---
title: "ggmatplot"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{ggmatplot}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
message = FALSE,
warning = FALSE
)
```
`ggmatplot` is a quick and easy way of plotting the columns of two matrices or data frames against each other using [`ggplot2`](https://ggplot2.tidyverse.org/).
`ggmatplot` is built upon [`ggplot2`](https://ggplot2.tidyverse.org/), and its functionality is inspired by [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot). Therefore, `ggmatplot` can be considered as a `ggplot` version of [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot).
## What does `ggmatplot` do?
Similar to [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot), `ggmatplot` plots a vector against the columns of a matrix, or the columns of two matrices against each other, or a vector/matrix on its own. However, unlike [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot), `ggmatplot` returns a `ggplot` object.
Suppose we have a covariate vector `x` and a matrix `z` with the response `y` and the fitted value `fit.y` as the two columns.
```{r defining-data}
# vector x
x <- c(rnorm(100, sd = 2))
head(x)
# matrix z
y <- x * 0.5 + rnorm(100, sd = 1)
fit.y <- fitted(lm(y ~ x))
z <- cbind(actual = y,
fitted = fit.y)
head(z)
```
`ggmatplot` plots vector `x` against each column of matrix `z` using the default `plot_type = "point"`. This will be represented on the resulting plot as two groups, identified using different shapes and colors.
```{r point-plot}
library(ggmatplot)
ggmatplot(x, z)
```
The default aesthetics used to differentiate the two groups can be updated using `ggmatplot()` arguments. Since the two groups in this example are differentiated using their shapes and colors, the `shape` and `color` parameters can be used to change them. If we want points in both groups to have the same shape, we can simply set the `shape` parameter to a single value. However, if we want the points in the groups to be differentiated by color, we can pass a list of colors as the `color` parameter - but we should make sure the number of colors in the list matches up with the number of groups.
```{r point-plot-w-parameters}
ggmatplot(x, z,
shape = "circle", # using a single shape over both groups
color = c("blue","purple") # assigning two colors to the two groups
)
```
Since `ggmatplot` is built upon [`ggplot2`](https://ggplot2.tidyverse.org/) and creates a ggplot object, [ggplot add ons](https://ggplot2.tidyverse.org/reference/index.html) such as scales, faceting specifications, coordinate systems, and themes can be added on to plots created using `ggmatplot` too.
Each `plot_type` allowed by the `ggmatplot()` function is also built upon a `ggplot2 geom` (geometric object), as listed [here](../reference/ggmatplot.html#plot-types). Therefore, `ggmatplot()` will support additional parameters specific to each plot type. These are often aesthetics, used to set an aesthetic to a fixed value, like `size = 2` or `alpha = 0.5`. However, they can also be other parameters specific to different types of plots.
```{r point-plot-w-theme}
ggmatplot(x, z,
shape = "circle",
color = c("blue","purple"),
size = 2,
alpha = 0.5
) +
theme_bw()
```
[This list of examples](../index.html#examples) includes other types of plots we can create using `ggmatplot`.
## When can we use `ggmatplot` over `ggplot2`?
[`ggplot2`](https://ggplot2.tidyverse.org/) requires wide format data to be wrangled into long format for plotting, which can be quite cumbersome when creating simple plots. Therefore, the motivation for `ggmatplot` is to provide a solution that allows [`ggplot2`](https://ggplot2.tidyverse.org/) to handle wide format data. Although `ggmatplot` doesn't provide the same flexibility as [`ggplot2`](https://ggplot2.tidyverse.org/), it can be used as a workaround for having to wrangle wide format data into long format and creating simple plots using [`ggplot2`](https://ggplot2.tidyverse.org/).
Suppose we want to use the `iris` dataset to plot the distributions of its numeric variables individually.
```{r iris-data}
iris_numeric <- iris[, setdiff(colnames(iris), "Species")]
head(iris_numeric)
```
If we were to plot this data using [`ggplot2`](https://ggplot2.tidyverse.org/), we'd have to wrangle the data into long format before plotting.
```{r iris-ggplot2-density-plot}
library(tidyr)
iris_numeric_long <- iris_numeric %>%
pivot_longer(cols = everything(),
names_to = "Feature",
values_to = "Measurement")
head(iris_numeric_long)
ggplot(iris_numeric_long, aes(x = Measurement, color = Feature)) +
geom_density()
```
But the wide format data can be directly used with `ggmatplot` to achieve the same result. Note that the order of the categories in the legend follows the column order in the original dataset.
```{r iris-ggmatplot-density-plot}
ggmatplot(iris_numeric, plot_type = "density", alpha = 0)
```
Suppose we also have the following dataset of the monthly totals of international airline passengers(in thousands) from January 1949 to December 1960.
```{r airline-data}
AirPassengers <- matrix(AirPassengers,
ncol = 12, byrow = FALSE,
dimnames = list(month.abb, as.character(1949:1960))
)
AirPassengers
```
If we want to plot the trend of the number of passengers over the years using `ggplot2`, we'd have to wrangle the data into long format. But we can use `ggmatplot` as a workaround.
First, we can split the data into two matrices as follows:
`months`: a vector containing the list of months
`nPassengers`: a matrix of passenger numbers with each column representing a year
```{r airline-data-split}
months <- rownames(AirPassengers)
nPassengers <- AirPassengers[, 1:12]
```
Then we can use `ggmatplot()` to plot the `months` matrix against each column of the `nPassengers` matrix - which can be more simply understood as grouping the plot using each column(`year`) of the `nPassengers` matrix.
```{r line-plot}
ggmatplot(
x = months,
y = nPassengers,
plot_type = "line",
size = 1,
legend_label = c(1949:1960),
xlab = "Month",
ylab = "Total airline passengers (in thousands)",
legend_title = "Year"
) +
theme_minimal()
```