--- title: "ggmatplot" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{ggmatplot} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE ) ``` `ggmatplot` is a quick and easy way of plotting the columns of two matrices or data frames against each other using [`ggplot2`](https://ggplot2.tidyverse.org/). `ggmatplot` is built upon [`ggplot2`](https://ggplot2.tidyverse.org/), and its functionality is inspired by [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot). Therefore, `ggmatplot` can be considered as a `ggplot` version of [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot). ## What does `ggmatplot` do? Similar to [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot), `ggmatplot` plots a vector against the columns of a matrix, or the columns of two matrices against each other, or a vector/matrix on its own. However, unlike [`matplot`](https://www.rdocumentation.org/packages/graphics/versions/3.6.2/topics/matplot), `ggmatplot` returns a `ggplot` object. Suppose we have a covariate vector `x` and a matrix `z` with the response `y` and the fitted value `fit.y` as the two columns. ```{r defining-data} # vector x x <- c(rnorm(100, sd = 2)) head(x) # matrix z y <- x * 0.5 + rnorm(100, sd = 1) fit.y <- fitted(lm(y ~ x)) z <- cbind(actual = y, fitted = fit.y) head(z) ``` `ggmatplot` plots vector `x` against each column of matrix `z` using the default `plot_type = "point"`. This will be represented on the resulting plot as two groups, identified using different shapes and colors. ```{r point-plot} library(ggmatplot) ggmatplot(x, z) ``` The default aesthetics used to differentiate the two groups can be updated using `ggmatplot()` arguments. Since the two groups in this example are differentiated using their shapes and colors, the `shape` and `color` parameters can be used to change them. If we want points in both groups to have the same shape, we can simply set the `shape` parameter to a single value. However, if we want the points in the groups to be differentiated by color, we can pass a list of colors as the `color` parameter - but we should make sure the number of colors in the list matches up with the number of groups. ```{r point-plot-w-parameters} ggmatplot(x, z, shape = "circle", # using a single shape over both groups color = c("blue","purple") # assigning two colors to the two groups ) ``` Since `ggmatplot` is built upon [`ggplot2`](https://ggplot2.tidyverse.org/) and creates a ggplot object, [ggplot add ons](https://ggplot2.tidyverse.org/reference/index.html) such as scales, faceting specifications, coordinate systems, and themes can be added on to plots created using `ggmatplot` too. Each `plot_type` allowed by the `ggmatplot()` function is also built upon a `ggplot2 geom` (geometric object), as listed [here](../reference/ggmatplot.html#plot-types). Therefore, `ggmatplot()` will support additional parameters specific to each plot type. These are often aesthetics, used to set an aesthetic to a fixed value, like `size = 2` or `alpha = 0.5`. However, they can also be other parameters specific to different types of plots. ```{r point-plot-w-theme} ggmatplot(x, z, shape = "circle", color = c("blue","purple"), size = 2, alpha = 0.5 ) + theme_bw() ``` [This list of examples](../index.html#examples) includes other types of plots we can create using `ggmatplot`. ## When can we use `ggmatplot` over `ggplot2`? [`ggplot2`](https://ggplot2.tidyverse.org/) requires wide format data to be wrangled into long format for plotting, which can be quite cumbersome when creating simple plots. Therefore, the motivation for `ggmatplot` is to provide a solution that allows [`ggplot2`](https://ggplot2.tidyverse.org/) to handle wide format data. Although `ggmatplot` doesn't provide the same flexibility as [`ggplot2`](https://ggplot2.tidyverse.org/), it can be used as a workaround for having to wrangle wide format data into long format and creating simple plots using [`ggplot2`](https://ggplot2.tidyverse.org/). Suppose we want to use the `iris` dataset to plot the distributions of its numeric variables individually. ```{r iris-data} iris_numeric <- iris[, setdiff(colnames(iris), "Species")] head(iris_numeric) ``` If we were to plot this data using [`ggplot2`](https://ggplot2.tidyverse.org/), we'd have to wrangle the data into long format before plotting. ```{r iris-ggplot2-density-plot} library(tidyr) iris_numeric_long <- iris_numeric %>% pivot_longer(cols = everything(), names_to = "Feature", values_to = "Measurement") head(iris_numeric_long) ggplot(iris_numeric_long, aes(x = Measurement, color = Feature)) + geom_density() ``` But the wide format data can be directly used with `ggmatplot` to achieve the same result. Note that the order of the categories in the legend follows the column order in the original dataset. ```{r iris-ggmatplot-density-plot} ggmatplot(iris_numeric, plot_type = "density", alpha = 0) ``` Suppose we also have the following dataset of the monthly totals of international airline passengers(in thousands) from January 1949 to December 1960. ```{r airline-data} AirPassengers <- matrix(AirPassengers, ncol = 12, byrow = FALSE, dimnames = list(month.abb, as.character(1949:1960)) ) AirPassengers ``` If we want to plot the trend of the number of passengers over the years using `ggplot2`, we'd have to wrangle the data into long format. But we can use `ggmatplot` as a workaround. First, we can split the data into two matrices as follows: `months`: a vector containing the list of months `nPassengers`: a matrix of passenger numbers with each column representing a year ```{r airline-data-split} months <- rownames(AirPassengers) nPassengers <- AirPassengers[, 1:12] ``` Then we can use `ggmatplot()` to plot the `months` matrix against each column of the `nPassengers` matrix - which can be more simply understood as grouping the plot using each column(`year`) of the `nPassengers` matrix. ```{r line-plot} ggmatplot( x = months, y = nPassengers, plot_type = "line", size = 1, legend_label = c(1949:1960), xlab = "Month", ylab = "Total airline passengers (in thousands)", legend_title = "Year" ) + theme_minimal() ```