plot_UMAP(sce, umap_slot = "UMAP_Harmony", color_by = "individual", group_by = "disease")
Plotting
cellula
implements a few simple plotting functions for exploratory data analysis.
plot_UMAP()
to plot a UMAP (or any other 2D dimensional reduction)plot_Coldata()
to plot data from thecolData
slot as a boxplot, scatterplot or confusion matrixplot_dots()
to plot a dot plot of gene expression
plot_UMAP()
You can choose point color using the color_by
argument, and facetting is supported via the group_by
argument. Additionally you can choose a shape_by
for symbols, and label_by
to place labels on the plot. Note that shape, group, and labels need to be categorical (i.e. factor) variables, whereas color can be numeric. The color palette is automatically generated, but it can be set by the user through the color_palette
argument.
plot_UMAP(sce, umap_slot = "UMAP_Harmony", color_by = "sum")
plot_Coldata()
Takes as input x
and y
as column names from colData(sce)
, with an optional color_by
and group_by
argument for facetting.
This function returns different plots depending on the class of the 2 colData
columns selected: - if y
is a numeric and x
is categorical (character or factor), it returns a combined violin-boxplot with one plot per level of x
.
plot_Coldata(sce, x = "individual", y = "sum") + scale_y_log10()
Additionally, if the color_by
argument specifies another column, every x
will be divided by levels of color_by
. With the appropriate use of the x
, color_by
and group_by
variables once an look at 3 different groupings of y
at once.
plot_Coldata(sce, x = "individual", y = "sum", color_by = "disease", group_by = "cell type") + scale_y_log10()
if
y
andx
are both categorical, it returns a heatmap of the confusion matrix where every value is the pairwise Jaccard index between sets for any given level pair (this is mostly useful to check for differences in clustering/annotations)if
y
andx
are both numeric, it returns a scatterplot with an optional 2D kernel density contour plot overlaid.
plot_Coldata(sce, x = "sum", y = "detected") + scale_x_log10()
plot_dots()
You can also use the plot_dots()
function to plot the popular dot-plot for marker genes.
This function takes in a SingleCellExperiment
object, together with a vector of genes (matched to the rownames
of the object), and a grouping variable specified by the group_by
argument. Additionally, dots can be ordered by hierarchical clustering on either genes, groups, or both (set cluster_genes
and/or cluster_groups
to TRUE
, which is the default). Colors can also be customized via the color_palette
argument. Finally, the user can choose whether they want genes to be columns (format = "wide"
, the default) or rows (format = "tall"
).
plot_dots(sce, genes = top5, group_by = "SNN_0.5")
Color palettes
cellula
has a few standard color palettes for data visualization, which were lifted from different packages. All plotting functions allow the user to input their custom palettes.
Standard qualitative
The standard qualitative palette comes from iterations of the qualpalr
package in the default implementation of qualpal()
. These palettes are optimized for maximal color differences in a perceptual space, i.e. by finding the N farthest points in the DIN99 color space. Since the number of colors will change the coordinates, using 2, 3, 4, … N colors will create different palettes.
This is automatically selected by default when the data is categorical, e.g. for clusters, samples, etc. It can be selected manually by specifying color_palette = "Qualpal"
where possible.
CVD-adjusted qualitative
There are two qualitative palettes from qualpalr
adjusted for Color Vision Deficiency (CVD) using a severity of 0.5. One is adjusted for protanopia and one for tritanopia.
They can be selected by specifying color_palette = "Protan"
or color_palette = "Tritan"
where possible.
Other qualitative palettes
cellula
offers a few alternative qualitative palettes out of the box, with different sizes.
Tableau
: the Tableau palette contains 10 colors. This palette is not optimized for CVD and is not maximally separated, just pleasant to look at. The Tableau palette comes from base R.
Pear
: the Pear18 palette contains 18 colors. This palette is not optimized for CVD and is not maximally separated, just pleasant to look at. The Pear18 palette comes from a subset of the original Pear36 palette on Lospec by user PineappleOnPizza.Polychrome
: the Polychrome24 palette contains 24 colors. This palette is not optimized for CVD and is not maximally separated, just pleasant to look at. The Polychrome24 palette comes from a subset of the Polychrome36 palette from base R.Polylight
: the Polychrome24 palette containst 24 colors. This palette is not optimized for CVD and is not maximally separated, just pleasant to look at. This is the Polychrome24 palette lightened by a factor of 0.4 usingcolorspace::lighten()
.
Quantitative palettes
The Sunset
, Heat
and truncated Yellow-Green-Blue (YlGnBu
) quantitative palette comes from the colorspace
package, through the sequential_hcl()
function using 25 colors (Sunset
and Heat
), 40 colors truncated to the last 30 (YlGnBu
).
The Parula
and Turbo
quantitative palettes come from the pals
package, using the parula()
and turbo()
functions respectively, both using 25 colors.
Sunset
is the default palette for quantitative data in dimensionality reductions. YlGnBu
is the default palette for quantitative data in dot plots and heatmaps. Heat 2
is the palette for kernel densities on scatterplots. Parula
and Turbo
are there just for testing purposes but can be chosen using color_palette = "Parula"
or color_palette = "Turbo"
where possible.