Several tools exist to display gene expression data in a click-and-plot system - most notably iSEE. More packages are listed in Related Tools. Geyser is unique from iSEE and others in that it does not do much (aside from showing expression data). The advantage of not doing much is that the number of dials and options is substantially reduced and it is, I hope, very simple to operate.
Commented out for now as it is not on Bioconductor
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("geyser")
The latest version can be installed via github
if (!requireNamespace("remotes", quietly=TRUE))
install.packages("remotes")
remotes::install_github("davemcg/geyser")
library(geyser)
load(system.file('extdata/tiny_rse.Rdata', package = 'geyser'))
Running geyser is as simple as giving the the SummarizedExperiment object to the geyser
function.
if (interactive()){
geyser(tiny_rse)
}
The Shiny-based GUI first shows you the metadata (colData
slot) of the SummarizedExperiment (SE) object in a reactive DT
data table.
The idea is that this helps you ID which are the relevant fields to plot against (tissue and disease).
You can then click over to the Plotting section of the app and start typing in those fields (tissue and disease) in the “Sample Grouping(s)” box. The text will auto-complete as you type.
After that you can type in the genes you are interested in. Again, the genes will auto-complete as you type.
When you click the orange “Draw Box Plot” button the plot will be made
You can custom filter which samples are shown by clicking on the “triangle” next to “Sample Filtering” and then selecting the samples you want to display. Here we select only the normal samples and then use the Heatmap visualization (you can swap between Box Plot and Heatmap by clicking between the tabs).
If you want to reset the custom sample filtering, just click the “Clear Rows” button
The plots can be “outputted” by either right-clicking or by
We also do some light tweaking of the metadata to make human useable splits
# If needed: BiocManager::install("recount3")
if (interactive()){
library(recount3)
library(geyser)
human_projects <- available_projects()
proj_info <- subset(
human_projects,
project == "SRP107937" & project_type == "data_sources"
)
rse_SRP107937 <- create_rse(proj_info)
assay(rse_SRP107937, "counts") <- transform_counts(rse_SRP107937)
# first tweak that glues the gene name onto the gene id in the row names
rownames(rse_SRP107937) <- paste0(rowData(rse_SRP107937)$gene_name, ' (', row.names(rse_SRP107937), ')')
# creates two new metadata fields
colData(rse_SRP107937)$tissue <- colData(rse_SRP107937)$sra.sample_title %>% stringr::str_extract(.,'PRC|PR')
colData(rse_SRP107937)$disease <- colData(rse_SRP107937)$sra.sample_title %>% stringr::str_extract(.,'AMD|Normal')
geyser(rse_SRP107937, " geyser: SRP107937")
}
The key step is to get matched metadata (where each row corresponds to each column of the count matrix).
library(SummarizedExperiment)
#> Loading required package: MatrixGenerics
#> Loading required package: matrixStats
#>
#> Attaching package: 'MatrixGenerics'
#> The following objects are masked from 'package:matrixStats':
#>
#> colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
#> colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
#> colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
#> colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
#> colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
#> colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
#> colWeightedMeans, colWeightedMedians, colWeightedSds,
#> colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
#> rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
#> rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
#> rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
#> rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
#> rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
#> rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
#> rowWeightedSds, rowWeightedVars
#> Loading required package: GenomicRanges
#> Loading required package: stats4
#> Loading required package: BiocGenerics
#> Loading required package: generics
#>
#> Attaching package: 'generics'
#> The following objects are masked from 'package:base':
#>
#> as.difftime, as.factor, as.ordered, intersect, is.element, setdiff,
#> setequal, union
#>
#> Attaching package: 'BiocGenerics'
#> The following objects are masked from 'package:stats':
#>
#> IQR, mad, sd, var, xtabs
#> The following objects are masked from 'package:base':
#>
#> Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
#> as.data.frame, basename, cbind, colnames, dirname, do.call,
#> duplicated, eval, evalq, get, grep, grepl, is.unsorted, lapply,
#> mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
#> rank, rbind, rownames, sapply, saveRDS, table, tapply, unique,
#> unsplit, which.max, which.min
#> Loading required package: S4Vectors
#>
#> Attaching package: 'S4Vectors'
#> The following object is masked from 'package:utils':
#>
#> findMatches
#> The following objects are masked from 'package:base':
#>
#> I, expand.grid, unname
#> Loading required package: IRanges
#> Loading required package: GenomeInfoDb
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
#>
#> Attaching package: 'Biobase'
#> The following object is masked from 'package:MatrixGenerics':
#>
#> rowMedians
#> The following objects are masked from 'package:matrixStats':
#>
#> anyMissing, rowMedians
counts <- matrix(runif(10 * 6, 1, 1e4), 10)
row.names(counts) <- paste0('gene', seq(1,10))
colnames(counts) <- LETTERS[1:6]
sample_info <- data.frame(Condition = c(rep("Unicorn", 3),
rep("Horse", 3)),
row.names = LETTERS[1:6])
se_object <- SummarizedExperiment(assays=list(counts = counts),
colData = sample_info)
if (interactive()){
geyser(se_object, "Magical Creatures")
}
This example is taken from the DESeq2 guide. A DESeqDataSet
is highly similar to the SummarizedExperiment
class.
library(DESeq2)
library(airway)
data(airway)
ddsSE <- DESeqDataSet(airway, design = ~ cell + dex)
if (interactive()){
geyser(ddsSE, "DESeq Airway Example")
}
sessionInfo()
#> R Under development (unstable) (2024-10-21 r87258)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] airway_1.27.0 DESeq2_1.47.1
#> [3] SummarizedExperiment_1.37.0 Biobase_2.67.0
#> [5] GenomicRanges_1.59.1 GenomeInfoDb_1.43.2
#> [7] IRanges_2.41.2 S4Vectors_0.45.2
#> [9] BiocGenerics_0.53.3 generics_0.1.3
#> [11] MatrixGenerics_1.19.1 matrixStats_1.5.0
#> [13] geyser_0.99.8 BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] beeswarm_0.4.0 shape_1.4.6.1 circlize_0.4.16
#> [4] gtable_0.3.6 rjson_0.2.23 xfun_0.50
#> [7] bslib_0.8.0 ggplot2_3.5.1 GlobalOptions_0.1.2
#> [10] lattice_0.22-6 vctrs_0.6.5 tools_4.5.0
#> [13] parallel_4.5.0 tibble_3.2.1 cluster_2.1.8
#> [16] pkgconfig_2.0.3 Matrix_1.7-1 RColorBrewer_1.1-3
#> [19] lifecycle_1.0.4 GenomeInfoDbData_1.2.13 compiler_4.5.0
#> [22] munsell_0.5.1 codetools_0.2-20 ComplexHeatmap_2.23.0
#> [25] clue_0.3-66 vipor_0.4.7 httpuv_1.6.15
#> [28] htmltools_0.5.8.1 sass_0.4.9 yaml_2.3.10
#> [31] tidyr_1.3.1 pillar_1.10.1 later_1.4.1
#> [34] crayon_1.5.3 jquerylib_0.1.4 BiocParallel_1.41.0
#> [37] DelayedArray_0.33.3 cachem_1.1.0 iterators_1.0.14
#> [40] abind_1.4-8 foreach_1.5.2 mime_0.12
#> [43] locfit_1.5-9.10 tidyselect_1.2.1 digest_0.6.37
#> [46] purrr_1.0.2 dplyr_1.1.4 bookdown_0.42
#> [49] fastmap_1.2.0 grid_4.5.0 colorspace_2.1-1
#> [52] cli_3.6.3 SparseArray_1.7.2 magrittr_2.0.3
#> [55] S4Arrays_1.7.1 scales_1.3.0 UCSC.utils_1.3.0
#> [58] promises_1.3.2 ggbeeswarm_0.7.2 rmarkdown_2.29
#> [61] XVector_0.47.2 httr_1.4.7 GetoptLong_1.0.5
#> [64] png_0.1-8 shiny_1.10.0 evaluate_1.0.3
#> [67] knitr_1.49 doParallel_1.0.17 rlang_1.1.4
#> [70] Rcpp_1.0.14 xtable_1.8-4 glue_1.8.0
#> [73] BiocManager_1.30.25 jsonlite_1.8.9 R6_2.5.1