Warning: replacing previous import 'S4Arrays::makeNindexFromArrayViewport' by
'DelayedArray::makeNindexFromArrayViewport' when loading 'SummarizedExperiment'
DEGpattern visualizations
1 Overview of this report
Template developed with materials in HBC training: Intro-to-DGE.
Default test data was originally from this paper, required raw data can be downloaded with links (Salmon data, Annotation file).
Steps taking from raw data to intermediate files required for this visualization can be found in Data_prep.R
and are adapted from two main DGE training materials: data set up; count normalization.
Three intermediate files required for this tutorial are .rds
files containing:
deseq_obj
: aDESeq2
object formatted from your tximportdeseq_meta
: adata.frame
specifying the sample groups of interestdeseq_deg
: a named vector with Differentially Expressed Genes (DEG) as the name and adjusted p value as the value.
All test data can be found in bcbioR test data github repo.
There are two additional parameters can be tuned in generating deseq_deg
from the original DESeq2
results:
padj.cutoff
: cutoff for adjusted p-value of DESeq results; Default: 0.05topN
: A second filtering afterpadj.cutoff
to keep only top significant genes for clustering for computing efficiency. If number of significant genes are less than the number supplied here, all genes will be used for clustering. Default: 1000
3 Zoom in a specific cluster of genes
Since we are interested in Group 1, we can filter the dataframe to keep only those genes:
After extracting a group of genes, we can use annotation packages to obtain additional information. We can also use these lists of genes as input to downstream functional analysis tools to obtain more biological insight and see whether the groups of genes share a specific function.
This lesson has been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Materials and hands-on activities were adapted from RNA-seq workflow on the Bioconductor website
4 Conclusions
5 Methods
5.1 R package references
To cite package ‘DEGreport’ in publications use:
Pantano L (2025). DEGreport: Report of DEG analysis. doi:10.18129/B9.bioc.DEGreport https://doi.org/10.18129/B9.bioc.DEGreport, R package version 1.44.0, https://bioconductor.org/packages/DEGreport.
A BibTeX entry for LaTeX users is
@Manual{, title = {DEGreport: Report of DEG analysis}, author = {Lorena Pantano}, year = {2025}, note = {R package version 1.44.0}, url = {https://bioconductor.org/packages/DEGreport}, doi = {10.18129/B9.bioc.DEGreport}, } To cite package ‘DESeq2’ in publications use:
Love, M.I., Huber, W., Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 Genome Biology 15(12):550 (2014)
A BibTeX entry for LaTeX users is
@Article{, title = {Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2}, author = {Michael I. Love and Wolfgang Huber and Simon Anders}, year = {2014}, journal = {Genome Biology}, doi = {10.1186/s13059-014-0550-8}, volume = {15}, issue = {12}, pages = {550}, } To cite ggplot2 in publications, please use
H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.
A BibTeX entry for LaTeX users is
@Book{, author = {Hadley Wickham}, title = {ggplot2: Elegant Graphics for Data Analysis}, publisher = {Springer-Verlag New York}, year = {2016}, isbn = {978-3-319-24277-4}, url = {https://ggplot2.tidyverse.org}, } To cite package ‘dplyr’ in publications use:
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. doi:10.32614/CRAN.package.dplyr https://doi.org/10.32614/CRAN.package.dplyr, R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
A BibTeX entry for LaTeX users is
@Manual{, title = {dplyr: A Grammar of Data Manipulation}, author = {Hadley Wickham and Romain François and Lionel Henry and Kirill Müller and Davis Vaughan}, year = {2023}, note = {R package version 1.1.4}, url = {https://CRAN.R-project.org/package=dplyr}, doi = {10.32614/CRAN.package.dplyr}, }
5.2 R session
List and version of tools used for the QC report generation.
R version 4.5.1 (2025-06-13)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale:
[1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
time zone: UTC
tzcode source: system (glibc)
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] ggprism_1.0.6 grafify_5.0.0.1
[3] R.utils_2.13.0 R.oo_1.27.1
[5] R.methodsS3_1.8.2 glue_1.8.0
[7] knitr_1.50 ggplot2_3.5.2
[9] dplyr_1.1.4 DESeq2_1.48.1
[11] SummarizedExperiment_1.38.1 Biobase_2.68.0
[13] MatrixGenerics_1.20.0 matrixStats_1.5.0
[15] GenomicRanges_1.60.0 GenomeInfoDb_1.44.0
[17] IRanges_2.42.0 S4Vectors_0.46.0
[19] BiocGenerics_0.54.0 generics_0.1.4
[21] DEGreport_1.44.0
loaded via a namespace (and not attached):
[1] RColorBrewer_1.1-3 ggdendro_0.2.0
[3] rstudioapi_0.17.1 jsonlite_2.0.0
[5] shape_1.4.6.1 magrittr_2.0.3
[7] estimability_1.5.1 farver_2.1.2
[9] nloptr_2.2.1 rmarkdown_2.29
[11] GlobalOptions_0.1.2 vctrs_0.6.5
[13] minqa_1.2.8 base64enc_0.1-3
[15] htmltools_0.5.8.1 S4Arrays_1.8.1
[17] broom_1.0.8 SparseArray_1.8.0
[19] Formula_1.2-5 sass_0.4.10
[21] bslib_0.9.0 htmlwidgets_1.6.4
[23] plyr_1.8.9 cachem_1.1.0
[25] emmeans_1.11.2 lifecycle_1.0.4
[27] iterators_1.0.14 pkgconfig_2.0.3
[29] Matrix_1.7-3 R6_2.6.1
[31] fastmap_1.2.0 GenomeInfoDbData_1.2.14
[33] rbibutils_2.3 clue_0.3-66
[35] digest_0.6.37 numDeriv_2016.8-1.1
[37] colorspace_2.1-1 reshape_0.8.10
[39] patchwork_1.3.1 crosstalk_1.2.1
[41] Hmisc_5.2-3 labeling_0.4.3
[43] httr_1.4.7 abind_1.4-8
[45] mgcv_1.9-3 compiler_4.5.1
[47] withr_3.0.2 doParallel_1.0.17
[49] htmlTable_2.4.3 ConsensusClusterPlus_1.72.0
[51] backports_1.5.0 BiocParallel_1.42.1
[53] carData_3.0-5 psych_2.5.6
[55] MASS_7.3-65 DelayedArray_0.34.1
[57] rjson_0.2.23 tools_4.5.1
[59] foreign_0.8-90 nnet_7.3-20
[61] nlme_3.1-168 grid_4.5.1
[63] checkmate_2.3.2 cluster_2.1.8.1
[65] gtable_0.3.6 tidyr_1.3.1
[67] data.table_1.17.8 car_3.1-3
[69] XVector_0.48.0 ggrepel_0.9.6
[71] foreach_1.5.2 pillar_1.11.0
[73] stringr_1.5.1 limma_3.64.1
[75] logging_0.10-108 circlize_0.4.16
[77] splines_4.5.1 lattice_0.22-7
[79] tidyselect_1.2.1 ComplexHeatmap_2.24.1
[81] locfit_1.5-9.12 reformulas_0.4.1
[83] gridExtra_2.3 edgeR_4.6.3
[85] xfun_0.52 statmod_1.5.0
[87] DT_0.33 stringi_1.8.7
[89] UCSC.utils_1.4.0 yaml_2.3.10
[91] boot_1.3-31 evaluate_1.0.4
[93] codetools_0.2-20 tibble_3.3.0
[95] cli_3.6.5 rpart_4.1.24
[97] xtable_1.8-4 Rdpack_2.6.4
[99] jquerylib_0.1.4 Rcpp_1.1.0
[101] png_0.1-8 parallel_4.5.1
[103] lme4_1.1-37 mvtnorm_1.3-3
[105] lmerTest_3.1-3 scales_1.4.0
[107] purrr_1.0.4 crayon_1.5.3
[109] GetoptLong_1.0.5 rlang_1.1.6
[111] cowplot_1.2.0 mnormt_2.1.1