RNA Sequencing (RNA-seq): Definition & Applications

Table of content

What is RNA-seq?
Understanding the Transcriptome
Types of RNA Sequencing
RNA-seq Workflow
Bioinformatics Pipeline
Applications in Life Sciences
Drug Discovery & Targets
Oncology & Precision Medicine
RNA-seq vs. Microarray
AI & ML in RNA-seq
Challenges & Considerations
How Excelra Supports RNA-seq
Conclusion
FAQ

QUICK DEFINITION

RNA Sequencing (RNA-seq) is a powerful next-generation sequencing (NGS) methodology used to detect, sequence, and quantify the presence and abundance of RNA transcripts within a biological sample at a given moment. By converting cellular RNA into a stable complementary DNA (cDNA) library before high-throughput sequencing, RNA-seq delivers an unpolluted, genome-wide evaluation of continuous transcriptional activity and alternative splicing variants.

Key takeaways

Dynamic Transcriptome Capture: Unlike static DNA profiling, RNA-seq records highly dynamic, real-time alterations in gene expression reflecting immediate cellular responses to drugs, environments, or disease phenotypes.
Superiority Over Microarrays: Offers a massive digital dynamic range with zero background cross-hybridization, allowing for unbiased, hypothesis-free identification of novel transcripts, low-abundance genes, and rare fusion events.
Granular Structural Insights: Cleanly identifies intricate transcriptomic features including single-nucleotide variations (SNVs), alternative splicing isoforms, post-transcriptional modifications, and long non-coding RNAs (lncRNAs).
Downstream Bioinformatics Load: Demands robust analysis pipelines—including quality assessment (FastQC), splice-aware alignment (STAR, HISAT2), transcript assembly, and differential gene expression (DGE) mapping.

What is RNA Sequencing (RNA-seq)?

RNA sequencing (RNA-seq) is an NGS-based technique that captures and quantifies all RNA molecules present in a biological sample at a given point in time. RNA sequencing works by converting RNA into complementary DNA (cDNA), sequencing the resulting library at high throughput, and mapping the reads back to a reference transcriptome or genome to measure the abundance of every transcript detected. The result is a comprehensive, quantitative snapshot of gene expression — the transcriptome — across thousands of genes simultaneously.

Unlike earlier gene expression technologies such as microarrays, RNA sequencing is not limited to pre-defined probes. RNA-seq can detect any transcript present in a sample — including novel splice variants, fusion transcripts, non-coding RNAs, and genes not previously annotated — making it a discovery-first technology with no upper ceiling on what it can find.

Since its first description in 2008, RNA sequencing has become the dominant transcriptomics technology in both academic and pharmaceutical research. It is used to understand how cells respond to disease, drugs, and environmental stimuli; to identify the molecular drivers of cancer and other complex diseases; and to discover and validate biomarkers for patient stratification and companion diagnostic development. RNA-seq is directly supported by Excelra’s bioinformatics services and transcriptomics capabilities.

Understanding the Transcriptome in RNA Sequencing

The transcriptome is the complete set of RNA molecules — transcripts — produced by the genome of a cell at a specific moment. While the genome is largely static (the same DNA sequence in every cell of an organism), the transcriptome is dynamic and cell-type specific: different cell types express different genes, and the same cell will express genes differently depending on its developmental stage, environmental conditions, disease state, or exposure to a drug.

The human transcriptome includes several classes of RNA that RNA sequencing can measure:

Messenger RNA (mRNA) — protein-coding transcripts that carry genetic instructions from DNA to ribosomes
Long non-coding RNA (lncRNA) — transcripts longer than 200 nucleotides with regulatory functions but no protein-coding capacity
MicroRNA (miRNA) — small ~22-nucleotide RNAs that post-transcriptionally regulate gene expression
Small interfering RNA (siRNA) — short double-stranded RNAs involved in RNA interference pathways
Circular RNA (circRNA) — covalently closed RNA molecules with emerging roles in gene regulation
Transfer RNA (tRNA) and ribosomal RNA (rRNA) — abundant structural/functional RNAs (usually depleted in RNA-seq libraries to enrich for other species)

By measuring all of these RNA species — or a selected subset — RNA sequencing provides an unparalleled window into cellular biology that no other single assay can match.

Types of RNA Sequencing

RNA sequencing has evolved into a family of related techniques, each optimized for different biological questions and experimental contexts. Understanding the differences is essential for selecting the right approach.

Bulk RNA Sequencing

Bulk RNA sequencing — the original and most widely used form — measures the average gene expression across all cells in a sample simultaneously. It is fast, cost-effective, and analytically mature, making it the default choice for comparing gene expression between experimental conditions (e.g., treated vs. untreated, disease vs. healthy). The main limitation is that bulk RNA-seq averages out cell-to-cell variability — a heterogeneous sample (such as a tumor) produces a composite expression profile that masks differences between cell types.

Single-Cell RNA Sequencing (scRNA-seq)

Single-cell RNA sequencing profiles gene expression in each individual cell, revealing the cellular heterogeneity hidden within a tissue or sample. Technologies such as 10x Genomics Chromium, Drop-seq, and Smart-seq2 enable profiling of thousands to hundreds of thousands of individual cells per experiment. scRNA-seq is transforming our understanding of tumor microenvironments, immune cell states, developmental biology, and drug resistance mechanisms. Excelra’s single-cell transcriptomics pipeline and our scRNA-seq analysis whitepaper detail our approach.

Spatial Transcriptomics

Spatial transcriptomics combines RNA sequencing with spatial coordinates, preserving the physical location of gene expression within a tissue section. Platforms such as 10x Genomics Visium, Slide-seq, and MERFISH map gene expression in situ — enabling analysis of tissue architecture, cell-cell communication, and regional gene expression gradients. This is particularly powerful for understanding tumor microenvironments, tissue zonation, and the spatial organization of immune infiltrates. See Excelra’s spatial transcriptomics solutions whitepaper for more.

Long-Read RNA Sequencing

Long-read RNA sequencing (using Oxford Nanopore or PacBio platforms) generates full-length transcript reads, enabling direct isoform characterization and phasing of alternative splicing events across the full transcript length. This is particularly valuable for resolving complex transcript structures that short-read RNA-seq cannot distinguish.

Targeted RNA Sequencing

Targeted RNA sequencing uses hybridization capture or amplicon-based enrichment to focus sequencing depth on a defined set of genes or transcripts. It delivers higher sensitivity for low-expressed targets and is more cost-efficient when the research question is focused on a specific gene set — such as an immune gene panel or a cancer fusion transcript panel.

RNA Sequencing Types Comparison
Type	Resolution	Cost	Best For
Bulk RNA-seq	Population average	Low–Medium	Differential expression, pathway analysis, biomarkers
scRNA-seq	Single cell	High	Cell heterogeneity, rare populations, cell state transitions
Spatial Transcriptomics	Spot/cell + spatial	Very High	Tissue architecture, tumor microenvironment, cell-cell interactions
Long-Read RNA-seq	Full isoform	High	Isoform characterization, novel splice variant discovery
Targeted RNA-seq	Gene panel	Low	Focused gene sets, fusion detection, clinical panels

Step-by-Step RNA Sequencing Workflow

The RNA sequencing wet-lab workflow progresses from biological sample to sequencing-ready library through five controlled steps. Quality failures at any step propagate through the entire pipeline, making rigorous QC at every stage essential.

1. Sample Collection & RNA Extraction

Total RNA is extracted from the biological material of interest — cell lines, fresh or frozen tissue, blood, FFPE sections, organoids, or single-cell suspensions. RNA stabilization during extraction is critical because RNA is highly susceptible to degradation by ubiquitous RNase enzymes. RNA quantity is measured by Qubit fluorometry and quality assessed by the RNA Integrity Number (RIN) score using Bioanalyzer or TapeStation. A RIN score ≥7 is generally required for high-quality RNA-seq library preparation; FFPE-derived RNA (typically RIN 2–4) requires specialized low-input or FFPE-optimized protocols.

2. RNA Selection & Enrichment

The RNA population of interest is selected or enriched before library preparation. Poly-A selection uses oligo-dT magnetic beads to capture polyadenylated mRNAs — the method of choice when protein-coding gene expression is the focus. Ribosomal RNA depletion (ribo-depletion) removes the highly abundant rRNA species (which can constitute >90% of total RNA) while retaining all other RNA types — necessary when lncRNAs, pre-mRNAs, or total transcriptome profiling is desired. For small RNA sequencing (miRNA, siRNA), size selection is used instead.

3. cDNA Library Preparation

Selected RNA is reverse-transcribed into complementary DNA (cDNA) by reverse transcriptase, typically using random hexamers or oligo-dT primers. The cDNA is then second-strand synthesized, fragmented to the target size (~200–300 bp for short-read platforms), end-repaired, A-tailed, and ligated with indexed sequencing adapters. Strand-specific (directional) library preparation — which preserves the sense/antisense orientation of each transcript — is strongly recommended as it improves transcript quantification accuracy and reduces ambiguity in antisense and overlapping gene regions.

4. High-Throughput Sequencing

The prepared cDNA library is loaded onto an NGS sequencer. Illumina platforms (NovaSeq, NextSeq) are most widely used for bulk RNA-seq, generating 2×75 bp or 2×150 bp paired-end reads. Standard bulk RNA-seq requires 20–50 million paired-end reads per sample for well-annotated organisms; deeper sequencing (50–100M reads) improves detection of lowly expressed genes and enables novel transcript discovery. For scRNA-seq, 10x Genomics instruments are most common, generating ~20,000–50,000 reads per cell across thousands to millions of cells.

5. Bioinformatics Analysis & Interpretation

Raw sequencing reads are processed through an RNA-seq bioinformatics pipeline for QC, alignment, quantification, differential expression analysis, and pathway interpretation. This is described in detail in the RNA Sequencing Bioinformatics Pipeline section below.

RNA Sequencing Bioinformatics Pipeline

RNA sequencing bioinformatics analysis transforms raw sequencing reads into biologically interpretable results. Excelra’s bioinformatics team and Bulk Transcriptomics Pipeline on OP² deliver validated, scalable RNA-seq workflows configurable for bulk, single-cell, and spatial transcriptomics.

Read Quality Control & Trimming

Raw FASTQ files are assessed with FastQC for per-base quality scores, adapter content, GC content distribution, and duplication levels. Adapter sequences and low-quality bases are removed using Trimmomatic, Fastp, or Cutadapt. RNA-seq-specific QC metrics — such as 5’/3′ bias (indicating RNA degradation), ribosomal RNA contamination rate, and exon/intron ratio — are evaluated to confirm library quality before alignment.

Alignment to Reference Genome/Transcriptome

Trimmed reads are aligned to the reference genome (GRCh38 for human) using splice-aware aligners: STAR (Spliced Transcripts Alignment to a Reference) is the gold standard for bulk RNA-seq, handling millions of reads rapidly and accurately mapping splice junctions; HISAT2 is an alternative with lower memory requirements; Salmon and kallisto use pseudo-alignment for ultra-fast transcript-level quantification without full genome alignment. Post-alignment QC checks on-target rate, duplication rate, and strand specificity.

Read Counting & Quantification

Aligned reads are assigned to genomic features (genes, transcripts, exons) using counting tools such as featureCounts or HTSeq. For transcript-level quantification, pseudo-alignment tools (Salmon, kallisto) produce transcript-per-million (TPM) or counts-per-million (CPM) values directly. The choice between gene-level and transcript-level quantification depends on whether the study requires differential gene expression or differential transcript usage (alternative splicing) analysis.

Differential Expression Analysis

Differential gene expression (DGE) analysis identifies genes whose expression levels differ significantly between conditions (e.g., treated vs. control, disease vs. healthy). The two most widely used tools are DESeq2 — which uses a negative binomial statistical model and is recommended for most experimental designs — and edgeR, which performs similarly and is preferred for very small sample sizes. Results are reported as log2-fold changes, adjusted p-values (Benjamini-Hochberg FDR correction), and visualized through volcano plots, heatmaps, and MA plots. This directly supports Excelra’s data science and biomarker workflows.

Pathway & Functional Enrichment Analysis

Lists of differentially expressed genes are interpreted in the context of biological pathways and processes using Gene Set Enrichment Analysis (GSEA), clusterProfiler, or Ingenuity Pathway Analysis (IPA). Pathway enrichment reveals which biological processes, signaling cascades, or molecular functions are most significantly altered between conditions — connecting differential gene expression to mechanism of action, disease biology, or drug effect.

Single-Cell RNA-seq Analysis

scRNA-seq data requires a specialized analytical pipeline. Raw count matrices from 10x Genomics Cell Ranger are processed in Seurat (R) or Scanpy (Python) for quality filtering (removing low-quality cells and doublets), normalization, highly variable gene selection, dimensionality reduction (PCA, UMAP), unsupervised clustering, and cell type annotation. Trajectory analysis (Monocle, PAGA) infers developmental or differentiation pseudotime. Cell-cell interaction analysis (CellChat, NicheNet) models ligand-receptor signaling between cell populations — critical for understanding tumor microenvironments and immune cell crosstalk.

Key Applications of RNA Sequencing in Life Sciences

RNA sequencing has broad applications across basic research, translational medicine, and drug development. Its ability to measure the complete transcriptome without prior hypotheses makes RNA-seq a discovery-first technology relevant to virtually every area of biology.

Disease Mechanism Characterization

RNA sequencing is used extensively to characterize the molecular mechanisms underlying disease — identifying dysregulated genes, pathways, and networks that distinguish diseased from healthy tissue. In complex diseases such as cancer, neurodegeneration, and autoimmune disorders, RNA-seq of patient samples reveals the transcriptional landscape of disease in unprecedented detail, generating hypotheses for drug target identification and therapeutic intervention.

Biomarker Discovery & Patient Stratification

Gene expression signatures derived from RNA sequencing are powerful biomarkers for patient stratification, disease subtyping, and treatment response prediction. Multi-gene RNA expression panels — such as Oncotype DX and MammaPrint in breast cancer, both derived from bulk RNA-seq data — are FDA-cleared diagnostic biomarkers used in clinical practice. Excelra’s biomarker discovery capabilities integrate RNA-seq with clinical data to develop and validate expression-based biomarkers.

Immunology & Immune Profiling

RNA sequencing is central to immunology research — profiling immune cell populations, cytokine expression, T cell receptor (TCR) and B cell receptor (BCR) diversity, and immune checkpoint pathway activation. scRNA-seq of peripheral blood mononuclear cells (PBMCs) or tumor-infiltrating lymphocytes (TILs) maps the immune landscape in disease and in response to immunotherapy, directly informing immuno-oncology drug development programs. Excelra’s immunomics pipeline on OP² is purpose-built for this application.

Agrigenomics & Animal Health

RNA sequencing is used in plant and animal biology to study gene expression responses to stress, disease, pesticide treatment, and environmental change. In animal health, RNA-seq profiles host transcriptional responses to pathogens, informs vaccine development, and supports the discovery of novel therapeutic targets in veterinary medicine. Excelra’s animal health and agrigenomics teams apply RNA-seq across these verticals.

RNA Sequencing in Drug Discovery & Target Identification

RNA sequencing has become an essential tool in modern drug discovery — providing the transcriptomic data needed to identify novel targets, characterize compound mechanism of action (MoA), and optimize the drug development pipeline from early discovery through clinical trials.

Drug Target Identification

Differential RNA-seq analysis comparing disease versus healthy tissue — or genetic models of disease (e.g., CRISPR knockouts, patient-derived organoids) versus controls — identifies genes that are consistently dysregulated in disease. Genes that are specifically upregulated in disease and encode druggable protein classes (receptors, kinases, transporters, enzymes) are prioritized as candidate drug targets. Integration of RNA-seq data with genetic association data (WGS-derived GWAS results) strengthens target credentialing by linking transcriptional dysregulation to genetic causality.

Mechanism of Action (MoA) Profiling

Transcriptome-wide RNA sequencing of cells treated with a compound — compared to vehicle-treated controls — reveals the complete transcriptional MoA of a drug. This identifies on-target pathway activation, off-target effects, secondary gene expression changes, and compensatory responses. Large-scale MoA profiling databases such as the LINCS L1000 project have profiled thousands of compounds across hundreds of cell lines using RNA-seq proxies, creating a reference atlas for drug mechanism inference.

Toxicogenomics

RNA sequencing of hepatocytes, cardiomyocytes, or other safety-relevant tissues treated with drug candidates reveals transcriptional signatures of toxicity before phenotypic adverse effects are apparent. Toxicogenomics RNA-seq can predict hepatotoxicity, cardiotoxicity, and nephrotoxicity risk in preclinical studies — enabling earlier compound selection decisions that reduce late-stage attrition. This connects to Excelra’s clinical data services and preclinical data curation capabilities.

Drug Repurposing via Transcriptomics

Transcriptome-based drug repurposing uses RNA-seq disease expression signatures to identify approved drugs whose transcriptional MoA is the inverse of the disease signature — suggesting they could reverse the disease phenotype. The Connectivity Map (CMap) approach, pioneered at the Broad Institute, formalizes this strategy. Excelra’s drug repurposing services integrate RNA-seq with GOSTAR bioactivity data for multi-evidence repurposing workflows.

RNA Sequencing in Oncology & Precision Medicine

Cancer genomics has been transformed by RNA sequencing. While DNA-based methods (WGS, WES) reveal what mutations are present, RNA sequencing reveals what those mutations do — how they alter gene expression, activate oncogenic pathways, and remodel the tumor microenvironment.

Tumor Transcriptome Profiling & Subtyping

RNA sequencing enables molecular subtyping of tumors beyond histology — revealing transcriptional subtypes with different prognoses, driver pathways, and treatment sensitivities. In breast cancer, RNA-seq-derived intrinsic subtypes (Luminal A, Luminal B, HER2-enriched, Basal-like) directly inform adjuvant chemotherapy decisions. In colorectal cancer, Consensus Molecular Subtypes (CMS1–CMS4) derived from bulk RNA-seq data predict response to targeted therapies and immunotherapy.

Fusion Transcript Detection

Cancer genomes frequently harbor chromosomal rearrangements that create oncogenic fusion genes — such as BCR-ABL in CML and EML4-ALK in NSCLC. RNA sequencing detects these fusion transcripts with high sensitivity through tools such as STAR-Fusion, Arriba, and FusionCatcher, enabling companion diagnostic-grade fusion detection from RNA-seq data as an alternative or complement to DNA-based fusion panels.

Tumor Microenvironment (TME) Analysis

Single-cell RNA sequencing of tumor samples maps the complete cellular landscape of the tumor microenvironment — characterizing tumor cell states, cancer-associated fibroblasts (CAFs), tumor-associated macrophages (TAMs), T cell exhaustion states, and natural killer cell activity. This TME characterization informs immunotherapy response prediction and identifies novel combination therapy strategies. Bioinformatic deconvolution tools (CIBERSORT, TIMER2.0) enable TME cell fraction estimation from bulk RNA-seq data when single-cell profiling is not feasible.

Neoantigen Prediction & Immunotherapy Support

RNA-seq data — particularly HLA typing from RNA-seq reads and tumor neoantigen expression confirmation — supports neoantigen vaccine design and CAR-T cell therapy development. By confirming that predicted neoantigens are actually expressed in tumor RNA (and not silenced), RNA-seq improves the precision of immunotherapy target selection.

RNA-seq vs. Microarray: Key Differences

RNA sequencing has largely superseded microarrays as the gene expression technology of choice, but understanding the differences is important for interpreting legacy datasets and deciding when microarrays may still be appropriate.

RNA Sequencing vs. Microarray Comparison
Feature	RNA Sequencing (RNA-seq)	Microarray
Dynamic Range	Very high — no saturation ceiling	Limited — saturates at high expression
Novel Transcript Detection	Yes — detects any transcript in sample	No — limited to pre-defined probe set
Alternative Splicing	Yes — full isoform resolution	Limited — exon arrays only
Non-coding RNA Detection	Yes — lncRNA, miRNA, circRNA	Limited — requires specific array designs
Background Noise	Low	Higher — cross-hybridization artefacts
Cost	Moderate–High (falling rapidly)	Lower for large cohorts
Bioinformatics Complexity	Higher — requires alignment and counting	Lower — established pipelines
Best Used When	Discovery, novel biology, isoforms, ncRNA	Very large cohorts, known gene sets, legacy compatibility

AI & Machine Learning in RNA Sequencing Analysis

The scale and complexity of RNA sequencing datasets — particularly single-cell data spanning millions of cells and thousands of genes — have driven rapid adoption of AI and machine learning tools across the RNA-seq analysis spectrum.

Deep Learning for Cell Type Annotation

Automated cell type annotation using deep learning models (scANVI, CellTypist, scBERT) addresses one of the most labor-intensive steps in scRNA-seq analysis. These models are trained on large reference atlases (Human Cell Atlas, Tabula Sapiens) and can accurately classify cell types in new datasets without manual expert review, dramatically accelerating analysis timelines.

Foundation Models for Transcriptomics

Large-scale foundation models trained on millions of single-cell transcriptomes — such as Geneformer and scGPT — can perform zero-shot gene perturbation prediction, cell state classification, and drug response inference from RNA-seq data. These models represent a paradigm shift in how RNA sequencing data is analyzed and interpreted, and are directly relevant to Excelra’s AI/ML capabilities.

Predictive Biomarker Modeling

Machine learning models trained on RNA-seq gene expression data are being used to predict clinical outcomes — including drug response, disease progression, and survival — with greater accuracy than traditional statistical approaches. Regularized regression (LASSO, elastic net), random forests, and gradient boosting models identify sparse gene expression signatures that outperform individual biomarkers in predicting complex clinical phenotypes.

Challenges & Considerations in RNA Sequencing

RNA Degradation & Sample Quality

RNA is inherently unstable and degrades rapidly after sample collection if not properly stabilized. FFPE-derived RNA is particularly challenging — formalin crosslinking fragments RNA and introduces chemical modifications that reduce library complexity and increase artefactual variants. Rigorous pre-analytical standardization of sample collection, stabilization, and storage is essential for reliable RNA-seq results.

Batch Effects

RNA-seq data generated across different experiments, laboratories, time points, or sequencing runs can contain systematic technical variation (batch effects) that confounds biological interpretation. Batch correction methods (ComBat, limma removeBatchEffect, Harmony for scRNA-seq) are essential when integrating multi-batch RNA-seq datasets — such as combining publicly available datasets with in-house data for meta-analysis or biomarker discovery.

Normalization & Statistical Modeling

Choosing the appropriate normalization method and statistical model for differential expression analysis significantly affects results. DESeq2’s size-factor normalization is preferred for most experimental designs; TMM normalization (edgeR) is better for some library composition scenarios. For scRNA-seq, sparse count matrices require specialized normalization approaches (SCTransform, scran) that account for the high proportion of zero counts (dropout) characteristic of single-cell data.

Data Management & FAIR Compliance

Large-scale RNA-seq studies generate terabytes of data that must be stored, managed, and shared in ways that preserve long-term usability. Applying FAIR data principles — ensuring data is Findable, Accessible, Interoperable, and Reusable — and using robust Scientific Data Management Systems (SDMS) are essential for organizations running multi-study RNA-seq programs.

How Excelra Supports RNA Sequencing Projects

Excelra offers comprehensive RNA sequencing bioinformatics capabilities — from raw FASTQ processing to clinical-grade biomarker reporting — delivered through validated, scalable pipelines and expert computational biology support.

Bulk RNA-seq Pipeline — validated end-to-end workflow (QC → alignment → counting → DGE → pathway analysis) on Excelra’s OP² Bulk Transcriptomics Pipeline, deployable on AWS, Azure, or GCP
scRNA-seq Analysis — Seurat/Scanpy-based single-cell pipelines including clustering, cell type annotation, trajectory analysis, and cell-cell interaction modeling via the OP² Single-Cell Transcriptomics Pipeline
Spatial Transcriptomics — 10x Visium and emerging spatial platforms analysis; see our spatial transcriptomics whitepaper and our spatial transcriptomics target discovery case study
Multi-Omics Integration — integration of RNA-seq with WGS, WES, ChIP-seq, proteomics, and clinical data for multi-omics analysis
Biomarker Discovery — machine learning-driven gene expression biomarker identification, validation, and companion diagnostic support; see our PDX RNA-seq analysis case study
FAIR-Compliant Data Management — SDMS integration and RNA-seq data lake design; our RNA-seq Nextflow pipeline modernization case study demonstrates this capability

Conclusion

RNA sequencing has fundamentally changed how we understand biology. By providing a comprehensive, quantitative view of the entire transcriptome — which genes are active, how active they are, how they are spliced, and how expression changes between conditions — RNA-seq has become the foundational technology for modern drug discovery, translational genomics, and precision medicine.

From identifying novel drug targets through differential expression analysis, to profiling tumor microenvironments at single-cell resolution, to mapping gene expression in its native spatial context — RNA sequencing now spans a family of powerful techniques that are reshaping what is possible in life sciences research. The emergence of AI-powered analysis tools — from deep learning cell type classifiers to transcriptomic foundation models — is further accelerating what RNA-seq data can reveal, turning terabytes of sequence reads into actionable biological and clinical intelligence.

For organizations running RNA sequencing programs — whether in drug discovery, clinical genomics, biomarker development, or agricultural research — the quality of bioinformatics analysis is as important as the quality of the sequencing itself. Validated RNA sequencing pipelines, rigorous QC frameworks, expert computational biology, and FAIR-aligned data management are all essential to realizing the full scientific and commercial value of transcriptomic data. Excelra’s end-to-end RNA sequencing bioinformatics capabilities — delivered through the OP² Online Pipeline Platform and a team of expert computational biologists — are built to support RNA-seq programs at any scale and for any application.

What is RNA sequencing (RNA-seq)?

RNA sequencing (RNA-seq) is an NGS-based technique that captures and quantifies all RNA molecules in a biological sample — measuring gene expression levels, detecting alternative splicing, identifying novel transcripts, and quantifying non-coding RNAs. It provides a comprehensive snapshot of cellular activity at a given time and condition

What is the difference between bulk RNA-seq and single-cell RNA-seq?

Bulk RNA-seq measures average gene expression across all cells in a sample — ideal for comparing expression between conditions. Single-cell RNA-seq (scRNA-seq) profiles each individual cell, revealing cell-type heterogeneity, rare populations, and cell state transitions that bulk averages out. The choice depends on whether population-level or single-cell resolution is needed.

What are the main applications of RNA-seq in drug discovery?

RNA-seq is used for: novel drug target identification through differential expression; compound mechanism of action (MoA) profiling; toxicogenomics; biomarker discovery for patient stratification; and transcriptome-based drug repurposing using Connectivity Map approaches.

What bioinformatics tools are used for RNA-seq analysis?

Key tools: FastQC/Trimmomatic (QC); STAR/HISAT2/Salmon (alignment/quantification); featureCounts/HTSeq (counting); DESeq2/edgeR (differential expression); GSEA/clusterProfiler (pathway enrichment); Seurat/Scanpy (scRNA-seq). Pipeline orchestration uses Nextflow or Snakemake.

How many reads are needed for RNA-seq?

20–30M paired-end reads per sample for standard bulk RNA-seq DGE studies; 50–100M for novel transcript discovery or low-expression genes; ~20,000–50,000 reads per cell for scRNA-seq. Depth depends on the organism, sample type, and biological question

What is the difference between RNA-seq and microarray for gene expression?

RNA-seq has no dynamic range ceiling, detects novel transcripts and splice variants, measures non-coding RNAs, and has lower background noise. Microarrays are faster and cheaper for large cohorts focused on known gene sets. RNA-seq has largely replaced microarrays for most research applications.

What is spatial transcriptomics and how does it differ from RNA-seq?

Spatial transcriptomics maps gene expression to physical locations within a tissue section — combining RNA sequencing with spatial coordinates. Unlike bulk RNA-seq (no spatial info) or scRNA-seq (no tissue context), spatial transcriptomics preserves tissue architecture, enabling analysis of TME organization, cell-cell interactions, and regional expression patterns.

RNA Sequencing (RNA-seq)