In an eukaryotic cell, the mRNA population constitutes approximately 1 of total RNA with the number of transcripts varying from several thousand to several tens of thousands. Normally, the high abundance transcripts (several thousand mRNA copies per cell) of as few as 5-10 genes account for 20% of the cellular mRNA. The intermediate abundance transcripts (several hundred copies per cell) of 500-2000 genes constitute about 40-60% of the cellular mRNA. The remaining 20-40 % of mRNA is represented by rare transcripts (from one to several dozen mRNA copies per cell) (Alberts et al., 1994). Such an enormous difference in abundance complicates large-scale transcriptome analysis, which results in recurrent sequencing of more abundant cDNAs.

cDNA normalization decreases the prevalence of high abundance transcripts and equalizes transcript concentrations in a cDNA sample, thereby dramatically increasing the efficiency of sequencing and rare gene discovery.

Normalization is utilized to enhance the gene discovery rate of a cDNA library and facilitate the identification and analysis of rare transcripts. This approach is imperative for transcriptom sequencing, and useful in other applications, such as functional screening, construction of specific RNA libraries, and Transcript End Sequence Profiling.

cDNA normalization using duplex-specific nuclease (DSN) is a highly efficient approach that can be applied for normalization of full-length-enriched cDNA (Zhulidov et al., 2004; Zhulidov et al., 2005). The resulting cDNA contains equalized abundance of different transcripts and can be used for construction of cDNA libraries and for direct sequencing, including high-throughput sequencing on the next generation sequencing platforms (Roche/454, ABI/SOLiD or Illumina/Solexa).