본문 바로가기
Paper

[paper] RNA-seq expression

by wycho 2021. 6. 9.

Differential Transcript Usage Analysis Incorporating Quantification Uncertainty via Compositional Measurement Error Regression Modeling

... Commonly used quantities for RNA-seq transcript isoform expression include Transcripts Per Million (TPM) (Wagner, Kin and Lynch, 2012), which provides a transcript-length and library size normalized estimate of expression for each transcript isoform within a sample. This correction facilitates comparison of abundance estimates across transcript isoforms within the same gene, as the estimated number of RNA-seq reads mapping to a gene or transcript isoform is often correlated with its length (Wagner, Kin and Lynch, 2012). 

 

Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2

... Often the goal of differential analysis is to produce a list of genes passing multiple-test adjustment, ranked by P-value. However, small changes, even if statistically highly significant, might not be the most interesting candidates for further investigation. Ranking by fold change, on the other hand, is complicated by the noisiness of LFC estimates for genes with low counts. Furthermore, the number of genes called significantly differentially expressed depends as much on the sample size and other aspects of experimental design as it does on the biology of the experiment - and well-powered experiments often generate an overwhelmingly long list of hits.

... For each gene, we fit a generalized linear model(GLM) as follows. We model read counts K_ij as following a negative binomial distribution (sometimes also called a gamma-Poisson distribution) with mean u_ij and dispersion a_i.

... We use GLMs with logarithmic link, log2 q_ij = sum_r x_jr b_ir , with design matrix elements x_jr and coefficients b_ir.

... the GLM fit returns coefficients indicating the overall expression strength of the gene and the log2 fold change between treatment and control.

 

 

 

댓글