Statistical analysis of real-time PCR data
- Methodology article
- Open Access
- Published:
BMC Bioinformaticsvolume 7, Article number: 85 () Cite this article
k Accesses
Citations
12 Altmetric
Metrics details
Abstract
Background
Even though real-time PCR has been broadly applied in biomedical sciences, data processing procedures for the analysis of quantitative real-time PCR are still lacking; specifically in the realm of appropriate statistical treatment. Confidence interval and statistical significance considerations are not explicit in many of the current data analysis approaches. Based on the standard curve method and other useful data analysis methods, we present and compare four statistical approaches and models for the analysis of real-time PCR data.
Results
In the first approach, a multiple regression analysis model was developed to derive ΔΔCt from estimation of interaction of gene and treatment effects. In the second approach, an ANCOVA (analysis of covariance) model was proposed, and the ΔΔCt can be derived from analysis of effects of variables. The other two models involve calculation ΔCt followed by a two group t- test and non-parametric analogous Wilcoxon test. SAS programs were developed for all four models and data output for analysis of a sample set are presented. In addition, a data quality control model was developed and implemented using SAS.
Conclusion
Practical statistical solutions with SAS programs were developed for real-time PCR data and a sample dataset was analyzed with the SAS programs. The analysis using the various models and programs yielded similar results. Data quality control and analysis procedures presented here provide statistical elements for the estimation of the relative expression of genes using real-time PCR.
Background
Real-time PCR is one of the most sensitive and reliably quantitative methods for gene expression analysis. It has been broadly applied to microarray verification, pathogen quantification, cancer quantification, transgenic copy number determination and drug therapy studies [1–4]. A PCR has three phases, exponential phase, linear phase and plateau phase as shown in Figure 1. The exponential phase is the earliest segment in the PCR, in which product increases exponentially since the reagents are not limited. The linear phase is characterized by a linear increase in product as PCR reagents become limited. The PCR will eventually reach the plateau phase during later cycles and the amount of product will not change because some reagents become depleted. Real-time PCR exploits the fact that the quantity of PCR products in exponential phase is in proportion to the quantity of initial template under ideal conditions [5, 6]. During the exponential phase PCR product will ideally double during each cycle if efficiency is perfect, i.e. %. It is possible to make the PCR amplification efficiency close to % in the exponential phases if the PCR conditions, primer characteristics, template purity, and amplicon lengths are optimal.
Both genomic DNA and reverse transcribed cDNA can be used as templates for real-time PCR. The dynamics of PCR are typically observed through DNA binding dyes like SYBR green or DNA hybridization probes such as molecular beacons (Strategene) or Taqman probes (Applied Biosystems) [2]. The basis of real-time PCR is a direct positive association between a dye with the number of amplicons. As shown in Figure (1B and 1C), the plot of logarithm 2-based transformed fluorescence signal versus cycle number will yield a linear range at which logarithm of fluorescence signal correlates with the original template amount. A baseline and a threshold can then be set for further analysis. The cycle number at the threshold level of log-based fluorescence is defined as Ct number, which is the observed value in most real-time PCR experiments, and therefore the primary statistical metric of interest.
Real-time PCR data are quantified absolutely and relatively. Absolute quantification employs an internal or external calibration curve to derive the input template copy number. Absolute quantification is important in case that the exact transcript copy number needs to be determined, however, relative quantification is sufficient for most physiological and pathological studies. Relative quantification relies on the comparison between expression of a target gene versus a reference gene and the expression of same gene in target sample versus reference samples [7].
Since relative quantification is the goal for most for real-time PCR experiments, several data analysis procedures have been developed. Two mathematical models are very widely applied: the efficiency calibrated model [7, 8] and the ΔΔCt model [9]. The experimental systems for both models are similar. The experiment will involve a control sample and a treatment sample. For each sample, a target gene and a reference gene for internal control are included for PCR amplification from serially diluted aliquots. Typically several replicates are used for each diluted concentration to derive amplification efficiency. PCR amplification efficiency can be either defined as percentage (from 0 to 1) or as time of PCR product increase per cycle (from 1 to 2). Unless specified as percentage amplification efficiency (PE), we refer the amplification efficiency (E) to PCR product increase (1 to 2) in this article. The efficiency-calibrated model is a more generalized ΔΔCt model. Ct number is first plotted against cDNA input (or logarithm cDNA input), and the slope of the plot is calculated to determine the amplification efficiency (E). ΔCt for each gene (target or reference) is then calculated by subtracting the Ct number of target sample from that of control sample. As shown in Equation 1, the ratio of target gene expression in treatment versus control can be derived from the ratio between target gene efficiency (E_{target}) to the power of target ΔCt (ΔCt_{target}) and reference gene efficiency (E_{reference}) to the power of reference ΔCt (ΔCt_{reference}). The ΔΔCt model can be derived from the efficiency-calibrated model, if both target and reference genes reach their highest PCR amplification efficiency. In this circumstance, both target efficiency (E_{target}) and control efficiency (E_{control}) equals 2, indicating amplicon doubling during each cycle, then there would be the same expression ratio derived from 2^{-ΔΔCt} [7, 9].
Whereas ΔCt_{t arget}= Ct_{control}- Ct_{treatment}and ΔCt_{reference}= Ct_{control}- Ct_{treatment}
Ratio = 2^{-ΔΔCt} Equation 2
Whereas ΔΔCt = ΔCt_{reference}- ΔCt_{t arget}
Even though both the efficiency-calibrated and ΔΔCt models are widely applied in gene expression studies, not many papers have thorough discussions of the statistical considerations in the analysis of the effect of each experimental factor as well as significance testing. One of the few studies that employed substantial statistical analysis used the REST^{®} program [8]. The software presented in this article is based on the efficiency-calibrated model and employed randomization tests to obtain the significance level. However, the article did not provide a detailed model for the effects of different experimental factors involved. Another statistical study of real-time PCR data used a simple linear regression model to estimate the ratio through Ct calculation [10]. However, the logarithm-based fluorescence was used as the dependent variable in the model, which we believe does not adequately reflect the nature of real-time PCR data. It follows that Ct should be the dependent variable for statistical analysis, because it is the outcome value directly influenced by treatment, concentration and sample effects. Both studies used the efficiency-calibrated models. Despite the publication of these two methods, many research articles published with real-time PCR data actually do not present P values and confidence intervals [11–13]. We believe that these statistics are desirable to facilitate robust interpretation of the data.
A priori, we consider the confidence interval and P value of ΔΔCt data to be very important because these directly influence the interpretation of ratio. Without a proper statistical modeling and analysis, the interpretation of real-time PCR data may lead the researcher to false positive conclusions, which is especially potentially troublesome in clinical applications. We hereby developed four statistical methodologies for processing real-time PCR data using a modified ΔΔCt method. The statistical methodologies can be adapted to other mathematical models with modifications. SAS programs implementing the methodologies and data control are presented with real-time PCR practitioners in mind for turnkey data analysis. Standard deviations, confidence levels and P values are presented directly from the SAS output. We also included analysis of the sample data set and SAS programs for the analysis in the online supplementary materials.
Results and discussion
Data quality control
From the two mathematical models for relative quantification of real-time PCR data, we observe disparities between data quality standards. For efficiency-calibrated method, the author who described this procedure [7] assumed that the amplification efficiency for each gene (target and reference) is the same among different experimental samples (treatment and control). In contrast, whereas an amplification efficiency of 2 is not required, the ΔΔCt method is more stringent by assuming that all reactions should reach an amplification efficiency of 2. In other words, the amount of product should double during each cycle [9]. Moreover, the ΔΔCt method assumes that the PCR amplification efficiency for each sample will be 2, if PCRs for one set of the samples reaches full amplification efficiency. However, this assumption neglects the effect of different cDNA samples.
Data quality could be examined through a correlation model. Even though examining the correlation between Ct number and concentration can provide an effective quality control, a better approach might be to examine the correlation between Ct and the logarithm (base 2) transformed concentration of template, which should yield a significant simple linear relationship for each gene and sample combination. For example, for a target gene in the control sample, the Ct number should correlate with the logarithm transformed concentration following the simple linear regression model in equation 3. In the equation, X_{lcon}represents the logarithm transformed concentration, β_{0} represents the intercept of the regression line, and β_{con}represents the slope of the regression line [14]. The acceptable real-time PCR data should have two features from the regression analysis. First, the slope should not be significantly different from Second, the slopes for all four combinations of genes and samples as shown in Table 1 should not be significantly different from one another. A SAS program was developed to perform the data quality control in Program1_QC.sas (additional file 1).
Ct = β_{0} + β_{con}X_{lcon}+ ε Equation 3
The input data is grouped as shown in Table 1 and additional file 2. Each combination of gene and sample was classified in one group named from 1 to 4. The SAS procedure Proc Mixed was used for performing simple linear regression for each group based on the model described above. The 95% confidence levels for slopes were estimated, which are expected not be significantly different from The abbreviated SAS output for the analysis of a sample data set is presented in SASOutput.doc (additional file 3). Slopes for Ct and logarithm transformed concentrations for all four groups were not significantly different from -1 based on 95% confidence level. In addition to the numeric output, the program also provides a visualization of data quality as shown in Figure 2, where the Ct number is plotted against logarithm transformed template concentration. A simple linear relationship should be observed between the Ct number and logarithm transformed concentration.
Multiple regression model
Several effects need to be taken in to consideration in the ΔΔCt method, namely, the effect of treatment, gene, concentration, and replicates. If we consider these effects as quantitative variables and have the Ct number relating to these multiple effects and their interactions, we can develop a multiple regression model as follows in Equation 4.
Ct = β_{0} + β_{con}X_{icon}+ β_{treat}X_{itreat}+ β_{gene}X_{igene}+ β_{contreat}X_{icon}X_{itreat}+ β_{congene}X_{icon}X_{igene}+ β_{genetreat}X_{igene}X_{itreat}+ β_{congenetreat}X_{icon}X_{itreat}X_{igene}+ ε Equation 4
In this model, Ct is the true dependent, the β_{0} is the intercept, β_{x}s are the regression coefficients for the corresponding X (independent) terms, and ε is the error term [14]. The model considers the effect of concentration, treatment, gene and their interactions. We are principally interested in the interaction between gene and treatment, which addresses the degree of the Ct differences between target gene and reference gene in treated vs. control samples: i.e., ΔΔCt. ΔΔCt can therefore be estimated from the different combinations values of β_{genetreat}. The four groups in Table 1 also represent the options of combinational effects of treatment and gene. The goal is to statistically test for differences between target and reference genes in treatment vs. control samples. Therefore, the null hypothesis is the Ct differences between target and reference genes will be the same in treatment vs control samples, which can be represented by combinational effect (CE) as: CE1-CE3 = CE2-CE4. An alternative formula will be: CE1-CE2-CE3+CE4 = 0, which will yield an estimation of ΔΔCt. If the null hypothesis is not rejected, then the ΔΔCt would not be significantly different from 0, otherwise, the ΔΔCt can be derived from the estimation of the test. In this way, we can perform a test of different combinational effects of β_{genetreat} and estimate the ΔΔCt from it. As shown in the ΔΔCt formula in Equation 2, if a ΔΔCt is equal to 0, the ratio will be 1, which indicates no change in gene expression between control and treatment.
A SAS program for multiple regression model
SAS procedure PROC GLM was used for ΔΔCt estimation in Program2_MR.sas in additional file 4. The multiple regression model is stated in a model statement. The combinational effect of gene and treatment are evaluated in the estimate and contrast statement. The null hypothesis of CE1-CE2-CE3+CE4 = 0 is tested in the contrast statement and the parameter estimation yield the ΔΔCt value. The SAS input file is available in additional file 5 and the SAS output for the multiple regression is in SASOutput.doc (additional file 3).
The SAS output gives a very comprehensive analysis of the data. We are interested in two aspects of the analysis. First, we want to test whether the ΔΔCt value is significantly different from 0 at P = If the ΔΔCt is not significantly different from 0, then we conclude the treatment does not have a significant effect on target gene expression; otherwise, the inverse is concluded. If the effect is significant, we are interested in the standard deviation of ΔΔCt value, from which we can derive the ratio of gene expression as discussed later. The SAS output provides the point estimation () and standard error () for the ΔΔCt. PROC GLM or PROC MIXED are interchangeable in this application. If the experiments involve multiple biological replicates, replicate effect can also be considered through modifying the SAS program. Then the estimation will be the combined effect of gene, treatment and replicate.
Analysis of covariance and SAS code
Another way to approach the real-time PCR data analysis is by using an analysis of covariance (ANCOVA). A simplified model can be derived from transforming the data into a grouped data as shown in Table 1 and additional file 2 resulting in Equation 5.
Ct = β_{0} + β_{con}X_{icon}+ β_{group}X_{igroup}+ β_{groupcon}X_{igroup}X_{icon}+ ε. Equation 5
We are interested in two questions here. First, are the covariance adjusted averages among the four groups equal? Second, what is the Ct difference of target gene value between treatment and control sample after corrected by reference gene? In this case, the null hypothesis will be (μ2-μ1)-(μ4-μ3) = 0, and the test will yield a parameter estimation of ΔΔCt as shown in the Program3_ANCOVA.sas (additional file 6).
The SAS code implementing the ANCOVA model is similar to that of multiple regression model. Either SAS procedures PROC GLM or PROC MIXED can be employed to implement the ANCOVA model; and we used PROC MIXED here. The class statement defines which variables will be grouped for significance testing. In this case, the variables are concentration and group, and ANCOVA assumes that these are co-varying in nature. The contrast and estimate statements were used to contrast the group effect, which will yield ΔΔCt (), as well as its standard error () and 95% confidence interval (, ). The SAS output with both confidence level and P value is presented in SASOutputs.doc (additional file 3).
Simplified alternatives – T-test and wilcoxon two group test
More simplified alternatives can be used to analyze real-time data with biological replicates for each experiment. The primary assumption with this approach is that the additive effect of concentration, gene, and replicate can be adjusted by subtracting Ct number of target gene from that of reference gene, which will provide ΔCt as shown in Table 2. The ΔCt for treatment and control can therefore be subject to simple t-test, which will yield the estimation of ΔΔCt.
As a non-parametric alternative to the t-test, a Wilcoxon two group test can also be used to analyze the two pools of ΔCt values. Two of the assumptions for t- test are that the both groups of ΔCt will have Gaussian distributions and they will have equal variances. However, these assumptions are not valid in many real-time PCR experiments using realistically small sample sizes. Therefore a distribution-free Wilcoxon test will be a more robust and appropriate alternative in this case [15].
A SAS program has been developed for both t- test and Wilcoxon two group test as shown in the attached program Program4_TW.sas (additional file 7). The SAS procedures TTEST and UNIVARIATE were used to analyze the data. The SAS Macro 'moses.sas' [15] in additional file 8 has been employed to derive the confidence levels. The SAS input file is in additional file 9 and the SAS output for sample data analysis is available in SASOutput.doc (additional file 3). Since the estimate of difference derives from subtracting treatment from control sample, the actual ΔΔCt should be the inverse of the output estimate.
Comparison of four approaches and data presentation
A comparison of the four approaches is presented in Table 3. Multiple regression and ANCOVA yield exactly the same result for ΔΔCt estimation, because both methods employ the same mathematical approach for parameter estimation. The t-test provides the same point estimation of ΔΔCt, however, the standard error is slightly greater, which leads to a larger confidence interval. Wilcoxon two group test provides a slightly smaller estimation of ΔΔCt. The highly similar results from the four approaches validated the models and SAS programs presented. The choice of the models and programs will depend on the experimental design and the stringency and quality of the experiment. However, the most conservative test, owing to its nonparametric nature, is the Wilcoxon two group test, which is distribution-independent.
Data quality control
Many of the current real-time PCR experiments do not include a standard curve design, nor do they use a method to estimate the amplification efficiency. We argue here that real-time PCR data without proper quality controls are not reliable, since the efficiency of real-time PCR could have significant impact on the ratio estimation and dynamic range. For example, if a PCR has a percentage amplification efficiency (PE) of (i.e. PCR product will increase 2^{} times instead of two times per cycle), a ΔCt value of 3 can only be transformed into times differences in ratio instead of 8 times. This problem gets amplified when the ΔΔCt or ΔCt values are larger and the amplification efficiency is lower, which could lead to severely skewed interpretations.
We therefore propose two standards for real-time PCR data quality control according to the model using the SAS programs presented in this paper. First, experiments with a serial dilution of template need to be included in order to estimate the amplification efficiency of each gene with each sample. Some researchers assume that the amplification efficiency for each gene is the same in different samples because the same primer pair and amplification conditions are used. However, we found that sample effect does have an impact on the amplification efficiency. In other words, the amplification efficiency could be different for the same gene when amplified from different cDNA template samples. We therefore consider the experimental design with standard curve for each gene and sample combination as the optimal. Second, under optimal conditions, if a plot of the Ct number against the logarithm (2-based) template amount should yield a slope not significantly different from -1, which indicates a nearly 2 amplification efficiency. Even though both efficiency-calibrated model and modified ΔΔCt model tolerates the amplification efficiency lower than 2, it is most reliable to have all the reaction with amplification efficiency approximating 2 through optimizing primer choices, amplicon lengths and experimental conditions. From our experience, maintaining all the amplification efficiency near 2 is the best way to reach equal amplification efficiency among the samples and thus to ensure high quality data. It is also observed that a near 2 amplification efficiency can help to expand the dynamic range of ratio estimation.
P-value, confidence intervals and data presentation
The P-value is an important parameter for significance level, and confidence intervals help to establish the reliable range for ΔΔCt estimation. Most of current real-time PCR publications do not present P-values and confidence intervals [11–13]. We believe disclosing P-values is important when the researchers claim differential expression between the samples or treatments exists. In the program we present, all the P-values are derived from testing the null hypothesis that ΔΔCt are equal to 0. Therefore, a small P-value indicates that the ΔΔCt is significantly different from 0, which demonstrates a significant effect. The interpretation of a P- value will depend on the experimental objectives. For example, at P = in a treatment versus control experiment, we can claim that the treatment has a significant effect; and in a tissue comparison experiment, we can claim that the gene expression is significantly different among the tissues.
Some publications present a standard deviation of the ratio as a meaningful metric. However, we argue here that the standard deviation of ratio should be derived from the standard deviation of ΔΔCt; and the confidence interval of the ratio should be derived from the confidence interval of ΔΔCt. In other words, the point estimation of ratio should be 2^{-ΔΔCt} and the confidence interval for ratio should be (2^{-ΔΔCtHCL}, 2^{-ΔΔCtLCL}). Since Ct is the observed value from experimental procedures, it should be the subject of statistical analysis. The practice of performing statistical analysis at ratio directly is not appropriate. The presentation of data needs to refer to the ΔΔCt and subsequently the ratio and confidence intervals derived from 2^{-ΔΔCt.}
Statistical analysis for real-time PCR data with amplification efficiency less than 2
As stated before, the PCR amplification efficiency can be optimized to be approximately 2 with proper amplification primers, RNA quality, and cDNA synthesis protocol. Recent advancements in real-time PCR primer design have allowed easier experimental optimization [16, 17]. However, less than ideal real-time PCR data can occur regardless the stringent control of experimental conditions. There are three scenarios for suboptimal real-time PCR data. In the first scenario, all of the PCR reactions have the same amplification efficiency, yet the efficiency differs from 2. In the second scenario, the PCR amplification efficiency differs by gene only. In other words, the amplification efficiency is the same for the same gene in all the biological samples; however, the amplification efficiency varies among the different genes. In the third scenario, the PCR amplification efficiency differs both by gene and by sample. We considered the data in the third scenario as unacceptable as many others have reported [10, 18]. In any of these scenarios, the adjusted ΔΔCt can be derived from the ANCOVA model by including the PE in the 'estimate' and 'contrast' statement of the SAS program.
Several approaches have been developed to calculate the amplification efficiency in the low quality data. One of such approach is so called 'dynamic data analysis', in which the fluorescence history of a PCR reaction is employed to calculate the amplification efficiency [19, 20]. The advantage of the approach lies in the capacity to analyze low quality data and the economy in cost by avoiding the standard curve. However, due to the mathematical complexity and the reliability controversy, this method is not as widely applied as the traditional standard curve method [10, 16, 18]. In our method, a standard curve already exists and can be used to derive amplification efficiency (E). Considering the simple linear regression model in Equation 3, if X_{lcon}represents 10 based logarithm transformed concentration, the amplification efficiency (E) is 10^{-(1/slope)} or according to Ramussen and Pfaffl [7, 21]. In our model, X_{lcon}represents the 2 based logarithm transformed concentration, the amplification efficiency (E) therefore is 2^{-(1/slope)} or , where the PE can be represented as -(1/β_{con}).
In the first scenario discussed above, all PCR amplification have the same efficiency, but the efficiency is not equal to 1. Then the ratio of gene expression can be represented in the following equation.
whereas PE = -(1/β_{con}), and ΔΔCt_{adjust}= PE*ΔΔCt
In the Equation 6, β_{con} is the pooled slope of the plot with Ct against logarithm 2 based concentration. The β_{con} can be calculated with a correlation function in SAS as shown in Program5_LowQualityData.sas in Additional file In the second scenario, the amplification efficiency differs by gene only. According to Equation 1, we have the following equation, in which the β_{0} is the pooled slope of the plot of Ct against log_{2} (concentration) for each gene.
whereas PE_{target}= -(1/β_{conTarget}), PE_{control}= -(1/β_{conControl}), and ΔΔCt_{adjust}= PE_{target}*ΔCt_{target}-PE_{control}*ΔCt_{control}
In the Equation 7, β_{conTarget}and β_{conControl}are the pooled slope for the plot of Ct against logarithm 2 based concentration for target gene and reference gene respectively. The slopes can be calculated by the Program5_LowQualityData.sas (additional file 10). The ΔΔCt_{adjust}can be calculated with the same program. Theoretically, an equation can also be derived for the third scenario when PCR amplification efficiency differs both by gene and by sample. However, in actual application, we don't consider the data in the third scenario as acceptable due to the significant variation of the amplification efficiency [10, 18].
The Program5_LowQuatilityData.sas in additional file 10 provides the solution to derive the adjusted ΔΔCt in the first and second scenarios. A data set with amplification efficiency different by gene is provided in LowQualityData.txt in additional file 11 to illustrate the use of the SAS program. The data set is of lower quality mainly because of the limited number of replicates involved in the experiment. Four steps are involved in calculating the ΔΔCt_{adjust}. The first step is to perform the data quality control test as shown in Methods. From the SAS output, we can conclude that the LowQualityData dataset does not meet the requirements for 2^{-ΔΔCt} method, since one group of PCR has amplification efficiency significantly different from 1 as shown in the data quality control for LowQualityData dataset part of SASOutput.doc (additional file 3).
The second step is to test the equal PCR efficiency (or slope) by observing the Type III sums of squares for lcon and class interaction. A low p value will indicate the interaction of different groups of PCR (class) with logarithm transformed concentration, which in turn indicates the unequal slope among different groups of PCR. If all PCR amplification efficiency are equal, then the pooled amplification efficiency can be calculated and integrate into the SAS program for ΔΔCt_{adjust}calculation. In this set of data, the Type III sums of squares has a p value smaller than , and the amplification efficiency are not equal for all PCRs. Tests of equal slopes are then performed for each gene to decide whether PCR amplification efficiency is the same for each gene. For either gene, the amplification efficiency is not significantly different with an α of All of the Type III sums of squares outputs can be found in SASOutput.doc (additional file 3).
The next step is to calculate the pooled slope (β_{con}) for each gene to derive the percentage amplification efficiency (PE = -(1/β_{con})) for each gene. The pooled slopes are derived based on the correlation between Ct and logarithm 2 based concentrations. The β_{con}s for the two genes are and respectively as shown in SASOutput.doc (additional file 3) for the amplification efficiency calculation of LowQualityData dataset. With the β_{con}, -(1/β_{con}) or PE can be calculated for each gene as and respectively. The ΔΔCt_{adjust} can then be computed with PEs substituting the 1 for each gene in the 'estimate' and 'contrast' statement. The SAS program is as follows in additional file
Title 2 'Calculate the deltadeltaCt with Adjusted efficiency';
PROCMIXED data=TR2 Order=Data;
CLASS Class Con;
MODEL Ct = Con Class Con*Class/SOLUTION NOINT;
Contrast 'Intercepts' Class - -;
Estimate 'Intercepts' Class - -/cl;
Run;
The SAS output for the analysis is in SASOutput.doc (additional file 3). The ΔΔCt_{adjust} is therefore and the change is significant since p value is very small. The ratio can be represented as discussed in the standard ΔΔCt method. The point estimation of the ratio in this example is , and the 95% confidence interval is (, ).
Overall, in the less optimized PCR reactions, statistical analysis is not only complicated but also compromised for precision and efficiency. Therefore caution should be exercised when performing statistical analysis with the low quality real-time PCR data, which may easily introduce error due to the efficiency adjustment [10, 18].
Conclusion
In this report, we presented four models of statistical analysis of real-time PCR data and one procedure for data quality control. SAS programs were developed for all the applications and a sample set of data was analyzed. The analyses with different models and programs yielded the same estimation of ΔΔCt and similar confidence intervals. The data quality control and analysis procedures will help to establish robust systems to study the relative gene expression with real-time PCR.
Methods
Plant material, RNA extraction, real-time PCR and sample data set
The sample data set (Table 1) used for the analysis came from the experiment described below. Arabidopsis thaliana (Col1) plants were grown in the growth chamber at 23°C with 14 hours of light for four weeks. Total RNA was isolated with RNeasy Plant Mini Kit (Qiagen, Inc.) from methyl-jasmonate treated Arabidopsis, alamethecin treated Arabidopsis and control plants, and DNA contamination was removed with an on-column DNase (Qiagen, Inc.) treatment. One microgram of total RNA was synthesized into first strand cDNA in a 20 μL reaction using iScript cDNA synthesis kit (BioRad Laboratories). cDNA was then diluted into 10 ng/μL, 2 ng/μL, ng/μL and ng/μL concentration series. Three replicates of real-time PCR experiments were performed for each concentration using an ABI Sequence Detection System from Applied Biosystems (Applied Biosystems). Ubiquitin was used as the reference gene, and the primer sequences for Arabidopsis ubiquitin gene were CACACTCCACTTGGTCTTGCG (F) and TGGTCTTTCCGGTGAGAGTCTTCA (R). The primers for target gene (MT_7) were designed by Primer Express software (Applied Biosystems) and the sequences were CCGCGGTACAAACCTTAATT (F) and TGGAACTCGATTCCCTCAAT (R). MT-7 gene is the Arabidopsis thaliana gene At3g encoding a protein with high catalytic specificity for farnesoic acid [22]. Primer titration and dissociation experiments were performed so that no primer dimmers or false amplicons will interfere with the result. After the real-time PCR experiment, Ct number was extracted for both reference gene and target gene with auto baseline and manual threshold.
Real-time PCR experimental design, data output, transformation, and programming
A main limitation of efficiency calibrated method and ΔΔCt method is that only one set of cDNA samples are employed to determine the amplification efficiency. It was assumed that the same amplification efficiency could be applied to other cDNA samples as long as the primers and amplification conditions are the same. However, amplification efficiency not only depends on the primer characteristics, but also varies among different cDNA samples. Using a standard curve for only one set of tested samples to derive the amplification efficiency might overlook the error introduced by sample differences. In our experimental design, we have performed standard curve experiments with four concentrations of three replicates for all samples and genes involved. The ΔΔCt will derive from the standard curves only, and the data quality is examined for each gene and sample combination. The analysis of two samples is presented in the paper as an example. A minimal of PCRs of two replicates in three concentrations will be required for each sample. Even though more effort is required, the data is more reliable out of stringent data quality control and data analysis based on statistical models.
The output dataset included Ct number, gene name, sample name, concentration and replicate. We used Microsoft^{®} Excel to open the exported Ct file from an ABI sequence analysis system and then to transform data into a tab delimited text file for SAS processing. The sample data set is shown in Table 1.
All programs were developed with SAS (SAS Institute).
References
- 1.
Klein D: Quantification using real-time PCR technology: applications and limitations.Trends in Mol Med , 8: – /S(02)
CASArticle Google Scholar
- 2.
Bustin SA: Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays.J Mol Endocrinol , – /jme
CASArticlePubMed Google Scholar
- 3.
Mocellin S, Rossi CR, Pilati P, Nitti D, Marincola FD: Quantitative real-time PCR: a powerful ally in cancer research.Trends in Mol Med , 9: – /S(03)
CASArticle Google Scholar
- 4.
Mason G, Provero P, Vaira AM, Accotto GP: Estimating the number of integrations in transformed plants by quantitative real-time PCR.BMC Biotechnology , 2: /
PubMed CentralArticlePubMed Google Scholar
- 5.
Heid CA, Stevens J, Livak KJ, Williams PM: Real time quantitative PCR.Genome Res , 6: –
CASArticlePubMed Google Scholar
- 6.
Gibson UE, Heid CA, Williams PM: A novel method for real time quantitative RT-PCR.Genome Res , 6: –
CASArticlePubMed Google Scholar
- 7.
Pfaffl MW: A new mathematical model for relative quantification in real-time RT-PCR.Nucl Acids Res , – /nar/e45
Article Google Scholar
- 8.
Pfaffl MW, Horgan GW, Dempfle L: Relative expression software tool (REST(C)) for group-wise comparison and statistical analysis of relative expression results in real-time PCR.Nucl Acids Res , e /nar/e36
PubMed CentralArticlePubMed Google Scholar
- 9.
Livak KJ, Schmittgen TD: Analysis of relative gene expression data using real-time quantitative PCR and the 2^{-ΔΔCT}method.Methods , – /meth
CASArticlePubMed Google Scholar
Cook P, Fu C, Hickey M, Han ES, Miller KS: SAS programs for real-time RT-PCR having multiple independent samples.Biotechniques , –
CASPubMed Google Scholar
Zenoni S, Reale L, Tornielli GB, Lanfaloni L, Porceddu A, Ferrarini A, Moretti C, Zamboni A, Speghini A, Ferranti F, Pezzotti M: Downregulation of thePetunia hybridaα-expansin genePhEXP1reduces the amount of crystalline cellulose in cell walls and leads to phenotypic changes in petal limbs.Plant Cell , – /tpc
PubMed CentralCASArticlePubMed Google Scholar
Eleaume H, Jabbouri S: Comparison of two standardisation methods in real-time quantitative RT-PCR to followStaphylococcus aureusgenes expression during in vitro growth.J Micro Methods , – /j.mimet
CASArticle Google Scholar
Shen H, He LF, Sasaki T, Yamamoto Y, Zheng SJ, Ligaba A, Yan XL, Ahn SJ, Yamaguchi M, Hideo S, Matsumoto S: Citrate secretion coupled with the modulation of soybean root tip under aluminum stress. Up-Regulation of transcription, translation, and threonine-oriented phosphorylation of plasma membrane H^{+}-ATPase.Plant Physiol , – /pp
PubMed CentralCASArticlePubMed Google Scholar
Kutner MH, Nachtsheim CJ, Neter J, William L: Applied Linear Statistical Models. Fifth edition. McGraw-Hill, Irwin, CA;
Google Scholar
Hollander M, Wolfe DA: Nonparametric Statistical Methods. John Wiley and Sons, New York;
Google Scholar
Bustin SA, Benes V, Nolan T, Pfaffl MW: Quantitative real-time RT-PCR – a perspective.J Mol Endocrinol , – /jme
CASArticlePubMed Google Scholar
Pattyn F, Speleman F, De Paepe A, Vandesompele J: RTPrimerDB: the real-time PCR primer and probe database.Nucl Acids Res , – /nar/gkg
PubMed CentralCASArticlePubMed Google Scholar
Peirson SN, Butler JN, Foster RG: Experimental validation of novel and conventional approaches to quantitative real-time PCR data analysis.Nucl Acids Res , e /nar/gng
PubMed CentralArticlePubMed Google Scholar
Liu W, Saint DA: A new quantitative method of real-time reverse transcription polymerase chain reaction assay based on simulation of polymerase chain reaction kinetics.Anal Biochem , 52– /abio
CASArticlePubMed Google Scholar
Tichopad A, Dilger M, Schwarz G, Pfaffl MW: Standardized determination of real-time PCR efficiency from a single reaction set-up.Nucl Acids Res , e /nar/gng
PubMed CentralArticlePubMed Google Scholar
Rasmussen R: Quantification on the LightCycler. In Rapid Cycle Real-time PCR, Methods and Applications; Heidelberg. Edited by: Meuer S, Wittwer C, Nakagawara K. Springer Press; –
Chapter Google Scholar
Yang Y, Yuan JS, Ross J, Noel JP, Pichersky E, Chen F: AnArabidopsis thalianamethyltransferase capable of methylating farnesoic acid.Arch Biochem Biophys , in press. Corrected Proof Corrected Proof
Google Scholar
Download references
Author information
Affiliations
Department of Plant Sciences, University of Tennessee, Knoxville, TN, , USA
Joshua S Yuan, Feng Chen & C Neal Stewart Jr
University of Tennessee Institute of Agriculture Genomics Hub, University of Tennessee, Knoxville, TN, , USA
Joshua S Yuan
Statistical Consulting Center, University of Tennessee, Knoxville, TN, , USA
Ann Reed
Corresponding author
Correspondence to C Neal Stewart Jr.
Additional information
Authors' contributions
JSY carried out the real-time PCR experiments, developed the statistical model and SAS programs for analysis, and drafted the article. AR provided assistance in SAS programming and data modeling. FC provided assistance in real-time PCR experiments. CNS provided oversight of the work, conceptualized non-parametric elements, and finalized the draft.
Authors’ original submitted files for images
Rights and permissions
Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Reprints and Permissions
About this article
Cite this article
Yuan, J.S., Reed, A., Chen, F. et al. Statistical analysis of real-time PCR data. BMC Bioinformatics7, 85 (). https://doi.org//
Download citation
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Keywords
- Reference Gene
- Amplification Efficiency
- Data Quality Control
- Simple Linear Regression Model
- Logarithm Transformed Concentration
Analyzing real-time PCR data by the comparative C_{T} method
Abstract
Two different methods of presenting quantitative gene expression exist: absolute and relative quantification. Absolute quantification calculates the copy number of the gene usually by relating the PCR signal to a standard curve. Relative gene expression presents the data of the gene of interest relative to some calibrator or internal control gene. A widely used method to present relative gene expression is the comparative C_{T} method also referred to as the method. This protocol provides an overview of the comparative C_{T} method for quantitative gene expression studies. Also presented here are various examples to present quantitative gene expression data using this method.
Access options
Subscribe to Journal
Get full journal access for 1 year
,22 €
only 9,27 € per issue
Subscribe
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
from$
Rent or Buy
All prices are NET prices.
References
- 1
Kim, Y.W. et al. Anti-inflammatory effects of liquiritigenin as a consequence of the inhibition of NF-kappaB-dependent iNOS and proinflammatory cytokines production. Br. J. Pharmacol., – ().
CASArticle Google Scholar
- 2
Pal, S. et al. Low levels of miRb/96 induce PRMT5 translation and H3R8/H4R3 methylation in mantle cell lymphoma. EMBO J.26, – ().
CASArticle Google Scholar
- 3
Ren, Z. et al. IGFBP3 mRNA expression in benign and malignant breast tumors. Breast Cancer Res.9, R2 ().
Article Google Scholar
- 4
Higashibata, A. et al. Decreased expression of myogenic transcription factors and myosin heavy chains in Caenorhabditis elegans muscles developed during spaceflight. J. Exp. Biol., – ().
CASArticle Google Scholar
- 5
Woods, D.C., Alvarez, C. & Johnson, A.L. Cisplatin-mediated sensitivity to TRAIL-induced cell death in human granulosa tumor cells. Gynecol. Oncol., – ().
CASArticle Google Scholar
- 6
Calin, G.A. et al. Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas. Cancer Cell12, – ().
CASArticle Google Scholar
- 7
Zhao, M. et al. Lipofectamine RNAiMAX: an efficient siRNA transfection reagent in human embryonic stem cells. Mol. Biotechnol. (in the press).
- 8
Spänkuch, B. et al. Downregulation of Plk1 expression by receptor-mediated uptake of antisense oligonucleotide-loaded nanoparticles. Neoplasia10, – ().
Article Google Scholar
- 9
Paik, S. et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N. Engl. J. Med., – ().
CASArticle Google Scholar
- 10
Chen, C. et al. Real-time quantification of microRNAs by stem-loop RT-PCR. Nucleic Acids Res.33, e ().
Article Google Scholar
- 11
Niesters, H.G. Quantitation of viral load using real-time amplification techniques. Methods25, – ().
CASArticle Google Scholar
- 12
Pfaffl, M.W. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res.29, e45 ().
CASArticle Google Scholar
- 13
Liu, W. & Saint, D.A. A new quantitative method of real time reverse transcription polymerase chain reaction assay based on simulation of polymerase chain reaction kinetics. Anal. Biochem., 52–59 ().
CASArticle Google Scholar
- 14
Rutledge, R.G. Sigmoidal curve-fitting redefines quantitative real-time PCR with the prospective of developing automated high-throughput applications. Nucleic Acids Res.32, e ().
CASArticle Google Scholar
- 15
Livak, K.J. & Schmittgen, T.D. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods25, – ().
CASArticle Google Scholar
- 16
Swillens, S., Dessars, B. & Housni, H.E. Revisiting the sigmoidal curve fitting applied to quantitative real-time PCR data. Anal. Biochem., – ().
CASArticle Google Scholar
- 17
Griffiths-Jones, S. The microRNA Registry. Nucleic Acids Res.32, D–D ().
CASArticle Google Scholar
- 18
Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA95, – ().
CASArticle Google Scholar
- 19
Sun, S. et al. TINY, a dehydration-responsive element (DRE)-binding protein-like transcription factor connecting the DRE- and ethylene-responsive element-mediated signaling pathways in Arabidopsis. J. Biol. Chem., – ().
CASArticle Google Scholar
- 20
Tan, K.P., Yang, M. & Ito, S. Activation of nuclear factor (erythroid-2 like) factor 2 by toxic bile acids provokes adaptive defense responses to enhance cell survival at the emergence of oxidative stress. Mol. Pharmacol.72, – ().
CASArticle Google Scholar
- 21
O'Rourke, J.P. & Ness, S.A. Alternative RNA splicing produces multiple forms of c-Myb with unique transcriptional activities. Mol. Cell. Biol.28, – ().
CASArticle Google Scholar
- 22
Schmittgen, T.D. & Zakrajsek, B.A. Effect of experimental treatment on housekeeping gene expression: validation by real-time, quantitative RT-PCR. J. Biochem. Biophys. Methods46, 69–81 ().
CASArticle Google Scholar
- 23
Mygind, T. et al. Determination of PCR efficiency in chelex purified clinical samples and comparison of real-time quantitative PCR and conventional PCR for detection of Chlamydia pneumoniae. BMC Microbiol.2, 17 ().
Article Google Scholar
Download references
Author information
Affiliations
Division of Pharmaceutics, College of Pharmacy, Ohio State University, Parks Hall, West 12th Avenue, Columbus, OH , Ohio, USA
Thomas D Schmittgen
Applied Biosystems, Lincoln Center Drive, Foster City, CA , California, USA
Kenneth J Livak
Corresponding author
Correspondence to Thomas D Schmittgen.
About this article
Cite this article
Schmittgen, T., Livak, K. Analyzing real-time PCR data by the comparative C_{T} method. Nat Protoc3, – (). https://doi.org//nprot
Download citation
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
Further reading
Reduced erythrocytic CHCHD2 mRNA is associated with brain pathology of Parkinson’s disease
- Xiaodan Liu
- , Qilong Wang
- , Ying Yang
- , Tessandra Stewart
- , Min Shi
- , David Soltys
- , Genliang Liu
- , Eric Thorland
- , Eugene M. Cilento
- , Yiran Hou
- , Zongran Liu
- , Tao Feng
- & Jing Zhang
Acta Neuropathologica Communications ()
Combined macromolecule biomaterials together with fluid shear stress promote the osteogenic differentiation capacity of equine adipose-derived mesenchymal stem cells
- Mohamed I. Elashry
- , Nadine Baulig
- , Alena-Svenja Wagner
- , Michele C. Klymiuk
- , Benjamin Kruppke
- , Thomas Hanke
- , Sabine Wenisch
- & Stefan Arnhold
Stem Cell Research & Therapy ()
Genome-wide analysis of MYB transcription factors and their responses to salt stress in Casuarina equisetifolia
- Yujiao Wang
- , Yong Zhang
- , Chunjie Fan
- , Yongcheng Wei
- , Jingxiang Meng
- , Zhen Li
- & Chonglu Zhong
BMC Plant Biology ()
Mechanism of interaction between virus and host is inferred from the changes of gene expression in macrophages infected with African swine fever virus CN/GS/ strain
- Bo Yang
- , Chaochao Shen
- , Dajun Zhang
- , Ting Zhang
- , Xijuan Shi
- , Jinke Yang
- , Yu Hao
- , Dengshuai Zhao
- , Huimei Cui
- , Xingguo Yuan
- , Xuehui Chen
- , Keshan Zhang
- , Haixue Zheng
- & Xiangtao Liu
Virology Journal ()
Physical mapping and InDel marker development for the restorer gene Rf2 in cytoplasmic male sterile CMS-D8 cotton
- Juanjuan Feng
- , Xuexian Zhang
- , Meng Zhang
- , Liping Guo
- , Tingxiang Qi
- , Huini Tang
- , Haiyong Zhu
- , Hailin Wang
- , Xiuqin Qiao
- , Chaozhu Xing
- & Jianyong Wu
BMC Genomics ()
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Real-Time PCR Data Analysis
All of Bio-Rad's real-time PCR detection systems include the powerful, easy-to-use CFX Maestro Software. With one software for all four systems, you can easily open and read files from any of the instruments, collaborate with other researchers, and move to another system without learning to use new software.
Similar to other real-time PCR software, CFX Maestro software offers several analysis modules, including quantification, melt curve, gene expression, allelic discrimination, and end-point analyses. An example of how to analyze a gene expression experiment in CFX Maestro software is shown below.
Set up the plate — indicate position of unknown, no template controls (NTCs), and standard curve samples on plate (Figure 1). This can be done before, during, or after a run.
Fig. 1. Plate setup in the Plate Editor window.
Analyze the data — a data file is automatically generated after a run with the gene expression module. Easily view up to six different charts or tables, such as the amplification plot, standard curve, gene expression chart, plate layout, or melt peak with the Custom Data View tab (Figure 2). Check the efficiency and R^{2} of the standard curve. The efficiency should be within 90–% and the R^{2} should be > If these values are out of range, you will need to troubleshoot your experiment.
Fig. 2. Custom Data View window.
Export results for publication — Quickly export any charts or tables by right clicking in the window and selecting Save Image As (Figure 3) or Export to Excel (Figure 4). You can also create reports or real-time PCR data markup language (RDML) files for quick import into qbase+ software.
Fig. 3. Select Save Image As in the data analysis window. | Fig. 4. Select Export to Excel in the data analysis window. |
Back to Top
Are you ready for this. - Yes. - Yes.
Rt data analyzing pcr
The room was tidy, nothing was left on the table from the recent feast. They sat and talked. I heard their voices, but I could not understand what they were talking about. But judging by their intonation and by Artyom's face, the conversation was serious.
Analysis of Real Time PCR (qRT-PCR) data: a ∆∆ct Method!In bed, she is a slutty whore, and at work - a strict leader, director, owner of her company. On the eve of the holidays, Kirill called me and offered to meet. Since Oleg was busy, I agreed, inviting Kirill to my mother's apartment, where we had already met many times when I lived with my mother.
Before his arrival, I went to the beauty salon, asking my mother to meet him if he came in my absence. When I returned, I saw what made me decide to part: Kirill violently fucked my mother on the kitchen table.
Similar news:
- Petfinder iowa city
- Aquaman comic 1
- Yamaha r1 2000 parts
- Java date add hours
- What is ccb material
- 3d marbles game
- Sega dreamcast roms
- Daily painting challenge 2019
Only, mind you, now I am a machinist. Madam doesn't mind, she answered me in tune. I love to frolic not only with guys but also with girls. and one day she went to visit her friend Alinka.