Bioinformatics Asked on August 19, 2021
First of all I apologize without the question is very basic, I am taking my first steps in bioinformatics.
We are evaluating the correlation (using the Pearson, Kendall or Spearman method) between gene expression and miRNA expression using the corAndPvalue function of WCGNA.
The resulting structure would be a DataFrame containing all combinations between each gene with each miRNAs, containing the following columns:
Gene miRNA Correlation P-value
Gen_1 miRNA_1 0,959 0.00311
Gen_1 miRNA_2 -0,039 0.1041
Gen_1 miRNA_3 -0,344 0.0021
Gen_2 miRNA_1 0,1333 0.00451
Gen_2 miRNA_2 0,877 0.07311
...
Considering the huge number of correlation tests we are going to evaluate, we need to adjust the p-values to avoid correlations due to chance. Bonferroni does not seem to be the best solution, so we would use Benjamini-Hochberg method (BH). The question is:
The BH correction for the Gen_1
x miRNA_1
combination, should consider the p-values of all combinations that include Gen_1 (Option 1), or should consider all the p-values of all the genes x miRNA combinations (Option 2)?
For example, let’s assume an expression dataset of 20,000 genes and another of 15,000 miRNAs
Option 1:
To adjust Gen_1
x miRNA_1
we would use 15,000 p-values (Gen_1
x miRNA_1
, Gen_1
x miRNA_2
, …, Gen_1
x miRNA_15000
).
Option 2:
To adjust Gen_1
x miRNA_1
we would use 300,000,000 p-values (Gen_1
x miRNA_1
, Gen_1
x miRNA_2
, …, Gen_1
x miRNA_15000
, Gen_2
x miRNA_1
, Gen_2
x miRNA_2
, …, Gen_2
x miRNA_15000
and so on).
Documentation of the method fdrcorrection from Python Statsmodels library suggests that for negative correlations (that could be frequent in a mRNA x miRNA correlation analysis) Benjamini-Yekutieli would work better; is that right? Or Benjamini-Hochberg method would be appropiated for this case?
Any kind of help would be much appreciated, thanks in advance!
I made the same question in CrossValidated forum and got an excellent answer!
The important part:
You need to correct for all of the comparisons you are doing. So if that's 300,000,000 comparisons you need to correct for that many multiple comparisons.
For more information check the answer in the link above
Correct answer by Genarito on August 19, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP