Bioinformatics Asked by Morteza Hadizadeh on January 20, 2021
I have a microarray dataset with Agilent platform, all of the samples are cancerous, I need normal or control samples to compare with cancerous samples, can I combine control samples from another platform (for example Illumina) to Agilent platform? and then remove the batch effect (by ComBat
function in the sva package).
I would be grateful if you could guide me to do it.
In short the answer is no because you have to standardise between the arrays. Thus both arrays would need to run exactly the same control sample and you would then perform statistical standardisation (its a formal term), which is a varience measure where you look at the variation around the 'mean' for all samples in the array and here the 'mean' is the standard sample. If that can be done for both arrays at that point they can be combined.
The problem is intensity per run varies and you have to control against this
In this scenario you can certainly combine samples by assessing whether the control samples between runs approximates to identity. What you could use is a uniform distribution to assess this, variance must MUCH smaller than the mean for the STANDARD sample(s). I don't really do uniform distribution statistics, so I'd need to research this, because 'uniform' means identity to standard deviations against the uniform distribution is a bit meaningless. However again the uniform distribution is denoted by a very variance in comparison to the mean. The statistic of dividing mean by standard deviation/variance from memory is called the Index of Association (I could be wrong). If this value is at an extreme you're good to go <<<1, if it is remotely close to 1
you're in trouble, if it is >1 you need to use standardisation.
Presumably you have several standards to ensure a mean is meaningful. A T-test and non-parametric equivalents is not stringent enough. If there is a difference assess a correction factor and rerun the uniform distribution check. The same correction will be applied to all samples.
The undisputed method to combine is standardisation, but accept in this scenario the differences between runs is potentially very small, so here what you are attempting is essentially to combine raw data or at least keep one of the runs as pure raw data.
Answered by M__ on January 20, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP