Bioinformatics Asked on May 7, 2021
I’m working with a set of homologous genes (let’s call it gene A) from several bacterial species.
I know (from previously published research) that in gene B (a close paralogue of gene A), there is a non-coding, antisense RNA. This antisense RNA regulates gene B by forming a duplex with the gene B mRNA.
I have noticed that, sometimes, an antisense RNA would be predicted for gene A in the same location in the IMG/MER database. This is not predicted for all of the available gene A sequences, only some of them. IMG/MER uses INFERNAL to predict RNAs, I think.
My question is – is there any way to assess how real this is without wet lab data? As it is, it could be that there’s no actual antisense RNA in gene A, and this is a spurious prediction due to homology. Or it could be that the antisense RNA is present in all of the gene A sequences, and the RNA prediction software simply missed some of them. Is there a computational way of testing which of these options (if any) is more likely? Thank you for your time!
Is there any conserved secondary structure in this RNA? One could align all known instances of this RNA and see if there is any significant covariation according to R-scape. That could serve as a computational signal that the RNA is real (although wet lab experiments would still be needed).
Another question is whether this RNA is already in Rfam? If so, then the job of making an alignment of known instances is easier.
Correct answer by apetrov on May 7, 2021
Get help from others!