Bioinformatics Asked by Pitagoras Alves on August 22, 2021
I am building a De Novo transcriptome reference assembly for an eukaryotic organism for which I have a genome.
I’ve created several assemblies with rnaSpades using different kmer sizes (19 to 69 with step 10). Now I would like to merge them into one final transcriptome.
How could I do that?
Is using a genome assembly reconciliation tool such as metassembler a good idea?
A simple option would be to simply supply SPAdes with some contigs that you like with the --trusted-contigs
option.
There exist tools specifically addressing the problem of merging transcriptome assemblies. I am unsure whether this is notably different from genome assemblies sufficiently that they are better than the genome assembly mergers. Here is a partial list:
This paper seems to have some more information on comparing tools, though it is not focused specifically on merging.
See also the SeqAnswers on this topic, and this discussion too.
Answered by Maximilian Press on August 22, 2021
You may combine all your transcriptomes into a single file and then apply a clustering method to group very similar transcripts into a single one. For that purpose, you may try CD-HIT-EST or MMseqs2. For each identity threshold you are going to test, you may assess the final quality with BUSCO or by blasting against reference sequences.
Answered by thomas duge de bernonville on August 22, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP