Bioinformatics Asked by shadi on January 29, 2021
I manually selected all 4000 sequences on the 20 pages of the SARS-CoV-2 sequences on genbank here.
When I click on Build phylogenetic tree
, it only has 200 nodes.
Is this expected? I was expecting to see 4000 nodes.
Is it simply a displaying issue where only 200 sequences per page are revealed while browsing the list of sequences? Is there a way to get the tree for all 4000 sequences?
Okay, I'll provide a formal answer, Genbank phylogeny is to get a gist of the diversity. It isn't a formal phylogenetic analysis. You can down the tree as a pdf, I don't think you can download the treefile (in fact I'm pretty certain you can't). Genbank will perform a reasonable distance method and from recollection it will "collapse clades", so if you see little triangles on the tree it represents a large number of taxa represented as a triangle.
I was surprised to learn there are 4000 SARS-CoV-2 sequences.
To make a tree you need to download the data, align (muscle) and send through a formal phylogeny program. If you think Genbank is okay, then try MegaX its very point and clicky. Generally, maximum likelihood phylogeny has returned to vogue for at least 1 decade now (previously Bayesian). However, the Beast package is a Bayesian analysis which is still going strong and the main tool now used in molecular dating.
In summary, to perform a tree analysis beyond Genbank
Answered by M__ on January 29, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP