Bioinformatics Asked on February 2, 2021
I seem to have landed on the mitomap.org site, but I don’t know what to make of it or what do with it / how to get the genomes onto my computer. It sounds like the genomes are stored in GenBank, but that mitomap simply lists which of the GenBank genomes are human mitochondrial haplogroup genomes.
First of all, I can’t find a simple list of links or list of names of the haplogroups the mitomap has… Can you show me where I can get or download that? They say:
This brings our total number of FL sequences to 51,673, and the number of CR sequences to 74,660. Our SNVs now total 19,227.
I don’t know what FL or CR (Control Region sequences?) means, but I am expecting to see lists of genomes or something. What am I to do with this information? Where do I find this data? Is this it? If I click on one of the results, I see this. It has path nuccore/MT742594.1
, can I find this in some NCBI FTP server somewhere? I don’t see anything of that folder structure in the GenBank FTP server…
Basically my question is:
Hmmm, I am not experienced enough in biology or bioinformatics to truly understand the question, but I am guessing you want to download the FASTA file or genebank file containing the sequence of the accession number you listed above. There are 2 ways to do this -
https://www.ncbi.nlm.nih.gov/nuccore/MT742594.1
If you familiar with Biopython and python programming you can download multiple files at once in whatever format you want, there is an easy script to do this and has been taken from Biopython's documentation on biopython.org
import os
from Bio import SeqIO
from Bio import Entrez
Entrez.email = "[email protected]" # Always tell NCBI who you are
filename = "MT742594.1.gb"
if not os.path.isfile(filename):
# Downloading...
net_handle = Entrez.efetch(
db="nucleotide", id="MG762674", rettype="gb", retmode="text"
)
out_handle = open(filename, "w")
out_handle.write(net_handle.read())
out_handle.close()
net_handle.close()
print("Saved")
print("Parsing...")
record = SeqIO.read(filename, "gb")
print(record)
You can make a list of the accession numbers of the sequences you want to download and pass the list in the id, if you are curious as to how the code works visit biopython.org .
Answered by Neeleshwar Pandey on February 2, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP