Bioinformatics Asked on September 26, 2021
I need to obtain sample data from modern humans in fasta format. I just need some megabytes of data from every individual. I actually use a script that obtains the cram file from here (ftp.1000genomes.ebi.ac.uk) and then processes it to obtain the fasta file.
The problem is that cram files are large, slow to download and slow to process. It takes days to get the samples.
Is there a better way to get these samples in fasta format?
The script already makes use of samtools to retrieve only the part of the bam file it needs but doesn’t help much. Cram files are still gigabytes large for only a few megabytes of data that I need.
I have the same problem with data from the 1000 genomes project.
You can download HGDP data in FASTQ format here: https://www.internationalgenome.org/data-portal/data-collection/hgdp
Correct answer by Dan Bolser on September 26, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP