Bioinformatics Asked on March 17, 2021
I came across references to "movie files" from PacBio sequencing in this paper:
https://www.jimmunol.org/content/204/12/3434
Specifically:
Movie files used to generate results presented in this article have been submitted to the National Center for Biotechnology Information BioProject (https://www.ncbi.nlm.nih.gov/bioproject/) under accession number PRJNA389440.
Looking at the SRA entries, it looks like it’s just the raw reads as I expected. But what is meant by "movie files"? (This is probably just my ignorance about PacBio as opposed to Illumina or others but I’ve never seen that term before and can’t find much online.)
[disclaimer: not a PacBio person, I'm just vaguely aware of the technology]
Movie files are the PacBio raw data format, representing observed signal intensities in each sequencing well over the course of the sequencing run. The movie files can be used to re-call sequences with updated statistical models, which can improve accuracy, or test out hypotheses about how epigenetics alters the rate of nucleotide incorporation (i.e. methylation).
Illumina has a similar static version of this which is an image (or multiple images, depending on chemistry) of the flow cell after each synthesis cycle. Illumina's raw data format consumes tens of terabytes per run, and is of minimal value after base calling, so is almost always discarded after base calling.
Anything uploaded to NCBI SRA must contain FASTQ information. While the original uploaded files can be kept (which may include time/signal information), the files are converted into a FASTQ-like format for NCBI's internal representation. As such, there's a chance that movie files could be uploaded to SRA, but those files would also need to include the called FASTQ information (e.g. in a HDF5 container). In contrast to this, ENA does allow uploading raw data files without called information.
Correct answer by gringer on March 17, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP