Bioinformatics Asked by guan on April 4, 2021
I’ve got some doubts on the hisat2 –rna-strandness option and its output for downstream analysis. Please see below.
I understand that the –rna-strandness option produces an XS tag to indicate where a transcript is from (on the + or – strand) for downstream transcriptome assembly analysis. I have a paired-end stranded sequencing library that was aligned to the genome using hisat2 without specifying the –rna-strandness (in other words, the default unstranded was the usage). Following this, the reads were assigned to genes using htseq-count and this time “-s reverse” was specified given the strand-specific sequencing assay type.
Would the above handling affect the alignment and counting results given the default usage of –rna-strandness in hisat2 followed by htseq-count -s reverse on a strand-specific assay? Since –rna-strandness is for transcriptome assembly using the XS tags generated and htseq does not use XS tags for counting, I presume there should be no practical impact from the above. Could you also shed light on this? in case I may have been overlooking other facts of the usages of the tools.
To help verify the above, I re-aligned and counted the reads from 2 samples by switching on –rna-strandness RF in hisat2. I attach the alignment and count features info. below for assessment.
Overall alignment rate of Sample 1: 94.52% (–rna-strandness RF) vs.94.12% (–rna-strandness unstranded)
Overall alignment rate of Sample 2: 94.57% (–rna-strandness RF) vs.94.15% (–rna-strandness unstranded)
Feature counts of Sample 1 (following –rna-strandness RF + -s reverse):
__no_feature 6327294
__ambiguous 2954776
__too_low_aQual 3784481
__not_aligned 688856
__alignment_not_unique 4858182
Feature counts of Sample 1 (following –rna-strandness unstranded + -s reverse):
__no_feature 6291151
__ambiguous 2911298
__too_low_aQual 4075017
__not_aligned 754400
__alignment_not_unique 16136045
Feature counts of Sample 2 (following –rna-strandness RF + -s reverse):
__no_feature 5417882
__ambiguous 1708510
__too_low_aQual 3532352
__not_aligned 564596
__alignment_not_unique 2859501
Feature counts of Sample 1 (following –rna-strandness unstranded + -s reverse):
__no_feature 5359434
__ambiguous 1676091
__too_low_aQual 3813344
__not_aligned 623122
__alignment_not_unique 2891792
These results look comparable to me across pipelines.
Thanks
Guan
If you reran the command with the correct settings, just leave it at that. (It is not at all clear to me that strandedness rf is correct)
If you want people to tell you if you ran the commands right, you need to put down what commands you used.
Answered by swbarnes2 on April 4, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP