Bioinformatics Asked on August 22, 2021
I am suddenly running into an error when running the DESeqDataSetFromTximport
txi.rsem <- tximport(files, type = "rsem", txIn = FALSE, txOut = FALSE)
dds <- DESeqDataSetFromTximport(txi = txi.rsem,
colData = SampleFile,
design = ~ Compound)`
using counts and average transcript lengths from tximport
Error in DESeqDataSetFromTximport(txi = txi.rsem, colData = SampleFile, :
all(lengths > 0) is not TRUE
any suggestions what causes this error?
I had this issue as well. It looks like all zero-length genes also have zero expression across all samples. The code below makes sure only to exclude genes that have both zero length and zero expression.
txi = tximport(file_paths, type = "rsem", txIn = TRUE, txOut = TRUE)
zero_length_and_unexpressed = (apply(txi$abundance, 1, max) == 0) &
(apply(txi$length, 1, min) == 0)
txi$length = txi$length[!zero_length_and_unexpressed,]
txi$abundance = txi$abundance[!zero_length_and_unexpressed,]
txi$counts = txi$counts[!zero_length_and_unexpressed,]
dds = DESeqDataSetFromTximport(txi, sampleTable, ~ 1)
```
Answered by winni2k on August 22, 2021
I found this same error as of today using this code:
pad.gene.data <-
tximport(files = file.list.for.gene.analysis,
type = "rsem",
txIn = FALSE,
txOut = FALSE)
dds.gene.level <-
DESeqDataSetFromTximport(txi = pad.gene.data,
colData = pad.characteristic.data,
design = ~ condition)
I had done this same differential gene expression analysis using transcript counts aggregated to the gene level, on a mac about 5 weeks ago, but as of this past week switched to a windows and re-installed R 3.6.3 and whatever the current version of tximport and DESeq2 is.
I was able to replicate the results I had gotten on the mac 5 weeks ago removing all rows where all gene lengths are greater than 0:
pad.gene.data$abundance <-
pad.gene.data$abundance[apply(pad.gene.data$length,
1,
function(row) all(row !=0 )),]
pad.gene.data$counts <-
pad.gene.data$counts[apply(pad.gene.data$length,
1,
function(row) all(row !=0 )),]
pad.gene.data$length <-
pad.gene.data$length[apply(pad.gene.data$length,
1,
function(row) all(row !=0 )),]
So first removing any gene where there is a length count of 0 from the abundance and counts matrix, then removing the genes from the length matrix so that they are all the same size. It was able to run that way.
I am definitely skeptical of why I would need to do that as it doesn't seem that it was advertised, if this is a new version, that we need to preprocess the data beforehand. I hope someone who might know more of the code can comment on this.
Answered by TJ Butler on August 22, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP