Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why multiple .bam files(>=7) cannot work #411

Open
Yumy526 opened this issue Jan 10, 2024 · 14 comments
Open

why multiple .bam files(>=7) cannot work #411

Yumy526 opened this issue Jan 10, 2024 · 14 comments

Comments

@Yumy526
Copy link

Yumy526 commented Jan 10, 2024

IZ}~6H XI7Y1~0(%$B{1R
I'm putting multiple .bam files as inputs for reads, but I'm getting an error. When I take less than 10 or so files as readings for reads, it runs successfully, don't know why, A little anxious, very eager and looking forward to your reply!

my code:se1 <- bambu(reads = list.files(pattern="GV"), annotations = bambuAnnotations, genome = fa.file)
error:Error in checkInputs(annotations, reads, readClass.outputDir = rcOutDir, :
Reads should either be: a vector of paths to .bam files, a vector of paths to Bambu RCfile .rds files, or a list of loaded Bambu RCfiles

my code:se1 <- bambu(reads = "/public/home/ymwang/PacBio/6_bambu/bam/.bam", annotations = bambuAnnotations, genome = fa.file)
error:Error: BiocParallel errors
1 remote errors, element index: 1
0 unevaluated and other errors
first remote error:
Error in value[3L]: failed to open BamFile: file(s) do not exist:
‘/public/home/ymwang/PacBio/6_bambu/bam/
.bam’

my code:se1 <- bambu(reads = c("GV10O_10.align.bam","GV11O_11.align.bam","GV12O_12.align.bam","GV13O_13.align.bam","GV14O_14.align.bam","GV15O_15.align.bam","GV16O_16.align.bam","GV1O_1.align.bam","GV20O_20.align.bam","GV22O_22.align.bam","GV23O_23.align.bam","GV2O_2.align.bam","GV3O_3.align.bam","GV4O_4.align.bam","GV5O_5.align.bam","GV7O_7.align.bam","GV8O_8.align.bam","GV9O_9.align.bam"), annotations = bambuAnnotations, genome = fa.file)
error:

@andredsim
Copy link
Collaborator

Hi,

For the first code could you show me what list.files(pattern="GV") outputs, is it possible you have another file in your working directory with GV in the name?

The second code is an expected error because I assume there is no file called .bam

The third looks fine to me, but unfortunately the error is missing in your post. Would you be able to add that in so I might see what could be happening?

Kind Regards,
Andre Sim

@Yumy526
Copy link
Author

Yumy526 commented Jan 11, 2024 via email

@andredsim
Copy link
Collaborator

Hi,

Unfortunately I cannot see it. Was it in your last comment or did you send it another way. Is it possible some syntax might be formatting it in a way that makes in invisible on github? Perhaps a screenshot of the error might work?
Below is what I see (the "..." in the final comment makes the email you replied to appear)

image

@Yumy526
Copy link
Author

Yumy526 commented Jan 11, 2024 via email

@andredsim
Copy link
Collaborator

Hi,

I have not seen this issue before so I am not exactly sure how to solve it, however here are somethings to try that might circumvent the problem, or provide me a clue on if/where a bug might be in the code.

  1. add rcOutDir = "/path/somewhere/" to bambu()
    The only difference when running with 10 files in the code is that it outputs to a tmp directory. By setting a manual output directory that might skip that issue
  2. Try a different combination of 11 rc files and see if that works. One issue cause be that 1 of the bam files from the 18 is not compatible with bambu for some reason. Use the 10 that worked + 1 that was missing.
  3. Run each of the bam files individually for read class construction (set quant = FALSE, discovery = FALSE). Then combine them for the subsequent steps.
    i.e (fill in the ... with the rest of the files)
se1 <- bambu(reads = "GV10O_10.align.bam", annotations = bambuAnnotations, genome = fa.file, quant = FALSE, discovery = FALSE)    
se2 <- bambu(reads = "GV11O_11.align.bam"), annotations = bambuAnnotations, genome = fa.file, quant = FALSE, discovery = FALSE)    
...
se_final = bambu(reads = c(se1, se2, se3, ....), annotations = bambuAnnotations, genome = fa.file)

I also notice some strange warnings in the run that did work which might be a hint. Are you expecting a very low number < 50 known transcripts to be in your bam files? Are you able to check that the chromosome names in your gtf file match that of your fasta file, and that you are using the same genome fasta file as the bambu input that use used to align your reads to. A common issue is a gtf file has "Chr1" as a name and the genome is "1".
For the run with <10 bam files that did work. Could you share the output from metadata(se1)$warnings

I hope to solve this for you and future users so please let me know what from the above works and doesn't work.

@Yumy526
Copy link
Author

Yumy526 commented Jan 11, 2024 via email

@Yumy526
Copy link
Author

Yumy526 commented Jan 22, 2024 via email

@andredsim
Copy link
Collaborator

Hi,

All the gene counts being named bambu means it has predicted a lot of novel genes. I believe this is related to the warning I flagged above. As in standard runs of Bambu this warning should not appear.

"I also notice some strange warnings in the run that did work which might be a hint. Are you expecting a very low number < 50 known transcripts to be in your bam files? Are you able to check that the chromosome names in your gtf file match that of your fasta file, and that you are using the same genome fasta file as the bambu input that use used to align your reads to. A common issue is a gtf file has "Chr1" as a name and the genome is "1"."

Double check that the gtf file you are using matches the fasta file as this is the most common cause, and that the fasta file you are using is the exact same file used for the genome (not transcriptome) alignments.

If neither of the above is the case, I will need to look at some of the outputs to try and figure out what is happening.
So that I can be of the most help please include:

  1. The script you used to run bambu including the prepareAnnotations step
  2. The first 10 lines of the gtf file you are using
  3. The warnings and output from one of these lines
    se1 <- bambu(reads = "GV10O_10.align.bam", annotations = bambuAnnotations, genome = fa.file, quant = FALSE, discovery = FALSE, verbose = TRUE)
    print(metadata(se1)$warnings)
  4. Bonus would be if you could attach the output of this line.
    saveRDS(se1.rds, "outputPath/se1.rds")

Kind Regards,
Andre Sim

@Yumy526
Copy link
Author

Yumy526 commented Feb 20, 2024 via email

@andredsim
Copy link
Collaborator

Hi,

The formatting of your last comment is a bit hard for me to parse (attached picture).

From what I gather you are surprised at the results that all the marker genes you are detecting are novel genes. As I mentioned in some of my earlier comments I think there may be something going wrong with you run as there were some concerning warnings messages which is the likely reason only bambu genes are being identified. For me to be able to help you please attach the following.

  1. The script you used to run bambu including the prepareAnnotations step
  2. The first 10 lines of the gtf file you are using
  3. The warnings and output from one of these lines
    se1 <- bambu(reads = "GV10O_10.align.bam", annotations = bambuAnnotations, genome = fa.file, quant = FALSE, discovery = FALSE, verbose = TRUE)
    print(metadata(se1)$warnings)
  4. Bonus would be if you could attach the output of this line.
    saveRDS(se1.rds, "outputPath/se1.rds")

image

Kind Regards,
Andre Sim

@Yumy526
Copy link
Author

Yumy526 commented Mar 3, 2024 via email

@andredsim
Copy link
Collaborator

Hi,
I am not sure I understand your question. Are you able to attach in image or an example of what you mean to better describe it?

@Yumy526
Copy link
Author

Yumy526 commented Mar 12, 2024 via email

@cying111
Copy link
Collaborator

cying111 commented Mar 13, 2024

Hi @Yumy526 ,

你的上一个提问格式有点问题,我们这边看到的全是乱码。
你可能需要重新发一遍

如果不太方便的话,你也可以直接发中文,谢谢!

Ying

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants