-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
non-European populations GC-waves models and ASA chip #97
Comments
1, yes you need to compile the GC model file based on gc5Base.txt.gz from
the PennCNV package (in lib/ folder)
2. yes, it is only dependent on reference genome
…On Thu, Oct 27, 2022 at 10:58 PM Pam ***@***.***> wrote:
Hi, Kai
Thank you for your tool. I am trying to apply it in my work.
I have microarray sequencing data of about 3000 individuals with ASA (
*Asian* Screening Array) chips from Illumina. I am trying to correct
GC-waves for LRR/BAF using PennCNV.
1. Do I need to compile the GC model file myself based on
gc5Base.txt.gz(hg19)?
2. Can the above document be applied to non-European groups
(gc5Base.txt.gz) as well?
I am looking forward to hearing from you soon.
—
Reply to this email directly, view it on GitHub
<#97>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OC7CQSE4D32LJ55JC3WFM6NVANCNFSM6AAAAAARQUSMIM>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you for your reply. I will apply it to ASA. |
Hi, Kai I am looking forward to hearing from you soon. |
I do not have experience with mitochondria. The current gc5file cannot be
used on mitochondria since its value is 5kb distance, and even if you want
to adjust GC, you have to compile a GC model yourself using a custom
threshold such as 1kb sequence surrounding the marker in mitochondria.
…On Fri, Oct 28, 2022 at 9:04 AM Pam ***@***.***> wrote:
1, yes you need to compile the GC model file based on gc5Base.txt.gz from
the PennCNV package (in lib/ folder) 2. yes, it is only dependent on
reference genome
… <#m_-2070931763492051298_>
On Thu, Oct 27, 2022 at 10:58 PM Pam *@*.*> wrote: Hi, Kai Thank you for
your tool. I am trying to apply it in my work. I have microarray sequencing
data of about 3000 individuals with ASA ( Asian Screening Array) chips from
Illumina. I am trying to correct GC-waves for LRR/BAF using PennCNV. 1. Do
I need to compile the GC model file myself based on gc5Base.txt.gz(hg19)?
2. Can the above document be applied to non-European groups
(gc5Base.txt.gz) as well? I am looking forward to hearing from you soon. —
Reply to this email directly, view it on GitHub <#97
<#97>>, or unsubscribe
https://github.com/notifications/unsubscribe-auth/ABNG3OC7CQSE4D32LJ55JC3WFM6NVANCNFSM6AAAAAARQUSMIM
<https://github.com/notifications/unsubscribe-auth/ABNG3OC7CQSE4D32LJ55JC3WFM6NVANCNFSM6AAAAAARQUSMIM>
. You are receiving this because you are subscribed to this thread.Message
ID: @.*>
Hi, Kai
I have another question about "genomic_wave.pl".
Are autosomal chromosomes and mitochondria probes corrected for GC waves
using "PennCNV", separately? Because when I was using "genomic_wave.pl" I
found it had a parameter "--distance ", which refers to the minimum
marker-marker distance for training model (default=1Mb). However, the
length of the chromosomal mitochondria was only 16569 bp. So what
(--distance ) is the appropriate setting for mitochondria?
I am looking forward to hearing from you soon.
—
Reply to this email directly, view it on GitHub
<#97 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OGF6KW5QJDBY2FTKFTWFPFMTANCNFSM6AAAAAARQUSMIM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi, Kai
Thank you for your reply. If I understand correctly, gc5Base.txt.gz cannot be used to compile the mitochondrial GC model using "cal_gc_snp.pl". Need to retrain GC model for mitochondria using mitochondrial reference genome, which is similar to a reference model (chr 11 ) in paper. Is there a specific method for this training? It is very difficult to do it.
I am looking forward to hearing from you soon.
…------------------ 原始邮件 ------------------
发件人: "WGLab/PennCNV" ***@***.***>;
发送时间: 2022年10月28日(星期五) 晚上9:28
***@***.***>;
***@***.******@***.***>;
主题: Re: [WGLab/PennCNV] non-European populations GC-waves models and ASA chip (Issue #97)
I do not have experience with mitochondria. The current gc5file cannot be
used on mitochondria since its value is 5kb distance, and even if you want
to adjust GC, you have to compile a GC model yourself using a custom
threshold such as 1kb sequence surrounding the marker in mitochondria.
On Fri, Oct 28, 2022 at 9:04 AM Pam ***@***.***> wrote:
> 1, yes you need to compile the GC model file based on gc5Base.txt.gz from
> the PennCNV package (in lib/ folder) 2. yes, it is only dependent on
> reference genome
> … <#m_-2070931763492051298_>
> On Thu, Oct 27, 2022 at 10:58 PM Pam *@*.*> wrote: Hi, Kai Thank you for
> your tool. I am trying to apply it in my work. I have microarray sequencing
> data of about 3000 individuals with ASA ( Asian Screening Array) chips from
> Illumina. I am trying to correct GC-waves for LRR/BAF using PennCNV. 1. Do
> I need to compile the GC model file myself based on gc5Base.txt.gz(hg19)?
> 2. Can the above document be applied to non-European groups
> (gc5Base.txt.gz) as well? I am looking forward to hearing from you soon. —
> Reply to this email directly, view it on GitHub <#97
> <#97>>, or unsubscribe
> https://github.com/notifications/unsubscribe-auth/ABNG3OC7CQSE4D32LJ55JC3WFM6NVANCNFSM6AAAAAARQUSMIM
> <https://github.com/notifications/unsubscribe-auth/ABNG3OC7CQSE4D32LJ55JC3WFM6NVANCNFSM6AAAAAARQUSMIM>
> . You are receiving this because you are subscribed to this thread.Message
> ID: @.*>
>
> Hi, Kai
> I have another question about "genomic_wave.pl".
> Are autosomal chromosomes and mitochondria probes corrected for GC waves
> using "PennCNV", separately? Because when I was using "genomic_wave.pl" I
> found it had a parameter "--distance ", which refers to the minimum
> marker-marker distance for training model (default=1Mb). However, the
> length of the chromosomal mitochondria was only 16569 bp. So what
> (--distance ) is the appropriate setting for mitochondria?
>
> I am looking forward to hearing from you soon.
>
> —
> Reply to this email directly, view it on GitHub
> <#97 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ABNG3OGF6KW5QJDBY2FTKFTWFPFMTANCNFSM6AAAAAARQUSMIM>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
I suggest do not do any adjustment with mitochondria since it is too small.
But if you want to do adjustment, you only need to compile a GC model file
that lists the GC content of regions surrounding each marker. There is
nothing to train.
…On Fri, Oct 28, 2022 at 9:54 AM Pam ***@***.***> wrote:
Hi, Kai
Thank you for your reply. If I understand
correctly, gc5Base.txt.gz cannot be used to compile the mitochondrial
GC model using "cal_gc_snp.pl". Need to retrain GC model for
mitochondria using mitochondrial reference genome, which is similar to a
reference model (chr 11 ) in paper. Is there a specific method for
this training? It is very difficult to do it.
I am looking forward to hearing from you soon.
------------------ 原始邮件 ------------------
发件人: "WGLab/PennCNV" ***@***.***>;
发送时间: 2022年10月28日(星期五) 晚上9:28
***@***.***>;
***@***.******@***.***>;
主题: Re: [WGLab/PennCNV] non-European populations GC-waves models and
ASA chip (Issue #97)
I do not have experience with mitochondria. The current gc5file cannot be
used on mitochondria since its value is 5kb distance, and even if you want
to adjust GC, you have to compile a GC model yourself using a custom
threshold such as 1kb sequence surrounding the marker in mitochondria.
On Fri, Oct 28, 2022 at 9:04 AM Pam ***@***.***> wrote:
> 1, yes you need to compile the GC model file based on gc5Base.txt.gz
from
> the PennCNV package (in lib/ folder) 2. yes, it is only dependent on
> reference genome
> … <#m_-2070931763492051298_>
> On Thu, Oct 27, 2022 at 10:58 PM Pam *@*.*> wrote: Hi, Kai Thank
you for
> your tool. I am trying to apply it in my work. I have microarray
sequencing
> data of about 3000 individuals with ASA ( Asian Screening Array)
chips from
> Illumina. I am trying to correct GC-waves for LRR/BAF using PennCNV.
1. Do
> I need to compile the GC model file myself based on
gc5Base.txt.gz(hg19)?
> 2. Can the above document be applied to non-European groups
> (gc5Base.txt.gz) as well? I am looking forward to hearing from you
soon. —
> Reply to this email directly, view it on GitHub <#97
> <#97>>, or unsubscribe
>
https://github.com/notifications/unsubscribe-auth/ABNG3OC7CQSE4D32LJ55JC3WFM6NVANCNFSM6AAAAAARQUSMIM
> <
https://github.com/notifications/unsubscribe-auth/ABNG3OC7CQSE4D32LJ55JC3WFM6NVANCNFSM6AAAAAARQUSMIM>
> . You are receiving this because you are subscribed to this
thread.Message
> ID: @.*>
>
> Hi, Kai
> I have another question about "genomic_wave.pl".
> Are autosomal chromosomes and mitochondria probes corrected for GC
waves
> using "PennCNV", separately? Because when I was using "
genomic_wave.pl" I
> found it had a parameter "--distance ", which refers to the minimum
> marker-marker distance for training model (default=1Mb). However, the
> length of the chromosomal mitochondria was only 16569 bp. So what
> (--distance ) is the appropriate setting for mitochondria?
>
> I am looking forward to hearing from you soon.
>
> —
> Reply to this email directly, view it on GitHub
> <
#97 (comment)>,
or
> unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ABNG3OGF6KW5QJDBY2FTKFTWFPFMTANCNFSM6AAAAAARQUSMIM>
> .
> You are receiving this because you commented.Message ID:
> ***@***.***>
>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID:
***@***.***>
—
Reply to this email directly, view it on GitHub
<#97 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OBA2LOOXXAR5ICYWSDWFPLKPANCNFSM6AAAAAARQUSMIM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi, Kai |
No, you cannot compile GC model using cal_gc_snp.pl file because it
requires a specific input file that does not include
mitochondria information. You need to compile the GC model yourself, by
writing a script yourself that calculates the GC content of each window
around a marker. Because of the 16kb size, I doubt that GC has too much
influence on copy number estimates though. So if it is too challenging, you
do not need to do adjustment and see how the results go first.
…On Mon, Oct 31, 2022 at 9:34 PM Pam ***@***.***> wrote:
Hi, Kai
Thank you for your advice. I am mainly focused on mitochondrial copy
number. I don't know how much GC affects the mitochondrial copy number
estimates.
To summarize, If I want to perform GC-WAVES correction, I first compile
the GC model of the mitochondria using "cal_gc_snp.pl" and then correct
GC waves with "genomic_wave.pl", setting "--distance " to 1 kb.
I am looking forward to hearing from you soon.
—
Reply to this email directly, view it on GitHub
<#97 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OF67MEB6VHA5KPE6D3WGBXS7ANCNFSM6AAAAAARQUSMIM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi, Kai 585 chrM 0 5120 chrM.566093 5 1024 579463291 /gbdb/hg19/wib/gc5Base.wib 0 100 1024 45520 2531200 So I can use this file to compile the GC model for autosomes and mitochondria using "cal_gc_snp.pl". Considering that mitochondria are smaller, it should also be possible to train autosomes (1Mb) and mitochondria (1kb) separately. |
I meant you have to write your own script to compile GC statistics, because
the current gc5Base is actually 5kb resolution, not 5bp resolution. You can
write a script yourself, taking the mitochondria sequence, and then
calculate the GC content for each 1kb window and consider the circular
shape as well as current calculation always assumes linear genome. (You do
not need to train any model; it should work directly as long as you have
the GC content information in the same gcmodel file)
Furthermore, chrM does not mean the same mitochondria that you may be
using. I explained this in question 46 of
https://annovar.openbioinformatics.org/en/latest/misc/faq/. So it is best
that you use the exact mitochondria reference sequence that you are using
for the SNP array.
…On Wed, Nov 2, 2022 at 4:57 AM Pam ***@***.***> wrote:
Hi, Kai
Thank you for your reply. The main problem of my work is to perform
GC-WAVE correction of the LRR of all SNPs in the array and to estimate the
copy number using corrected-LRR of mitochondria.
Why can't I compile GC MODEL? I found that the file "gc5Base" downloaded
on PennCNV github contains mitochondrial information (chrM in
hg19.gc5Base.txt). The mitochondrial information is as follows (total 4
rows, that is four ~5kb fragments):
585 chrM 0 5120 chrM.566093 5 1024 579463291 /gbdb/hg19/wib/gc5Base.wib 0
100 1024 45520 2531200
585 chrM 5120 10240 chrM.566094 5 1024 579464315
/gbdb/hg19/wib/gc5Base.wib 0 100 1024 45900 2605200
585 chrM 10240 15360 chrM.566095 5 1024 579465339
/gbdb/hg19/wib/gc5Base.wib 0 100 1024 45160 2511200
585 chrM 15360 16570 chrM.566096 5 242 579466363
/gbdb/hg19/wib/gc5Base.wib 0 100 242 10840 603200
So I can use this file to compile the GC model for autosomes and
mitochondria using "cal_gc_snp.pl".
Considering that mitochondria are smaller, it should also be possible to
train autosomes (1Mb) and mitochondria (1kb) separately.
—
Reply to this email directly, view it on GitHub
<#97 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABNG3OBROYU3MNIHQYK6ARTWGIUIJANCNFSM6AAAAAARQUSMIM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Hi, Kai Name Chr Position GC Finally, I merge the two files into one and use "genomic_wave.pl" to correct the GC-waves for LRR. My ASA arry's probe file “source” column shows different platforms, such as dbsnp, rCRS, NCBI. I think the source of the probes does not matter, so I'll just pick a rCRS as a reference. |
Hi, Kai
Thank you for your tool. I am trying to apply it in my work.
I have microarray sequencing data of about 3000 individuals with ASA (Asian Screening Array) chips from Illumina. I am trying to correct GC-waves for LRR/BAF using PennCNV.
I am looking forward to hearing from you soon.
The text was updated successfully, but these errors were encountered: