-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
earlier CCTCCTT prevents finding leader/D1 #25
Comments
@phycologia what are the flanking regions for this one? I find GACAA as a start (2 of them) and the matching end I have for that is '[AT]TGTC' which means either ATGTC or TTGTC, there is a ATGTC but 500 bases down so I don't think that's it. Am I missing one end for that? |
@nlabrad D1 begins with GACCT and ends with AGGTC |
Aha, I see it. And now I see what you mean. Maybe if there is more than one ITS starting pattern found we do something. Maybe if there is more than one and d1d1 is not found we try again, idk |
@nlabrad @callmcgovern have we ever come across a sequence that doesn't have "AGGGA" right at the beginning of the ITS region? if that's always present then what if the code searched for "CCTCCT[TA]" followed by "(however you would code '2 or 3 variable bases')" followed by "AGGGA"? what's the range of leader sequence lengths we've seen? I think I've found only 7 & 8, but if it's sometimes longer then I guess it'd need to be a wider range of # of bases between CCTCCT[TA] and AGGGA |
Code worked and found boxb but couldn't find leader or D1--D1 has typical start and end sequences, so I think it may be because there is an earlier "CCTCCTT" which occurs in the sequence
accession number:
MT135015.1
The text was updated successfully, but these errors were encountered: