You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I really like the general direction and having the prototype is super helpful. My immediate main use case / interest is in situations like B.1.526 ala "New York variant" which is of interest due to constellation of spike mutations alongside recent rise in New York:
So, in the "nextfrequencies" case, I'd want a JSON with enough granularity to filter to country USA or division New York and look at the frequency of B.1.526 (or the frequency of 253G+484K).
I would first think this could be accomplished by adding "region", "country" and "division" columns to the list of "traits" in the data/frequencies.json file and then exposing the ability in the app to both (1) "group by" and (2) "filter by" elements in "traits".
In this case you'd hope that the resulting JSON wouldn't be too bloated by splitting "haplotypes" based on geography. However, this seems doubtful as we have 1411 divisions currently categorized and you could easily imagine a >100X increase in JSON size from incorporating division.
Thus, it seems necessary to pre-build a series of JSONs filtered to various geographies. This would be quick and wouldn't be difficult to serve a number of different JSON files of the sort of:
and then we just need an interface to select JSON file of interest.
And as discussed you could imagine an interface to compare multiple JSON files, which could "color by" the same type across multiple frequency panels. This could expose things like B.1.1.7 frequencies across multiple countries or could compare clade frequency predictions across different models. This approach is nice in that you can treat geographic "filtering" in the same fashion as different prediction models.
Does this seem like a reasonable approach?
The text was updated successfully, but these errors were encountered:
Also, I think it's pretty instructive to look at how the well thought out covidcg.org does things. This provides frequencies of clades, lineages and AAs across the entire genome (one at a time). It's fully expressive in terms of filtering by geography, but this means a very long list of check boxes for different regions, countries and admin divisions. I here how you can get to exactly the view you're interested in. However, it's too much clicking much of the time.
I really like the general direction and having the prototype is super helpful. My immediate main use case / interest is in situations like B.1.526 ala "New York variant" which is of interest due to constellation of spike mutations alongside recent rise in New York:
At the moment, there are 748 B.1.526 viruses in GISAID, but if we look at this in current Nextstrain we have just 2 in the North American build and just 37 in the SPHERES New York build. This makes accurate estimation of frequencies quite difficult.
So, in the "nextfrequencies" case, I'd want a JSON with enough granularity to filter to country USA or division New York and look at the frequency of B.1.526 (or the frequency of 253G+484K).
I would first think this could be accomplished by adding "region", "country" and "division" columns to the list of "traits" in the
data/frequencies.json
file and then exposing the ability in the app to both (1) "group by" and (2) "filter by" elements in "traits".In this case you'd hope that the resulting JSON wouldn't be too bloated by splitting "haplotypes" based on geography. However, this seems doubtful as we have 1411 divisions currently categorized and you could easily imagine a >100X increase in JSON size from incorporating division.
Thus, it seems necessary to pre-build a series of JSONs filtered to various geographies. This would be quick and wouldn't be difficult to serve a number of different JSON files of the sort of:
and then we just need an interface to select JSON file of interest.
And as discussed you could imagine an interface to compare multiple JSON files, which could "color by" the same type across multiple frequency panels. This could expose things like B.1.1.7 frequencies across multiple countries or could compare clade frequency predictions across different models. This approach is nice in that you can treat geographic "filtering" in the same fashion as different prediction models.
Does this seem like a reasonable approach?
The text was updated successfully, but these errors were encountered: