Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected sampling behaviour #242

Open
smithara opened this issue Sep 17, 2024 · 2 comments
Open

Unexpected sampling behaviour #242

smithara opened this issue Sep 17, 2024 · 2 comments
Labels

Comments

@smithara
Copy link
Member

When setting a custom sampling step, the sampling "resets" at the beginning of the next day (file?).

Demonstrated here with MAGx_LR, but first observed by Alexander Grayver using the AUX_OBS products.

from viresclient import SwarmRequest

request = SwarmRequest()
request.set_collection("SW_OPER_MAGA_LR_1B", verbose=False)
request.set_products(
    sampling_step="PT10M"
)
data = request.get_between("2024-04-15T23:41:00", "2024-04-16T00:15:00", asynchronous=False, show_progress=False)
df = data.as_dataframe()
                         Radius   Longitude   Latitude Spacecraft
Timestamp                                                        
2024-04-15 23:41:00  6852707.30  -34.592644   4.318464          A
2024-04-15 23:51:00  6857419.52  -35.118510 -33.871531          A
2024-04-16 00:00:00  6858899.52  -32.532711 -68.117402          A
2024-04-16 00:10:00  6858813.01  129.400782 -73.372255          A

The expected behaviour is that the sampling rate continues uniformly, i.e. 23:51, 00:01, 00:11, ...

@pacesm
Copy link
Contributor

pacesm commented Sep 17, 2024

The sampling is currently implemented so that each products is sampled independently. I.e., the new daily product starts at 00:00 and not 00:01 and you can observe this discontinuity.

This is not technically a bug but I understand that it is not what users expect. If it is an issue, I could find a way how to preserve the sampling across the product boundaries (i.e., to carry a time offset from one product to another).

@pacesm pacesm added the defect label Sep 17, 2024
@smithara
Copy link
Member Author

I agree, not exactly a bug and I think I understand the reasoning here. It might instead be worked around in documentation and giving users recipes to get what they want.

If the process were changed, I'm not sure how it could affect non-uniformly sampled datasets.

I wonder if it could be approached with a new alternative process that instead picks data samples at specific times set by the chosen cadence. The number of sample points returned should match the chosen cadence, and gaps filled with NaN, maybe with user-configurable matching behaviour (exact-only/nearest/pick-last/pick-next).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants