-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process trace events into intermediate storage format #35
Comments
@mjcarroll I would just like to clarify a few things:
So I'm guessing you're talking about an alternative to step 2.ii, and not talking about storing the events themselves? Then in parallel we can change steps 1.i/2.i, which is kind of more related to #22. |
Yes, mostly talking about an alternative to 2.ii in this outline. Depending on the output of #22, there may be a potential of collapsing 2.i and 2.ii into a single step. For example, if |
Coming at this from the usage end, we have two different kinds of information in the CTF
Meta-data is emitted first, but due to things like the life-cycle, system modes and more complex launch scenarios, the entire tracefile has to be scanned to be sure to get everything. We usually need all meta-data for later association. For reasons of efficiency and storage size, I am assuming that we want to store meta-data separately also during later stages, but note that we never measured the advantage of this, and due to things like category tables etc., merged storage might actually be comparable. In contrast, for activity data, it is often sufficient and quite often very useful to process just parts of it, usually temporal chunks For example, for analysis of performance, we usually need to differentiate at least where the system is starting up, idle, active, or shutting down. Many systems also frequently switch between active and idle. Last, but not least, memory-wise it can be necessary to load data partially. I think it doesn't matter very much in practice whether we store data after it has been converted into a pandas dataframe or before, assuming that we're using one of several data storage formats which can be easily written from and loaded into pandas dataframes (like those from Apache Arrow). |
I was hoping, but could not find evidence, that |
As first discussed in safe-ros/ros2_profiling#1
The idea would be to read raw CTF traces into some intermediate time-series data that is well suited for analysis tasks. Further high-level APIs could be built to ingest the intermediate data.
General requirements:
tracetools_read
is currently using Pandas DataframesProposed alternatives:
CC: @iluetkeb
The text was updated successfully, but these errors were encountered: