Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Stats #41

Open
hlitz opened this issue Jul 16, 2020 · 5 comments
Open

[Question] Stats #41

hlitz opened this issue Jul 16, 2020 · 5 comments
Assignees
Labels
question Further information is requested

Comments

@hlitz
Copy link
Contributor

hlitz commented Jul 16, 2020

Is there a way to print out the stats in a machine-readable format?
Do you have stats parsers for the output files that you could share?

Also, it seems as if rows 2 and 3 in the stats files are duplicated as rows 4 and 5. What is the reason behind this?

thanks, Heiner

@hlitz hlitz added the question Further information is requested label Jul 16, 2020
@spruett
Copy link
Collaborator

spruett commented Jul 16, 2020

Hey Heiner,

Printing stats:
Scarab can print stats in one of two ways: 1) Scarab will dump all stats at the end of simulation. This is the *.stat.out files that appear in your run directory. 2) Scarab can be configured to periodically dump a stat (e.g, dump cycle_count every x number of instructions.)

We are working on a stats parser for batch runs; i.e., to rollup results across multiple SimPoints or a given stat for each benchmark in a suite. Is this the functionality you are looking for? We do not have a script currently available, but this is actively being worked on and should be ready relatively soon (a month, give or take).

In terms of printing the stats in a machine-readable format (I assume you mean something like JSON), we have discussed this, but there are no current plans to do this.

Duplicate rows (columns?)
Are you referring to the multiple columns in the stats file? If so, there is a subtle difference between columns 2/3 vs 4/5.

Columns 2/3 are affected when the stats are reset. For example, you may choose to simulate for a warmup period and reset the stats before you begin full simulation. Once full simulation ends, columns 2/3 will reflect the stat values during simulation only, whereas 4/5 will represent the complete stats, including warmup. Resetting the stats does not reset columns 4/5.

Does that make sense?
Stephen

@hlitz
Copy link
Contributor Author

hlitz commented Jul 17, 2020

Hi Stephen,
thank you!
With machine-readable, I mean that the format can be easily plotted using e.g. matplotlib. If the stats were in CSV format, it would be trivial to read it into a python script for plotting. In the current form, it requires a more advanced parser to extract the right values from the output files.

duplicate rows: yes I meant columns. Thanks!

@spruett
Copy link
Collaborator

spruett commented Jul 17, 2020

Yeah, I see. The script we are working on will solve your problem. It is a python script that will parse the stats files, then can either be included as a library for matplotlib or can be run as a statndalone script. The only trouble is it has not been finished yet. The "scarab_batch_update" branch of this repository is currently developing that and a few other script updates.

That being said, there is a current (not working) version of the script in bin/scarab_globals/scarab_stats.py, which you can look at if you like. I will leave this question open for now, and update it when the working script has been pushed.

@spruett spruett self-assigned this Aug 12, 2020
@pranav-vempati
Copy link

pranav-vempati commented May 3, 2023

@spruett Do you happen to know whether this script(bin/scarab_globals/scarab_stats.py) is functional, or if the aforementioned issues still persist?

@spruett
Copy link
Collaborator

spruett commented May 12, 2023

Yes these scripts are functional. I’ve been meaning to write some documentation on them for a while but have not gotten around to it.

In short there is a scarab_batch script which can be used to launch jobs either locally or on a SLURM system. Then scarab stats can be used to collect stats across multiple benchmarks or a suite.

The stats script can be used to collect any stats in the stat file or user defined stats. Either way the stats are rolled up and weighted properly in the case of checkpoints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants