Skip to content

Commit

Permalink
Adds /timetrace command to generate a flame graph from an input trace (
Browse files Browse the repository at this point in the history
…#8)

* Accepts a new command-line switch to output Chrome's trace instead of WPA's (still does nothing useful with it other than checking it's provided)

* Refactors DoAnalyze and DoStop to separate WPA and Chrome trace outputs

* Refactors new arguments into the same line

There's no apparent max line length to cut them and looks like out arguments are the ones to get cut to the next line

* Adds new analyzer to recreate the execution hierarchy, starts adding data

* Updates entry's StopTimestamp when its associated activity ends

* Adds a new view to export to Chrome trace's JSON, uses nlohmann library

Exported entry names are the raw activity names, for now

* Removes leftover comment

* Creates auxiliary class to calculate ProcessId and ThreadId remappings of entries, to keep subhierarchies packed together

* Moves ChromeFlameGraphView into the new subfolder, to keep related classes together

* Calculates ProcessId remapping for root entries in the flame graph

* Renames method and creates new one to calculate ThreadId remaps

It will consume some data that's not yet calculated

* Calculates local ThreadId for entries, starts working on calculating the required number of ThreadId to fit a given hierarchy

* Calculates and applies new ThreadId for all entries so they are tightly packed with their hierarchies

* Adds the ability to filter entries in the Chrome trace to keep output file manageable

Currently allows filtering templates via /templates flag and both templates and functions that last for little time

* Adds names to TemplateInstantiation activities

* Removes "TODO: cache" comments for entry names, could be an interesting optimization for memory however

* Fixes spacing on error message

* Fixes "/stop /chrometrace", which wasn't working

* Moves ChromeFlameGraphView::AddEntry to an internal function to prevent including nlohmann\json.hpp from the header

* Adds documentation bit

* Removes indentation from internal namespace, similar to other places in the code

* Updates README.md to include /chrometrace option

* Fixes /start arguments section in README.md

* Updates README.md to give a little description of the new files

* Adds some extra data to Chrome trace: command line, useful environment variables (check BuildExplorerView.cpp for reference) and compiler/linker tool version, path, working directory

* Compiler/Linker's tool path may not be available if the analyzed project was compiled with an old toolset version (i.e. v141)

* Assumes symbol keys are unique (after re-reading the SDK's docs)

* Fixes wrong .h/.cpp placement in filters

* Adds type to constants

* Refactor for improved understandability

* Renames flame graph options to make it vendor-independent, switches to edge://tracing in README.md

* Renames TimeTraceFlameGraphView to TimeTraceGenerator

The View suffix is related to WPA

* Moves filters around to separate WPA-related files from TimeTrace-related ones

* Prevent copying pair

* Removes size_t to int warning by using iterators

* Refactors opening bracket for if and else whose body is just one line, to match the rest of the codebase

* Adds path from FileInput and FileOutput events to Compiler and Linker entries

* Removes namespace from method signature, it's already part of an using clause

* Modifies how TemplateInstantiations are filtered: if root passes, full hierarchy passes (children don't get filtered)

* Adds Invocation ID to Compiler and Linker entries, much like ContextBuilder does

* Enforces /timetrace to be placed right before the output file

* Moves files into a new physical structure to match solution Filters (modified in commit 986da76)

* Updates "Overview of the code" section in README.md to match the new physical structure of files and folders

* Updates /stop and /analyze commands documentation to make /timetrace option clearer

* Prints /stop and /analyze command line hints in two lines: one for the original command and other for the /timetrace version

* Uses bool operator instead of the explicit != nullptr check

* Promotes check out of the if statement

* Functions and types from the Microsoft::Cpp:BuildInsights::Internal aren't meant to be used

* Uses CP_UTF8 codepage when converting between std::wstring and std::string

* Updates variable names to match new class name (leftover from previous refactors)

* Removes leftover code from previous refactor

* Refactors activity filtering from TimeTraceGenerator to ExecutionHierarchy

Thanks to this, TimeTraceGenerator deals with real leaf entries instead of entries that may end up being leafs if all their children get filtered out
It will now output one X trace event instead of a pair of B and E types next to each other

* Adds screenshot of the /timetrace option and uses it in README.md

* Renames Thread activities to reflect their parent (i.e. Thread within CodeGeneration becomes CodeGenerationThread)

* Invocation activities can have several FileInput and FileOutput events

Pad number of FileInput/FileOutput with leading zeroes to keep them sorted as they were emitted (or else viewer would show 1 then 10s, 2 then 20s, ...)

* Updates timetrace media for README.me to include latest changes in activity names

* Refactors OnFinishRootTemplateInstantiation and OnFinishNestedTemplateInstantiation into OnFinishTemplateInstantiation

There was a bug in CppBuildInsights v1.0.0 that forced us to separate them, but with v1.1.0 that bug is gone and they can be merged

* Fixing a problem that occurs due to a rare bug in the linker that causes it to sometimes emit FileInput events with an empty path.

Co-authored-by: Kevin Cadieux <[email protected]>
  • Loading branch information
MetanoKid and kevcadieux authored Aug 25, 2020
1 parent 1b86b7f commit 172a0f5
Show file tree
Hide file tree
Showing 27 changed files with 1,244 additions and 135 deletions.
31 changes: 19 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@ vcperf is a C++ build analysis tool for the MSVC toolchain. It is built on top o

![Overview of what a vcperf trace looks like when viewed in WPA.](media/vcperf-wpa-overview.gif)

vcperf can also generate flame graphs viewable in Microsoft Edge's trace viewer, as shown below:

![Overview of what a vcperf trace looks like when viewed in Microsoft Edge.](media/vcperf-timetrace-overview.png)

## How vcperf works

vcperf makes use of the [Event Tracing for Windows](https://docs.microsoft.com/windows/win32/etw/about-event-tracing) (ETW) relogging interface available in the [C++ Build Insights SDK](https://docs.microsoft.com/cpp/build-insights/reference/sdk/overview?view=vs-2019). This interface allows vcperf to translate an MSVC build trace into a new, customized ETW event format that is suitable for viewing in WPA. The translation process involves determining the context of each event, and emiting new events that include this information. For example, when vcperf emits an event for the code generation time of a function, it also includes the compiler or linker invocation in which the code generation took place. Having this context available allows gaining more insight from the data, such as determining the functions that took longest to generate for one particular invocation.
Expand Down Expand Up @@ -71,33 +75,36 @@ Viewing vcperf traces in WPA requires the C++ Build Insights WPA add-in. Install

| Option | Arguments and description |
|------------------|---------------------------|
| `/start` | `[/nocpusampling]` `[/level1 | /level2 | /level3]` `<sessionName>` |
| `/start` | `[/nocpusampling]` `[/level1 \| /level2 \| /level3]` `<sessionName>` |
| | Tells *vcperf.exe* to start a trace under the given session name. There can only be one active session at a time on a given machine. <br/><br/> If the `/nocpusampling` option is specified, *vcperf.exe* doesn't collect CPU samples. It prevents the use of the CPU Usage (Sampled) view in Windows Performance Analyzer, but makes the collected traces smaller. <br/><br/>The `/level1`, `/level2`, or `/level3` option is used to specify which MSVC events to collect, in increasing level of information. Level 3 includes all events. Level 2 includes all events except template instantiation events. Level 1 includes all events except template instantiation, function, and file events. If unspecified, `/level2` is selected by default.<br/><br/>Once tracing is started, *vcperf.exe* returns immediately. Events are collected system-wide for all processes running on the machine. That means that you don't need to build your project from the same command prompt as the one you used to run *vcperf.exe*. For example, you can build your project from Visual Studio. |
| `/stop` | `<sessionName>` `<outputFile.etl>` |
| | Stops the trace identified by the given session name. Runs a post-processing step on the trace to generate a file viewable in Windows Performance Analyzer (WPA). The `<outputFile.etl>` parameter specifies where to save the output file. |
| `/stop` | (1) `[/templates]` `<sessionName>` `<outputFile.etl>`<br/>(2) `[/templates]` `<sessionName>` `/timetrace` `<outputFile.json>` |
| | Stops the trace identified by the given session name. Runs a post-processing step on the trace to generate a file specified by the `<outputFile>` parameter.<br/><br/>If the `/templates` option is specified, also analyze template instantiation events.<br/><br/>(1) Generates a file viewable in Windows Performance Analyzer (WPA). The output file requires a `.etl` extension.<br/>(2) Generates a file viewable in Microsoft Edge's trace viewer ([edge://tracing](edge://tracing)). The output file requires a `.json` extension. |
| `/stopnoanalyze` | `<sessionName>` `<rawOutputFile.etl>` |
| | Stops the trace identified by the given session name and writes the raw, unprocessed data in the specified output file. The resulting file isn't meant to be viewed in WPA. <br/><br/> The post-processing step involved in the `/stop` command can sometimes be lengthy. You can use the `/stopnoanalyze` command to delay this post-processing step. Use the `/analyze` command when you're ready to produce a file viewable in Windows Performance Analyzer. |

### Miscellaneous commands

| Option | Arguments and description |
|------------|---------------------------|
| `/analyze` | [`/templates`] `<rawInputFile.etl> <outputFile.etl>` |
| | Accepts a raw trace file produced by the `/stopnoanalyze` command. Runs a post-processing step on this trace to generate a file viewable in Windows Performance Analyzer. If the `/templates` option is specified, also analyze template instantiation events. |
| `/analyze` | (1) `[/templates]` `<rawInputFile.etl>` `<outputFile.etl>`<br/>(2) `[/templates]` `<rawInputFile.etl>` `/timetrace` `<outputFile.json>` |
| | Accepts a raw trace file produced by the `/stopnoanalyze` command. Runs a post-processing step on this trace to generate a file specified by the `<outputFile>` parameter.<br/><br/>If the `/templates` option is specified, also analyze template instantiation events.<br/><br/>(1) Generates a file viewable in Windows Performance Analyzer (WPA). The output file requires a `.etl` extension.<br/>(2) Generates a file viewable in Microsoft Edge's trace viewer ([edge://tracing](edge://tracing)). The output file requires a `.json` extension. |

## Overview of the code

This section briefly describes the source files found in the `src` directory.

|Item name|Description|
|-|-|
|Analyzers\ContextBuilder.cpp/.h|Analyzer that determines important information about every event, such as which *cl* or *link* invocation it comes from. This data is used by all *View* components when writing their events in the relogged trace.|
|Analyzers\ExpensiveTemplateInstantiationCache.cpp/.h|Analyzer that pre-computes the templates with the longest instantiation times. This data is later consumed by *TemplateInstantiationsView*.|
|Analyzers\MiscellaneousCache.h|Analyzer that can be used to cache miscellaneous data about a trace.|
|Views\BuildExplorerView.cpp/.h|Component that builds the view responsible for showing overall build times in WPA.|
|Views\FilesView.cpp/.h|Component that builds the view responsible for showing file parsing times in WPA.|
|Views\FunctionsView.cpp/.h|Component that builds the view responsible for showing function code generation times in WPA.|
|Views\TemplateInstantiationsView.cpp/.h|Component that builds the view responsible for showing template instantiation times in WPA.|
|WPA\Analyzers\ContextBuilder.cpp/.h|Analyzer that determines important information about every event, such as which *cl* or *link* invocation it comes from. This data is used by all *View* components when writing their events in the relogged trace.|
|WPA\Analyzers\ExpensiveTemplateInstantiationCache.cpp/.h|Analyzer that pre-computes the templates with the longest instantiation times. This data is later consumed by *TemplateInstantiationsView*.|
|WPA\Analyzers\MiscellaneousCache.h|Analyzer that can be used to cache miscellaneous data about a trace.|
|WPA\Views\BuildExplorerView.cpp/.h|Component that builds the view responsible for showing overall build times in WPA.|
|WPA\Views\FilesView.cpp/.h|Component that builds the view responsible for showing file parsing times in WPA.|
|WPA\Views\FunctionsView.cpp/.h|Component that builds the view responsible for showing function code generation times in WPA.|
|WPA\Views\TemplateInstantiationsView.cpp/.h|Component that builds the view responsible for showing template instantiation times in WPA.|
|TimeTrace\ExecutionHierarchy.cpp/.h|Analyzer that creates a number of hierarchies out of a trace. Its data is later consumed by *TimeTraceGenerator*.|
|TimeTrace\TimeTraceGenerator.cpp/.h|Component that creates and outputs a `.json` trace viewable in Microsoft Edge's trace viewer.|
|TimeTrace\PackedProcessThreadRemapping.cpp/.h|Component that attempts to keep entries on each hierarchy as close as possible by giving a more *logical distribution* of processes and threads.|
|Commands.cpp/.h|Implements all commands available in vcperf.|
|GenericFields.cpp/.h|Implements the generic field support, used to add custom columns to the views.|
|main.cpp|The program's starting point. This file parses the command line and redirects control to a command in the Commands.cpp/.h file.|
Expand Down
Binary file added media/vcperf-timetrace-overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions packages.config
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
<?xml version="1.0" encoding="utf-8"?>
<packages>
<package id="Microsoft.Cpp.BuildInsights" version="1.1.0" targetFramework="native" />
<package id="nlohmann.json" version="3.7.3" targetFramework="native" />
</packages>
145 changes: 99 additions & 46 deletions src/Commands.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,15 @@

#include "VcperfBuildInsights.h"

#include "Analyzers\ExpensiveTemplateInstantiationCache.h"
#include "Analyzers\ContextBuilder.h"
#include "Analyzers\MiscellaneousCache.h"
#include "Views\BuildExplorerView.h"
#include "Views\FunctionsView.h"
#include "Views\FilesView.h"
#include "Views\TemplateInstantiationsView.h"
#include "WPA\Analyzers\ExpensiveTemplateInstantiationCache.h"
#include "WPA\Analyzers\ContextBuilder.h"
#include "WPA\Analyzers\MiscellaneousCache.h"
#include "WPA\Views\BuildExplorerView.h"
#include "WPA\Views\FunctionsView.h"
#include "WPA\Views\FilesView.h"
#include "WPA\Views\TemplateInstantiationsView.h"
#include "TimeTrace\ExecutionHierarchy.h"
#include "TimeTrace\TimeTraceGenerator.h"

using namespace Microsoft::Cpp::BuildInsights;

Expand Down Expand Up @@ -165,6 +167,78 @@ void PrintError(RESULT_CODE failureCode)
std::wcout << std::endl;
}

RESULT_CODE StopToWPA(const std::wstring& sessionName, const std::filesystem::path& outputFile, bool analyzeTemplates,
TRACING_SESSION_STATISTICS& statistics)
{
ExpensiveTemplateInstantiationCache etic{ analyzeTemplates };
ContextBuilder cb;
MiscellaneousCache mc;
BuildExplorerView bev{ &cb, &mc };
FunctionsView funcv{ &cb, &mc };
FilesView fv{ &cb, &mc };
TemplateInstantiationsView tiv{ &cb, &etic, &mc, analyzeTemplates };

auto analyzerGroup = MakeStaticAnalyzerGroup(&cb, &etic, &mc);
auto reloggerGroup = MakeStaticReloggerGroup(&etic, &mc, &cb, &bev, &funcv, &fv, &tiv);

unsigned long long systemEventsRetentionFlags = RELOG_RETENTION_SYSTEM_EVENT_FLAGS_CPU_SAMPLES;

int analysisPassCount = analyzeTemplates ? 2 : 1;

return StopAndRelogTracingSession(sessionName.c_str(), outputFile.c_str(),
&statistics, analysisPassCount, systemEventsRetentionFlags, analyzerGroup, reloggerGroup);
}

RESULT_CODE StopToTimeTrace(const std::wstring& sessionName, const std::filesystem::path& outputFile, bool analyzeTemplates,
TRACING_SESSION_STATISTICS& statistics)
{
ExecutionHierarchy::Filter f{ analyzeTemplates,
std::chrono::milliseconds(10),
std::chrono::milliseconds(10) };
ExecutionHierarchy eh{ f };
TimeTraceGenerator ttg{ &eh, outputFile };

auto analyzerGroup = MakeStaticAnalyzerGroup(&eh, &ttg);
int analysisPassCount = 1;

return StopAndAnalyzeTracingSession(sessionName.c_str(), analysisPassCount, &statistics, analyzerGroup);
}

RESULT_CODE AnalyzeToWPA(const std::filesystem::path& inputFile, const std::filesystem::path& outputFile, bool analyzeTemplates)
{
ExpensiveTemplateInstantiationCache etic{ analyzeTemplates };
ContextBuilder cb;
MiscellaneousCache mc;
BuildExplorerView bev{ &cb, &mc };
FunctionsView funcv{ &cb, &mc };
FilesView fv{ &cb, &mc };
TemplateInstantiationsView tiv{ &cb, &etic, &mc, analyzeTemplates };

auto analyzerGroup = MakeStaticAnalyzerGroup(&cb, &etic, &mc);
auto reloggerGroup = MakeStaticReloggerGroup(&etic, &mc, &cb, &bev, &funcv, &fv, &tiv);

unsigned long long systemEventsRetentionFlags = RELOG_RETENTION_SYSTEM_EVENT_FLAGS_CPU_SAMPLES;

int analysisPassCount = analyzeTemplates ? 2 : 1;

return Relog(inputFile.c_str(), outputFile.c_str(), analysisPassCount,
systemEventsRetentionFlags, analyzerGroup, reloggerGroup);
}

RESULT_CODE AnalyzeToTimeTrace(const std::filesystem::path& inputFile, const std::filesystem::path& outputFile, bool analyzeTemplates)
{
ExecutionHierarchy::Filter f{ analyzeTemplates,
std::chrono::milliseconds(10),
std::chrono::milliseconds(10) };
ExecutionHierarchy eh{ f };
TimeTraceGenerator ttg{ &eh, outputFile };

auto analyzerGroup = MakeStaticAnalyzerGroup(&eh, &ttg);
int analysisPassCount = 1;

return Analyze(inputFile.c_str(), analysisPassCount, analyzerGroup);
}

HRESULT DoStart(const std::wstring& sessionName, bool cpuSampling, VerbosityLevel verbosityLevel)
{
TRACING_SESSION_OPTIONS options{};
Expand Down Expand Up @@ -206,29 +280,18 @@ HRESULT DoStart(const std::wstring& sessionName, bool cpuSampling, VerbosityLeve
}


HRESULT DoStop(const std::wstring& sessionName, const std::filesystem::path& outputFile, bool analyzeTemplates)
HRESULT DoStop(const std::wstring& sessionName, const std::filesystem::path& outputFile, bool analyzeTemplates, bool generateTimeTrace)
{
TRACING_SESSION_STATISTICS statistics{};

ExpensiveTemplateInstantiationCache etic{analyzeTemplates};
ContextBuilder cb;
MiscellaneousCache mc;
BuildExplorerView bev{&cb, &mc};
FunctionsView funcv{&cb, &mc};
FilesView fv{&cb, &mc};
TemplateInstantiationsView tiv{&cb, &etic, &mc, analyzeTemplates};

auto analyzerGroup = MakeStaticAnalyzerGroup(&cb, &etic, &mc);
auto reloggerGroup = MakeStaticReloggerGroup(&etic, &mc, &cb, &bev, &funcv, &fv, &tiv);

std::wcout << L"Stopping and analyzing tracing session " << sessionName << L"..." << std::endl;

unsigned long long systemEventsRetentionFlags = RELOG_RETENTION_SYSTEM_EVENT_FLAGS_CPU_SAMPLES;

int analysisPassCount = analyzeTemplates ? 2 : 1;

auto rc = StopAndRelogTracingSession(sessionName.c_str(), outputFile.c_str(),
&statistics, analysisPassCount, systemEventsRetentionFlags, analyzerGroup, reloggerGroup);
TRACING_SESSION_STATISTICS statistics{};
RESULT_CODE rc;
if (!generateTimeTrace) {
rc = StopToWPA(sessionName, outputFile, analyzeTemplates, statistics);
}
else {
rc = StopToTimeTrace(sessionName, outputFile, analyzeTemplates, statistics);
}

PrintTraceStatistics(statistics);

Expand Down Expand Up @@ -270,27 +333,17 @@ HRESULT DoStopNoAnalyze(const std::wstring& sessionName, const std::filesystem::
return S_OK;
}

HRESULT DoAnalyze(const std::filesystem::path& inputFile, const std::filesystem::path& outputFile, bool analyzeTemplates)
HRESULT DoAnalyze(const std::filesystem::path& inputFile, const std::filesystem::path& outputFile, bool analyzeTemplates, bool generateTimeTrace)
{
ExpensiveTemplateInstantiationCache etic{analyzeTemplates};
ContextBuilder cb;
MiscellaneousCache mc;
BuildExplorerView bev{&cb, &mc};
FunctionsView funcv{&cb, &mc};
FilesView fv{&cb, &mc};
TemplateInstantiationsView tiv{&cb, &etic, &mc, analyzeTemplates};

auto analyzerGroup = MakeStaticAnalyzerGroup(&cb, &etic, &mc);
auto reloggerGroup = MakeStaticReloggerGroup(&etic, &mc, &cb, &bev, &funcv, &fv, &tiv);

std::wcout << L"Analyzing..." << std::endl;

unsigned long long systemEventsRetentionFlags = RELOG_RETENTION_SYSTEM_EVENT_FLAGS_CPU_SAMPLES;

int analysisPassCount = analyzeTemplates ? 2 : 1;

auto rc = Relog(inputFile.c_str(), outputFile.c_str(), analysisPassCount,
systemEventsRetentionFlags, analyzerGroup, reloggerGroup);

RESULT_CODE rc;
if (!generateTimeTrace) {
rc = AnalyzeToWPA(inputFile, outputFile, analyzeTemplates);
}
else {
rc = AnalyzeToTimeTrace(inputFile, outputFile, analyzeTemplates);
}

if (rc != RESULT_CODE_SUCCESS)
{
Expand Down
4 changes: 2 additions & 2 deletions src/Commands.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ enum class VerbosityLevel
};

HRESULT DoStart(const std::wstring& sessionName, bool cpuSampling, VerbosityLevel verbosityLevel);
HRESULT DoStop(const std::wstring& sessionName, const std::filesystem::path& outputFile, bool analyzeTemplates = false);
HRESULT DoStop(const std::wstring& sessionName, const std::filesystem::path& outputFile, bool analyzeTemplates = false, bool generateTimeTrace = false);
HRESULT DoStopNoAnalyze(const std::wstring& sessionName, const std::filesystem::path& outputFile);
HRESULT DoAnalyze(const std::filesystem::path& inputFile, const std::filesystem::path& outputFile, bool analyzeTemplates = false);
HRESULT DoAnalyze(const std::filesystem::path& inputFile, const std::filesystem::path& outputFile, bool analyzeTemplates = false, bool generateTimeTrace = false);

} // namespace vcperf
Loading

0 comments on commit 172a0f5

Please sign in to comment.