-
Notifications
You must be signed in to change notification settings - Fork 36
/
Copy pathTODO
421 lines (320 loc) · 19.5 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
BUGS:
- ThreadScope DEADLOCKs occasionally, more often with --debug, why?
- X Window System error sometimes?
- background of some widgets on Windows are white when they are grey in Linux
- Make ^C work on Windows
- resizing the panes causes a grab lockup?
Happened to Mikolaj on 7.11.2011, too.
- fix, rewrite or disable partial redrawing of the graphs pane
that causes many of the graphical bugs reported below
- rewrite lots of drawing code to sidestep the fixed point precision
problem of cairo
- scrolling to the right, we get some over-rendering at the boundary
causing a thick line
- (probably the same as above)
the gray vertical lines get sometimes randomly darker when scrolling
by moving the scrollbar handle slowly to the right (probably caused
by testing an x coordinate up to integral division by slice or without
taking into account the hadj_value, so the line is drawn
many times when scrolling)
- scrolling when event labels are on chops off some labels
- rendering bug when scrolling: we need to clip to the rectangle being
drawn, otherwise we overwrite other parts of the window when there
are long bars being drawn, which makes some events disappear.
- sideways scrolling leaves curve rendering artifacts (e.g., the thicker
fragments of the flat line at the end of the Activity graph)
- sometimes 2 labels are written on top of each other even at max zoom,
e.g. "thread 3 yielding" and "thread 3 is runnable"
- some sequence of enabling/disabling labels and traces leave the timeline
too short to display all traces; refreshing fixed this
- may be a feature: filling graphs with colours is from line 1 upwards,
not line 0, so lines at level 0 seem under the filled area, not level with it
- a few levels of zoom in and then zoom out sometimes results in only
the rightmost fragment of the timeline shown, no indication in the scrollbar
that scroll is possible, but scrolling indeed fixes the view
- ticking the trace boxes off sometimes shows a black rectangle in place
of the switched off graph
OTHER:
- sort bookmark view by time?
- Delete key deletes bookmarks?
- hotkey to add a bookmark?
- 25%/50%/75% percentiles for pool size, see
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.6199
figure 15
WARNING: unlike mean, the median and other percentiles can't be computed
from percentiles of subnodes of the trees --- they need the whole data
for the interval in question at each level of the tree. This increases
the cost from O(n) to O(n * log n), where n is the total number of samples.
Additional log n factor, in comparison with mean, is probably inevitable
unless we put the data in an array, because otherwise we have to
sort the data for each interval to find the k-th element.
An extra problem is that to get accurate percentiles for splices
that do not match a single subtree node, we have to get the whole
data for the splice again, completely repeating the calculations.
Then the preprocessing via creating the tree would only be useful
if the tree stores the whole data at each tree level, already partitioned
and the data for each slice may be gathered cheaply (but a bit inaccurately,
see the use of SparkStats.rescale in the current code) by only looking
at a few nodes of the tree at a single level, instead of traversing
a very long list. There are better data structures than the spark tree
for quick lookup of sorted data, so let's remove the pool sizes from
the spark tree altogether and hack them separately (or use the better
structures for everything). Use the trick with calculating percentiles
from all raw data, but after quantizing it into a histogram,
to make it tractable; the trick is orthogonal to the change of data
representation to multi-level array aggregates, but with old data
representation it may be too slow
- resample the data (morally) uniformly, unless the sampling
is changed from GC time points to equal intervals
(note that with resampling, with enough extra inserted sample points,
the median approaches the mean, so calculating the median for pool size
does not make sense; however, percentiles still make sense --- they are
not just mean*(n/m), e.g., for y=5, all percentiles are 5,
while for y=x, from 0 to 10, they differ, regardless of sample density
and uniformity, except trivial cases)
- remove the grey halo in trivial cases of pool size,
like the line: ___/----, but keep min/max for ___/\___
- make sure the user understands that the _areas_ are proportional
to the total number and curve points to rates and that the green area
in spark creation and the total area in spark conversion are equal;
perhaps tell so in the help tab or in the timeline text summary tab
- the same aggregate style for the activity graph, to see min/max
(the green area does convey the total runtime, so perhaps mark min
just as a line, no grey shadow); or just redo it completely as the pool graphs
- test, in particular the quality of sampling, on the parfib suite;
ideally generate sampled events from accurate events and compare visually
and/or numerically
- either change scale together with zoom level (and keep a colored box
showing how much pictures space corresponds to how many sparks (should be
probably constant)) or (this one implemented currently:)
start with the best scale for the complete view
and the let the peaks be clipped, let the user manually readjust scale then
- limit the use of save/restore and similar crude hacks
- in zoom out, activity and spark graphs seem cut off at the ends,
which is mathematically correct, but one more slice or even pixel
on each end would make it look better, without compromising correctness
too much (just make sure the extra space does not grow with zoom level!);
the sparks and Activity are already rendered with (up to?) one extra slice
at the ends, but somehow this does not show (perhaps one is not enough)
- perhaps enlarge the main timeline canvas to the right to the nearest tick
- perhaps draw the trace labels vertically and reduce the size
of the Y axis area
- click and drag the view (or the selected interval,
perhaps shift-click or control-click when starting to drag)
- draw the detailed view in the background
- bookmarks
- save
- measure the time between two markers
- search for events in the event window, also filter events using regexps;
alternatively, adding more categories of events to the RTS could help
so that the user can enable only the needed events
- indicate when one thread wakes up another thread, or a thread is migrated;
perhaps draw lines/arrows between the events?
- event list view
- respond to page up / page down / arrows (FOCUS)
- interact with bookmarks
- left pane for
- bookmarks
- list of traces
- traces would be a tree-view, e.g.
* HECs
* 0
* 1 etc.
* Threads
* 0
* 1 etc.
* RTS
* live heap
* ETW / Linux Performance counters
* cache misses
* stalls
* etc.
- some way to reorder the traces? dragging and dropping?
- when rendering threads, we want some way to indicate the time
between creation and death - colour grey?
- a better progress bar
- animate zoom level transitions:
ways to make the zoom in/out less confusing for users
(e.g. the sudden appearance of spikiness once thresholds)
animating the transitions would make it clearer
generate the bitmap of the higher resolution new view,
and animate it expanding to cover the new view
it'd be quick since it's just bitmap scaling
so the user can see the region in the old view
that is expanding out to become the new view
- let the user set the interval size/scale,
as an advanced option, _separately_ for each graph stack,
each HEC, each visible region at the current zoom level,
each selected region (if they are implemented)
- a button for vertical zoom (clip graphs if they don't fit at that zoom)
and/or select regions of time in the view and zoom only that region of display;
- an option to automatically change scale at zoom in to take
into account only the visible and smoothed part of the graphs
- label coloured areas with a mouseover, according to what they represent
- move the text entents stuff out of drawing ThreadRuns
- overlay ETW events
- integrate strace events in the event view on Linux (note that
linux perf events are already available in TS)
- colour threads differently
- thread names rather than numbers
- live heap graph
- summary analysis
- perhaps draw the graphs even if only the fully-accurate,
per-spark events are present in the logs (by transforming them
to the sampled format)
- and/or extrapolate data for large zoom levels with scarce samples
by slated lines, not just horizontal lines (which work great with
numerous enough samples, though)
- merge adjacent 0 samples:
if we have equal adjacent samples we just take the second/last
can be done when it's still a list, before making the tree;
simple linear pass, and lazy too;
does not make sense in per-spark events mode
- use the extra info about when the spark pool gets full and empty;
we know when the spark pool becomes empty because we can observe
that the threads evaluating sparks terminate; similarly, we know overflow
only occurs when the pool is full; but note the following:
"dcoutts: also noticed that we cannot currently reliably discover when the spark
queue is empty. I had thought that "the" spark thread terminating
implied that the queue is empty, however there can be multiple spark
threads and they can also terminate when there are other threads on that
capability's run queue (normal forkIO work takes precedence over
evaluating sparks)"
- consider making the graphs more accurate by drilling down the tree
to the base events at each slice borders:
they don't go down to the base events at each slice boundary
but only at the two viewport borders (hence the extra slices
at the ends dcoutts noticed).
The current state generally this results in smoothing the curve,
but a side-effect is that the graphs grow higher visually
(the max is higher) as the zoom level increases.
- or increase the accuracy by dividing increments not by the slice
length (implicitly), but the length of the sum of tree node spans
that cover the slice, similarly as in gnuplot graphs
- perhaps we should change the spark event sampling to emit events
every N sparks rather than at GC; but only if experiments (do more accurate
sampling and compare the general look of the graphs)
show that linear extrapolation of the GC data is not correct
(large spark transitions happen in the long periods between GC
and we don't know when exactly) (the 4K pool size guarantees that at least
with large visible absolute spark transition, the invisible transitions
can't be huge in proportion to the visible ones, so then linear extrapolation
is correct))
- use adaptive interval, depending on the sample density at the viewed fragment
- perhaps, depending on sample density, alternate between raw data,
min/max, percentiles; so the raw data line slowly explodes into a band,
a big smudge, like a string of beads, that gets even more detailed
and perhaps wider, when data density/uniformity allows it;
in other words: a thin line means we only guess that's where the data might be
a thicker one, with mix/max means we have some data,
but too irregular/scarce to say more, and full thickness line,
with percentiles means we have enough data or evenly distributed enough
to say all
- have *configurable* colors, like in Eden; where to save them?
- Eden TV has good visualisation for messages between processes & nodes,
(steal it, when we do more work on Cloud Haskell)
- show time corresponding to the eventlog buffer flushing,
after the event for this introduced
- verify correctness of the input eventlog before visualizing; define a method
of passing the error report from ghc-events to TS and displaying it; have
a grace period when that can be disabled via an option, until RTS and validation
stabilize
- perhaps the far left and far right bars on the histogram should be in red,
since extreme spark sizes are usually bad (or mark it in some other way)
- or colour stacked bars of the histogram according to strategy label
- determine which bars represent too small and too large sparks
based on the estimation of spark overhead divided by spark duration, etc.
- dynamic strategy labelling (lots of stuff has to be done elsewhere, first)
- mousing over or clicking a bar should tell "1.3s worth of sparks (28 total)
that took between 16--64ms to evaluate"; people want to know the number
of sparks as well as the total eval time, in particular kosmikus wants
to know this
- either show "Loading" when calculating spark duration histogram
for a selected range on the timeline graph, or store precomputed data
in the data structures for speedy drawing of the main timeline graphs,
however they are (re)implemented in the end (mipmap or tress or anything)
- manage histogram interval selection as gimp does: drag to create a selection,
then move the selection, etc.
- extend the selection stuff, e.g. keep bookmarks sorted and let you select multiple bookmarks which would then give you a range selection
- add a new info / stats tab giving numeric info about the current selection, e.g., the amount of cpu time used on each hec, mutator, gc, also sparks, basically all the things that ./prog +RTS --blah can give you, but for any time period, not just whole run
- perhaps the current "reset zoom" button could change the x and y scale of timeline and histogram to fit the current region (which initially and most of the time later on is just the whole eventlog)?
- perhaps we need a floating "resize me" button to pop up on the graph whenever your current view ought to be resized , e.g. clipping at the top, or tallest bar taking up less than half the vertical space, or put that in red letters into the popups that we'll have at some point, or mark the upper or lower edge of the graph red (depending if it's clipped or too small) and blink the y-zoom reset button (to the right of the x-zoom reset button) in red, too
- select a region and display it in a separate tab
(e.g. next to the events tab)
- scroll around the graph image via a small zoomed out window
"The Information Mural: A Technique for Displaying
and Navigating Large Information Spaces"
- zoom histogram for granularity: increase the number of bars,
keeping the min/max unchanged; that requires log with base other than 2
- zoom histogram for detail: keep the number of bars the same,
but change the scale to only show sparks between a narrower min/max;
for this log 2 is enough, with a constant offset
- compare any of the graphs for the whole time and for the interval: either
by opening/detaching window/pane dedicated to the interval data
or by a permanent tab in the bottom pane with selectable traces
that only ever show the data from the selected interval
and when you zoom it, the interval changes in the upper pane
(but histogram should not be among those traces, because it does not scroll,
does not have the time X axis, and generally does not fit and is confusing)
- perhaps use the tab with traces to guide what is exported to PDF/PNG;
also, since histogram is not among the traces, use some other UI element
to specify in the histogram should be shown (currently disabled permanently)
- events in the info pane should be searchable, and possibly numbered
- selecting a range of events (via a shift-click) in the info pane should create a selection as might be done with the mouse.
- select and copy events from the info pane to the OS clipboard
- perhaps, on the X scale of timeline, show only offsets, to save space:
that is, only give the full time for the left point of the view and all other
times relative to that, e.g., 2.356 s, +1ms, +2ms, etc.
- and add a way to add a bookmark for a user-specified event, or a kind of events
- medium-term: let the user configure which instant events shown in the trace
- medium-term: user-defined visualisation frequency for the new trace (at least)
and any other tricks needed to visualize instant events well (show density
even if many events at almost the same spot and so the shades of green cue
is not possible) and fast (in particular, don't draw many lines at the same spot)
some random user feedback on the "lots of events" use case of traceEvents:
1. I don't need/use bookmarking (I can use the visual clues to jump), or even grep the event log
2. I find the visualization helpful, but disturbing, because it hides other information
3. it would be great to be able to visually distinguish different "messages"
4. it would be nice if drawing wouldn't slow down significantly even for a relatively small number of events (i.e., 100 is noticeable slowdown already)
- make TS useful for par-monad: generate and visualize user code / libs events, make the same kind of granularity histogram as for sparks, etc.
- make TS useful for concurrent code: forkIO concurrency events, properly implement the event merging for the cluster use case
- make TS useful for sequential code, in addition to showing the GC pattern
- make TS useful for the cloud haskell use cases
- add finite machines for the GC and other events (after new events are added and/or extended) and use validation profiles to count some of the summary data that is calculated by hand right now
- another possibility to visualize sparks when there are not many of them would be to show an activity graph purely for spark evaluation, ie only counting those threads that are doing spark evaluation
- a button to snap scale to a power of 10, that'd solve:
I need a way to show two different eventlog files with the same time
scale. The attached picture was generated from two different files, but I
couldn't find a way to make the time scale consistent. I expect this is
because the 'zoom' button scales the plot based on the total length of the
eventlog file, instead of using a fixed pixels/second ratio, which it what I
really need.
Alternatively, have dialog to set the time scale specifically,
as in an electronic oscilloscope, but in our case, with zoom levels
that multiply/divide the scale by 2, detailed settings are rather irrelevant.
- another idea illustrated by an electronic oscilloscope: folding a graph
at all occurrences of an event and showing the overlaid plot
Given a regex named "compute iteration N start", find all instances of
of this event within a user-defined time interval, and overlay the plots on
top of each other. Ideally, each plot would be drawn partially transparent,
so the color would be brighter when most instances were active at the same
time since the event start. Alternatively, could could just average all the
instances.
I want to see if different iterations behave similarly.
On an electronic oscilloscope this would be done the
"trigger" setting. On an oscilloscope the beam traces from left to
right on the display. When it runs off the right-hand-side it waits
for a particular event in the signal before it starts tracing again.
For a continuous signal the event would be that signal reaching a
given threshold -- and the trigger knob on the bottom right-hand-side
sets this threshold. Due to persistence-of-vision, you'd see many
separate traces overlaid on the display. For ThreadScope the trigger
should be a particular event selected with a regex.
- show a histogram of the time difference between non-overlapping event pairs.
Specifically, If I have events named "image filter iteration N start" and
"image filter iteration N end" I want a histogram of how long those
iterations took.
- highlight the HEC of the current event selected in the Raw Events view
(in the timeline)
- for small screens with many HECs, make it possible to zoom the canvas
vertically (perhaps via M-mouse wheel?)