-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathapp.html
596 lines (566 loc) · 45.8 KB
/
app.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>General Value Friends</title>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/3.0.5/es5/startup.js"></script>
<script type="text/javascript" async
src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.7/latest.js?config=TeX-MML-AM_CHTML">
</script>
<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}
});
</script>
<link href="{{ url_for('static', filename='bootstrap/css/bootstrap.css') }}" rel="stylesheet">
<!--Custom CSS-->
<!--Custom Fonts-->
<link href="https://maxcdn.bootstrapcdn.com/font-awesome/4.1.0/css/font-awesome.min.css" rel="stylesheet"
type="text/css">
<link href='https://fonts.googleapis.com/css?family=Lora:400,700,400italic,700italic' rel='stylesheet'
type='text/css'>
<link href='https://fonts.googleapis.com/css?family=Open+Sans:300italic,400italic,600italic,700italic,800italic,400,300,600,700,800'
rel='stylesheet' type='text/css'>
<!--HTML5 Shim and Respond.js IE8 support of HTML5 elements and media queries-->
<!--WARNING: Respond.js doesn't work if you view the page via file://-->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
<script src="https://oss.maxcdn.com/libs/respond.js/1.4.2/respond.min.js"></script>
<![endif]-->
<!--<title>Bootstrap Example</title>-->
<!--<meta charset="utf-8">-->
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/css/bootstrap.min.css">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.4.1/js/bootstrap.min.js"></script>
</head>
<body>
<div class="container" style="min-width: 100%">
<div class="row h-100">
<div class="col-lg-7 col-md-7 col-sm-12">
<div class="row h-auto" style="padding:10px;">
<div class="col-lg-6 col-md-6 col-sm-6 flex-fill" id="system_state" style="text-align:center">
<div class="row flexbox text-center">
<div class="col-lg-8 flex-fill d-flex h-100 flex-column" style="min-height:400px">
<div class="row text-center">
<strong>
<a
href="#"
data-toggle="popover"
data-placement="bottom"
data-content="A visualization of the robotic third arm that is being
simulated. The robotic third arm moves back and forth opening its hand on
the most extreme end of its position.">Bento Arm Visualization</a>
</strong>
</div>
<div class="row">
<?xml version="1.0" encoding="UTF-8"?>
<svg id="bento-arm" style="max-width:150px;max-height:190px" width="189" height="911" version="1.1" viewBox="0 0 189 911"
xmlns="http://www.w3.org/2000/svg" xmlns:cc="http://creativecommons.org/ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<metadata>
<rdf:RDF>
<cc:Work rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/>
<dc:title/>
</cc:Work>
</rdf:RDF>
</metadata>
<g>
<path d="m40.338 326.9c7.9396 43.482 8.8024 86.963 9.9536 130.44 35.973-0.6985 80.832 4.5276 107.92-2.0955-4.8356-43.855-3.0504-86.386-1.0478-128.87l-5.2387-0.52388c-4.7552 38.189-13.04 75.496-7.3342 116.3-30.361 9.5762-60.218 13.687-88.535 1.0478 0.91309-39.431 0.9598-78.765-10.477-116.82z"
fill="none" stroke="#000" stroke-width="1px"/>
<path d="m88.918 457.24c-11.237-0.36043-24.349-0.67834-29.137-0.70648l-8.7059-0.0512-0.35089-12.442c-1.4313-50.752-3.6448-80.524-7.9859-107.41-0.75729-4.69-1.2851-8.6112-1.1729-8.7137 0.1122-0.10254 0.93688-0.34328 1.8326-0.53496l1.6286-0.34852 1.6345 6.1809c6.6066 24.983 9.3324 53.101 8.9074 91.88l-0.2056 18.758 3.9426 1.555c5.2335 2.0642 12.243 4.0324 18.349 5.1522 7.7805 1.4269 26.727 1.2968 36.423-0.25018 7.1992-1.1486 19.749-4.0646 27.427-6.3723l3.8661-1.1621-0.37049-3.4937c-1.1419-10.768-1.31-14.081-1.3229-26.075-0.0204-18.934 0.73647-27.427 5.5039-61.762 1.5603-11.237 2.9775-21.448 3.1494-22.691 0.3074-2.2227 0.33996-2.2541 1.978-1.9033 0.91601 0.19614 1.7497 0.43292 1.8526 0.52619 0.10292 0.0933-0.0815 6.1361-0.40981 13.429-2.0441 45.403-1.5781 84.278 1.3308 111.01l0.31923 2.9342-2.0234 0.34985c-16.997 2.9389-29.194 3.3309-66.46 2.1356z"
fill="#999" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".5247"/>
<g fill="#1a1a1a" fill-rule="evenodd" stroke="#4d4d4d" stroke-width="1.0016">
<rect x="59.817" y="348.47" width="77.01" height="94.821" ry="8.382"
opacity=".99"/>
<rect x="56.578" y="313.28" width="1.0477" height="65.484" ry=".52387"
opacity=".99"/>
<rect x="138.83" y="315.63" width="1.0477" height="65.484" ry=".52387"
opacity=".99"/>
</g>
<rect x="57.626" y="315.63" width="82.248" height="12.573" ry=".52388"
fill="#4d4d4d" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width="1.0016"/>
<path d="m57.626 315.63-7.8581-1.3097v-35.623c37.585-4.93 70.501-5.1913 96.917 1.0478v33.004l-6.8104 2.8813z"
fill="none" stroke="#000" stroke-width="1px"/>
<g fill-rule="evenodd" stroke="#4d4d4d">
<path d="m58.177 313.86c-0.22048-1.5454-1.807-1.8347-2.2586-0.4119-0.29612 0.933-0.62983 1.0468-2.2712 0.77479-1.0583-0.1754-2.2188-0.41102-2.5789-0.52359-0.5062-0.15822-0.65484-4.0599-0.65484-17.189v-16.984l1.7026-0.29743c0.93643-0.16358 5.003-0.65902 9.0368-1.101 31.418-3.4422 60.1-2.9276 81.201 1.457l3.1432 0.65316 0.13843 16.006 0.13842 16.006-6.5633 2.7859-40.433-1e-3 -40.433-2e-3z"
fill="#4d4d4d" opacity=".99" stroke-width=".5247"/>
<path d="m55.621 314.72-0.3591-0.0827 0.39184-0.22311c0.38929-0.22165 0.39184-0.22101 0.39184 0.0982 0 0.17673-0.01473 0.31436-0.03274 0.30583-0.01801-9e-3 -0.19434-0.0527-0.39184-0.0982z"
fill="#4d4d4d" opacity=".99" stroke-width=".13118"/>
<rect x="59.29" y="187.07" width="77.01" height="94.821" ry="8.382"
fill="#1a1a1a" opacity=".99" stroke-width="1.0016"/>
</g>
<g fill="#4d4d4d" fill-rule="evenodd" stroke="#4d4d4d" stroke-width=".13118">
<path d="m57.388 315.01c-0.09629-0.0389-0.16371-0.30429-0.16371-0.64451 0-0.31814-0.05894-0.57844-0.13097-0.57844-0.07203 0-0.13097 0.27152-0.13097 0.60337 0 0.49639-0.03673 0.58928-0.20716 0.52388-0.11394-0.0437-0.29075-0.0795-0.39291-0.0795-0.1367 0-0.18587-0.18157-0.18623-0.68759l-4.85e-4 -0.68758-0.26151 0.42891c-0.4137 0.67853-0.76747 0.75254-2.3569 0.49308-2.4339-0.39733-2.6325-0.46303-2.8303-0.93652-0.35688-0.85412-0.43985-4.4577-0.44185-19.19l-2e-3 -14.882 0.2792-0.073c0.15356-0.0402 1.0081-0.17764 1.899-0.30552 0.89092-0.12787 1.6198-0.27685 1.6198-0.33105s0.07738-0.0689 0.17195-0.0326c0.09457 0.0363 2.5551-0.19892 5.4679-0.52269 2.9128-0.32377 5.738-0.62504 6.2783-0.66949 0.54025-0.0444 1.0707-0.11339 1.1787-0.15319s0.4455-0.0604 0.74989-0.0457c0.30439 0.0146 0.62613-0.0183 0.71499-0.0732 0.08886-0.0549 0.413-0.0844 0.72033-0.0655 0.30732 0.0189 0.62144-4e-3 0.69803-0.0517 0.07659-0.0473 0.50531-0.0992 0.9527-0.11528 1.5794-0.0567 7.6247-0.52638 7.6893-0.59738 0.03602-0.0396 0.62538-0.0773 1.3097-0.0839 0.68431-7e-3 2.6292-0.0821 4.322-0.16778 16.578-0.8396 33.66-0.33351 46.343 1.373 0.81745 0.10999 1.5689 0.16827 1.6698 0.12953 0.10098-0.0387 0.18359-7e-3 0.18359 0.0707 0 0.0841 0.18446 0.11965 0.45632 0.0879 0.25098-0.0293 0.47199-8e-3 0.49113 0.0474 0.0192 0.0554 0.38843 0.12649 0.82063 0.15802 0.43219 0.0315 0.87421 0.0876 0.98226 0.12463 0.10805 0.037 1.1394 0.22408 2.292 0.41567 1.1525 0.19159 3.2068 0.57713 4.565 0.85675 1.3582 0.27961 2.5075 0.50111 2.5539 0.49221 0.0464-9e-3 0.17279 0.0192 0.28084 0.0624s0.31433 0.0976 0.45839 0.12071c0.28551 0.0459 0.28309-0.0839 0.47101 25.294l0.0522 7.051-6.3947 2.7275h-40.7c-38.652 0-40.703-0.0115-40.76-0.2292-0.05704-0.21828-0.06062-0.21828-0.07509 0-0.01467 0.22128-0.28962 0.28275-0.63729 0.14246zm-0.32508-2.157 0.35782-0.0827-0.38259-0.0155c-0.21043-9e-3 -0.49037 0.082-0.6221 0.20125-0.32806 0.29689-0.3048 0.49183 0.02477 0.20762 0.14535-0.12536 0.4253-0.26515 0.6221-0.31065z"
opacity=".99"/>
<path d="m50.37 313.83c-0.04802-0.048-0.07996-0.47531-0.07097-0.94953l0.01634-0.86221 0.09691 0.65957c0.0533 0.36277 0.13832 0.79005 0.18894 0.94953 0.08745 0.27551-0.04354 0.39031-0.23122 0.20264z"
opacity=".99"/>
</g>
</g>
<g id="chopstick-gripper">
<path d="m111.87 2.423-4.9768-0.78581-3.9291 33.528-50.816 84.606-8.1201 59.984 40.338 43.482c19.969 14.365 42.304-2.8518 32.48-29.861l-55.793-60.769c-3.8985-3.7124-3.2551-8.2505 0.52388-13.359l46.887-78.057z"
fill="#ccc" stroke="#000" stroke-width="1px"/>
</g>
<path d="m136.32 241.15 0.37043-46.675 13.336-62.233-34.821-86.311-0.37044-43.711 5.927-0.37044-0.37044 34.45 36.303 88.904v117.43h-20.004z"
fill="none" stroke="#000" stroke-width="1px"/>
<path d="m137.23 217.97 0.19577-23.986 13.338-61.83-34.879-86.344-0.0576-42.97 4.327-0.22419-0.24759 16.967-0.24758 16.967 36.36 88.886v116.52h-18.986z"
fill="#e6e6e6" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".37102"/>
<path d="m7.3342 777.43c-7.8363-98.395 24.486-208.26 43.482-314.32l106.87-0.52387c13.422 85.3 24.268 170.17 29.337 254.08l-1.5716 72.295-27.765-53.435-60.486-17.812-54.766 20.955z"
fill="none" stroke="#000" stroke-width="1px"/>
<path d="m183.56 774.3c-6.7865-29.629-22.363-52-43.458-62.416-10.324-5.0975-18.339-7.0199-31.019-7.4395-6.466-0.21397-10.183-0.0705-14.753 0.56953-29.525 4.1352-58.043 22.239-75.013 47.621-3.8449 5.7508-8.0356 13.721-9.8222 18.681l-1.2212 3.3903-0.40798-1.8187c-0.78244-3.488-1.0994-36.021-0.46946-48.182 1.488-28.724 4.6401-55.813 10.515-90.368 4.3963-25.858 7.51-41.646 22.007-111.59 2.3292-11.237 5.8787-29.017 7.8876-39.51s3.7111-19.127 3.7825-19.185c0.07143-0.0586 23.823-0.27447 52.782-0.47978l52.652-0.37328 1.4609 9.605c12.449 81.848 19.512 139.22 24.434 198.46 3.3965 40.881 3.3795 40.398 2.6613 75.962-0.35788 17.72-0.68391 32.258-0.7245 32.307-0.0406 0.0487-0.62289-2.3088-1.294-5.2388z"
fill="#999" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".5247"/>
<g fill="#333" fill-rule="evenodd" stroke-width="1.0016">
<path d="m149.25 816.66a51.152 52.042 0 0 1-50.816 52.341 51.152 52.042 0 0 1-51.486-51.658 51.152 52.042 0 0 1 50.734-52.423 51.152 52.042 0 0 1 51.567 51.574"
fill-opacity=".22672" opacity=".99" stroke="#2b0000"/>
<g stroke="#4d4d4d">
<path d="m70.289 816.96a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m133.01 816.96a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m117.88 843.2a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m85.413 843.2a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m85.413 788.04a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m117.88 788.04a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m112.62 804.29a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m90.032 804.29a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m90.032 828.06a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
<path d="m112.62 827.67a4.448 4.8928 0 0 1-4.4188 4.9209 4.448 4.8928 0 0 1-4.4771-4.8567 4.448 4.8928 0 0 1 4.4116-4.9287 4.448 4.8928 0 0 1 4.4841 4.8489"
opacity=".99"/>
</g>
</g>
<g fill="none" stroke="#000" stroke-width="1px">
<path d="m4.2715 809.64c-7.1045-104.63 176.33-170.85 183.08 1e-5"/>
<path d="m4.2715 809.64 4.5753-0.10608"/>
<path d="m8.8468 809.53c-2.8944-102.33 170.07-158.43 174.95 0.79255"/>
<path d="m183.8 810.32 3.5482-0.68646"/>
</g>
<path d="m46.995 732.12c0.14333-0.23192 0.45035-0.42166 0.68226-0.42166 0.23191 0 0.30439 0.18974 0.16106 0.42166-0.14333 0.23191-0.45035 0.42166-0.68226 0.42166-0.23191 0-0.30439-0.18975-0.16106-0.42166z"
fill="#666" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".42232"/>
<path d="m184.49 808.79c-0.0956-0.30616-0.28869-2.9982-0.4291-5.9824-0.70816-15.051-4.2594-31.878-9.4779-44.91-1.6747-4.182-5.5782-11.614-8.222-15.655-13.646-20.855-34.39-32.046-59.4-32.046-14.65 0-27.853 3.227-42.256 10.328-25.557 12.6-45.322 35.397-52.827 60.933-2.2148 7.5354-3.2045 13.514-3.7168 22.453l-0.28395 4.9545h-1.3771c-1.3638 0-1.3798-0.0235-1.6604-2.4413-0.32485-2.799 0.34197-12.194 1.2184-17.166 5.0167-28.46 24.214-54.367 51.978-70.146 11-6.2513 23.438-10.613 35.93-12.601 6.8693-1.0928 19.416-0.97178 25.977 0.25055 27.178 5.0635 46.971 23.25 57.807 53.116 4.6061 12.695 8.0574 31.059 8.6919 46.246 0.11634 2.7849 0.0696 2.9592-0.82732 3.0866-0.52284 0.0742-1.0288-0.11551-1.1244-0.42166z"
fill="#666" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".42232"/>
<path d="m8.8468 809.53 9.639-3.0101c30.148-121.93 156.16-92.391 161.01-2.6834l4.3084 6.4861"
fill="none" stroke="#000" stroke-width="1px"/>
<path d="m9.5574 803.98c0.71221-18.491 6.8309-35.567 18.28-51.016 22.878-30.87 62.995-47.408 95.958-39.559 22.262 5.3011 39.564 21.02 49.355 44.838 4.1076 9.9928 7.081 21.811 8.6252 34.282 0.623 5.0316 1.2241 11.986 1.2228 14.148l-9e-4 1.2076-1.4462-2.173c-1.3923-2.0921-1.4534-2.2739-1.6382-4.8785-0.34749-4.8975-1.5819-11.81-3.0269-16.95-3.9153-13.929-10.903-25.98-20.851-35.96-6.6606-6.6818-12.651-10.978-21.188-15.196-7.6613-3.7852-15.124-6.053-24.151-7.3387-4.0285-0.57382-14.72-0.48654-18.933 0.15455-13.331 2.0286-24.344 6.7249-35.332 15.067-3.7548 2.8506-11.79 10.837-15.04 14.948-9.7958 12.392-17.765 28.959-22.417 46.598l-1.0054 3.8126-3.8761 1.2225c-2.1318 0.67235-4.0653 1.2937-4.2966 1.3809-0.3557 0.13397-0.39238-0.57377-0.23777-4.5879z"
fill-rule="evenodd" opacity=".99" stroke="#4d4d4d" stroke-width=".29863"/>
<path d="m186.2 816.96c1.8606 139.53-186.45 111.75-184.56-2.9816l7.0067-0.59632c-0.89168 105.78 169.02 137.6 170.25 0.59632z"
fill="#666" stroke="#000" stroke-width="1px"/>
<path d="m97.282 910.42c-14.068-1.0761-24.756-3.5807-36.375-8.5245-8.1977-3.4879-17.227-8.8543-24.002-14.265-3.9253-3.1348-10.366-9.3965-13.486-13.111-13.055-15.544-20.267-34.391-21.01-54.909l-0.17992-4.9672 1.7589-0.18334c0.96742-0.10084 2.2422-0.23173 2.8328-0.29087l1.0738-0.10752 0.1952 4.0892c0.51726 10.836 2.6631 20.973 6.3401 29.952 9.1035 22.231 26.711 39.891 49.607 49.754 27.362 11.788 57.447 10.239 79.31-4.082 21.946-14.376 34.505-40.193 36.013-74.032l0.21291-4.7774 6.1428 2.4556-0.17778 4.6215c-0.82753 21.512-4.8261 36.8-13.396 51.217-12.139 20.422-33.322 33.609-58.769 36.586-3.8541 0.45083-13.293 0.78677-16.092 0.57271z"
fill="#999" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".29863"/>
<path d="m94.002 909.83c-11.129-1.151-19.195-2.9718-28.422-6.4163-10.005-3.7346-22.103-10.654-29.57-16.911-3.2715-2.7414-11.154-10.711-13.591-13.741-12.22-15.194-18.825-33.041-19.695-53.219l-0.19827-4.5983 1.7595-0.18722c0.96775-0.10297 2.1093-0.24213 2.5368-0.30926l0.77726-0.12204 0.19907 3.7855c1.0214 19.423 6.8949 35.787 18.206 50.723 20.403 26.942 56.743 41.543 89.465 35.947 29.67-5.0741 50.913-25.158 59.774-56.511 2.5305-8.9538 4.4752-22.317 4.4783-30.772 4e-4 -1.1069 0.0669-2.0126 0.14774-2.0126 0.0809 0 1.3717 0.49016 2.8685 1.0892l2.7215 1.0892-0.20695 4.5012c-0.6937 15.088-2.6759 26.126-6.6177 36.852-1.3658 3.7162-4.9275 11.051-7.0943 14.61-10.665 17.515-28.477 29.904-49.512 34.438-5.2538 1.1323-9.1653 1.6139-15.672 1.9296-7.2288 0.35079-7.3843 0.3487-12.355-0.16537z"
fill="#666" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".29863"/>
<path d="m8.6466 813.38 9.5411 0.29816c58.491 81.31 111.76 76.383 160.71 0.29816"
fill="none" stroke="#000" stroke-width="1px"/>
<path d="m93.555 904.13c-18.689-1.9248-36.48-9.1398-50.985-20.677-4.0344-3.2089-11.156-10.32-14.26-14.238-6.9173-8.7328-12.489-19.445-15.432-29.668-1.9856-6.8987-3.1072-13.893-3.5433-22.097l-0.18862-3.5487 2.4747 0.19263c1.3611 0.10595 3.2942 0.19344 4.2957 0.19442l1.8211 2e-3 3.9906 5.2923c15.88 21.06 31.094 35.59 46.586 44.492 17.87 10.269 35.929 12.131 53.339 5.4993 18.064-6.8806 36.14-23.369 53.58-48.873l3.0582-4.4724-0.19428 3.4288c-1.0176 17.96-4.1509 31.169-10.432 43.978-11.345 23.137-31.288 37.277-56.78 40.259-3.4055 0.39836-14.314 0.54747-17.33 0.23689z"
fill="#1a1a1a" fill-rule="evenodd" opacity=".99" stroke="#4d4d4d"
stroke-width=".29863"/>
</svg>
</div>
</div>
</div>
</div> <!-- End container for robotic visualization -->
<div class="col-lg-6 col-md-6 col-sm-6 h-100" style="text-align:center">
<strong>
<a
href="#"
data-toggle="popover"
data-placement="bottom"
data-content="A chart depicting the learned prediction (grey) vs the true return
(yellow). We do not know the true-return, so we maintain a buffer of previous cumulants
c to estimate the return. This estimate is often called the empirical return.
Because it is an estimate based on a buffer of data, the yellow-line moves over-time;
As more cumulants are observed on subsequent time-steps, we can better estimate the
return for previous time-steps."
>Learned Prediction vs Signal</a>
</strong>
<div id="graph" width="300" height="200"></div>
</div> <!-- End container for prediction visualization -->
</div> <!-- End container for first row -->
<hr>
<div class="row h-auto">
<div class="col-lg-6 col-md-6 col-sm-6">
<div class="row">
<div class="col-lg-6 pull-left">
<h3 id="time-step-counter"></h3>
</div>
<div class="col-lg-6">
<h3 id="active_state"></h3>
</div>
<div class="col-lg-12">
<!--TD eq'n-->
<strong>
<a
href="#"
data-toggle="popover"
data-placement="top"
title="The Temporal difference error"
data-content="The difference between the estimated value of the current state v(s)
and the observed value c + γ v(s'). We call the observed value bootstrapped,
as it is depends on the observed value c, and the discounted estimate of the next
state γ v(s')."
>
δ</a>
←
<a
href="#"
data-toggle="popover"
data-placement="top"
title="Cumulant (a.k.a Pseudo-reward)"
data-content="The signal of interest we accumulate.">c</a> + <a
href="#"
data-toggle="popover"
data-placement="top"
title="The discounting function"
data-content="γ(s,a,s') determines how future values are discounted. In this
example, our discount is a constant value 0 ≤ γ ≤ 1. When the discount
is a constant value, we can think of the prediction as being over a horizon of
1/(1-γ).">γ</a>
<a
href="#"
data-toggle="popover"
data-placement="top"
title="The value of the next state"
data-content="The estimated value of the next state s'.">
v(s')
</a>
- <a
href="#"
data-toggle="popover"
data-placement="top"
title="The value of the current state"
data-content="Estimate of the value of the current state s.">v(s)</a> <br>
</strong>
<!--Placeholders for the update values-->
<p style="display:inline" id="td-td">0</p>
←
<p style="display:inline" id="td-cumulant">0</p>
+
<p style="display:inline" id="td-gamma">0</p>
*
<p style="display:inline" id="td-v-next">
0
</p> - <p style="display:inline" id="td-v">0</p>
<hr>
<strong>
<!--Eligibility trace description-->
<a
href="#"
data-toggle="popover"
data-placement="top"
title="Eligibility Traces"
data-content="Trace of state visitation in the current state s.">
e(s)
</a>
←
<a
href="#"
data-toggle="popover"
data-placement="top"
title="The discounting function"
data-content="γ(s,a,s') determines how future values are discounted. In this
example, our discount is a constant value 0 ≤ γ ≤ 1. When the discount
is a constant value, we can think of the prediction as being over a horizon of
1/(1-γ).">
γ
</a>
<a
href="#"
data-toggle="popover"
data-placement="top"
title="Lambda, the eligibility decay."
data-content="Todo.">λ</a> <a
href="#"
data-toggle="popover"
data-placement="top"
title="todo"
data-content="todo">e(s)</a> + <a
href="#"
data-toggle="popover"
data-placement="top"
title="todo"
data-content="todo">1</a> </strong> <br>
<p style="display:inline" id="e-e">0</p> ← <p style="display:inline" id="e-gamma">0</p>
* <p style="display:inline" id="e-lambda">0</p> * <p style="display:inline" id="e-e2">0</p>
+ <p style="display:inline" id="e-phi">1</p>
<hr>
<strong>
<a
href="#"
data-toggle="popover"
data-placement="top"
title="Value of state S (after update)"
data-content="The value of state S after it has been updated using the experience
from the current time-step.">v(s)</a> ← <a
href="#"
data-toggle="popover"
data-placement="top"
title="Value of state S (before update)"
data-content="The value of state s before we update it.">v(s)</a> + <a
href="#"
data-toggle="popover"
data-placement="top"
title="Step-size (a.k.a learning rate)"
data-content="A scaling factor that determines the amount by which we reduce the
error for this particular time-step.">α</a> <a
href="#"
data-toggle="popover"
data-placement="top"
title="Temporal-difference error"
data-content="Using the previously calculated TD error δ, we update the
weights for the given state, moving them in a direction to reduce the error for
this time-step.">δ</a> </strong> <br>
<p style="display:inline" id="w-w">0</p> ← <p style="display:inline" id="w-w2">0</p> +
<p style="display:inline" id="w-alpha">0</p> * <p style="display:inline" id="w-td">0</p>
</div>
</div>
</div>
<div class="col-lg-6 col-md-6 col-sm-6 text-center">
<strong>
<a
href="#"
data-toggle="popover"
data-placement="top"
title="Learned weights: where we store the predictions"
data-content="The Weight vector is where we store our estimates V(s). We have an array whose
length is the number of states |S|. When we are performing an update, we adjust the value of
the weight corresponding to the state. This is considered a tabular representation of the
world: an agent maintains a table. However, we can use many different kinds of
representations, such as those constructed by neural nets.
">Weight Vector</a> & <a
href="#"
data-toggle="popover"
data-placement="top"
title="todo"
data-content="todo">Elegibilty
Traces</a></strong>
<div style="max-height:150px" id="chart">
</div>
</div>
</div>
<hr>
<div class="row h-auto flex ">
<div class="col-lg-4 col-md-4 col-sm-4" style="text-align: center; vertical-align: middle">
<a
href="#"
data-toggle="popover"
data-placement="top"
title="What a prediction is about"
data-content="The question parameters determine what a predictive question is about.
The cumulant c determines what the signal of interest is: it can be any signal available to
the agent from the environment, or its own learning processes. The discount parameter γ
determines what kind of accumulation the value function is approximating. In this case we
consider only constant discounts. For constant discounts, the γ can be thought of as
describing the horizon over which a prediction is being made, where horizon = 1/(1-γ)."
><b>Question Parameters</b></a>
<div class="col-lg-12">
c : <select id="cumulant"></select>
<br>
</div>
<div class="col-lg-12">
γ : <input type="number" id="gamma" , value="null">
<br>
</div>
</div>
<div class="col-lg-4 col-md-4 col-sm-4" style="text-align: center; vertical-align: middle">
<a
href="#"
data-toggle="popover"
data-placement="top"
title="How the answer is learned"
data-content="These parameters determine modify how the learning algorithm operates.
The step-size α specifies the amount by which the TD-error δ is used to update
the weights on each time-step. The eligibility decay λ determines how much previous
states are updated based on the current observation."
>
<b>Answer Parameters</b>
</a>
<div class="col-lg-12">
λ : <input type="number" id="lambda" value="null">
<br>
</div>
<div class="col-lg-12">
α : <input type="text" id="step_size" name="lname">
<br>
</div>
</div>
<div class="col-lg-4 col-md-4 col-sm-4 justify-content-center mx-auto text-center" style="text-align: center; vertical-align: middle">
<a style="font-size:35px; text-decoration: none;padding:10px" id="play-pause">⏯︎</a>
<a style="font-size:35px; text-decoration: none;padding:10px" id="step">⏭︎</a>
<a style="font-size:35px; text-decoration: none;padding:10px" id="ff">⏩︎</a>
<a style="font-size:35px; text-decoration: none;padding:10px" id="reset">🔄</a>
</div>
</div>
<hr>
</div>
<div class="col-lg-5 h-100"
style="padding: 10px 10px 0px 10px;box-shadow: inset 7px 0 9px -7px rgba(0,0,0,0.4);overflow-y:scroll;max-height: 100vh;">
<div class="">
<h1>General Value Functions: A Visual Primer</h1>
<p>
Before we come to understand how predictions are made, we first must have an understanding of how we
could describe
an agent interacting with its environment. The interaction of the agent with the world is formalized
as a series
of observations and actions. An agent may take an action $a_t$, and be presented with a new
observation
\(o_{t+1}\). If the problem is, say, a board game, $a_t$ may be them move taken by the agent, and
\(o_{t+1}\)
might be an encoding of the change in the board resulting from the move. Because the board is
<i>fully observable</i> $o_t$ perfectly describes the state of the world.
Oftentimes, real-world problems are not fully observable. For instance, imagine a robot interacting
with the
world where observations $o$ describe the sensor readings available to the robot on a
moment-to-moment basis.
The world is <i>partially observable</i> to the robot. There are many aspects of the world which are
not
perceived by the sensors of the robot---e.g., objects outside of its visual field---so $o_t$ in this
case
does not describe the state of the world, but rather the <i>agent state</i>: the state of the world
from
the agent's perspective.
</p>
<p>
General Value Functions (GVFs) estimate the future accumulation of a value $c$. In the simplest
case, this
might be the accumulation of some element of an agent's observation $c \in o$. The discounted sum of
$c$,
is called the <i>return</i>, and is defined over discrete time-steps $t = 1,2,3,...,n$ as
$G_t = \mathbb{E}_\pi( \sum^\infty_{k=0}(\prod^{k}_{j=1}(\gamma_{t+j}))C_{t+k+1})$---the expectation
of how
a signal will accumulate over time. What the GVF's prediction is about is determined by its
<i>question parameters</i>, including the signal of interest $C$ (often called the <i>cumulant</i>),
a
discounting function $0 \leq \gamma(o_t, a_t, o_{t+1}) \leq 1$, and a policy $\pi$ which describes
the
behaviour over which the predictions are made. In the simplest case, the discounting function is a
is a
constant value that describes the horizon over which a prediction is made. For example if
$\gamma=0.9$,
then it's corresponding GVF is the accumulation of $c$ over $\frac{1}{1-\gamma} = 10$ time-steps.
by making predictions like this, we can anticipate how a signal changes over a period of time.
</p>
<p>
The discounting function and cumulant can also be used to express more complex predictions.
For instance, we can specify a GVF that asks the question "How long until we see <i>x</i>" by using
the
following cumulant and discount:
\[
c =
\begin{cases}
\text{if}\quad o_i = x, &\quad 1\\
\text{else}, &\quad0 \\
\end{cases}
\label{echo_gvf_c}
\]\[
\gamma =
\begin{cases}
\text{if}\quad c = 1, &\quad 0\\
\text{else}, &\quad0.9 \\
\end{cases}
\label{echo_gvf_y}
\]
This has the effect of counting the time-steps until $o_i$ takes on the value of $x$.
There are many more possible ways that cumulants and discounts could be specified, but we choose
these two
examples to give a flavour of what can be expressed.
</p>
<p>
There is one final question parameter that we haven't discussed yet: the policy $\pi$. We want our
predictions to not only be a function of what we observe, but the actions we take. We want our
predictions
to be able to capture how the environment changes as a response to our own behaviours. To do so, we
condition the expectation on the policy, where $\pi(o,a) = \mathbb{P}(o,a)$. That is, the policy
describes
the probability of taking an action $a$ given observation $o$. for instance, if an agent had three
actions---turn left, move forwards, or turn right---we could specify a policy as follows:
\[\pi = [0,1,0]\]
Which would mean that each prediction would be conditioned on the agent moving forwards.
If this were the policy of our counting GVF, the question would then become "how long until we see
<i>x</i>
if we continue moving forwards"? Possibly the most powerful aspect of General Value Functions is
their
ability to condition predictions in terms of an agent's actions.
</p>
<p>
Having determined how we want to express predictions, need a way to learn these predictions: to
estimate
their values using the observations and actions available to the agent. General Value Functions can
be
estimated using typical Value Function approximation methods from computational reinforcement
learning.
In this context, we consider Temporal-difference learning.
<p>
In temporal difference learning we estimate a value-function $v$ such that
$v(\phi(o_t)) \approx \mathbb{E}_\pi [G_t | o_t]$ : we learn a function that estimates the return at
a
given time-step given the agent's observations. On each time-step the agent receives a vector of
observations $o \in \mathbb{R}^m$ where each $o_i \cdots o_m$ is a different real-valued input. A
function
approximator $\phi : o \rightarrow \mathbb{R}^n$---such as a neural net, kanerva coder, or tile
coder---may
be used to encode the observations into a <i>feature vector</i>. The estimate for each time-step
$v(\phi(o_t))$ linear combination of learned weights $w\in \mathbb{R}^n$,
and the current feature vector--$v(o_t) = w^\top\phi(o_t))$.
</p>
<p>
How do we learn the weights? We need an error metric with which we can adapt the weights over time:
a measure to determine how accurate our guess $v(o_t)$ was. In traditional supervised learning,
we compare the estimated value to the true value. In this case, we do not yet know the true value
of the expected return $G_t$. To compute the true value of the return, we would need collect
$c_t \cdots c_n$, where $n$ is is possibly infinite. To resolve this, we estimate the return by
<i>bootstrap</i> our estimate. We estimate the value of the return, using our current approximate
value-function $v$. That is, $G_t \approx c_t + \gamma(o_t, a_t, a_{t+1}) v(o_{t+1})$. We can then
form
the temporal-difference error by taking $\delta_t = c_t + \gamma(o_t, a_t, a_{t+1}) v(o_{t+1}) -
v(o_t)$
(line 3, Algorithm 1). The more accurate our estimate $v(o_{t+1})$ is, the more accurate our error
$\delta$
is. We build the error through which we learn our estimates using our existing estimates. The value
function's weights are learned iteratively on each time-step by updating to reduce the
temporal-difference
error $ w_{t+1} = w_t + \alpha\delta\phi(o_t)$, where the step-size or learning rate is $0 <
\alpha$.
</p>
<p>
We call the parameters of the learning methods <i>answer parameters</i>. Answer parameters change
how an
agent answers a question. Answer parameters include the step-size\footnote{Also known as the
\emph{learning rate}.} $\alpha$ which scales updates to the weights, and the linear or non-linear
function-approximator $\phi$ used to construct state.
</p>
</div>
</div>
</div>
</div>
<script src="static/js/chart.min.js"></script>
<script src="static/js/apexcharts.min.js"></script>
<script src="static/js/simulator.js"></script>
<script>
$(document).ready(function(){
$('[data-toggle="tooltip"]').tooltip();
});
$(document).ready(function() {
$('[data-toggle="popover"]').popover({
trigger: 'hover'
});
});
</script>
</body>
</html>