-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathLeoPostings.leo
5487 lines (3597 loc) · 406 KB
/
LeoPostings.leo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet ekr_stylesheet?>
<leo_file>
<leo_header file_format="2" tnodes="0" max_tnode_index="0" clone_windows="0"/>
<globals body_outline_ratio="0.5" body_secondary_ratio="0.5">
<global_window_position top="50" left="50" height="500" width="700"/>
<global_log_window_position top="0" left="0" height="0" width="0"/>
</globals>
<preferences/>
<find_panel_settings/>
<vnodes>
<v t="ekr.20071028032354"><vh>@chapters</vh></v>
<v t="ekr.20050421221914"><vh>About this file</vh></v>
<v t="ekr.20050425064819"><vh>3.x 2001 @file trees</vh>
<v t="ekr.20050425064819.1"><vh>Designing @file trees (From LeoDocs.Leo)</vh>
<v t="ekr.20050425064819.2"><vh>Deciding to do Leo2</vh></v>
<v t="ekr.20050425064819.3"><vh>A prototype</vh></v>
<v t="ekr.20050425064819.4"><vh>User interaction</vh></v>
<v t="ekr.20050425064819.5"><vh>The write code</vh></v>
<v t="ekr.20050425064819.6"><vh>The read code</vh></v>
<v t="ekr.20050425064819.7"><vh>The load/save code</vh></v>
<v t="ekr.20050425064819.8"><vh>Attributes, mirroring and dummy nodes</vh></v>
<v t="ekr.20050425064819.9"><vh>Clones</vh></v>
<v t="ekr.20050425064819.10"><vh>Error recovery, at last</vh></v>
</v>
</v>
<v t="ekr.20050421212523"><vh>4.0 (2002-2003) New file format w/o child indices, eliminated error "recovery"</vh>
<v t="ekr.20050422071739"><vh> 2003-10-17 From 4.0 readme</vh></v>
<v t="ekr.20050422065602.7"><vh>2002 & 2003: Early ideas</vh>
<v t="ekr.20050421205312"><vh>2002-10-21 design.doc</vh></v>
<v t="ekr.20050421192149.12"><vh>2002-10 gti-open.doc New (long) design notes ***</vh>
<v t="ekr.20050422055636"><vh>2002-10-21 Theme 1: Global Tnode Indices </vh></v>
<v t="ekr.20050422055636.1"><vh>2002-10-21 Theme 2: Small (template) .leo files </vh></v>
<v t="ekr.20050422055636.2"><vh>2002-10-21 Theme 3: @@file nodes </vh></v>
<v t="ekr.20050422055636.8"><vh>2002-10-21 Summary of themes 1-3 </vh></v>
<v t="ekr.20050422055636.6"><vh>2002-10-21Theme 4: Revised XML file format</vh></v>
<v t="ekr.20050422055636.7"><vh>2002-10-22 RE: Theme 4: Revised XML file format</vh></v>
<v t="ekr.20050422055636.3"><vh>2002-10-23 Yes, GTI's _are_ possible ***</vh></v>
<v t="ekr.20050422055636.4"><vh>2002-10-24 RE: Yes, GTI's _are_ possible, more </vh></v>
<v t="ekr.20050422055636.5"><vh>2002-10-24 sequence numbers in gti's </vh></v>
<v t="ekr.20050422055636.9"><vh>2002-10-24 Glorious unification & leo.py 4.0 **</vh></v>
<v t="ekr.20050422055636.10"><vh>2002-10-26 Setting global name: LeoID.txt </vh></v>
<v t="ekr.20050422060227"><vh>2002-10-24 Embedded XML != XML </vh></v>
<v t="ekr.20050422060227.1"><vh>2002-10-27 Embedded XML escapes</vh></v>
<v t="ekr.20050422060227.2"><vh>2002-10-29 Embedded XML escapes: second thoughts</vh></v>
</v>
<v t="ekr.20050421192149.13"><vh>2002-12 gti's.doc Big picture: why 4.0 is important ***</vh>
<v t="ekr.20050421205312.1"><vh>2002-12-06 Big picture: why 4.0 is important: more</vh></v>
<v t="ekr.20050421205312.2"><vh>2002-12-07 Big picture: why 4.0 is important: more</vh></v>
<v t="ekr.20050421192149.18"><vh>2002-12-07 structure rule.doc (Continuation of big picture)</vh></v>
<v t="ekr.20050421205312.3"><vh>2002-12-17 Big picture: thick or thin?</vh></v>
<v t="ekr.20050421205312.4"><vh>2002-12-18 Big picture: thick or thin?</vh></v>
<v t="ekr.20050421210335"><vh>2002-12-18 Big picture: thick or thin?</vh></v>
<v t="ekr.20050421210335.1"><vh>2002-12-18 Big picture: thick or thin?</vh></v>
<v t="ekr.20050421210335.2"><vh>2002-12-18 Big picture: thick or thin?</vh></v>
<v t="ekr.20050421210335.3"><vh>2002-12-19 Teamwork with LEO</vh></v>
<v t="ekr.20050421210335.4"><vh>2002-12-19 RE: Teamwork with LEO</vh></v>
<v t="ekr.20050421210335.5"><vh>2002-12-19 RE: Teamwork with LEO</vh></v>
<v t="ekr.20050421210335.6"><vh>2002-12-19 RE: Teamwork with LEO</vh></v>
<v t="ekr.20050421210335.7"><vh>2002-12-18 Separate presentation from content</vh></v>
<v t="ekr.20050421210335.8"><vh>2002-12-18 RE: Separate presentation from content</vh></v>
<v t="ekr.20050421210335.9"><vh>2002-12-18 RE: Separate presentation from content</vh></v>
<v t="ekr.20050421210335.10"><vh>2002-12-19 RE: Separate presentation from content</vh></v>
<v t="ekr.20050421210335.11"><vh>2002-12-19 RE: Separate presentation from content</vh></v>
<v t="ekr.20050421210335.12"><vh>2002-12-19 RE: Separate presentation from content</vh></v>
<v t="ekr.20050421210335.13"><vh>2002-12-19 RE: Separate presentation from content</vh></v>
<v t="ekr.20050421210335.14"><vh>2002-12-30 Why thick is required </vh></v>
<v t="ekr.20050421210335.15"><vh>2002-12-31 RE: Thick & thin</vh></v>
<v t="ekr.20050421210335.16"><vh>2003-01-02 RE: Thick & thin</vh></v>
<v t="ekr.20050421210335.17"><vh>2003-01-06 RE: Thick & thin</vh></v>
</v>
<v t="ekr.20050421192149.11"><vh>2003-02-18 gti_summary.doc</vh></v>
</v>
<v t="ekr.20050421194542.1" a="EM"><vh>2003-04 thru 2003-07 Doubts about reliability & resolution</vh>
<v t="ekr.20050421211313"><vh>2003-04-30 Design questions</vh></v>
<v t="ekr.20050421204424" a="M"><vh>2003-05-01 thru 2003-05-30 vxnodes (shared nodes)</vh>
<v t="ekr.20050421192149.15"><vh>2003-05-01 nodes.doc</vh></v>
<v t="ekr.20050421203956"><vh>2003-05-01 shared nodes.doc</vh></v>
<v t="ekr.20050421192149.14"><vh>2003-05-02 inodes redux.doc</vh></v>
<v t="ekr.20050421203956.4"><vh>2003-05-29 scrolling.doc</vh></v>
<v t="ekr.20050421203956.2"><vh>2003-05-30 More about vxnodes.doc</vh></v>
</v>
<v t="ekr.20050421192149.6"><vh>2003-05-06 Conflicts.doc</vh></v>
<v t="ekr.20050421192149.5"><vh>2003-05-07 Conflicts2.doc ** (user can't resolve conflicts)</vh></v>
<v t="ekr.20050421211313.1"><vh>2003-5-13 Progress.doc</vh></v>
<v t="ekr.20050421192149.2"><vh>2003-05-25 4.0 is dead, long live leo.doc ** (Valid concerns, wrong conclusion: see 2004-02-5)</vh></v>
<v t="ekr.20050421192149.20"><vh>2003-05-26 Using gnx's safely.doc</vh></v>
<v t="ekr.20050421192149.7"><vh>2003-05-27 Eliminate clones.doc</vh></v>
<v t="ekr.20050421192149.16"><vh>2003-05-30 Objections to link nodes.doc</vh></v>
<v t="ekr.20050421192149.3"><vh>2003-05-31 clones2links script.doc</vh></v>
<v t="ekr.20050421203956.3"><vh>2003-06-02 positions.doc</vh>
<v t="ekr.20050421204424.1"><vh>Overview</vh></v>
<v t="ekr.20050421204424.2"><vh>Positions</vh></v>
</v>
<v t="ekr.20050421203956.1"><vh>2003-06-02 Giant Aha re positions.doc ***</vh></v>
<v t="ekr.20050421211313.2"><vh>2003-06-09 Progress report</vh></v>
<v t="ekr.20050421192149.4"><vh>2003-06-12 Cold feet.doc</vh></v>
<v t="ekr.20050421211313.3"><vh>2003-06-17 Progress report (shared tnodes delayed)</vh></v>
<v t="ekr.20050421192149.10"><vh>2003-06-18 gnxs must go.doc ** (Valid concerns, wrong conclusion: see 2004-02-5)</vh></v>
<v t="ekr.20050421192149.19"><vh>2003-06-18 ugnx.doc</vh></v>
<v t="ekr.20050421192149.8"><vh>2003-06-19 Eliminating child indices.doc</vh></v>
<v t="ekr.20050421192149.17"><vh>2003-06-22 Reply 6-22.doc</vh></v>
<v t="ekr.20050421192149"><vh>2003-06-26 What has been gained.doc</vh></v>
<v t="ekr.20050421212523.1"><vh>2003-07-09 New design principles.doc</vh></v>
<v t="ekr.20050421192149.1"><vh>2003-07-26 ++About consistency.doc</vh>
<v t="ekr.20050421195802.1"><vh>A new 4.0? Consistency</vh></v>
<v t="ekr.20050421195802.2"><vh>A new 4.0? ironical gnx's</vh></v>
<v t="ekr.20050421195802.3"><vh>A new 4.0? Derived files can be the SUM</vh></v>
<v t="ekr.20050421195802.4"><vh>A new 4.0? .leo files must be disjoint unions</vh></v>
<v t="ekr.20050421195802.5"><vh>A new 4.0? Owned & unowned clones</vh></v>
<v t="ekr.20050421195802.6"><vh>A new 4.0? Acid tests</vh></v>
<v t="ekr.20050421195802.7"><vh>A new 4.0? Primary & secondary data</vh></v>
<v t="ekr.20050421195802.8"><vh>A new 4.0? gnx's redux</vh></v>
</v>
<v t="ekr.20050421212523.2"><vh>2003-07-30 About at-include.doc</vh></v>
<v t="ekr.20050421212523.3"><vh>2003-07-30 More about at-include.doc</vh></v>
<v t="ekr.20050421212523.4"><vh>2003-07-31 comments about 4-0 design.doc</vh></v>
<v t="ekr.20050421212523.5"><vh>2003-07-31 synch reply 2.doc</vh></v>
</v>
<v t="ekr.20050422065602.8"><vh>2003-09 Code details</vh>
<v t="ekr.20050421212523.6"><vh>2003-09-05 New 4-0 Design Notes.doc</vh>
<v t="ekr.20050421212523.7"><vh>Executive summary</vh></v>
<v t="ekr.20050421212523.8"><vh>Background and discussion</vh></v>
<v t="ekr.20050421212523.9"><vh>Examples</vh></v>
</v>
<v t="ekr.20050421212523.10"><vh>2003-09-03 ProgressReport.doc</vh></v>
<v t="ekr.20050421214628"><vh>2003-09-14 tempBodyString.doc</vh></v>
<v t="ekr.20050421214628.1"><vh>2003-09-17 Progress.doc *** (summary of features of 4.0)</vh></v>
<v t="ekr.20050421214628.2"><vh>2003-09-18 Farewell to at-ws.doc</vh></v>
<v t="ekr.20050421214628.3"><vh>2003-09-22 4-0 complete.doc</vh></v>
<v t="ekr.20050421214628.4"><vh>2003-09-23 Transition.doc</vh></v>
</v>
</v>
<v t="ekr.20050421214628.5"><vh>4.1 (2003) Unicode, gui-agnosic code and gnx's in .leo files.</vh>
<v t="ekr.20050422071739.1"><vh> 2004-02-20 From 4.1 readme</vh></v>
<v t="ekr.20050421214628.6"><vh>2003-11-03 4-1a1 released.doc</vh></v>
</v>
<v t="ekr.20050421214704"><vh>4.2 (2004) shared tnodes, positions and @thin (gnx's in derived files)</vh>
<v t="ekr.20050422071739.2"><vh> 2004-09-20 From 4.2 readme</vh></v>
<v t="ekr.20050421214921.6" a="M"><vh>2004-02-05 at-file-thin.doc ***** (abandon the synchronization principle)</vh></v>
<v t="ekr.20050421214921.16"><vh>2004-02-29 New plans-2-29-04.doc **</vh></v>
<v t="ekr.20050421214921.21"><vh>2004-02-29 shared tnode design.doc ***</vh></v>
<v t="ekr.20050421214921.9"><vh>2004-02-29 Code details of shared tnode.doc</vh></v>
<v t="ekr.20050421214921.12"><vh>2004-02-29 Details and schedule.doc</vh></v>
<v t="ekr.20050421214921"><vh>2004-03-02 Transition notes.doc</vh></v>
<v t="ekr.20050421214921.7"><vh>2004-03-02 Better convert routine.doc</vh></v>
<v t="ekr.20050421214921.19"><vh>2004-03-02 Progress report 3-2-04.doc</vh></v>
<v t="ekr.20050421214921.20"><vh>2004-03-04 Progress report 3-4-04.doc</vh></v>
<v t="ekr.20050421214921.14"><vh>2004-03-04 Iter test code.doc</vh></v>
<v t="ekr.20050421214921.11"><vh>2004-03-04 Design answers.doc</vh></v>
<v t="ekr.20050421214921.23"><vh>2004-03-04 Status report 3-5-04.doc</vh></v>
<v t="ekr.20050421214921.15"><vh>2004-03-05 iterators make positions safe.doc ***</vh></v>
<v t="ekr.20050421214921.18"><vh>2004-03-07 positions can be compatible.doc **</vh></v>
<v t="ekr.20050421214921.4"><vh>2004-03-08 A little gem.doc **</vh></v>
<v t="ekr.20050421214921.17"><vh>2004-03-09 New read logic works.doc</vh></v>
<v t="ekr.20050421214921.8"><vh>2004-03-10 cmp and nonzero.doc</vh></v>
<v t="ekr.20050421214921.24"><vh>2004-03-11 Status report 3-11-04.doc</vh></v>
<v t="ekr.20050421214921.10"><vh>2004-03-11 Compatibility report.doc</vh></v>
<v t="ekr.20050421214921.13"><vh>2004-03-13 Heavy lifting.doc</vh></v>
<v t="ekr.20050421214921.22"><vh>2004-03-14 Small code--big aha.doc **</vh></v>
<v t="ekr.20050421214921.1"><vh>2004-03-15 4-2 liftoff near.doc</vh></v>
<v t="ekr.20050421214921.26"><vh>2004-03-19 The taste of dog food.doc *** (eliminating positions)</vh></v>
<v t="ekr.20050421214921.25"><vh>2004-03-23 Status report 3-23-04.doc</vh></v>
<v t="ekr.20050421214921.3"><vh>2004-03-25 4-2a1 now on cvs.doc</vh></v>
<v t="ekr.20050421214921.2"><vh>2004-03-26 4-2 looks solid.doc</vh></v>
<v t="ekr.20050421214921.5"><vh>2004-05-01 at-file-thin works.doc</vh>
<v t="ekr.20050421221330"><vh>@all directive</vh></v>
<v t="ekr.20050421221330.1"><vh>@file-thin-wait won't work</vh></v>
<v t="ekr.20050421221330.2"><vh>Organizing projects with @file-thin</vh></v>
<v t="ekr.20050421221330.3"><vh>Embedded sentinels are essential</vh></v>
</v>
</v>
<v t="ekr.20050422065602.9"><vh>4.3 (2005) Settings dialog</vh></v>
<v t="ekr.20050425053621"><vh>The essentials</vh>
<v t="ekr.20050422065602.1"><vh>The big questions that Leo must answer</vh>
<v t="ekr.20050422065602.2"><vh>How to read derived files reliably?</vh></v>
<v t="ekr.20050422065602.3"><vh>How to ensure the integrity of data?</vh></v>
<v t="ekr.20050422065602.4"><vh>How to represent clones?</vh></v>
<v t="ekr.20050422065602.5"><vh>How to make derived files friendly to cvs?</vh></v>
<v t="ekr.20050422071828"><vh>How to handle unicode reliably?</vh></v>
</v>
<v t="ekr.20050421214628.1"></v>
<v t="ekr.20050425053635"><vh>About consistency</vh>
<v t="ekr.20050421192149.5"></v>
<v t="ekr.20050421192149.2"></v>
<v t="ekr.20050421192149.10"></v>
<v t="ekr.20050421214921.6" a="M"></v>
</v>
<v t="ekr.20050425060514"><vh>About positions</vh>
<v t="ekr.20050421203956.1"></v>
<v t="ekr.20050421214921.21"></v>
<v t="ekr.20050421214921.15"></v>
<v t="ekr.20050425064819.11"><vh>Missing paper here? Referenced in previous paper</vh></v>
<v t="ekr.20050421214921.18"></v>
<v t="ekr.20050421214921.4"></v>
<v t="ekr.20050421214921.22"></v>
<v t="ekr.20050421214921.26"></v>
</v>
</v>
<v t="ekr.20050425060514.1"><vh>To do</vh></v>
</vnodes>
<tnodes>
<t tx="ekr.20050421192149">What has been gained?
The recent design changes are subtle: much remains from the old 4.0 design. In this posting I'd like to summarize what has been gained.
1. The design of Leo is complete and solid. There should be no need for further extended design discussions about the fundamentals of Leo.
2. The "single-owner" rule for clones ensures that .leo files will remain consistent and meaningful even in collaborative environments. This ensures that @file-thin and @file x.leo (@include) can be made to work reliably.
3. There is now a simple strategy for resolving conflicts: namely the Resolve Conflicts command. This command will depend neither on detailed information from cvs nor on gnx's. This command may _use_ such information, but the Resolve Conflicts command will be designed to work even if that information is missing or unreliable.
4. It is now clear that gnx's create only non-essential information such as clone links from .leo files to derived. Information used to resolve conflicts is also non-essential.
5. There is a clear plan for changes to Leo's file formats. In 4.0 sentinels will contain gnx's. Clone indices will be gone. Except for these changes the format of derived files will remain the same. .leo files will contain xml elements needed to recreate non-essential information such as marks and node order.
Edward
Minor choices
There are a few minor choices yet to make about the new 4.0. Please make your views known.
1. Should the new gnx's include an id field?
Now that gnx's give non-essential data it would be conceivable to eliminate the id field. This would make it a bit more convenient for new users: Leo wouldn't immediately prompt them for an id. OTOH, I believe this field might be useful in some situations, say the Resolve Conflicts command. I am inclined to retain the id field.
2. Make minimal changes to the format of derived files for 4.0?
In 4.0 sentinels will contain gnx's. Clone indices will be gone. No other changes are required for 4.0. I would prefer not to make any changes to make derived files "friendlier" to cvs. Such changes would make derived files a bit more cluttered, for very little gain.
Edward
Transitioning to the "new Leo"
I plan to implement the new design in the following phases:
Phase 1: Implement the "single-owner" restriction for clones.
This can be done in 3.13: no change to file formats are needed. The atFile.read code will no longer do error "recovery". Experience shows such recovery is useless. This might be delayed until 4.0.
Phase 2: Revise file formats
This will be the basis of 4.0. I plan to do this in September. Derived files will no longer contain child indices. Sentinels will have full gnx's. .leo files will contain xml elements needed to recreate non-essential information such as marks and node order.
Phase 3: Implement @file-flat.
Phase 4: Implement @file x.leo
These last two phases could be done in Phase 2, and we'll probably have our hands full with the transition to 4.0.
Edward
</t>
<t tx="ekr.20050421192149.1">A new 4.0?
I do appreciate people's efforts to revive 4.0. This shows good "fighting spirit".
It should be possible to get just about everything anyone has ever wanted for 4.0. I'll be writing up my thoughts in separate, shorter postings in this thread. I've noticed that I have extreme difficulty following long postings, and I suspect others have the same problem :-)
Edward
P.S. I am fairly confident that the scheme I am about to discuss will meet with general approval. I put a question mark in the title to indicate that nothing has been firmly decided.
</t>
<t tx="ekr.20050421192149.10">gnx's MUST DIE
[WARNING: the concerns in this posting are real. The conclusions are WRONG! See 2004-02-05]
Standing in the shower this morning I saw again how dangerous gnx's are.
Proof: Suppose I wanted to create a file called LeoAttic.leo containing old project nodes that aren't very useful but that I wanted to keep around "just in case". Creating this file _decouples_ the contents of all the nodes in the attic from the ongoing development. Would I _ever_ want to _recouple_ these nodes? Absolutely not! The "old" nodes are _exceedingly_ dangerous! Bad (old) data are the enemy of all good (up-to-date) data!
As we shall see, the notion of time pervades the entire discussion.
Once I saw how really bad gnx's might be, a whole new train of thought arose immediately:
gnx's are an attempt to solve a problem a the wrong level. Conflicts are not about nodes, they are about outlines, or even entire projects.
Gnx's are not needed for LeoN. What we want is collaboration at the outline level, not at the node level. The identities of outlines and @file nodes change very slowly if at all.
Why are we so eager to have global node indices? Why aren't we as suspicious of global nodes as we are of global variables? What we are asking for is a completely chaotic situation that _falsely identifies_ nodes that a) were created at the same instant and b) now have arbitrarily different data and structure. In my mind, this is a recipe for disaster. It's a completely stupid idea. My criticism can be so harsh because the idea was mine :-)
It is impossible to resolve conflicts between versions of code that vary greatly _in time. We know this in our bones! Programs are really complex, and changing anything can have profound consequences throughout the code. Yes, even in Python. The recent fiasco with cut/paste shows this clearly. You want such fiascos to become routine? Then start messing with arbitrary conflicting nodes!
We can only resolve conflicts in code that has _recently_ been changed. Even if merging code separated in time could be done, it would be really foolish to create an environment that makes such a horrid undertaking part of the anticipated work flow. BTW, the algorithms that Rodrigo has been studying are attempts to synchronize development that is happening "concurrently". Even that is complex. Synchronizing present development with work that happened two months ago is futile.
Leo works _because of_, not in spite of, the close relationship between Leo outlines and derived files. Indeed, Leo outlines guarantee that all derived files are related _in time_. In other words, Leo ensures that derived files were all written "at the same time" or were all current at the time the time the .leo file was written.
Consistency is a property of the entire outline, not of parts of it! In other words, consistency is a global property, not the sum of individual properties of nodes. N.B. This is a far different use of the word "global" than in the so-called "global" indices. You can't recreate global properties by summing the properties of individual nodes !!
We aren't going to cure cvs's problems with a new file format. LeoN isn't based on cvs, and the "Resolve Conflicts" command really needs to know only that there is, in fact, a conflict between files. Something like a gnx might be tempting for the Resolve Conflicts command. I would consider adding some kind of identifying mark to sentinel lines provided that Leo doesn't use such marks to join nodes!
I would be more willing to adding _modification_ dates to nodes rather than creation dates. But adding modification dates is going to make cvs's problems worse, not better.
Similarly, @include is also dangerous. There is no way to keep @included info joined _in time_. If we are going to have @include at all, we must _break_ the links between nodes in different files. For sure we must never create links between nodes that are separated in time.
Conclusions
At last the picture is clear. Gnx's are dangerous because they join old data to new.
Unless I hear an _absolutely convincing_ argument to the contrary, I plan to abandon 4.0 immediately. Please note: I will be the sole judge of what "absolutely convincing" means. This is not a matter for experiment or "muddling though". I won't even consider gnx's further unless somebody shows why connecting old data to new data makes any kind of sense. Good luck :-)
No matter how many false starts we have taken with gnx's, this clear result is worthwhile and encouraging. I trust you will agree with me.
The sooner I stop trying to solve the wrong problems the sooner I can get 3.12 out the door and the sooner we can do LeoN and the "Resolve Conflicts" command :-)
Edward
P.S. The solution to the attic problem is either:
a) To _throw away_ stuff that is no longer useful (like we should) or
b) To make _dead_ copies of stuff and put them in the attic.
We _never_ want stuff in the attic to come to life automatically. The consequences of the dead coming to life would be similar to a "Friday the 13th" movie :-)
EKR
</t>
<t tx="ekr.20050421192149.11">In the last day or so I have been reviewing and rewriting all the notes about gti's that have appeared in the Leo Forums. This posting summarizes what I plan to do and why. I plan to begin rewriting leoAtFile.py in the next day or so. This is a good time to make any comments...
1. Global Tnode Indices (gti's) are the defining feature of 4.0. A "full" gti is a string of the form: "userid:location:timestamp:index" where userid is a cvs name, like edream or dthein, location identifies a location, timestamp denotes a time, and index is an integer used to disambiguate gti's that would otherwise be identical. The user will specify userid and location strings in a file, say leoID.txt. This file is private: it will not be part of any distribution nor will it be part of cvs. I'm not sure what Leo should do if it can't find leoID.txt.
Derived files will specify defaults for the userid and location strings. A minimal gti is a string of the form "::timestamp", where the userid, location and index strings are taken to be the defaults.
2. Leo does not need childIndex values in sentinels in order to reconstruct the outline. Leo can deduce the order of nodes introduced in the derived file by the @others directive. The order for nodes introduced in the derived file by section references is inessential. Therefore, we may store order information separately in .leo files. .leo files will have <marks> and <order> elements containing this inessential information. The <marks> and <order> elements will be a list of gti's of nodes.
3. All essential information (structure and content) of an @file tree must be kept together. Therefore, derived files must be fat.
@file trees in outlines can be thin. We no longer need information in the .leo file to recreate clone links.
@file-asis and @file-nosent trees in outlines must be thick. The corresponding derived files contain no sentinels. The _only_ way to create a thin derived file is to use @file-nosent or @file-asis. This way all essential information is in one place, namely in the outline.
4. Whether a node is clone or not is a property of the outline in which the node resides; it is _not_ an intrinsic property of a node. There can be no such thing as a clone index.
Unanswered question: what happens if a node is used in several derived files and has different text in each? This issues may happen more now that outlines don't mirror structure in derived files. This can't be a show-stopper, and it must be handled somehow, if only to warn people away from certain practices...
5. Gti's should allow Leo to handle included .leo files in an outline. These will probably be represented in an outline as @file x.leo, though perhaps @include x.leo would be more accurate and less likely to cause confusion with the various flavors of @file nodes.
6. There is still the possibility of cvs corrupting the structure of derived files. The format of derived files should be designed so that Leo can recover from corrupted derived files. @+nodes will contain only the gti, and to have the headline _follow_ the @+node. This scheme may be expanded in order to facilitate recovering from cvs interference.
7. Some way should be found to eliminate extra blank lines in derived files. This is a long-standing request, and it should be a requirement of 4.0 derived files. This may involve defining new sentinels to handle whitespace issues.
The following are the goals of the new format for derived files:
a) The minimum of sentinels needed to properly recreate the outline.
b) A robust way of telling whether newlines belong to sentinels or not.
b) A minimum of intrusion and ugliness,
c) No unnecessary blanks lines.
8. The code in leoAtFile.py will follow the present model, except that routines may be dispatched using a dispatching dict as in the syntax colorer. The code in leoFileCommands.py will change slightly (mainly to handle <marks> and <order> elements. It would be possible to use Python's xmllib or similar modules, and this is a fairly low priority, and not really connected with any other 4.0 design issues.
Edward
P.S. A note to myself: Python dictionaries will simplify both the read and write code.
</t>
<t tx="ekr.20050421192149.12">By: edream ( Edward K. Ream )
New (long) design notes
2002-10-21 08:49
One of the great joys of the Leo project is the way it takes, in unexpected ways and at unexpected times, surprising new directions. The last time a major change in Leo happened was a little more than a year ago when I decided that @file trees were feasible. I believe a similar seismic shift is about to happen.
There seems to be a natural rhythm involved: expansion and contraction, invention and consolidation/completion, positive and negative. I believe part of this natural rhythm involves forgetting, especially forgetting why things don’t work. Often a slightly new point of view invalidates formerly real obstacles.
Such changes and rhythms are heralded by largely unconscious thought. Recently there have been a great many requests for user options, as well as other features. I believe this has had the mostly unconscious effect of changing my thinking from “what is the right way?” to “why not do it every way?” Another formerly unconscious impetus for the present avalanche of ideas was the recent question about whether someone had imported the entire Linux kernel into Leo. That brought to my mind several problems with the present way of doing things:
1. Leo files can get very large.
2. It can take a long time to read all derived files.
3. The larger the file, and particular the more clones, the more time it takes to move cloned nodes.
4. Leo practically hangs when recovering from read errors.
5. Using .leo files with CVS is a real pain. I’ve pushed this to the background by promising the “Resolve CVS conflicts” command, but this command may not be easy to do, or even possible.
These thoughts got me thinking about all parts of Leo’s implementation, especially clones. I recalled the discussion about “global” clone indices. These thoughts have suddenly created a flood of new ideas, in several related directions or themes.
The major problems facing any new implementation strategy involve clones. At present, clone linking happens entirely within .leo files as the result of redundantly saving all information in the derived files as part of the .leo file. The derived files create content, the .leo file creates clone links, marks, etc.
This post is already too long, and it is just the introduction. So I am going to break the rest of this posting into 5 themes, each a "response" on this thread. </t>
<t tx="ekr.20050421192149.13">Big picture: why 4.0 is important
2002-12-03 22:05
There are many reasons why 4.0 is important. Yes, gti's solve many implementation problems. Yes, gti's allow for much smaller .leo files. But these are minor issues in the grand scheme of things.
As I see it, the biggest unfinished project is to make Leo suitable for using _everywhere_. In particular, I'd like to see some or all of the Python project done in Leo. Whether or not that ever happens isn't up to me, but this goal keeps me focused.
As I see it, there are several main drawbacks to doing Python in Leo:
1. Sentinels in derived files. I'm not sure how serious this issue is in general, and recent developments (@file-asis
and @file-nosent) are a pretty complete solution.
2. Spurious CVS diffs. Without gti's _many_ sentinels change whenever a node gets move. With gti's sentinels
_never_ change. Of course this issue won't arise if, say, Python uses @file-nosent trees, but in general gti's will
make using Leo with CVS much easier.
3. Dreaded read errors. These can and will happen if .leo files aren't downloaded from CVS "in synch" with derived files. Such read errors will disappear completely with gti's.
So 4.0 will make Leo CVS friendly. In the Open Source world I think this is absolutely essential.
</t>
<t tx="ekr.20050421192149.14">Actually, my first analysis incomplete. Suppose we eliminate vnodes completely? Leo would then redraw the screen directly from the inodes. This wouldn't be so hard: each inode would contain an list of Tk.Text widgets. The drawing code merely has to place the an unused widget in the correct place in the screen (the Tk.Canvas).
In some sense, the array of Tk.Text widgets in each inode is like the join list, but only visible nodes need be on this list. Furthermore, this list only needs to be updated when the outline is actually redrawn. It would be easy to insert or delete new Text widgets in this list. There are lots of possibilities, all easy to do in Python.
However, replacing vnodes with inodes is likely to be a very bad idea, for several reasons:
1. As I mentioned earlier, this implementation would require massive changes throughout Leo's code. All "user" code would have to use an iterator to traverse the outline. In particular, the fundamental code to manage the outlines would be changed significantly and would almost certainly become more complex.
2. Inodes complicate Leo from the user's point of view. The present data model is much better because
** The vnode tree corresponds directly to what the user sees on the screen **
Giving up this correspondence seems like a big step backward.
3. As mentioned in an earlier post, marks present a problem without vnodes. Perhaps each inode could contain a list of locations (in the full tree traversal) that should be marked. However, updating this kind of list when the outline changes could be very complex. It wouldn't be horrible to say that all joined nodes must be marked in synchronization, but it wouldn't be a step forward.
4. There are other ways of improving Leo's performance without touching the data model at all. In particular, rewriting the vnode, tnode, atFile and fileCommands modules as C++ code in a Python extension will almost certainly double key operations. And as mentioned in the first post, there are optimizations that the vnode class can do to avoid deleting dependent trees and then immediately recreating them. And don't forget that the average speed of our computers doubles every 2-3 years or so. So just waiting for a faster machine is a highly effective optimization!
Revised conclusions
Replacing vnodes with inodes is possible, and it might even provide some performance gains for huge outlines containing many clones. However, replacing vnodes with inodes would be an extremely high risk project: very complex, with possible negative consequences.
The present code base is plenty good enough for most outlines, and there are much simpler and better ways to speed up key outline operations. The result of all this noodling is that I have no more interest in inodes and their attendant complexities.
Edward
</t>
<t tx="ekr.20050421192149.15">Design notes: tnodes and vnodes
I have just realized that only tnodes need to be uniquely identified with a gnx (global node index). Vnodes do _not_ need to be so identified. Indeed, vnodes are now, and can always be, "anonymous". This will simplify the format of both .leo files and derived files in 4.0. I shall justify this conclusion in several informal ways:
1. Only tnodes have indices that in pre-4.0 .leo files. This causes no problems whatever. Indeed, the vnodes section of .leo files shows the nesting structure of vnodes by the nesting of the v tags. This is all that is required to create vnodes properly.
2. After creating an outline, Leo never at any time needs to refer to a particular vnode "by name." Instead, Leo simply traverses vnodes using the threadNext, back, next, etc. methods.
3. While we speak loosely of cloned vnodes, what we have in fact are _separate_ vnodes that share uniquely identified tnodes. In pre-4.0 files, tnodes are identified in .leo files by tx fields. These indices are generated as needed when Leo writes the .leo file.
4. Conceptually, vnodes are simply locations on the screen (or equivalently, positions in an outline) attached to tnodes that hold body text. The implementation is different from the concept: headlines are held in vnodes. This was done for historical reasons: in the Borland version of C it was natural to put headline in vnodes.
However, it would be more natural to place both headlines and body text in tnodes because cloned nodes must all have the same headline. In fact, you could say that the present code acts "as if" headlines were really stored in tnodes. It may be desirable later on (after 3.11.x becomes truly stable) to move headlines into tnodes. Whatever representation is finally chosen, however, Leo (in particular the read code and the event handlers) will ensure that all vnodes sharing the same tnode will in fact have the same headline.
5. The identity of tnodes (represented by gnx's in 4.0) is sufficient to do everything that Leo needs to do. A formal proof would be difficult. An informal proof is easy: Leo's 4.0 code is very similar to the pre-4.0 code in all respects, and the pre-4.0 code did not use the identity of vnodes in any real way.
For example, when reading a 4.0 derived file, Leo can create (anonymous) vnodes in the outline without any kind of identity whatever. All that is needed is the nesting structure of vnodes indicated by the sentinel lines in the derived file.
Summary & Conclusion
However one looks at it, it seems clear that the identity of vnodes is never needed. Leo's present code (including the 4.0 code) never uses the identity of vnodes, and it is not at all clear how Leo could use the identity of vnodes even if Leo had them.
The only thing that is important is the identity of tnodes. Leo uses the fact that vnodes share tnodes to create join links and in turn to ensure that all vnodes have the same headline. I believe the 4.0 should do essentially the same.
The present 4.0 code needlessly writes gnx fields for vnodes in.leo files and derived files. I shall remove these fields very soon. This will remove quite a bit of clutter, which is especially important in derived files. Perhaps more importantly, this discussion shows that the present 4.0 code is already essentially complete.
Edward
</t>
<t tx="ekr.20050421192149.16">Serious objections to link-target nodes?
I am seriously considering adding link-target nodes to Leo 4.x. As stated in the 10 breakthroughs post, there are a number of advantages to doing so, chiefly the following:
1. A foundation for making distinctions about kinds of clones. This may be very important in LeoN.
2. A way to greatly speed up fundamental outline operations.
3. A way to unify vnodes and tnodes internally, and a way to simplify the format of .leo files.
The drawbacks:
1. Slightly different user interface. Target nodes will have a bulls-eye.
N.B. The key features of clones will be retained. Link nodes will appear to have descendents just like today's clones, and you will be able to edit those virtual descendents just as today.
2. There may possibly be restrictions on using link nodes in LeoN. No such restrictions will exist in single-user Leo.
3. This change implies some differences in Leo's data model. Scripts may be affected slightly.
4. This change implies some difference in how Leo traverses trees. By default (and maybe always?) Leo will simply skip virtual descendents during tree traversals. I'll have a bit more to say about this in another post, coming today.
If there are any serious objects to this plan I'd like to hear them immediately.
Edward
P.S. Single-user user will probably always have the option of using old-style clones. Certainly this option will be available for the foreseeable future.
EKR
</t>
<t tx="ekr.20050421192149.17">Many, many thanks for this posting. I was beginning to think people didn't care about the end of 4.0 :-)
Let me assure you, LeoN is important to me, and I shall be glad to support it in any way I can, _including_ adding "names" that are exactly like the "old" gnx's. The point of the original post was not that any particular tool or technique was wrong. Such an idea would be brain-dead. Rather, the main idea was this: it would be _fatal_ to Leo to confuse old data with new data. We simply must not allow this.
> Still, I strongly dislike this idea that the whole of the .LEO is a monolithic, binary block which changes drastically with each little change in the LEO source tree.
My original mistake was thinking that gnx's would allow Leo to link cloned nodes reliably across different files. That idea was COMPLETELY WRONG. Old nodes are poison: we must never link to them. So it is not the gnx's themselves that are dangerous, it is using gnx's to link nodes that is wrong, wrong, wrong.
This is basically a database discussion. Whatever the means, we must maintain a single, unified and consistent view of all the data that we presently keep in a .leo file. Failing to do this kills the Leo project. I can see no way to maintain consistency of clone links when these links somehow reside in distinct files that may be changed arbitrarily at different files. No distributed database could possibly exist in this kind of chaotic environment.
In short, CONSISTENCY OF DATA is driving everything. The discussion about "worse is better" is irrelevant here. We are not talking about marketing or hype. We are talking about the engineering foundations of Leo. Also, the question of "heroic" solutions does not apply here. I didn't kill 4.0 because I was timid, I killed 4.0 to protect the consistency of Leo's data.
Please note: cvs also creates a "unified" view of data. Yes, cvs may manage many data files, but for each file cvs presents a consistent picture. Moreover, cvs forces users to update before committing, so that changes always happen "at the same time".
> For me, LEO nodes can serve as a "grand unifying concept"
No. You can't unify data using "atoms". Consistency of data is a global properly, especially where clones are concerned. And it's not just the "identity" of data that is important. The data must be "up-to-date" as well. You CAN NOT create a distributed database when that data base does not control each of its parts!
> the current implementation of LEO is totally unsuited for representing large source trees, and even more so for multiple people working on the same tree.
As you know, I've spent quite a bit of time thinking about alternative representations of data in Leo. However, I am inclined to disagree with you. Sure, Leo would bog down dealing with huge outlines, but so what? Large enough data files are going to break any implementation. And I fail to see how the present implementation would discourage the LeoN project. Leo's vnodes and tnodes are proven themselves to be robust views of Leo's data.
> To summarize this post:
1. 4.x is a step to make LEO more modular. A monolithic LEO file is unsuitable for collaboration without massive additional tool support.
2. Factoring out suboutlines as self-contained LEO files, which seamlessly integrate with the super-outline, is a relatively simple way to both get "low-tech" collaboration support and a dramatic increase in scalability.
Alas, both of these points ignore the fundamental problem of ensuring the consistency of Leo's data.
Edward
P.S. I believe that the "Resolve Conflicts" command should be relatively straightforward. They key idea is that this command should _not_ try to guess the intentions of programmers. Rather, this command will display differences between outlines in some simple way and leave it to programmers to resolve those differences.
EKR
</t>
<t tx="ekr.20050421192149.18">This really is a continuation of the thread Big picture: why 4.0 is important.
Yesterday, after writing that gti's don't solve all CVS problems, I had another Aha regarding 4.0 derived files. This is still not a complete solution, and it may be a big step in that direction. Recall that even with gti's two problems with CVS remain:
Problem 1: moving nodes can cause @node sentinels to change.
Problem 2: CVS "helpfully" inserts lines into a file when it detects a conflict.
These two problems are related in a nasty way: CVS can corrupt the structure of changed sentinels!
I was lightly dozing in the middle of the day, thinking quite vaguely about these problems when I suddenly I realized both problems can be made to go away! With gti's there is really no need to specify outline structure in derived files at all! We could adopt the following rule:
Structure rule: derived files specify content; the outline specifies structure.
The structure rule can be made to solve both Problem 1 and Problem 2 as follows:
1. Leo 4.0 could generate just a single @gti sentinel for each section reference. (or body text in @file-noref trees). This sentinel will contain only the gti of the defining vnode. This gti doesn't change no matter how the outline is reorganized, so CVS will never alter it!
2. For documentation purposes, Leo should generate the headline text of the vnode as a comment following the @gti sentinel. Headlines _can_ change, so CVS might alter such comments when CVS detects a conflicts, but changing this comment will _not_ corrupt the @gti lines! Robust recovery from CVS meddling may be possible.
3. Similarly, @ref sentinels could represent the actual reference in the body text. Again, the @ref sentinel would be followed by the actual text of the reference, so the actual @ref sentinel will never be altered by CVS.
The result is a radical simplification of derived files. At present (without gti's) Leo is forced to represent outline structure using nested @nodes sentinels and @body sentinels. When outline structure changes, arbitrarily many of these sentinels can change. In the new scheme, all these sentinels can be replaced by a single @gti sentinel that can never change.
Earlier I said that the Aha (i.e., the structure rule) does not solve all problems:
1. CVS could still corrupt @gti sentinels if it decides that a range of lines including @gti sentinels have changed. I'm not sure exactly what to do about this, but clearly the structure rule will reduce the number of times this will happen. Indeed, rather than posing problems for CVS's diff (as changed @node sentinels presently do), @gti lines will provide "islands of stability" for CVS.
2. As always, there is the problem of keeping .leo files and derived files in synch. Clearly, this problem can never go away completely. What Leo must do is to provide ways of a) discovering out-of-synch conditions and b) recovering in a straightforward way. With the new scheme, we know we are out-of-synch if an @file or @file-noref node refers to a gti not found in the derived file, or conversely, if the derived files contains an @gti sentinel with no corresponding tnode in the outline. The outline and derived files could be out-of-synch in other ways that would be undetectable. For example, suppose the only change to an outline is that a node was moved. This will cause no changes to gti's. On the other hand, such kind of out-of-synch conditions might safely be ignored. I'm not sure about this though...
We might try to associate a global time stamp (using the same techniques used to create gti's) with @file nodes and @file-noref nodes. Leo's atFile read logic can then determine whether the derived file was created by the @file node. However, this is a dubious idea: if this timestamp is represented in the derived file then CVS will complain and interfere when it changes...
3. If we adopt the structure rule then outlines must specify the structure of all @file trees, just as it does now. In particular, @file and @file-noref nodes can _not_ be placeholders. But we have the option of not saving body text that is used _only_ in @file and @file-noref trees. This has the potential to radically reduce the size of .leo files.
However, removing information from .leo files makes it more difficult to recover from errors. I think the solution to this kind of dilemma are option, either in leoConfig.txt or the .leo file itself specifying whether the .leo file will contain all information (that is, whether some body text may be deleted). I'm not sure about what this option should be when using CVS.
4. With or without the structure rule CVS can still interfere with .leo files in unpleasant ways. However, the structure rule makes it impossible to use placeholders for @file and @file-noref nodes, so the structure rule ensures that LeoPy.leo must be part of Leo's cvs tree. This might be considered a step backwards.
To summarize:
1. The structure rule completely solves Problem 1, and goes a long way towards solving Problem 2.
2. The structure rule greatly simplifies derived files.
3. The structure rule allows smaller .leo files, at the cost of making error recovery more difficult.
4. The structure rule would require that LeoPy.leo be part of Leo's CVS tree.
An important way to evaluate designs is whether the design is "headed in the right direction", that is, whether the design is tending to become more complex or less complex. Heading in the right direction is always important because simplifications tend to suggest further simplifications. I have some hope that further improvements may be possible...
Clearly, the structure rule greatly simplifies derived files. As a result, both the atFile.read and the atFile.write code will become simpler. The structure rule simplifies error recovery by essentially making it impossible! Either body text for a gti appears somewhere (in the .leo file or the derived file) or it doesn't. If it doesn't, then we have an out-of-synch condition for which no error recovery is possible. We can make sure that .leo files contain all the text for all their gti's by writing all body text to .leo files, just as is done now. Even so, out-of-synch conditions could still happen if the derived file defines a gti that does not appear in the outline. In that case the outline is missing a node used to create the derived file and no recovery is possible. But similar situations exist today, so the new way is no worse than the old.
To summarize further: the structure rule appears to be headed in the right direction, and only experience will tell whether error recovery considerations will allow us to use "small" .leo files. The structure rule appears to require that LeoPy.leo remain a part of Leo's CVS tree, which might be considered a step backward. I doubt that CVS will every handle XML files well. On balance, I think the structure rule is worth implementing to see what happens.
Now is a good time for your comments.
Edward
</t>
<t tx="ekr.20050421192149.19">Many thanks, Paul, for these thoughtful comments.
> Speaking off the top of my head, I understood than Gnx's mainly attempted to solve the problem of false conflicts in CVS raised when the local structure around a node changed but the text did not. eg in a derived file.
I had _many_ fond hopes for gnx's. This was one of them.
Let me be clear: there is still plenty of room for invention concerning Leo file formats, data structures, whatever. Today's "aha" was simply a realization that it is stupid to rely _too much_ on a particular kind of data. We could, in fact, put something that looks like a gnx into derived files _provided_ that we don't use the "new gnx" stupidly.
We must not use the new gnx's to create clone links, but I see no harm in using them as aids to the present "mirroring" scheme. I trust you see the importance of the distinction. The mirroring scheme relies on data _in the .leo file_ to create links, especially clone links. The forbidden gnx scheme threw all that data away pursuing a fool's errand. The fundamental reason the mirroring scheme works is that all the data that must be consistent are in a single outline.
> The situation in a .leo file is worse because the resulting CVS conflict marker would break the XML structure.
My present opinion is that trying to recover from the damage cvs may do to _any_ file is futile. Better to take the much simpler approach of using cvs conflicts merely to signal that we must do conflict resolution. Conflict resolution should be done on two successive, undamaged versions of a file. Both adjectives are important. "Successive:" We don't want to do anything but the simplest merges. Note that cvs enforces this constraint by requiring updates before commits. "Undamaged:" It's so much easier just to ignore cvs's notion of what is going on. What is _really_ going on is that the user has changed, inserted, deleted _outlines_, not just blocks of text that happen to have these confusing sentinels in them ;-)
> There are two problems here,
1. Structural changes cause node ID's to change
2. CVS conflict markers break XML structure.
As I indicated in my reply to Rich, there is a good chance that we can make problem 1 go away, and in so doing clean up the appearance of derived files. I'd like to sidestep problem 2 completely by dealing with only undamaged files. I think this is entirely reasonable.
Furthermore, the old gnx's would really have not have made the Resolve Conflicts command easier to do. In fact, they may have mislead us into considering algorithms that were fundamentally flawed. Make no mistake, the Resolve Conflicts command is non-trivial. I am only suggesting that the real problems with the old gnx's would likely have made gnx's useless (or even worse than useless) for resolving conflicts.
My vision is this: The Resolve Conflicts command _must_ rely on the user to make sense out of the differences between two conflicting files. All we can expect is that the Resolve Conflicts command can display the _approximate_ differences between two outlines. The user is going to have to sort out what is important and accurate. At worst, two users are going to have to email back and forth to discuss what actually was intended. N.B. This is _exactly_ the worst case in how people presently use cvs.
In other words, diffs are _always_ and always _inherently_ approximate, so the "pseudo precision" of the old gnx's was always going to be nothing but a red herring.
> What if we had a two component GNX?
My first, and probably last, reaction is that this kind of scheme is way too heroic. I don't see how it could help, and even if it could help we don't want to put ugnx's in derived files. Also, I've considered various schemes to "put all the pollution in the derived file in one place". I don't like the whole notion, though I play with it from time to time.
No. The way to solve problems is not with heroism but with fundamental simplicity. If we accept that the "Resolve Conflicts" command will work only on undamaged files we get the following benefits:
1. We can play with schemes to simplify derived files still further. This won't really change anything, and it would placate those who dislike (oh horrors!) the data that Leo puts into derived files. And it would make cvs a bit happier, though not enough happier to make a real difference ;-)
2. These schemes can add new data to .leo files. We can "think these thoughts" because the unrealistic expectations for gnx's have died.
Summary
The "Resolve Conflicts" command should work only on undamaged files. I am willing to consider adding data either to .leo files or derived files to help the Resolve Conflicts command, provided we don't have unrealistic expectations about what that data can do.
Paul, was it you who pointed out fundamental problems with resolving conflicts, even using "plain" cvs? I think we should remember those fundamental limitations and design a Resolve Conflicts command that relies on the user to do what people do best, namely understand the _meaning_ and _intention_ of changes to code.
Edward
</t>
<t tx="ekr.20050421192149.2">4.0 is dead! Long live Leo!
[WARNING: the concerns in this posting are real. The conclusions are WRONG! See 2004-02-05]
I have been growing increasingly uneasy about 4.0. The thoughts weren't fully formed, and I had vague worries about gnx's creating problems similar to the ill-fated backup .leo files.
Last night, the picture suddenly became clear. Gnx's have no chance of working. Don't panic; good things will come of this.
Conflicts: the fatal flaw
Lying in bed last night I saw this simple picture: two versions of the _same_ node (two nodes with the same gnx), each having different subtrees!
Yes, this is possible. There is nothing to prevent two people from editing separate copies of a .leo file so as to create different children for any node. People can rearrange .leo files in endless ways.
Unless I am greatly mistaken, there is no way of resolving all the messes that could result. This is the kind of problem that invalidates an entire design. Gnx's are history.
In retrospect, the situation seems clear: it is not nearly enough to identify nodes uniquely. Entire trees much match in outlines and derived files. Identifying _individual_ nodes as "the same" in no way guarantees that the trees of which they are a part have similar shape or contents. End of story. End of design.
Long live Leo
Last night I was more relieved than upset. Maybe I was confident that something good would appear. Maybe I was just grateful to see the true situation clearly. More likely, my intuition has known for quite awhile that gnx's wouldn't work. Anyway, when I awoke this morning a completely new train of thought appeared. It went something like this:
1. Gnx's won't work, so the problem they solved must be solved anew. The fundamental problem that gnx's actually _would_ have solved is creating sturdy links between nodes in derived files and nodes in outlines.
2. Alas, we can easily imaging those "sturdy links" creating conflicts when reading derived files: the structure of cloned nodes in a tree won't match the structure of the cloned nodes in the derived files. The read code is toast: it has no way of recovering. This is the "dreaded read error" with a vengeance!
3. Therefore, we are stuck with the "mirroring" scheme used in all recent versions of Leo: clone links must be contained in the .leo file, not in derived files. What if we make a virtue out of necessity? That is, what if .leo files once again become the primary source files?
We can remove _almost all_ sentinel lines from derived files!
Could the read code detect changes made to the derived file and incorporate those changes into the outline? Yes, it could! Leo would only need #@ lines in derived files that denote the start of a node. These lines would contain no other information: just raw marker lines. We could follow those lines with #@<<name>> lines to denote the section names, but such lines would be strictly optional.
Leo's atFile.read code would first ensure that the number of nodes in the derived file matched the number of nodes in the outline. If not, no simple "untangling" is possible, and a warning message would be sent to the log pane. Otherwise, Leo would compare the text section by section, and replace the outline with new text in the derived file as needed. This would be a major simplification of the read code!
4. What if we do the unthinkable and remove _all_ sentinel lines from the derived files?
There are several consequences:
A: @file nodes would become @file-nosentinel nodes by default. Don't worry! For the foreseeable future Leo will allow you to write @file-sentinel nodes by default if you want.
B: Derived files become "clean"; Leo adds nothing to them. This is _crucial_ for the wider adoption of Leo. Aside: I do like the context provided by #@<< name >> comments. There could be an option to write such lines in @file-nosentinel files. Such extra lines don't matter at all if Leo never reads those derived files.
C: Leo will load .leo files much more quickly. The pass that loads @file nodes will do nothing.
D: We can't update outlines from changes made derived files; no more automatic untangling.
This last consequence seems severe. However, recent developments make it seem bearable, for the following reasons:
A: Leo's Open With command provides an easy way of updating derived files outside of Leo and then integrating the changes back into the Leo outline.
B: The new workflow has almost entirely eliminated the need for me make changes to derived files outside of Leo. I run tests in a separate copy of Leo, and if that copy gets corrupted I make changes in an earlier version of Leo that still works.
Yes, I must remember not to close Leo. (Maybe a new Inhibit Close command would prevent me from doing something that I don't want to do.) Anyway, even if I forget and do close Leo with a corrupted copy of leo.py, all I would need to do is open LeoPy.leo from a non-corrupted copy of leo.py.
So there is less need for reading derived files when opening a .leo file. We might rely on an explicit Untangle command to update the outline as needed, provided of course, that at least minimal sentinels were written to the derived files.
Leo and cvs
I am considering making @file-nosentinel files the standard way of interacting with cvs. Let's look at the consequences:
1. Cvs conflicts will have less direct effect on derived files. Cvs conflicts alter derived files, and must be dealt with, but the conflicts will not affect how Leo reads the corresponding .leo file.
2. We should be able to design a Resolve CVS Conflicts command to deal with such conflicts in a semi-automatic way. This command might just set up a script-oriented Find/Change panel. The script would walk through the derived file and present changes in the outline, using the code for the Go To Line command.
Leo should able to open plain text files in a plain text window. This should have been done long ago. We may want to open the conflicting derived file in such a text window while executing the Resolve CVS Conflicts command.
3. Cvs is sufficient to manage development of derived files! It no longer has to worry about tracking clones across files. This is a _major_ step forward.
4. As before, cvs is completely incompetent to handle changes or conflicts in .leo files. In some ways, the situation is just the same as it has always been. In some ways, the situation is actually better, because of cvs can handle derived files properly.
A manual solution is viable in the short term: developers would see the conflicts in derived files, agree which conflicting .leo file to use as the "base" .leo file, and put the resolved code into the base .leo file by hand.
Longer term, we clearly need to implement the xmldiff approach to resolving conflicting .leo files. It should be possible to run xmldiff from within a reference copy of LeoPy.leo, i.e., a copy not affected by the cvs conflicts. N.B.: the xmldiff command or script can be self contained and will affect no other part of Leo's design.
Questions to consider
1. How important is it to have the option of automatic untangling when reading .leo files? Are the advantages gained in dealing with cvs enough to overcome the inconvenience of not having automatic untangling?
2. Supposing that Leo does write sentinels, what format should be used? The choices:
A: Use the old way. This is fairly appealing visually and it robustly identifies structure.
B: Use the minimalist way using only #@ markers, possibly followed by #@<<name>> lines. This way probably isn't robust enough for automatic untangling, and it could be used to support an explicit Untangle @file Node command.
C: Use the way developed for 4.0, modified so that it doesn't use gnx's. I am not comfortable with this approach. It seems to neither as visually clear as point A and it is more sensitive to cvs conflicts than point B. My vague unhappiness with this format has been growing for quite some time.
Summary and conclusions
Gnx's create problems that cannot be resolved. My intuition is relieved to be rid of them and I am not looking for ways to bring them back. Yes, I am open to ways of resuscitating 4.0, and I don't expect that this can be done.
The section called "Long Live Leo" proposes new ways of using Leo. These changes are a major new way of understanding Leo how fits into the world. These changes imply no major changes to Leo's code base.
These changes will create many good things in the long run, regardless of how unsettling they may be in the short run. My intuition tells me that the "new old Leo" is a big step in the right direction.
Using clean derived files, devoid of all sentinels, may be best when using cvs. Clean derived files promise to simplify how people collaborate using Leo. Resolving cvs conflicts _inside Leo_ becomes feasible using relatively simple scripts.
Your comments and suggestions are very important now I shall make no major changes in Leo's code until your comments have sunk in thoroughly.
Edward
P.S. I envisage continuing the 3.x version numbering for the foreseeable future. It may well be that 3.12 can come out soon. I'd like that very much.
EKR
</t>
<t tx="ekr.20050421192149.20">Using gnx's safely
[Warning: these are obsolete ideas:
- thin derived files carry all essential info, including structure and content.
- The root @thin node in an outline carries only marks, expansion state and uA's]
My initial intuition wasn't completely faulty, I think: using gnx's naively has the potential to create almost unlimited chaos. The problem is this: if "flat" (derived) files contain embedded structure, the structure implied by a derived file may not match the structure of corresponding (cloned) node in a .leo file. As I have said before, the situation may result in an extreme form of "dreaded read errors". Note that such mismatches of structure may occur even if @file-thin is in effect. When many people are editing flat files simultaneously the potential for conflicts seems almost unlimited.
Maybe we should revisit an old idea, namely that outlines should carry structure and that flat files should carry content. The implications:
1. There will be no @file-thin or @file-thick options. Outlines, not flat files, will be the _only_ determinant of outline structure.
2. Flat files will use gnx's _only_ to delimit text, not to show structure. This will greatly simplify the format of flat files. Indeed, about the only sentinels will be:
#@+ <tnx>
#@- <tnx>
These mark the start and end of body text for the tnode with the given tnx. There will be a few other sentinels, roughly the same as presently, for marking section references, verbatim escapes, etc. What will _not_ be part of flat files are the sentinels that describe nested vnodes. Note: we still have to handle the _effects_ of nested vnodes, which is why we need #@- <tnx> sentinels and @ref sentinels.
3. Reading a flat file might be considerably simpler than the present atFile.read code. The code need not recreate structure; it need only update the contents of tnodes that exist in the outline. It might be that the new simpler code could handle cvs conflict markers in a separate pre-scan.
4. This scheme works well with the Resolve CVS Conflicts command that I just described on a recent thread.
We might not want this simplification if Gil can convince me that Leo can easily representing conflicts in tree structure.
Edward
</t>
<t tx="ekr.20050421192149.3">Links, clones & mode bits
Today I wrote a script that scans a .leo file looking to see how clones are used. Something like this script could be used to convert clones to links.
The results were most interesting:
targetsInDerivedFiles: 418
[snip] Clones for which exactly one item on the join list is in an @file tree.
clonesInNoDerivedFiles: 20
[snip] Clones for which no item on the join list is in an @file tree.
clonedAtFileNodes: 9
[snip] Cloned @file trees themselves.
multipleTargetsInDerivedFiles: 8
<< Append any unused text to the parent's body text >>
<< Check both parts for @ comment conventions >>
<< Compare single characters >>
<< Set the default directory >>
class nodeIndices
frame.OpenWithFileName
recentButtonCallback
replacePatterns
The sections in multipleTargetsInDerivedFiles show a bug in how I have been using Leo (!) For example, OpenWithFileName is indeed defined twice in LeoPy.leo (!!) Can you see how this happened?
This script shows that the vast majority of clones could be converted to links automatically, simply by picking as the target the unique node on the join list that appears in some @file tree, including cloned @file nodes themselves. multipleTargetsInDerivedFiles indicate problems with the present version of LeoPy.leo. clonesInNoDerivedFiles are clones that are "floating" without any target in any derived file. The clones2links script could pick one at random to be the target without great harm being done. This might be subject to the constraint that no link can point to an ancestor, something for which the present test script did _not_ check.
So I am beginning to think that links might actually be better, in some sense, than clones. Leo would refuse to allow targets that appear twice in derived files, whether twice in the same derived file or in two separate derived files.
N.B. This special pleading!! This argument is _not_ disinterested. I am considering reneging on my statement that clones will remain as they are in the single-user version of Leo. The problem is that this is going to lead to all sorts of coding problems. In effect Leo would need one or more "mode bits" that indicate which flavor of code to use:
- use clones or use links.
- use gnx's or use file indices.
The present code uses the app().use_gnx switch to determine whether to write gnx's or not. This was always intended a _strictly temporary_ expedient. Enshrining such mode bits as part of Leo would be very bad practice. For example, the recent train wreck involving cut & paste of nodes was probably related to app().use_gnx: the old code works only if the tnodesDict is cleared, but that wasn't done when use_gnx was false.
In any event, there is no way I am going to allow even 1 mode bit in Leo on a permanent basis. Each mode bit in effect doubles the number of paths through the code. Sure, we must have backward compatibility, but that kind of special-case code can't be helped. Mode bits must be avoided.
Edward
</t>
<t tx="ekr.20050421192149.4">Please read: cold feet
After all the interesting design work with shared vnodes, I am having major doubts that such a change to Leo is wise. Here are my concerns:
1. Using shared vnodes will likely cause all kinds of subtle compatibility problems: Eliminating tnodes (even with the self.t = self hack) has the potential to create complex and subtle changes in the meaning of code. For example, at present both vnodes and tnodes contain status bits. In particular, merging the vnode and tnode "visited" bits into a single vnode is going to affect existing code. I am already seeing this kind of problem as the result of comparatively minor changes needed for 4.0. Eliminating tnodes altogether makes me extremely nervous.
2. Sharing vnodes means that different nodes on the screen are _really_ identical. For example, with a shared vnode scheme all shared vnodes would have to be marked if any shared vnode were marked. This would be a small violation of what users might expect.
3. The present separation between vnodes and tnodes is actually quite natural. Tnodes represent shared information. Vnodes represent nodes on the screen.
4. Although theoretically interesting, major changes are needed neither to support 4.x nor to support LeoN.
In short, sharing vnodes looks like all pain and no gain:
1. The potential for a major disruption to Leo's progress is very real, in spite of the trick of making positions look like vnodes.
2. Scalability issues do not seem pressing now. Moreover, there may be less drastic ways of improving Leo's performance besides changing Leo's fundamental data model.
3. Abandoning the shared vnode scheme would mean that I could release 4.0 beta 1 in a matter of a week or three.
I am going to let these thoughts and fears sink in for a few more days at the least. There is no need for another quiet time. Indeed, I encourage your comments on this vital subject.
Edward
P.S: Leo's file formats are not, in fact, strongly connected to how Leo represents data internally. We are pretty much free to represent data in .leo files and derived files as we like: Leo's various kinds of read/write can easily and cleanly translate from any external file format to/from any internal data representation.
EKR
</t>
<t tx="ekr.20050421192149.5">Note: a copy of this will be on Leo's wiki shortly. I am posting this here because I think it is important that everyone sees it.
Yesterday in the bath I was mulling over how Leo should deal with conflicting contents of the "same" node (nodes with the same vnx or tnx). In particular, I was considering Gil's picture of conflict nodes on Leo's wiki.
I didn't understand a lot of the details and assumptions about that picture, and I was wondering how to go about understanding the details, when I suddenly realized that it wouldn't matter if I _did_ understand the details. Any conflict scheme based on a complex model, with complex operations, implementing subtle distinctions would fail completely. To be useful to the user, conflicts must be dead simple to see, to use and to understand. Everything must be obvious and intuitive to the _naive_ user.
This conclusion is based on my own experience with "dreaded read errors" (errors that arise in 3.x as the result of mismatches between the outline structure in .leo files and the corresponding derived files.) Here I was, the creator of Leo, with intimate knowledge of all aspects of Leo and its implementation, and my one and only reaction to a dialog asking me to make a choice was helpless, blind panic. The dialog didn't provide me with nearly enough information to make a proper choice, and the panic that the dialog induced in me would have prevented me from making an informed choice even if I did have the missing information!
A similar situation pertains regarding conflicts. There is no use in having fancy distinctions about kinds of conflicts. The user won't understand those distinctions, no matter how sophisticated the user is. I am sure that I would not be able to understand those distinctions. Moreover, we _must_ assume that the user knows nothing about various flavors of conflicts. Users won't read documentation until well _after_ the conflicts have been presented to him or her, if ever. And I would be most unwilling to answer endless questions on Leo's Help Forum regarding various kinds of conflicts.
With all this in mind, the fundamental design issues became quite clear. The following is based on Gil's diagram, with several simplifying assumptions:
- Leo must indeed handle conflicts. Gil has just pointed out that cvs can't detect conflicts in nodes in different files.
- Leo can't use dialogs to resolve conflicts: dialogs offer too little context.
- N.B. Leo will represent conflicts by a single _conflict node_ (possibly cloned). Conflict nodes behave _exactly_ as regular nodes as far as most of Leo is concerned. In particular, conflict nodes are cloned if and only if the corresponding conflict node would be cloned. Moving, inserting, deleting, cloning conflict nodes happens just as with any other nodes. A change to any part of a conflict node is propagated to all other joined nodes, and all those joined nodes will be conflict nodes.
- N.B. On the screen, Leo shows conflicts as "fat nodes." Fat nodes look like a tree, but in reality _fat nodes are a single node_. Leo draws fat nodes as a _conflict_ tree, consisting of a _main headline_ and one or more _conflict headlines_. Leo will draw the conflict tree in a distinctive manner to indicate that the nodes of a conflict tree are closely related to the main headline. There may be a command to delete a selected conflict headline. BTW, the details of how to draw fat tnodes are confined solely to code in leoTree.py.
- N.B. _main headline_ represents the _main data_ of the conflict node. Leo uses the main data of a node in all operations. In particular, ** Leo only writes main data to .leo or derived files **
In other words, _conflicts are invisible_ to Leo for most purposes. All conflicts will be lost when Leo exits. Until then, users can update or replace the main data with whatever data they choose.
- N.B. Leo will represent fat nodes in memory as a _fat tnode_. Fat tnodes are exactly like present tnodes except that they contain a new conflictList ivar. This ivar holds a list of alternate values for the headline and body text stored in tnodes. This ivar does _not_ affect the rest of Leo (except as noted below). In particular, tnode getters and setters _ignore_ the contents of conflictList, so the file code is unchanged, the find command ignores alternative conflicting values, etc. etc. (Of course, it would be possible to write scripts that would access t.conflictList data.)
In this representation, the _conflict data_ is the contents of t.conflictList. The main headline is the fat tnode's regular headline text. Conflict headlines are headlines associated with items in conflictList.
- N.B. There is only one conflict tnode associated with any set of conflicting nodes with the same gnx or tnx. This is because in 4.0 tnodes contain headline text. No matter how many conflicts are associated with a particular vnode or tnode, all such conflicts get merged into a single conflict tnode.
- We can extend the Cut Node and Copy Node commands to work on conflict headlines. We shall certainly also want to have Go To Next/Previous Conflict commands. We might want to add Insert Conflict Node or Delete Conflict Node commands, but these are optional. There should be no need for other operations on conflicts. As always, users could create new nodes to squirrel away data wherever they please.
That is just about all there is to it. Note in particular what is _not_ present:
- We haven't created anything new; we have just added the conflictList ivar to tnodes. True, Leo draws fat nodes differently, but this changes little.
- All operations on nodes remain exactly as have been in the past. In particular, Leo always writes main data to .leo and derived files. In other words, Leo mostly ignores conflicts.
- Leo will not expect the user to see or understand any kind of distinctions regarding conflicts. It doesn't matter how or why the conflicts arose: if data conflicts, no matter what the reasons, Leo simply shows all the possibilities. Leo will pick one of those possibilities (somewhat at random) to use as the main headline. I say "somewhat at random" because in some cases Leo might make an educated guess about what the main data should be. See the next section.
Avoiding conflicts
In some cases Leo won't even bother to create items in t.conflictList, but will instead silently replace one version of data with another. In other words, it may be valid for Leo to make distinctions about types of conflicts, _as long as those distinctions never become apparent to the user_ !! In particular, I plan to change the format of .leo files to add a file modification date to the <v> element of each @file node. This will allow Leo to do the following:
1. Leo will silently ignore conflicts that arise solely because the user has edited a derived file after the derived file is created. Leo does this now: it's essential to make "automatic untangle" work without being intrusive.
2. Leo must warn when reading a derived file that was created _before_ the modification date in the corresponding <v> element in the .leo file. Such situations are more like file reversions than conflicts.
3. Leo probably will warn when conflicts arise one the _same_ node as the result of reading the node from two different derived files. Such conflicts seem to me to reflect bad organization. Note: in spite of this warning (probably given in the log pane) Leo will create entries in t.conflictList as usual. There is _nothing special_ about such conflicts.
Conclusions & Summary
I believe the general form of the conflict issue is becoming clear:
- Conflicts must be simple. Users, even sophisticated and experienced users like me simply won't be able to understand complex conflicts.
- Leo must present only one kind of conflict to the user, and there must be no new kinds of outline nodes or outline operations. Fat nodes only appear to be new. In fact, they work exactly as always.
- Adding a few simple commands will suffice to support conflicts.
- Leo can continue to ignore some (most?) conflicts, silently replacing one node with another.
The scheme just outlined can handle all conflicts because it doesn't attempt to do too much. Conflicts are simply presented to the user for the user to handle or ignore. Leo can't be expected to reconcile conflicting versions of nodes.
I believe this scheme presents a firm foundation for further work. Indeed, I probably won't implement fat tnodes now, secure in the knowledge that they can be added easily at any time. For 4.0 alpha 0 it will suffice to do the following:
- Add modification dates to <v> elements in .leo files corresponding to @file nodes.
- Warn on conflicts in the log pane and use the version of the conflict in the derived file that is read last.
With these changes we can see often serious conflicts actually arise.
That's all for now. I am most interested in your comments.
Edward
P.S. Thanks again, Gil for your many comments. To repeat, your diagram on Leo's wiki was crucial to all these thoughts. Fat tnodes are obviously based directly on that diagram.
EKR
</t>
<t tx="ekr.20050421192149.6">Conflicts: new directions
Sometimes in the creative process it is useful to _increase_ the confusion.
With that in mind, I'd like to propose two different, contradictory approaches to resolving node conflicts in 4.x.
I. Avoid the conflicts entirely
It might be possible to avoid conflicts completely using cvs. Indeed, cvs is quite successful in managing concurrent development. Let's not forget this fact!
Suppose we declare that all developers using Leo must use cvs if we are going to use Leo cooperatively. Might it not be true that this rule will, by itself, solve all the 4.x conflict issues? It certainly seems plausible to me. After all, that's really cvs's job!
II. Redesign Leo along client-server lines
Another way would be to use the Lotus Notes approach, whatever that is. Here is where homework comes in. Does anyone have personal knowledge of Lotus Notes? One of my friends does have that experience. He's a realtor, so he won't be tainted by any knowledge of Leo, which might be really useful: he will focus on the big picture.
The nice thing about design is that we can contemplate huge changes in Leo (like making it work like Lotus Notes) without any investment in coding.
Comments please.
Edward
</t>
<t tx="ekr.20050421192149.7">OTW: Replace clones with hoists?
I've titled this Off The Wall because it is pure speculation.
Nobody is more fond of clones than I, and we are seeing all sorts of implementation and design problems that arise directly from clones:
- Clones slow down fundamental outline operations. This limits how big Leo outlines can be. OTOH, all programs have limits, and it may be that clones really aren't the limiting factor. Clearly, drawing many nodes on a screen is going to be slow in any case.
- Clones appear to make @file-thin impossible.
- Clones greatly complicate error recovery, and error recovery drives all aspects of Leo's design.
- Clones complicate resolving conflicts between files, both .leo files and derived files.
So my question is: can we get the _benefits_ of clones without actually implementing clones?
Well, what are the benefits of clones? How we answer this question is crucial! I am going to focus on how I actually use clones; I'm going to ignore uses of clones that I haven't yet thought of.
1. Clones allow different views of a tree. We clone nodes and gather them together for easy viewing.
2. Clones join all these different views together.
3. Clones are "live". Altering any clone, including its structure, alters all other clones.
At present, all clones of a node are equivalent to the node itself. There is no such thing as a "master" node from which all clones are derived. Suppose we alter this as follows:
1. We replace clone nodes with "link nodes" (patent pending). Link nodes have no structure and no content. They merely point to another node, the target node. Linking to a link node is exactly equivalent to linking to the target node.
2. Selecting a link node takes us to the target node.
We must make sure that we don't get dizzy as Leo skips around the outline. The great advantage of clones is that we can move from clone to clone in a project headline _without_ constantly redrawing the outline, and without having the vertiginous experience of the outline changing. So if we eliminate clones we want to do so in a way that doesn't create visual chaos.
Hoisting may provide a way to do this. Hoisting is one of MORE's other cute ideas. Hoisting just means replacing the view of the outline in the outline pane by the presently selected suboutline. Naturally, there must be a way of "dehoisting": going back to the original view. BTW, in MORE, (and in Leo without clones) there can be arbitrarily many levels of hoisting.
Hoists can be done in several ways:
1. Inside the Leo window: Leo would replace the view of the entire outline with a view only of the linked outline. This isn't so jolly because the set of link nodes in the project view disappears and we would have to dehoist in order to see the project view.
2. In a separate Hoist Window. This can be done in two ways: we could show the hoisted outline either in the Hoist Window or in the main Leo Window. There are advantages to either. Perhaps we shall want to allow either way. Either way, clicking a link node hoists the target node.
If we show the hoisted outline in the main Leo window we need a command to copy the present node (Presumably a project node including links) to the Hoist Window. If we show hoisted code in the Hoist Window then we need no such command. Clicking a link node shows the hoisted tree immediately in the Hoist window.
I think I favor using the Hoist Window for showing the project view: then we always see the project view while hoisting. The alternative is to show the hoisted outline and the body text in two panes of the Hoist Window. The advantage is that the entire outline is always visible in the main Leo window.
Consequences
This would be a radical restructuring and simplification of Leo:
1. There would no longer be a need for separate tnodes and vnodes. All we need is a separate link for link nodes.
2. Gnx's make link nodes possible!
3. Leo could insist that link nodes appear only outside of @file trees. This kind of restriction would be similar to the ban on orphan and ignored nodes.
4. Therefore, only .leo files contain link nodes. Link nodes refer to whatever tree the gnx refers. Link nodes _have no structure_. Actually, we could allow link nodes to have descendents, but the structure of such descendents does not affect the link node in any way. It would probably be less confusing, though, to disallow link nodes from having children. In any event, this is a minor point.
5. Leo could use a much simpler file format, possibly even OPML. However, we would need a kludge so that we could represent link nodes in OPML.
6. Leo's fundamental outline operations would be much simpler. No need for dependent trees. No need for join lists!
7. @file-thin nodes become possible again. Derived files can completely specify structure. Links nodes refer to gnx's, whatever their structure. Without clones we have no more structure mismatches when reading derived files.
8. Structure conflicts are still possible between two .leo files, i.e., between two versions of a single flat file. I'm not sure how to resolve this. In any case, the problems are no worse than before.
9. When deleting an original node all link nodes become "broken". We must deal with this somehow. Maybe we can even ignore it and let the user delete the links. Or we can prompt. Note that undo can restore the target of the link, so we probably don't want to automatically delete link nodes (unless undo also restores the link nodes!).
Summary
Hoists would eliminate clones and dependent trees, at the expense of a bit more work on the users part.
This is a completely experimental idea. I would have to implement it completely before deciding whether it is a good idea. It does have appeal.
The notion of tabbed Leo windows and the recent work with Mark and Recent windows has shown that altering Leo's visual presentation is relatively easily done.
Naturally, your comments and suggestions are crucial.
Edward
P.S. I think the best time for blue-ski thinking is when things are already completely confused :-) Now is that time, so let her rip.