CoEditor Unable to Reconcile Simultaneous Changes #185

ccotter · 2012-06-12T16:38:22Z

The CoEditor appears to work properly when clients atomically make changes, but when two clients make simultaneous changes, the operational transform algorithm destroys the HTML structure embedded in the raw character text. See the example below.

Initially:

Client 1 html:  
Client 2 html:  
Client 1 changes   into a  and sends the following syncs to the server:
- insert 2 >
- insert 3 a
- insert 4 <
- insert 5 /
- insert 6 b
- insert 7 >
- insert 8 <
- insert 9 b
Client 2 changes   into s and sends the following syncs to the server:
- delete 0
- delete 0
- delete 0
- update 0 s

At this point:

Client 1 html: a 
Client 2 html: s

After the operational transform:

Client 1 receives the following remote syncs:
- insert 0 >
- insert 1 a
- insert 2 <
- insert 3 /
- insert 4 b
- insert 5 >
- insert 6 <
- insert 7 b
Client 2 receives the following remote syncs:
- delete 0
- delete 0
- delete 8
- update s 8

The final HTML of both clients does in fact remain the same - however, it is malformed: >a<bs

The core issue is the operational transform is agnostic to information encoded in HTML tags (i.e. the OT only provides syntactical consistency, not semantic consistency). This is not an OT bug, but rather a design flaw in how the editor data is synced.
#167 is related in behavior, although the underlying problems are different and perhaps more severe for this issue.

The text was updated successfully, but these errors were encountered:

ccotter · 2012-06-12T16:55:21Z

Try to brainstorm some ideas here...

Solutions

We could treat the editor content as a DOM tree and not as a single sequence of characters. We could use the same method of syncing the tree as the CoTree demo.

Current Approach

I have been working for the past two days on a DOM tree based approach to representing and syncing the editor data. I've come up with a representation (likely to change somewhat) of the DOM tree that we can use to sync. The difficult part is doing a diff on two different DOM trees. I had not considered how to diff two different trees initially, and by far figuring out an efficient approach had taken up most of my time.

Tree Diffing

I could not find any useful existing libraries that could diff trees (in JavaScript), so I found some papers describing techniques for doing so. I have implemented an algorithm described by Chawathe et. al. [1], and have been testing it for correctness and designing a series of regression tests. The algorithm essentially takes two trees and finds an EditScript, or a sequence of operations (insert, update, move, and delete) to change one tree into another.

Currently, one part of the algorithm is unacceptably slow, but [1] and [2] offer solutions. Another caveat is that [1] makes some assumptions about the tree structure and data in order to guarantee to find a minimal EditScript - these assumptions do not hold for our editor content. Fortunately, [2] and others (not listed here) offer more efficient solutions that relax assumptions about the trees that are acceptable for the CoEditor.

Update June 17th 2012

I've created a branch (treeBasedEditor https://github.com/opencoweb/cowebx/tree/treeBasedEditor) with the code changes. In the current state, the coeditor has been converted to use the tree based approach, but there are some bugs that prevent syncing to work properly. In other words, I just haven't finish fully implementing everything, but the majority of the work is done, and with some tweaking the editor should be fully functional.

[1] http://ilpubs.stanford.edu:8090/115/1/1995-46.pdf
[2] http://www.ifi.uzh.ch/pax/uploads/pdf/publication/704/fluri-tse2007.pdf

ccotter · 2012-07-31T19:26:14Z

I believe we can improve the raw HTML based editor to a state that eliminates the effects of the inherent issue discussed above.

Add a moderator to the editor to track changes to the rich text content.
Anytime the raw HTML becomes invalid, the moderator will revert back to a version that is valid HTML.
Anytime the raw HTML becomes invalid, browser clients will destroy their collaborative object, and create a new one, and will retrieve the valid copy from the moderator.

Since all clients will converge to the invalid HTML (guaranteed by the OT algorithm), eventually each will request a fresh copy from the moderator, who will always have a valid HTML string.

ccotter · 2012-08-01T20:43:06Z

The coedit app uses a moderator to track the rich text editor widget changes. Now, I need to

Add logic to the browser javascript to check for inconsistent states and reinitialize the collab object to pull valid state from the moderator.
Implement a validator in the moderator to check if a string is valid html.

ccotter · 2012-08-02T14:27:04Z

Always forget to include issue number in commit messages.

49881468001c65e81a4e42b24d468a72104c896d
a9056738f88122d15da078a31c27fe8522a9761a
14411fb3f9d0aad04cc748a6f52666637404cd5f
ac8e22fed3d69c94a74388f93d476871cfa7a845
e66471d936b5fcd11b8950b3953976b5f135abfa

ghost assigned ccotter Jul 1, 2012

ccotter mentioned this issue Jul 1, 2012

Replace coedit with DOM-based version #195

Closed

ccotter mentioned this issue Oct 16, 2012

[CoEditor] HTML control characters are being displayed #167

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoEditor Unable to Reconcile Simultaneous Changes #185

CoEditor Unable to Reconcile Simultaneous Changes #185

ccotter commented Jun 12, 2012

ccotter commented Jun 12, 2012

ccotter commented Jul 31, 2012

ccotter commented Aug 1, 2012

ccotter commented Aug 2, 2012

CoEditor Unable to Reconcile Simultaneous Changes #185

CoEditor Unable to Reconcile Simultaneous Changes #185

Comments

ccotter commented Jun 12, 2012

Initially:

At this point:

After the operational transform:

ccotter commented Jun 12, 2012

Solutions

Current Approach

Tree Diffing

Update June 17th 2012

ccotter commented Jul 31, 2012

ccotter commented Aug 1, 2012

ccotter commented Aug 2, 2012