forked from konvpalto/offlineimaporg
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathhowitworks.html
102 lines (95 loc) · 8.46 KB
/
howitworks.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<!--[if lt IE 9]>
// teach older IEs to render those elements at all (http://diveintohtml5.org/semantics.html#unknown-elements)
<script>
var e = ("abbr,article,aside,audio,canvas,datalist,details," +
"figure,footer,header,hgroup,mark,menu,meter,nav,output," +
"progress,section,time,video").split(',');
for (var i = 0; i < e.length; i++) {
document.createElement(e[i]);
}
</script>
<![endif]-->
<link href='http://fonts.googleapis.com/css?family=Walter+Turncoat' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<header>
<img id="logo" src="img/logo.png">
<hgroup>
<h1><a href="http://offlineimap.org">OfflineImap</a></h1>
<h2>Syncing IMAP accounts since 2002</h2>
</hgroup>
<nav>
<a class="reference" href="index.html">Home</a>
<a class="reference" href="index.html#ref-download" >Download</a>
<a class="reference" href="index.html#ref-quick-start" >Quick Start</a>
<a class="reference" href="index.html#ref-documentation" >Documentation</a>
<a class="reference" href="development.html" >Development</a>
<a class="reference" href="index.html#ref-links">Links</a>
</nav>
</header>
<section id="content">
<h3 class="entry-title">How OfflineIMAP works</h3>
<h4 class="vcard author">by <span class="fn">Edward Z. Yang</span></h4>
<div class="entry-content">
<div class="document">
<!-- -*- mode: rst -*- -->
<p>As software engineers, we are trained to be a little distrustful of marketing copy like this:</p>
<blockquote>
OfflineIMAP is SAFE; it uses an algorithm designed to prevent mail loss at all costs. Because of the design of this algorithm, even programming errors should not result in loss of mail. I am so confident in the algorithm that I use my own personal and work accounts for testing of OfflineIMAP pre-release, development, and beta releases.</blockquote>
<p>What is this algorithm? Why does it work? Where is the correctness proof? Unfortunately, no where in OfflineIMAP’s end user documentation is the algorithm described in any detail that would permit a software engineer to convince himself of OfflineIMAP’s correctness. Fortunately for us, OfflineIMAP is open source, so we can find out what this mysterious algorithm is. In fact, OfflineIMAP's synchronization algorithm is very simple and elegant. (Nota bene: for simplicity’s sake, we don’t consider message flag synchronization.)</p>
<div class="section" id="preliminaries">
<h3>Preliminaries</h3>
<p>Define our local and remote repositories (Maildir and IMAP, respectively) to consist of sets over messages L and R. In a no-delete synchronization scheme, we would like to perform some set of operations such that end states of the repositories L' and R' are L ∪ R.</p>
<p>However, no-delete synchronization schemes work poorly for email, where we would like the ability to delete messages and have those changes be propagated too. To this end, OfflineIMAP defines a third repository called the status repository, also a set over messages, which says whether or not a message has been synchronized in the past without an intervening synchronized delete. There are now seven possible states for a message to have, based on which repositories it is a member:</p>
<div class="outer-image"><div class="inner-image"><img alt="img/state-space.png" src="img/state-space.png" /></div></div>
<p>Considering all possible combinations:</p>
<ul class="simple">
<li><strong>Synchronized</strong> (L,R,S): The message is fully synchronized and needs no further processing.</li>
<li><strong>New Local</strong> (L): The message was newly added to the local repository and needs to be uploaded.</li>
<li><strong>New Remote</strong> (R): The message was newly added to the remote repository and needs to be downloaded.</li>
<li><strong>Status Missing</strong> (L,R): The message is synchronized but our status is out-of-date.</li>
<li><strong>Remote Removed</strong> (L,S): The message was synchronized, but since then was removed from the remote; it should now be removed from local.</li>
<li><strong>Local Removed</strong> (R,S): The message was synchronized, but since then was removed from the local; it should now be removed from remote.</li>
<li><strong>Missing</strong> (S): The message has been deleted everywhere and our status has a stale entry for it.</li>
</ul>
<p>The green-shaded region of the Venn diagram is what we would like L, R and S to cover at the end of synchronization.</p>
</div>
<div class="section" id="algorithm">
<h3>Algorithm</h3>
<p>Define a synchronization operation on a source, destination and status repository <tt class="docutils literal">syncto(src, dst, status)</tt> to be these two steps:</p>
<ol class="arabic simple">
<li>Calculate the set difference <tt class="docutils literal">src - status</tt>, and copy these messages to <tt class="docutils literal">dst</tt> and <tt class="docutils literal">status</tt>.</li>
<li>Calculate the set difference <tt class="docutils literal">status - src</tt>, and delete these messages from <tt class="docutils literal">dst</tt> and <tt class="docutils literal">status</tt>.</li>
</ol>
<p>The full synchronization algorithm is then:</p>
<ol class="arabic simple">
<li><tt class="docutils literal">syncto(R, L, S)</tt> (download changes)</li>
<li><tt class="docutils literal">syncto(L, R, S)</tt> (upload changes)</li>
</ol>
</div>
<div class="section" id="how-it-works">
<h3>How it works</h3>
<p>In the absence of crashes, the correctness proof only involves verifying that the status repository invariant (that messages in status have been synchronized in the past without an intervening synchronized delete) is preserved over all four operations, and that the set differences are, in fact, precisely the sets of messages we want to copy and delete. However, we can also try and look at how the local, remote and status repositories change as the algorithm progresses. In particular, the contents of the status repository in the first <tt class="docutils literal">syncto</tt> is slightly surprising as it evolves differently from <tt class="docutils literal">local</tt>, despite having the same operations applied to it (it then evolves in lockstep with <tt class="docutils literal">remote</tt>).</p>
<div class="outer-image"><div class="inner-image"><img alt="img/normal.png" src="img/normal.png" /></div></div>
<p>Another important correctness claim is that OfflineIMAP never “loses mail”. Under what conditions is mail deleted? When it is present in status repository, but not in the local or remote repository. So it is easy to see that when the status repository is “lost” (either corrupted, or deleted as the instructions tell you to if you delete the contents of your local folders), OfflineIMAP will conservatively perform a full, no-delete synchronization between the two sources. So long as the status repository never contains data for more messages than it ought to, OfflineIMAP will not delete your mail.</p>
</div>
<div class="section" id="variations">
<h3>Variations</h3>
<p>Suppose that I have more disk space available on local disk for Maildir than my remote IMAP server. Eventually, you will end up in the awkward position of wanting to delete messages from your remote IMAP server without correspondingly nuking them from your local mail store. OfflineIMAP provides the <tt class="docutils literal">maxage</tt> option (in which OfflineIMAP refuses to acknowledge the existence of messages older than some sliding window), but what if we <em>really</em> wanted to be sure that OfflineIMAP would never ever delete messages from my local repository?</p>
<p>Simple: Skip step 1-2.</p>
<div class="outer-image"><div class="inner-image"><img alt="img/asymmetric.png" src="img/asymmetric.png" /></div></div>
</div>
<div class="section" id="conclusion">
<h3>Conclusion</h3>
<p>By utilizing a third repository, for which data loss results in a <em>conservative</em> action on the part of the program, OfflineIMAP achieves its claims of <em>an algorithm designed to prevent mail loss at all costs</em>. It is also a simple algorithm, and I hope that any computer scientist or software engineer using this software will take the time to convince themselves of its correctness, rather than relying on the hearsay of some marketing material.</p>
</div>
</div>
</div>
</section>
</body>
</html>