-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathspec.sup
366 lines (255 loc) · 10.3 KB
/
spec.sup
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
====================================================================
Sup - The ^Super Unambiguous Unobstrusive Understandable Plain Text^
====================================================================
Objective
=========
Provide an unambiguous and extensible file format with an intuitive markup
that allows the same file to be easily read as plain text or be parsed
and converted in books etc.
The Sup markup consists in the minimum amount of symbols while being the most
unambiguous as possible. It is parsed to a list of blocks of content, and each
block of content is parsed to a list of markup pieces.
This document specification is under GNU-FDL License. See credits at the end
of this document.
Basic markup
============
The styling markups are the following:
* The ````verbatim```` that becomes ``verbatim``
* The ``^^strong^^`` that becomes ^^strong^^
* The ``^emphasis^`` that becomes ^emphasis^
Headings and Titles
===================
They are used to write section titles and delimitate the bigger parts of your
text:
=============
Title Level 1
=============
Title Level 2
=============
Title Level 3
-------------
Title Level 4
~~~~~~~~~~~~~
Lists
=====
Lists are written prepending ``*`` or ``-`` for unordered lists
and ``1.`` or ``a.`` or ``A.`` for ordered lists. Sublists can be created
simply adding a two space indentation. Blank lines can also be added between
items.
* First of unordered list
* Second of unordered list
1. First of ordered sub list
2. Second of ordered sub list
- First of unordered sub sub list
- Second of unordered sub sub list
Preformated Text
================
Just indent with 4 spaces more than the previous indentation. All subsequent
lines with at least the same indentation will become part of the same
preformated block until it find a smaller indentation.
Code Blocks
===========
They act exactly like preformated text, but give hints about how external
parsers can process its content syntax.
:: code :: Lua
function sum(a,b)
return a+b
end
Code Signature
==============
Used to create very smaller section headings on the text with preformated text
in a single line. Convenient to create source code documentation about
functions that can be called or represented in many different ways.
## string.sub ## lua
string.sub( str @string, start @number, end @number ) @string
@string:sub(start @number, end @number) @string
## myDatatype ## c
typedef struct {
int x;
char **str;
} myDatatype;
In the example above, two entries will be created at the document index, one for
``string.sub`` and other for ``myDataType``.
Links
=====
The linking markups are between ``<<`` and ``>>``. The type of link is defined
immediately after the angle braces and can or not be followed by a token or
complementing value inside single backticks `````.
before and after it.
* The ``<<word>>`` links to the anchor heading in the document.
* The ``<<word>>@`http://somewhere.com``` make ``word`` point to an address
outside the document.
* The ``<<<word>>>`` (triple chevrons) points to another document, as in wikis.
The internal reference is made on each heading or <<signature header>>
by tokenization (replacing any ASCII non alphanumerical character by space,
trimming and removing duplicate spaces that are replaced by a single ``-``.
Non ASCII characters are preserved).
Footnotes
=========
Footnotes are marked by a single asterisk (see <<Choices>>) right after a word
or styling marker:
This is a footnote*, and this also is ^other footnote^*.
This is a footnote*`foo` with a name.
This is a numbered footnote*1.
** This explains the first footnote.
** This explains the second footnote.
** `foo` This explains a named footnote.
** 1 This explains a numbered footnote.
Glossaries
==========
Glossary references are marked with a hash ``#`` immediattely the word or
expression. To group different words reffering to the same concept, it
can be followed by the target glossary reference between backticks:
There are many planets# in our Universe. All planets seen nearer of its
star can seen as it if had moon phases, even a waxing#`crescent` one.
# `planets` are near spherical objects that not shine like stars and are
bigger than our moon.
# `crescent` or waxing is the appearance of a celestial spheric body when
its brighter hemisphere becomes more apparent.
Glossary can have no explicative notes and be referenced multiple ways to just
create indexes of where in the document such words were used.
Image (draft)
=============
Images are a distinct block where you can put its address with some metadata
if you want to.
:: image :: Your short description
src = https://xxx.yyyy/x.img
width =
height =
author =
year =
copyright =
Breaks
======
Breaks can be used to disambiguate on list or block ending. It consists in just
adding new line with only four carets ``^^^^`` positionated at the desired
block level. This shouldn't be used as decorative separation lines.
Blocks
======
A block is formed by contiguous non-empty lines.
A simple blank line is sufficient to separate two blocks. If more blank lines
are found they are discared.
The only block not ruled by this is the preformated block.
Custom blocks can also incorporate subsequent blocks with wider indentation,
but this logic is not applied to ordered or unordered lists where each item
is considered a distinct block.
Other block types:
Quotation (draft)
:: quote ::
Se essa rua fosse minha
eu mandava ela calar
A bagunça da vizinha
eu não sei se tem pra lá
Tables (draft)
--------------
:: table :: `My table title`
# Column A
# A line 1
# A line 2
# A line 3
# Column B
# B line 1
# B line 2
# C line 3
Bibliography (draft)
--------------------
:: bib :: JUNG-MDT
name = Carl Gustav Jung
title = Memory, dreams and thoughts.
year = 1968
page = 250
:: bib :: FRANZ-SH
name = Marie-Louise von Franz
title = Shadow in Fairy tales
year = 1960
page = 120
Specifications
==============
ASCII delimiter
---------------
^Sup^ consider only ASCII printable characters in its parsing.
So a file can be written using UTF-8; As UTF-8 contains punctuation marks other
than the found in ASCII, when using them after an inline markup ending, it
should be preceded by an escaped whitespace.
Parsing order
-------------
1. First the text will be split in blocks, that are delimited by a empty lines.
A single empty line is sufficient for that and the immediately following empty
lines are discarded.
2. Detect the type of each block and it to the list of blocks with the type
information.
3. Accordingly to the block type, call the block type parser.
4. Do inline parsing that should parse the block content, when applicable.
5. The default inline parsing should follow this order:
a. verbatim
b. links
c. strong
d. emphasis
Inline escaping
---------------
- A backslash before any character make it has no special meaning for parsing.
- Escaped whitespace is a space character preceeded by a backslash. It allows
to force closing inline markups when inside a word. At the end of document
parsing these escaped spaces are completely removed.
- Inline verbatim disable any escaping, so if a backslash is found inside the
verbatim chunk, it will be kept as is.
Inline markup boundaries
------------------------
The start delimiter in inline markup is valid when:
- It is at the start of a line;
- It is anteceded by a space character;
- It is anteceded by one of the ASCII charaters:
' " ( { [
The end delimiter in inline markup is valid when:
- It is followed by a space;
- It is followed by one of the ASCII characters:
' " ) } ] , . : ; ! ?
Verbatim parsing
----------------
- Verbatim starts with a pair of backticks `````` preceded by any valid start
delimiter.
- Cannot be nested. Its contents are not escaped.
- It only ends when a pair of backticks are found followed by a boundary ending
as defined on <<Inline markup boundaries>>.
Link parsing
------------
It has a pair of left angle brackets `<<` followed by words and finalized by
another pair of right angle brackets `>>`.
To diferentiate between the kinds of link, a modifier is added immediately after
the right angle bracket. These modifiers are:
- The at sign ``@`` for URLs;
- The asterisk ``*``` for footnotes;
- The hash ``#`` for glossary/vocabulary;
- (draft) The double asterisk ``**`` for bibliographical references;
After the modifier, a single parameter can be passed to the link. This parameter
should be written between single backticks and shouldn't be parsed for more
markup.
- For URLs, the parameter is the target (email, site etc.);
- For footnotes, the parameter use an explicit numbering (instead of the
positional) or an explicit name.
- For glossary, the parameter can be used to link a word variant to another in
the glossary.
- (draft) For bibliographical reference, it should point to the token pointing
to the right bibliographical description.
Choices
=======
* Links: angle brackets (chevrons) in literature are most commonly used to
quote someone, say something in foreign language or insert asides. Just because
this is the main usage in literature, but is rarely used (giving place to
parenthesis or quotation marks) it is well suited to represent in plain text
what should be found elsewhere in document or outside it.
* Asterisks: the main usage of asterisks or stars (or daggers) in printed text
is to mark a place in text where there is an anotation or further information.
Use them for emphasis or strong markup looks strange and distractful to a
reader.
* Carets: as the names says it resembles the original marks added in
"proofreading and typography to indicate that additional material needs to
be inserted at the point indicated in the text"*. As in digital writing it is
rarely used outside technical usage (except in programming or mathematical
representation) it is the perfect sign to indicate where a text is an emphasis
or a strong emphasis.
** https://en.wikipedia.org/wiki/Caret_(proofreading)
Credits
=======
This document is under GNU-FDL license.
Copyright (c) 2023 - Thadeu A. C. de Paula