some questions regarding the new <t-hspace> tag #95

kosloot · 2021-04-12T09:42:00Z

recently a <t-hspace> tag is introduced, but when I started using it , some questions arose:

It is possible the add some text to a <h-space> like this:
<t-hspace>extra text</t-hspace>
This is acceptable to foliavalidator and folialint, but doesn't show up in text() output. Probably that is OK
In libfolia, it DOES show up, which is a bug I assume?
But shouldn't we disallow this construct? To avoid strange effects and misunderstandings?
There are NO predefined class values for <h-space>. I understand the ratio, but that poses a big burden on all tools that would like to make use of it. They all have to create their own text() extraction functions and would be very helped by a predefined set, that the libraries support. Like "tab", "space", "wide-space", or such.
I realize that defining such a set might be a challenge, but still.
The text() function is very complex and replicating it is cumbersome. (like handling of the tag' feature already showed us.)
Another possibility might be a way of providing a translation table for those class values:
tab ==> '\t'
space ==> ' _'
wide-space ==> ' __'

The text was updated successfully, but these errors were encountered:

proycon · 2021-04-12T10:08:42Z

Good point, this is indeed not intentional and should be disallowed.
We could define a set, implement some support for it in the libraries, and recommend its usage. It's then simply up to users whether they decide to use that set or not (i.e. it'll be an opt-in choice).

kosloot · 2021-04-12T10:55:53Z

Good point, this is indeed not intentional and should be disallowed.

Maybe the same holds for a few of the other text Markup tags too?

We could define a set, implement some support for it in the libraries, and recommend its usage. It's then simply up to users whether they decide to use that set or not (i.e. it'll be an opt-in choice).

That would be great. Leaving us with a challenge to create a reasonable set.

kosloot · 2021-04-12T12:33:33Z

We can simply forbid text in a TextMarkupHSpace by adding 1 line in folia_properties.cxx:

//------ TextMarkupHSpace -------
    TextMarkupHSpace::PROPS = AbstractTextMarkup::PROPS;
    TextMarkupHSpace::PROPS.ACCEPTED_DATA.erase( XmlText_t );           <=== 1 extra line
    TextMarkupHSpace::PROPS.ELEMENT_ID = TextMarkupHSpace_t;

But maybe this is not generic enough?

Otherwise XmlText_t could be removed from AbstractTextMarkup::PROPS, and explicitly added for the Sub-classes it applies to?

proycon · 2021-04-12T12:40:45Z

Generally we have the TEXTCONTAINER property for this. ACCEPTED_DATA only carries FoLiA elements in my implementations.

kosloot · 2021-04-12T12:47:31Z

A right. That is a better solution, and it works:

folialint tests/bug59.xml
tests/bug59.xml failed: XML error: found extra text 'test' inside element <t-hspace>, NOT allowed there.

the input contained:

    <div xml:id="example.div.4" class="section" n="4">
      <t>Space,<t-hspace>test</t-hspace>the<t-hspace/>final<t-hspace/><t-hspace/>frontier</t>
    </div>

kosloot · 2021-04-12T14:21:45Z

Ok, but still there is room for rather suspicious constructions like:

      <t>Space,<t-hspace><t-str>test</t-str><t-hbr>what</t-hbr></t-hspace>the<t-hspace/>final<t-hspace/><t-hspace/>frontier</t>

This passes folialint and foliavalidator, and both folia2txt and FoLiA-2text ignore everything inside the <t-hspace> but
still this is confusing and should be rejected imho

proycon · 2021-04-13T09:42:27Z

Agreed

proycon added question low priority labels Jul 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some questions regarding the new <t-hspace> tag #95

some questions regarding the new <t-hspace> tag #95

kosloot commented Apr 12, 2021

proycon commented Apr 12, 2021

kosloot commented Apr 12, 2021

kosloot commented Apr 12, 2021

proycon commented Apr 12, 2021

kosloot commented Apr 12, 2021

kosloot commented Apr 12, 2021

proycon commented Apr 13, 2021

some questions regarding the new <t-hspace> tag #95

some questions regarding the new <t-hspace> tag #95

Comments

kosloot commented Apr 12, 2021

proycon commented Apr 12, 2021

kosloot commented Apr 12, 2021

kosloot commented Apr 12, 2021

proycon commented Apr 12, 2021

kosloot commented Apr 12, 2021

kosloot commented Apr 12, 2021

proycon commented Apr 13, 2021