Skip to content

Commit

Permalink
Fix incorrect missing of trimming all-space text events when trim_tex…
Browse files Browse the repository at this point in the history
…t_start = false and trim_text_end = true

This is still not complete fix, because we will generate empty Event::Text although we should not do that,
but it is hard to prevent generation of such event. Moreover it would be better to remove ability of
automatic trimming completely, because it is anyway does not work correctly -- events should not be
trimmed at boundary of text / CDATA, or text / PI, or text / comment in some cases
  • Loading branch information
Mingun committed Jun 14, 2024
1 parent 4c2cc84 commit 28795e1
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 17 deletions.
4 changes: 4 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,15 @@

### Bug Fixes

- [#755]: Fix incorrect missing of trimming all-space text events when
`trim_text_start = false` and `trim_text_end = true`.

### Misc Changes

- [#650]: Change the type of `Event::PI` to a new dedicated `BytesPI` type.

[#650]: https://github.com/tafia/quick-xml/issues/650
[#755]: https://github.com/tafia/quick-xml/pull/755


## 0.32.0 -- 2024-06-10
Expand Down
16 changes: 12 additions & 4 deletions src/reader/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -244,13 +244,21 @@ macro_rules! read_event_impl {
}
ReadTextResult::UpToMarkup(bytes) => {
$self.state.state = ParseState::InsideMarkup;
// Return Text event with `bytes` content or Eof if bytes is empty
Ok($self.state.emit_text(bytes))
// FIXME: Can produce an empty event if:
// - event contains only spaces
// - trim_text_start = false
// - trim_text_end = true
Ok(Event::Text($self.state.emit_text(bytes)))
}
ReadTextResult::UpToEof(bytes) => {
$self.state.state = ParseState::Done;
// Return Text event with `bytes` content or Eof if bytes is empty
Ok($self.state.emit_text(bytes))
// Trim bytes from end if required
let event = $self.state.emit_text(bytes);
if event.is_empty() {
Ok(Event::Eof)
} else {
Ok(Event::Text(event))
}
}
ReadTextResult::Err(e) => Err(Error::Io(e.into())),
}
Expand Down
17 changes: 4 additions & 13 deletions src/reader/state.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,31 +52,22 @@ pub(super) struct ReaderState {
}

impl ReaderState {
/// Trims end whitespaces from `bytes`, if required, and returns a [`Text`]
/// event or an [`Eof`] event, if text after trimming is empty.
/// Trims end whitespaces from `bytes`, if required, and returns a text event.
///
/// # Parameters
/// - `bytes`: data from the start of stream to the first `<` or from `>` to `<`
///
/// [`Text`]: Event::Text
/// [`Eof`]: Event::Eof
pub fn emit_text<'b>(&mut self, bytes: &'b [u8]) -> Event<'b> {
pub fn emit_text<'b>(&mut self, bytes: &'b [u8]) -> BytesText<'b> {
let mut content = bytes;

if self.config.trim_text_end {
// Skip the ending '<'
let len = bytes
.iter()
.rposition(|&b| !is_whitespace(b))
.map_or_else(|| bytes.len(), |p| p + 1);
.map_or(0, |p| p + 1);
content = &bytes[..len];
}

if content.is_empty() {
Event::Eof
} else {
Event::Text(BytesText::wrap(content, self.decoder()))
}
BytesText::wrap(content, self.decoder())
}

/// reads `BytesElement` starting with a `!`,
Expand Down

0 comments on commit 28795e1

Please sign in to comment.