-
Notifications
You must be signed in to change notification settings - Fork 8.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse UTF-16 surrogates pairs for calculating pattern's position #11915
Parse UTF-16 surrogates pairs for calculating pattern's position #11915
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, thanks!
@msftbot make sure @PankajBhojwani signs off on this one |
Hello @zadjii-msft! Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:
If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you". |
@@ -2585,16 +2586,17 @@ PointTree TextBuffer::GetPatterns(const size_t firstRow, const size_t lastRow) c | |||
// match and the previous match, so we use the size of the prefix | |||
// along with the size of the match to determine the locations | |||
size_t prefixSize = 0; | |||
|
|||
for (const auto ch : i->prefix().str()) | |||
for (const std::vector<wchar_t> parsedGlyph : Utf16Parser::Parse(i->prefix().str())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have left auto
for the left side here and below... but I'm not going to block over it.
@msftbot merge this in 1 minute |
Hello @zadjii-msft! Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:
If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you". |
) <!-- Enter a brief description/summary of your PR here. What does it fix/what does it change/how was it tested (even manually, if necessary)? --> ## Summary of the Pull Request Properly handle UTF-16 surrogates when calculating the position of matched pattern. Fix #8709 <!-- Other than the issue solved, is this relevant to any other issues/existing PRs? --> ## References https://github.com/microsoft/terminal/blob/b88ffb21b0725331877ba76bac5a79a4c21eaa03/src/buffer/out/search.cpp#L335-L339 <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist * [ ] Closes #8709 * [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA * [ ] Tests added/passed * [ ] Documentation updated. If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/terminal) and link it here: #xxx * [ ] Schema updated. * [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx <!-- Provide a more detailed description of the PR, other things fixed or any additional comments/features here --> ## Detailed Description of the Pull Request / Additional comments use `Utf16Parser::Parse` to handle code points from U+010000 to U+10FFFF in UTF-16. <!-- Describe how you validated the behavior. Add automated tests wherever possible, but list manual validation steps taken as well --> ## Validation Steps Performed ![image](https://user-images.githubusercontent.com/1068203/145421736-c842c7d4-0136-42d0-ad72-f004f58d9e3b.png) also the case by @mas90 #8709 (comment): ![image](https://user-images.githubusercontent.com/1068203/145420264-3fe220b4-42c5-44ac-aa94-4e604b164ed3.png) (cherry picked from commit a2d96d6)
) <!-- Enter a brief description/summary of your PR here. What does it fix/what does it change/how was it tested (even manually, if necessary)? --> ## Summary of the Pull Request Properly handle UTF-16 surrogates when calculating the position of matched pattern. Fix #8709 <!-- Other than the issue solved, is this relevant to any other issues/existing PRs? --> ## References https://github.com/microsoft/terminal/blob/b88ffb21b0725331877ba76bac5a79a4c21eaa03/src/buffer/out/search.cpp#L335-L339 <!-- Please review the items on the PR checklist before submitting--> ## PR Checklist * [ ] Closes #8709 * [x] CLA signed. If not, go over [here](https://cla.opensource.microsoft.com/microsoft/Terminal) and sign the CLA * [ ] Tests added/passed * [ ] Documentation updated. If checked, please file a pull request on [our docs repo](https://github.com/MicrosoftDocs/terminal) and link it here: #xxx * [ ] Schema updated. * [ ] I've discussed this with core contributors already. If not checked, I'm ready to accept this work might be rejected in favor of a different grand plan. Issue number where discussion took place: #xxx <!-- Provide a more detailed description of the PR, other things fixed or any additional comments/features here --> ## Detailed Description of the Pull Request / Additional comments use `Utf16Parser::Parse` to handle code points from U+010000 to U+10FFFF in UTF-16. <!-- Describe how you validated the behavior. Add automated tests wherever possible, but list manual validation steps taken as well --> ## Validation Steps Performed ![image](https://user-images.githubusercontent.com/1068203/145421736-c842c7d4-0136-42d0-ad72-f004f58d9e3b.png) also the case by @mas90 #8709 (comment): ![image](https://user-images.githubusercontent.com/1068203/145420264-3fe220b4-42c5-44ac-aa94-4e604b164ed3.png) (cherry picked from commit a2d96d6)
🎉 Handy links: |
🎉 Handy links: |
Summary of the Pull Request
Properly handle UTF-16 surrogates when calculating the position of matched pattern.
Fix #8709
References
terminal/src/buffer/out/search.cpp
Lines 335 to 339 in b88ffb2
PR Checklist
Detailed Description of the Pull Request / Additional comments
use
Utf16Parser::Parse
to handle code points from U+010000 to U+10FFFF in UTF-16.Validation Steps Performed
also the case by @mas90 #8709 (comment):