Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Markdown: turn plain URLs into links #2884

Merged
merged 3 commits into from
Nov 7, 2023
Merged

Conversation

danyx23
Copy link
Contributor

@danyx23 danyx23 commented Nov 1, 2023

By default, plain link like https://domain.com are not shown as links in markdown (see commonmark spec) and instead have to be written in the [text](https://domain.com) style.

This PR adds autolinking to plain links. This is done in two different way

  • once for the SimpleMarkdown component that uses react-remark via a remark transformation plugin
  • and once for the MarkdownTextWrap component by doing the translation as part of the mapping to IRNodes

In theory it should be possible to use the remark plugin in both cases, but navigating the set of interdependent libraries at the state two years ago is a pain and so I went for the two bespoke versions. When we upgrade our setup to be ESM everywhere we can upgrade to the latest versions of the unified, react-remark et al libraries and try again to use the same code in both cases.

@danyx23 danyx23 force-pushed the 11-01-markdown-plain-urls branch from bf6380d to 165afaa Compare November 2, 2023 09:03
@danyx23 danyx23 changed the title markdown-plain-urls Markdown: turn plain URLs into links Nov 2, 2023
@danyx23 danyx23 force-pushed the 11-01-markdown-plain-urls branch from 165afaa to ffed23d Compare November 2, 2023 11:43
@danyx23 danyx23 marked this pull request as ready for review November 2, 2023 12:25
@@ -0,0 +1,23 @@
import findAndReplace from "mdast-util-find-and-replace"

export const urlRegex = /https?:\/\/([\w-]+\.)+[\w-]+(\/[\w\- .\+/?:%&=~#]*)?/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, now that I think of it, including .?: can be annoying in cases like Learn more at https://ourworldindata.com/poverty..

I know that some other sites do something like "match .?: unless they are at the very end of the URL", which seems sensible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I took this opportunity to also make sure we match urls with unicode word characters in the path.

AFAIK to properly match this with a regex you need to use negative lookbehind assertions and they are not supported in Safari before 16.4 which came out in Spring this year so I instead handle trailing punctuation characters in the replacement function.

@danyx23 danyx23 force-pushed the switch-markdown-renderer branch from 01111b6 to 3e3744f Compare November 7, 2023 14:36
@danyx23 danyx23 force-pushed the 11-01-markdown-plain-urls branch 3 times, most recently from c4ea834 to d8387dc Compare November 7, 2023 19:18
Copy link
Contributor Author

danyx23 commented Nov 7, 2023

Merge activity

@danyx23 danyx23 force-pushed the switch-markdown-renderer branch from 531cc39 to 3863877 Compare November 7, 2023 21:02
@danyx23 danyx23 force-pushed the 11-01-markdown-plain-urls branch from d8387dc to 944d5b2 Compare November 7, 2023 21:02
Base automatically changed from switch-markdown-renderer to master November 7, 2023 21:03
@danyx23 danyx23 merged commit 0541da5 into master Nov 7, 2023
@danyx23 danyx23 deleted the 11-01-markdown-plain-urls branch November 7, 2023 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants