I built a feed to spot Bluesky posts with DOIs—now I need your help to make the RegEx more accurate.

My RegEx is so basic. Help!

September 29, 2025

I’ve built a feed that surfaces Bluesky posts containing DOIs. It uses regex to detect DOI links, but I’m not sure my patterns are robust enough to catch all valid variations.

Can you help?

The Feed

Latest Academic DOIs

A feed of the newest Bluesky posts containing DOIs—helping researchers quickly spot fresh academic references.

Bluesky

https://bsky.app/profile/did:plc:s2rczyxit2v5vzedxqs326ri/feed/aaajnmvcrx3k6

REGEX strings

And here are the two regex strings in use:

https://doi.org/10.\d{4,9}/[-._;()/:A-Za-z0-9]+

10.\d{4,9}/[-._;()/:A-Za-z0-9]+

They work for most common DOI formats, but DOIs can be tricky, and I may be missing edge cases. If you’re familiar with ReGex and DOI formats, I’d really appreciate your feedback or suggestions for improvements.

More Info

Here are two articles I used to inspire these strings:

Finding a DOI in a document or page

The DOI system places basically no useful limitations on what constitutes a reasonable identifier. However, being able to pull DOIs out of PDFs, web pages, etc. is quite useful for citation informa...

https://stackoverflow.com/questions/27910/finding-a-doi-in-a-document-or-page

Finding and Validating DOIs with Regex

To find a DOI (Digital Object Identifier) in a document or webpage using regex (regular expressions), you can create a pattern that matches the standard format of DOIs. A typical DOI looks like this: `10.1000/xyz123`.

https://stackhub.net/manuals/finding-and-validating-dois-with-regex

Get updates from noted by @renderg.host!