vinnie
vinnie

Reputation: 363

Twitter status URL regex

I have an existing regex:

/^http:\/\/twitter\.com\/(\w+)\/status(es)*\/(\d+)$/

that I use for determining if a URL is a twitter status update URL. Eg.

http://twitter.com/allWPthemes/status/2040410213974016

But ever since "new" twitter came out, they have changed the status URL's to look like :

http://twitter.com/#!/allWPthemes/status/2040410213974016

with the added /#!

So my question is : How can I modify my regex to match both URL's?

My final failed attempt was:

^http:\/\/twitter\.com\/(#!\/w+|\w+)\/status(es)*\/(\d+)$

Upvotes: 8

Views: 6581

Answers (6)

zax
zax

Reputation: 946

2023 update: With the new ownership, it looks like nowadays you'll have to use

[(?:https?:\/\/(?:twitter|x)\.com)](\/(?:#!\/)?(\w+)\/status(es)?\/(\d+))

and then prepend https://x.com to match group 1 if you find a match, to ensure forwards and backwards compatibility.

Upvotes: 0

webolizzer
webolizzer

Reputation: 335

approved answer will not match shared twitter URLs like this: https://twitter.com/USATODAY/status/982270433385824260?s=19 because end of string flag "$"

// working solution
/^https?:\/\/twitter\.com\/(?:#!\/)?(\w+)\/status(es)?\/(\d+)/

test: https://regex101.com/r/mNsp3o/4

Upvotes: 6

zurfyx
zurfyx

Reputation: 32767

@Kevin answer updated.

^https?:\/\/twitter\.com\/(?:#!\/)?(\w+)\/status(?:es)?\/(\d+)(?:\/.*)?$

Matches both:

https://twitter.com/someone/status/866002913604149248
https://twitter.com/someone/status/857179125076963329/video/1

You can run them by yourself here:

https://regex101.com/r/mNsp3o/3

Upvotes: 1

Kevin
Kevin

Reputation: 3841

Try this: /^https?:\/\/twitter\.com\/(?:#!\/)?(\w+)\/status(es)?\/(\d+)$/

This will match both the original URLs and the new hash tag URLs.

If you just want to match the new URLs, this should do it: /^https?:\/\/twitter\.com\/#!\/(\w+)\/status(es)?\/(\d+)$/

Upvotes: 14

tchrist
tchrist

Reputation: 80384

Ewww! ☺ Don’t uses slashes as the regex quoting delimiter when you have slashes inside that would therefore need backwhacking. Otherwise you get icky LTS (Leaning Toothpick Syndrome) and an infectious case of backslashitis to boot.

Something like this is much better:

    m!http://twitter.com/(#!/)?\w+/status(es)?/(\d+)$!

or

    m{http://twitter.com/(#!/)?\w+/status(es)?/(\d+)$}

or if you don’t need to capture portions:

    m{http://twitter.com/(?:#!/)?\w+/status(?:es)?/(?:\d+)$}

or if you want to make it readable:

    m{ http:// twitter.com / ( \x23 ! / )? \w+ / status (es)? / (\d+) $ }x

which is even beter when broken up across multiple lines so you can comment it:

    m{ 
           http:
        // twitter.com
        /  ( \x23 ! / )?       # optional new "#!" element
           \w+ 
        / status (es)?         # one or more statuses
        / ( \d+ ) 
          $
     }x

Upvotes: 2

Gavin Miller
Gavin Miller

Reputation: 43815

Your solution is pretty close. You can simply add the #!/ as an optional element like this:

(#!\/)?

So the full regex would look like this:

/^http:\/\/twitter\.com\/(#!\/)?(\w+)\/status(es)*\/(\d+)$/

Upvotes: 2

Related Questions