Reputation: 1690
Given the following Facebook profile and page URLs, my intent is to extract profile ids or usernames into the first match position.
http://www.facebook.com/profile.php?id=123456789
http://www.facebook.com/someusername
www.facebook.com/pages/Regular-Expressions/207279373093
The regex I have so far looks like this:
(?:http:\/\/)?(?:www.)?facebook.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[?\w\-]*\/)?(?:profile.php\?id=(\d.*))?([\w\-]*)?
Which produces the following results:
Result 1:
Result 2:
Result 3:
The ideal outcome would look like:
Result 1:
Result 2:
Result 3:
That is to say, I'd like to have the profile identifier to always be returned in the first position.
It would also be ideal of www.facebook.com/ and facebook.com/ didn't match either.
Upvotes: 25
Views: 54412
Reputation: 338
I am not sure why I got highly invested in this even though it's not really something I need or missing a solution But I added an ultra fine solution combining all rules + do not allow consecutive periods nor profiles that has less than 5 characters
Check out my regex here: https://regex101.com/r/6OQYWr/2
(You can check out the unit tests section for seeing all of the cases it handles)
Upvotes: 0
Reputation: 557
I've tried every single answer above and each one doesn't work for at least one reason. This most likely won't be helpful to OP, but if anybody like me finds this in a web search, I believe this is the correct answer:
^(?:.*)\/(?:pages\/[A-Za-z0-9-]+\/)?(?:profile\.php\?id=)?([A-Za-z0-9.]+)
Supports basically everything I can think of, except verifying that the domain contains facebook.com. If you need to check if the URL is valid, this should be done outside of a regular expression to make sure the page or profile actually exists. Why check it twice, especially when one of the checks is incomplete?
Upvotes: 6
Reputation: 947
Regex that will correctly identify profile pages with a . in the name like www.facebook.com/my.name and it will also exclude www.facebook.com/ or home.php as it is not a valid facebook page.
https://regex101.com/r/koN8C2/2
(?:(?:http|https):\/\/)?(?:www.|m.)?facebook.com\/(?!home.php)(?:(?:\w)*#!\/)?(?:pages\/)?(?:[?\w\-]*\/)?(?:profile.php\?id=(?=\d.*))?([\w\.-]+)
Let me know if you found any that are not matched.
Upvotes: 0
Reputation: 93
Only this regular expression works correctly for all FB URLs:
/(?:https?:\/\/)?(?:www\.)?(?:facebook|fb|m\.facebook)\.(?:com|me)\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*([\w\-\.]+)(?:\/)?/i
Upvotes: 9
Reputation: 5732
I'd recommend Rad Software Regular Expression Designer.
Also this online tool is great https://regex101.com/ ( though most people prefer http://regexr.com/ )
(?:(?:http|https):\/\/)?(?:www.)?facebook.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[?\w\-]*\/)?(?:profile.php\?id=(?=\d.*))?([\w\-]*)?
Upvotes: 23
Reputation: 17
This works well for me. It can detect personal profile url, and exclude all the fan pages, and groups.
.+www.facebook.com\/[^\/]+$
Upvotes: -2
Reputation: 6858
Matches facebook.com, m.facebook.com, mbasic.facebook.com and fb.me (short link)
/(?:https?:\/\/)?(?:www\.)?(mbasic.facebook|m\.facebook|facebook|fb)\.(com|me)\/(?:(?:\w\.)*#!\/)?(?:pages\/)?(?:[\w\-\.]*\/)*([\w\-\.]*)/ig
Upvotes: 4
Reputation: 8192
The most completed pattern for Facebook profile url:
/(?:https?:\/\/)?(?:www\.)?facebook\.com\/.(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*([\w\-\.]*)/
It detects all the cases + one important difference. Other regex patterns recognize http://www.facebook.com/ as a valid Facebook Profile URL while it is not a valid Profile url. It is just the original Facebook URL and not a user or page address. But this regex can distinguish a normal url from a profile and page url and only accepts the valid one.
Upvotes: 5
Reputation: 66465
I made a gist a while back that works fine against the given examples:
# Matches patterns such as:
# http://www.facebook.com/my_page_id => my_page_id
# http://www.facebook.com/#!/my_page_id => my_page_id
# http://www.facebook.com/pages/Paris-France/Vanity-Url/123456?v=app_555 => 45678
# http://www.facebook.com/pages/Vanity-Url/45678 => 45678
# http://www.facebook.com/#!/page_with_1_number => page_with_1_number
# http://www.facebook.com/bounce_page#!/pages/Vanity-Url/45678 => 45678
# http://www.facebook.com/bounce_page#!/my_page_id?v=app_166292090072334 => my_page_id
/(?:http:\/\/)?(?:www\.)?facebook\.com\/(?:(?:\w)*#!\/)?(?:pages\/)?(?:[\w\-]*\/)*([\w\-]*)/
To get the latest version: https://gist.github.com/733592
Upvotes: 10