Reputation: 685
I have a URL string like "https://example.com"
. I want to show the parts of this URL like protocol, domain, and extension. How can I do this using regular expression?
Upvotes: 0
Views: 361
Reputation: 7777
In the Ruby I have used something like this
user:~/workspace $ irb
2.3.4 :018 > url = "https://example.com"
=> "https://.example.com"
2.3.4 :019 > u = url.match(/(?<protocol>[\w]+):\/\/(?<domain>[\w-]+)\.(?<extension>\w+)/)
=> #<MatchData "https://example.com" protocol:"https" domain:"example" extension:"com">
2.3.4 :020 > u[:protocol]
=> "https"
2.3.4 :021 > u[:domain]
=> "example"
2.3.4 :022 > u[:extension]
=> "com"
If you have also subdomain then use like below regular expression
2.3.4 :034 > url = "https://sub.example.com"
2.3.4 :035 > u = url.match(/(?<protocol>[\w]+):\/\/(?<domain>[[a-zA-Z0-9]\.-]+)\.(?<extension>\w+)/)
=> #<MatchData "https://sub.example.com" protocol:"https" domain:"sub.example" extension:"com">
2.3.4 :036 > u[:protocol]
=> "https"
2.3.4 :037 > u[:domain]
=> "sub.example"
2.3.4 :038 > u[:extension]
=> "com"
In the http://rubular.com/
I have created a snippet for testing regular expression which not failing with subdomain
see this Rubular
Upvotes: 1
Reputation: 2872
You could easily use a ruby built-in class for this:
uri = URI("http://www.example.com")
uri.scheme // http
uri.host // www.example.com
See also: http://ruby-doc.org/stdlib-2.0.0/libdoc/uri/rdoc/URI.html
Upvotes: 2