Reputation: 35572
Here are my test cases.
Expected:
JUNKINFRONThttp://francium.tech
should be http://francium.tech
JUNKINFRONThttp://francium.tech/http
should be http://francium.tech/http
francium.tech/http
should be francium.tech/http
(unaffected)
Actual result:
http://francium.tech
francium.tech/http
http
I am trying to write a regex replace for this. I tried this,
text.sub(/.*http/,'http')
However, my second and third test cases fail because it searches till the end. It would help if the answer could also do the case insensitivity.
2.5.0 :001 > url = 'francium.tech/http'
=> "francium.tech/http"
2.5.0 :002 > url.sub(/^.*?(?=http)/i,'')
=> "http"
Upvotes: 0
Views: 224
Reputation: 22837
As per my original comments, you can use the pattern as shown below. If you want a really small performance gain, you can remove one step in the regex by using the second pattern instead. If you're especially concerned with performance, the last one performs even quicker.
^.*?(?=https?://)
^.*?(?=https?:/{2})
^.*?(?=ht{2}ps?:/{2})
strings = [
"JUNKINFRONThttp://francium.tech",
"JUNKINFRONThttp://francium.tech/http",
"francium.tech/http"
]
strings.each { |s| puts s.sub(%r{^.*?(?=https?://)}, '') }
Outputs the following:
http://francium.tech
http://francium.tech/http
francium.tech/http
Upvotes: 2
Reputation: 2586
When using regex you should make sure to use unique strings like http:\\
or better http:\\[SOMETHING].[AT_LEAST_TWO_CHARS][MAYBE_A_SLASH]
and so on...
This works for your given cases:
str = ['JUNKINFRONThttp://francium.tech',
'JUNKINFRONThttp://francium.tech/http',
'francium.tech/http']
str.each do |str|
puts str.sub(/^.*?(https?:\/{2})/, '\1') # with capturing group
puts str.sub(/^.*?(?=https?:\/{2})/, '') # with positive lookahead
end
By using a group we can use it for the replacement, another method would be to use a positive lookahead
Upvotes: 2
Reputation: 134
I think this may solve your problem.
str1 = 'JUNKINFRONThttp://francium.tech'# should be http://francium.tech
str2 = 'JUNKINFRONThttp://francium.tech/http'# should be http://francium.tech/http
str3 = 'francium.tech/http' #should be francium.tech/http (unaffected)
str4 = 'JUNKINFRONThttps://francium.tech/http'# should be https://francium.tech/http
[str1, str2, str3, str4].each do |str|
puts str.gsub(/^.*(http|https):\/\//i, "\\1://")
end
Result:
http://francium.tech
http://francium.tech/http
francium.tech/http
https://francium.tech/http
Upvotes: 2