user4233940
user4233940

Reputation:

Ruby regex search and replace array

I've got an array with a list of image URLs that I'm trying to search/replace with a regex (via gsub). The values are in the format //subdomain.website.com/folder/image.extension. I want to add 'https' in front of each array entry.

I've tried to use gsub, but the array remains unchanged:

matches = source.scan(/(\/\/\w+\.\w+\.\w{2,4}\/\w+\/\w+\.\w{2,4})/).uniq
matches.each {|value| value.to_s.gsub!(/\/\//, 'https://')}

In Perl, I could do something like this to change each value:

for (@matches) {
    s/\/\//https:\/\//g;
}

Am I calling the gsub function in an incorrect manner?

Upvotes: 1

Views: 812

Answers (4)

the Tin Man
the Tin Man

Reputation: 160571

Ruby comes with a nice class for this called URI, so take advantage of it:

require 'uri'

uri = URI.parse('//www.example.com')  # => #<URI::Generic:0x007ff0098581e8 URL://www.example.com>
uri.scheme = 'https'                  # => "https"
uri.to_s                              # => "https://www.example.com"

If you want to process a list of URLs:

%w[
  //www.example.com
].map{ |url|                          # => ["//www.example.com"]
  uri = URI.parse(url)                # => #<URI::Generic:0x007ff009853350 URL://www.example.com>
  uri.scheme = 'https'                # => "https"
  uri.to_s                            # => "https://www.example.com"
}                                     # => ["https://www.example.com"]

The advantage of URI is it's smart enough to do the right thing if the URL already has a scheme or is missing it entirely:

require 'uri'

%w[
  http://foo.com
  https://foo.com
  //foo.com
].map { |url|
  uri = URI.parse(url)
  uri.scheme = 'https'
  uri.to_s 
} # => ["https://foo.com", "https://foo.com", "https://foo.com"]

If you insist on using a regex, then simplify it:

url = '//www.example.com'
url[/^/] = 'https:'
url # => "https://www.example.com"

And:

%w[
  //www.example.com
].map{ |url|           # => ["//www.example.com"]
  url[/^/] = 'https:'  # => "https:"
  url                  # => "https://www.example.com"
}                      # => ["https://www.example.com"]

Using a regular expression isn't smart enough to sense whether the scheme already exists, so more code has to be written to handle that situation.

Upvotes: 0

Darkmouse
Darkmouse

Reputation: 1939

You could try something like this.

matches.map{|m| "https#{m}"}

This should add https to the front of every element.

Upvotes: 0

pjd
pjd

Reputation: 1173

If you know every element of your array is formatted properly and ready to have "https:" prepended, it seems like concatenating would be simpler than gsub. For example,

matches.map! { |value| "https:" << value }

should work once you have an array of strings as described by @August.

Upvotes: 0

August
August

Reputation: 12558

First of all, I find it strange that you are calling to_s on value, since value is an array which will include array notation when converted to a string, so value.to_s might look something like ["//subdomain.website.com/folder/image.exte"].

You can avoid this by changing your regex to not include a capture group:

/\/\/\w+\.\w+\.\w{2,4}\/\w+\/\w+\.\w{2,4}/

Now to the main part of your question, you should be calling map on matches, instead of each. The map method will change each element in the array to the result of calling the supplied block with the given element.

Put together it might look like this:

matches = source.scan(/\/\/\w+\.\w+\.\w{2,4}\/\w+\/\w+\.\w{2,4}/).uniq
matches.map { |value| value.gsub(/\/\//, 'https://') }
# => ["https://subdomain.website.com/folder/image.exte"]

Upvotes: 1

Related Questions