kimpalita
kimpalita

Reputation: 15

How to substitute all characters in a string except for some (in Ruby)

I'm having some trouble trying to find an appropriate method for string substitution. I would like to replace every character in a string 'except' for a selection of words or set of string (provided in an array). I know there's a gsub method, but I guess what I'm trying to achieve is its reverse. For example...

My string: "Part of this string needs to be substituted"

Keywords: ["this string", "substituted"]

Desired output: "**** ** this string ***** ** ** substituted"

ps. It's my first question ever, so your help will be greatly appreciated!

Upvotes: 1

Views: 1779

Answers (6)

Cary Swoveland
Cary Swoveland

Reputation: 110725

You can do that using the form of String#split that uses a regex with a capture group.

Code

def sub_some(str, keywords)
  str.split(/(#{keywords.join('|')})/)
     .map {|s| keywords.include?(s) ? s : s.gsub(/./) {|c| (c==' ') ? c : '*'}}
     .join
end

Example

str = "Part of this string needs to be substituted"
keywords = ["this string", "substituted"]
sub_some(str, keywords)
  #=> "**** ** this string ***** ** ** substituted" 

Explanation

r = /(#{keywords.join('|')})/
  #=> /(this string|substituted)/ 
a = str.split(r)
  #=> ["Part of ", "this string", " needs to be ", "substituted"] 
e = a.map
  #=> #<Enumerator: ["Part of ", "this string", " needs to be ",
  #     "substituted"]:map> 

s = e.next
  #=> "Part of " 
keywords.include?(s) ? s : s.gsub(/./) { |c| (c==' ') ? c : '*' }
  #=> s.gsub(/./) { |c| (c==' ') ? c : '*' }
  #=> "Part of "gsub(/./) { |c| (c==' ') ? c : '*' }
  #=> "**** ** " 

s = e.next
keywords.include?(s) ? s : s.gsub(/./) { |c| (c==' ') ? c : '*' }
  #=> "this string" 
keywords.include?(s) ? s : s.gsub(/./) { |c| (c==' ') ? c : '*' }
  #=> s
  #=> "this string" 

and so on... Lastly,

["**** ** ", "this string", " ***** ** ** ", "substituted"].join('|') 
  #=> "**** ** this string ***** ** ** substituted" 

Note that, prior to v.1.9.3, Enumerable#map did not return an enumerator when no block is given. The calculations are the same, however.

Upvotes: 0

Edgar Mkaka Joel
Edgar Mkaka Joel

Reputation: 96

str = "Part of this string needs to be substituted"
keywords = ["this string", "substituted"]

pattern = /(#{keywords.join('|')})/

str.split(pattern).map {|i| keywords.include?(i) ? i : i.gsub(/\S/,"*")}.join
#=> "**** ** this string ***** ** ** substituted"

A more readable version of the same code

str = "Part of this string needs to be substituted"
keywords = ["this string", "substituted"]

#Use regexp pattern to split string around keywords.
pattern = /(#{keywords.join('|')})/ #pattern => /(this string|substituted)/
str = str.split(pattern) #=> ["Part of ", "this string", " needs to be ", "substituted"]

redacted = str.map do |i|
    if keywords.include?(i)
        i
    else
        i.gsub(/\S/,"*") # replace all non-whitespace characters with "*"
    end
end      
# redacted => ["**** **", "this string", "***** ** **", "substituted"]
redacted.join

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627082

You can use the following approach: collect the substrings that you need to turn into asterisks, and then perform this replacement:

str="Part of this string needs to be substituted"
arr = ["this string", "substituted"]

arr_to_remove = str.split(Regexp.new("\\b(?:" + arr.map { |x| Regexp.escape(x) }.join('|') + ")\\b|\\s+")).reject { |s| s.empty? }

arr_to_remove.each do |s|
    str = str.gsub(s, "*" * s.length)
end
puts str

Output of the demo program:

**** ** this string ***** ** ** substituted

Upvotes: 0

Brian Davis
Brian Davis

Reputation: 357

Here's a different approach. First, do the reverse of what you ultimately want: redact what you want to keep. Then compare this redacted string to your original character by character, and if the characters are the same, redact, and if they are not, keep the original.

class String
  # Returns a string with all words except those passed in as keepers
  # redacted.
  #
  #      "Part of this string needs to be substituted".gsub_except(["this string", "substituted"], '*')
  #      # => "**** ** this string ***** ** ** substituted"
  def gsub_except keep, mark
    reverse_keep = self.dup
    keep.each_with_object(Hash.new(0)) { |e, a| a[e] = mark * e.length }
             .each { |word, redacted| reverse_keep.gsub! word, redacted }
    reverse_keep.chars.zip(self.chars).map do |redacted, original|
      redacted == original && original != ' ' ?  mark : original
    end.join
  end
end

Upvotes: 1

EdgeCaseBerg
EdgeCaseBerg

Reputation: 2841

This might be a little more understandable than my last answer:

s = "Part of this string needs to be substituted"
k = ["this string", "substituted"]

tmp = s
for(key in k) {
    tmp = tmp.replace(k[key], function(x){ return "*".repeat(x.length)})
}

res = s.split("")
for(charIdx in s) {
    if(tmp[charIdx] != "*" && tmp[charIdx] != " ") {
        res[charIdx] = "*"
    } else {
        res[charIdx] = s.charAt(charIdx)
    }
}
var finalResult = res.join("")

Explanation:

This goes off of my previous idea about using where the keywords are in order to replace portions of the string with stars. First off:

For each of the keywords we replace it with stars, of the same length as it. So:

s.replace("this string", function(x){
    return "*".repeat(x.length)
}

replaces the portion of s that matches "this string" with x.length *'s

We do this for each key, for completeness, you should make sure that the replace is global and not just the first match found. /this string/g, I didn't do it in the answer, but I think you should be able to figure out how to use new RegExp by yourself.

Next up, we split a copy of the original string into an array. If you're a visual person, it should make sense to think of this as a weird sort of character addition:

"Part of this string needs to be substituted"
"Part of *********** needs to be substituted" +
---------------------------------------------
 **** ** this string ***** ** ** ***********

is what we're going for. So if our tmp variable has stars, then we want to bring over the original string, and otherwise we want to replace the character with a *

This is easily done with an if statement. And to make it like your example in the question, we also bring over the original character if it's a space. Lastly, we join the array back into a string via .join("") so that you can work with a string again.

Makes sense?

Upvotes: 0

pepegasca
pepegasca

Reputation: 84

You can use something like:

str="Part of this string needs to be substituted"
keep = ["this","string", "substituted"]

str.split(" ").map{|word| keep.include?(word) ? word : word.split("").map{|w| "*"}.join}.join(" ")

but this will work only to keep words, not phrases.

Upvotes: 0

Related Questions