Reputation: 5251
I need to clean up a string from the phrase "not"
and hashtags(#
). (I also have to get rid of spaces and capslock and return them in arrays, but I got the latter three taken care of.)
Expectation:
"not12345" #=> ["12345"]
" notabc " #=> ["abc"]
"notone, nottwo" #=> ["one", "two"]
"notCAPSLOCK" #=> ["capslock"]
"##doublehash" #=> ["doublehash"]
"h#a#s#h" #=> ["hash"]
"#notswaggerest" #=> ["swaggerest"]
This is the code I have
def some_method(string)
string.split(", ").map{|n| n.sub(/(not)/,"").downcase.strip}
end
All of the above test does what I need to do except for the hash ones. I don't know how to get rid of the hashes; I have tried modifying the regex part: n.sub(/(#not)/)
, n.sub(/#(not)/)
, n.sub(/[#]*(not)/)
to no avail. How can I make Regex to remove #
?
Upvotes: 2
Views: 975
Reputation: 110755
arr = ["not12345", " notabc", "notone, nottwo", "notCAPSLOCK",
"##doublehash:", "h#a#s#h", "#notswaggerest"].
arr.flat_map { |str| str.downcase.split(',').map { |s| s.gsub(/#|not|\s+/,"") } }
#=> ["12345", "abc", "one", "two", "capslock", "doublehash:", "hash", "swaggerest"]
When the block variable str
is set to "notone, nottwo"
,
s = str.downcase
#=> "notone, nottwo"
a = s.split(',')
#=> ["notone", " nottwo"]
b = a.map { |s| s.gsub(/#|not|\s+/,"") }
#=> ["one", "two"]
Because I used Enumerable#flat_map, "one"
and "two"
are added to the array being returned. When str #=> "notCAPSLOCK"
,
s = str.downcase
#=> "notcapslock"
a = s.split(',')
#=> ["notcapslock"]
b = a.map { |s| s.gsub(/#|not|\s+/,"") }
#=> ["capslock"]
Upvotes: 3
Reputation: 29613
Here is one more solution that uses a different technique of capturing what you want rather than dropping what you don't want: (for the most part)
a = ["not12345", " notabc", "notone, nottwo",
"notCAPSLOCK", "##doublehash:","h#a#s#h", "#notswaggerest"]
a.map do |s|
s.downcase.delete("#").scan(/(?<=not)\w+|^[^not]\w+/)
end
#=> [["12345"], ["abc"], ["one", "two"], ["capslock"], ["doublehash"], ["hash"], ["swaggerest"]]
Had to delete the #
because of h#a#s#h
otherwise delete could have been avoided with a regex like /(?<=not|^#[^not])\w+/
Upvotes: 2
Reputation: 646
Ruby regular expressions allow comments, so to match the octothorpe (#
) you can escape it:
"#foo".sub(/\#/, "") #=> "foo"
Upvotes: 1
Reputation: 13921
Fun problem because it can use the most common string functions in Ruby:
result = values.map do |string|
string.strip # Remove spaces in front and back.
.tr('#','') # Transform single characters. In this case remove #
.gsub('not','') # Substitute patterns
.split(', ') # Split into arrays.
end
p result #=>[["12345"], ["abc"], ["one", "two"], ["CAPSLOCK"], ["doublehash"], ["hash"], ["swaggerest"]]
I prefer this way rather than a regexp as it is easy to understand the logic of each line.
Upvotes: 1
Reputation: 10472
You can use this regex to solve your problem. I tested and it works for all of your test cases.
/^\s*#*(not)*/
^
means match start of string\s*
matches any space at the start#*
matches 0 or more #(not)*
matches the phrase "not" zero or more times. Note: this regex won't work for cases where "not" comes before "#", such as not#hash
would return #hash
Upvotes: 1