Reputation: 19110
I'm using Rails 5. How do I remove all words from the beginning of my string whose first character is not a letter (i.e. !~ /\p{L}/)? So if I have a string
"1000 22 cc33 dfdsf"
I would want the result to be
"cc33 dfdsf"
Similarly, if the string were
"7nnn 2000 388 a 4000 bbb"
I would expect the result to be
"a 4000 bbb"
Upvotes: 1
Views: 276
Reputation: 626747
I suppose the "words" are just chunks of non-whitespace symbols.
You may use
rx = /\G[^[:space:]\p{L}][^[:space:]]*[[:space:]]*/
puts "1000 22 cc33 dfdsf".gsub(rx, '') # => cc33 dfdsf
puts "7nnn 2000 388 a 4000 bbb".gsub(rx, '') # => a 4000 bbb
See the Ruby demo online
Details:
\G
- start of string or the end of the previous match (thus, we only get consecutive matches from the start of the string)[^[:space:]\p{L}]
- a char that is not a whitespace and not a letter[^[:space:]]*
- 0+ non-whitespaces [[:space:]]*
- 0+ whitespaces.Another regex you can use is /\A(?:[^[:space:]\p{L}][^[:space:]]*[[:space:]]*)+/
. Here, \A
matches the start of the string, and (?:...)+
matches 1 or more consecutive occurrences of the pattern described above.
NOTE: If you want to match specifically alphanumeric words, that is, if you want to remove all words starting with a digit at the beginning of the string, you may use
/\G\p{N}[[:alnum:]]*[^[:alnum:]]*/
or
/\A(?:\p{N}[[:alnum:]]*[^[:alnum:]]*)+/
where \p{N}
matches any digit, [[:alnum:]]
matches any alphanumeric and [^[:alnum:]]
matches any char that is not alphanumeric. See another Ruby demo.
Upvotes: 2
Reputation: 5695
This pattern searches until it finds the first occurence of a "word" by your criteria and then it takes everything till the end of the string. You can extract the result in from the matched group.
.*?\b([A-Za-z].*)
Swap it for:
.*?\b([A-Za-z][\s\S]*)
if you need line terminators included.
Upvotes: 0