Reputation: 7136
Considering this string:
Looking for a front-end developer who can fix a bug on my Wordpress site. The header logo disappeared after I updated some plugins. \n\nI have tried disabling all plugins but it didn't help.Budget: $25\nPosted On: May 06, 2016 16:29 UTCCategory: Web, Mobile & Software Dev > Web DevelopmentSkills: WordPress Country: Denmarkclick to apply
I'd like to retrieve the price value after the string Budget:
. I have a number of string all with the same pattern (price right after the "Budget:" string)
I tried /\$[\d.]+/
to extract any price amount but that would take any price amount in the string not only the one following Budget:
How can I accomplish that ?
Upvotes: 0
Views: 67
Reputation: 110725
r = /
\b # match a word break
[Bb] # match "B" or "b"
udget: # match string
\s+\$ # match one or more spaces followed by a dollar sign
\K # discard all matches so far
\d{1,3} # match between one or three digits
(?:\,\d{3}) # match a comma followed by three digits in a non-capture group
* # perform the preceding match zero or more times
(?:\.\d\d) # match a period followed by two digits in a non-capture group
? # make the preceding match optional
/x # free-spacing regex definition mode
"Some text Budget: $25\nsome more text"[r] #=> "25"
"Some text Budget: $25.42\nsome more text"[r] #=> "25.24"
"Some text Budget: $25,642,328\nsome more text"[r] #=> "25,642,328"
"Some text Budget: $25,642,328.01\nsome more text"[r] #=> "25,642,328.01"
This is actually not quite right because
"Some text Budget: $25,64,328.01\nsome more text"[r] #=> "25"
should return nil
. Unfortunately, the fix calls for major surgery:
r = /
\b # match a word break
[Bb] # match "B" or "b"
udget: # match string
\s+\$ # match 1 or more spaces followed by a dollar sign
\K # discard all matches so far
\d{1,3} # match between 1 and 3 digits
(?: # begin a non-capture group
(?![\,\d]) # match a comma or digit in a negative lookahead
| # or
(?: # begin a non-capture group
(?:\,\d{3}) # match a comma followed by 3 digits in a non-capture group
+ # perform preceding match 1 or more times
) # end non-capture group
) # end non-capture group
(?:\.\d\d) # match a period followed by 2 digits in a non-capture group
? # make the preceding match optional
/x
"Some text Budget: $25\nsome more text"[r] #=> "25"
"Some text Budget: $25.42\nsome more text"[r] #=> "25.24"
"Some text Budget: $25,642,328\nsome more text"[r] #=> "25,642,328"
"Some text Budget: $25,642,328.01\nsome more text"[r] #=> "25,642,328.01"
"Some text Budget: $25,64,328.01\nsome more text"[r] #=> nil
Upvotes: 3
Reputation: 4050
Try this:
def extract_budget s
m = s.match(/Budget: \$([\d,.]+)\n/)
if m.nil?
nil
else
m.captures[0].gsub(/,/, "").to_f
end
end
If s1
is your string and s2
is the same string but with "Budget: $25,000.53":
irb> extract_budget s1
=> 25.0
irb> extract_budget s2
=> 25000.53
irb> extract_budget "foo"
=> nil
Upvotes: 1
Reputation: 130
You say the string "Budget:" doesn't change and assuming there are no decimal values, I'd use something like this:
/Budget:(\s*\$\d*)/
Upvotes: 1