Reputation: 111
If this is the test string -
alt="mass |36 grams\nserving volume | 63 mL (milliliters)\nserving density | 0.57 g\/cm^3 (grams per cubic centimeter)" title="mass | 36 grams.
\btitle="mass| \b.*+\s*+\K.*(?=serving volume\b)
This is my code but it does not return what is required. Then how to extract 36 grams from this text?
It would be great if someone could share a link from where I can learn regex.
Upvotes: 0
Views: 60
Reputation: 28461
gsub('mass \\|([0-9]* [A-Za-z]*).*', '\\1', alt)
[1] "36 grams"
To exclude the unit:
gsub('mass \\|([0-9]*).*', '\\1', alt)
[1] "36"
Careful with the extra space, it will be captured too. This is not what you want:
gsub('mass \\|([0-9]* ).*', '\\1', alt)
[1] "36 "
Upvotes: 2
Reputation: 819
For the example you gave this will work, but depending on what you want to do you might need something more general:
alt<-"mass |36 grams\nserving volume | 63 mL (milliliters)\nserving density | 0.57 g/cm^3 (grams per cubic centimeter)"
gsub(".*\\|([0-9]+ gram).*","\\1",alt)
[1] "36 gram"
Upvotes: 1