Reputation: 3596
I'd say I'm getting the hang at Regex but when it comes to extracting data, I'm lost. Here are the inputs I have to parse through:
Format:
String(String,...String,Integer)
Ex.
Jeff(White,Male,24)
Mark Zuckerberg(Facebook,9)
Grocery(Eggs,Cheese,Pancake,Bread,Milk,Strawberry,0)
I want to match the Strings and Integer, but not the commas or parenthesis.
This one is is a bit easy because the strings don't have symbols in them but the other day I needed to extract the word cake
out of something like this:
<Header><Body><font=Tahoma,15pt><b>cake <\b><\font>
and whenever I'd try, I'd match the entire statement, not just the cake word, because I'd do like:
.*<b>[a-zA-Z]+<\b>.*
. So yeah... the whole concept of using Regex to extract bits of a string is foreign to me. How is it usually done in these two examples?
Upvotes: 1
Views: 359
Reputation: 8318
Try following .
(?<=<b>)\s*\cake\s*(?=<\\b>)
If you want to match word other than cake
, try following.
(?<=<b>)\s*\w+\s*(?=<\\b>)
Regex to match string in first part of your Question (String(string, ... ,number))
^\w+\((\w+,)+\d\)$
In the first part of your Question, if you like to match only words and number (Grocery,Eggs, ... ,0) in your string, try following
(?<=^|\(|\,)\w+
Upvotes: 1