Reputation: 1302
By using Regular Expressions how can I extract all text in double quotes, and all words out of quotes in such string:
01AB "SET 001" IN SET "BACK" 09SS 76 "01 IN" SET
First regular expression should extract all text inside double quotes like
SET 001
BACK
01 IN
Second expression shoud extract all other words in string
01AB
IN
SET
09SS
76
SET
For the first case works fine ("(.*?)")
. How can I extract all words out of quotes?
Upvotes: 6
Views: 4559
Reputation: 15218
If suggest you need all blocks of sentence - quoted and not ones - then there is more simple way to separate source string by using Regex.Split:
static Regex QuotedTextRegex = new Regex(@"("".*?"")", RegexOptions.IgnoreCase | RegexOptions.Compiled);
var result = QuotedTextRegex
.Split(sourceString)
.Select(v => new
{
value = v,
isQuoted = v.Length > 0 && v[0] == '\"'
});
Upvotes: 1
Reputation: 10347
Try this regex:
\"[^\"]*\"
Use Regex.Matches
for texts in double quotes, and use Regex.Split
for all other words:
var strInput = "01AB \"SET 001\" IN SET \"BACK\" 09SS 76 \"01 IN\" SET";
var otherWords = Regex.Split(strInput, "\"[^\"]*\"");
Upvotes: 4
Reputation: 2750
Maybe you can try replacing the words inside quotes with empty string like:
Regex r = new Regex("\".*?\"", RegexOptions.CultureInvariant | RegexOptions.Compiled | RegexOptions.Singleline);
string p = "01AB \"SET 001\" IN SET \"BACK\" 09SS 76 \"01 IN\" SET";
Console.Write(r.Replace(p, "").Replace(" "," "));
Upvotes: 2
Reputation: 394
You need to negate the pattern in your first expression.
(?!pattern)
Check out this link.
Upvotes: 1
Reputation: 726609
Try this expression:
(?:^|")([^"]*)(?:$|")
The groups matched by it will exclude the quotation marks, because they are enclosed in non-capturing parentheses (?:
and )
. Of course you need to escape the double-quotes for use in C# code.
If the target string starts and/or ends in a quoted value, this expression will match empty groups as well (for the initial and for the trailing quote).
Upvotes: 5