new_linux_user
new_linux_user

Reputation: 731

Regex: C# extract text within double quotes

I want to extract only those words within double quotes. So, if the content is:

Would "you" like to have responses to your "questions" sent to you via email?

The answer must be

  1. you
  2. questions

Upvotes: 55

Views: 76703

Answers (8)

crokusek
crokusek

Reputation: 5624

Slight improvement on answer by @ria,

\"[^\" ][^\"]*\"

Will recognize a starting double quote only when not followed by a space to allow trailing inch specifiers.

enter image description here

Side effect: It will not recognize "" as a quoted value.

Upvotes: 1

sthames42
sthames42

Reputation: 1009

I needed to do this in C# for parsing CSV and none of these worked for me so I came up with this:

\s*(?:(?:(['"])(?<value>(?:\\\1|[^\1])*?)\1)|(?<value>[^'",]+?))\s*(?:,|$)

This will parse out a field with or without quotes and will exclude the quotes from the value while keeping embedded quotes and commas. <value> contains the parsed field value. Without using named groups, either group 2 or 3 contains the value.

There are better and more efficient ways to do CSV parsing and this one will not be effective at identifying bad input. But if you can be sure of your input format and performance is not an issue, this might work for you.

Upvotes: 0

Jared Chu
Jared Chu

Reputation: 2852

I combine Regex and Trim:

const string searchString = "This is a \"search text\" and \"another text\" and not \"this text";
var collection = Regex.Matches(searchString, "\\\"(.*?)\\\"");
foreach (var item in collection)
{
    Console.WriteLine(item.ToString().Trim('"'));
}

Result:

search text
another text

Upvotes: 4

vapcguy
vapcguy

Reputation: 7537

This also steals the Regex from @Ria, but allows you to get them into an array where you then remove the quotes:

strText = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
MatchCollection mc = Regex.Matches(strText, "\"([^\"]*)\"");
for (int z=0; z < mc.Count; z++)
{
    Response.Write(mc[z].ToString().Replace("\"", ""));
}

Upvotes: 5

Ria
Ria

Reputation: 10347

Try this regex:

\"[^\"]*\"

or

\".*?\"

explain :

[^ character_group ]

Negation: Matches any single character that is not in character_group.

*?

Matches the previous element zero or more times, but as few times as possible.

and a sample code:

foreach(Match match in Regex.Matches(inputString, "\"([^\"]*)\""))
    Console.WriteLine(match.ToString());

//or in LINQ
var result = from Match match in Regex.Matches(line, "\"([^\"]*)\"") 
             select match.ToString();

Upvotes: 76

bart s
bart s

Reputation: 5100

I like the regex solutions. You could also think of something like this

string str = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
var stringArray = str.Split('"');

Then take the odd elements from the array. If you use linq, you can do it like this:

var stringArray = str.Split('"').Where((item, index) => index % 2 != 0);

Upvotes: 16

Edi Wang
Edi Wang

Reputation: 3637

Based on @Ria 's answer:

static void Main(string[] args)
{
    string str = "Would \"you\" like to have responses to your \"questions\" sent to you via email?";
    var reg = new Regex("\".*?\"");
    var matches = reg.Matches(str);
    foreach (var item in matches)
    {
        Console.WriteLine(item.ToString());
    }
}

The output is:

"you"
"questions"

You can use string.TrimStart() and string.TrimEnd() to remove double quotes if you don't want it.

Upvotes: 23

opewix
opewix

Reputation: 5083

Try this (\"\w+\")+

I suggest you to download Expresso

http://www.ultrapico.com/Expresso.htm

Upvotes: 1

Related Questions