user594166
user594166

Reputation:

extract double quoted items in a phrase

I would like to extract all double quoted phrase inside an input phrase and keep non matching elements as words

let's say i have "sales people" IT i want the output to be:

sales people

  IT

same thing for input="SO \"sales manager\" marketing \"management\""

the output is:

SO
sales manager
marketing
management

if input ="SO \"sales manager\" marketing management\" insurance"

the output is:

SO
sales manager
marketing
management
insurance

I have found the regex :but i don't know how to extract:

string InputText="SO \"sales manager\" marketing \"management\"" ;
string pattern0 = "^\"(.*?)\"$";
string pattern = "^(.*?)\"(.*?)\"(.*?)$";
Regex regex = new Regex(pattern);
string[] temOperands;
bool isMatch = regex.IsMatch(InputText);
if (isMatch)
{
    //here goes the extraction
}

Upvotes: 3

Views: 1048

Answers (4)

Oleks
Oleks

Reputation: 32333

I think you need something like "(?<word>[^"]+)"|(?<word>\w+). This will match both text in double quotes and single words:

var str = @"SO ""sales manager"" marketing hello ""management""";
var regex = new Regex(@"""(?<word>[^""]+)""|(?<word>\w+)");
var words = regex.Matches(str)
    .Cast<Match>()
    .Select(m => m.Groups["word"].Value)
    .ToArray();

For the test string this will return:

SO
sales manager
marketing
hello
management

Upvotes: 4

userGS
userGS

Reputation: 220

The input string has only two words within double quotes "Sales manager" and "marketing". Below code can extract strings within double quotes.

        ArrayList arr = new ArrayList();

        int x1 ;
        int nextPos=0;
        x1 = InputText.IndexOf('\"', 0) +1 ;
        while (x1 != -1)
        {
            if (x1 >= 0)
            { 
                nextPos = InputText.IndexOf('\"',x1);
                arr.Add(InputText.Substring(x1, nextPos - x1));
            }
            nextPos++;
            x1 = InputText.IndexOf('\"', nextPos) + 1;
        }

Upvotes: 0

SMK
SMK

Reputation: 2158

you can also use split function

string s="SO \"sales manager\" marketing \"management\"";
string[] ExtractedString= Regex.Split(s, "\"");

Upvotes: -1

Likurg
Likurg

Reputation: 2760

You can use 'replace'

string InputText="SO \"sales manager\" marketing \"management\"" ;
InputText=InputText.Replace("\"","\n");

in output you will have what you want.

Upvotes: -1

Related Questions