thatandrey
thatandrey

Reputation: 287

Regex command, OR doesn't seem to work

I have part of the following text that I'm reading with C#

"I have to see your driver’s license and print you an ID tag before I can send you through," he said in a flat, automatic sort of way, staring at the horns with blank-eyed fascination.

I'm reading in some lines of this one book, and I'd like to create strings out of all the words, including those with apostrophes. I'd like to split the lines based on non word characters, but I want apostrophes to be included with the word characters, so I ultimately get a list of strings with just words, so that the word "driver's" is together.

I'm using sublime to test out the expressions, but when I do (\W+|\'), apostrophes are still captured. I don't want to split something like "you'd" into two string. \W+ is perfect, but I'd just like to include apostrophes. How could I do that?

Upvotes: 0

Views: 49

Answers (2)

Alexander Bell
Alexander Bell

Reputation: 7918

You can try String.Split: example follows

string _input ="I have to see your driver’s license and print you an ID tag before I can send you through";
string[] _words = _input.Split(' ');

In case you want to remove other characters, for example: single quote (apostrophe) "'" and comma "," and use Replace(), like:

_input = _input.Replace("'", String.Empty).Replace(",",String.Empty);
string[] _words = _input.Split(' ');

You can also use Regex, but its performance is worse than of these methods (if it does matter).

Also, you can try as an example my 'semantic analyzer' app at: http://webinfocentral.com/TECH/SemanticAnalyzer.aspx . It's doing all that stuff and much more (characters to exclude are listed at the left pane). Rgds,

Upvotes: 0

jcaron
jcaron

Reputation: 17710

If you're looking for a regex matching "between" the words:

[^\w']+

should do.

Upvotes: 1

Related Questions