Reputation: 1251
I have a parsing question. I have a paragraph which has instances of : word
. So basically it has a colon, two spaces, a word (could be anything), then two more spaces.
So when I have those instances I want to convert the string so I have
:
and the word. Don't know exactly how about to do this. I'm using C# to do this. Bullet point 2 above is what I'm having a hard time doing this.
Thanks
Upvotes: 1
Views: 403
Reputation: 36522
The following is an example using Regular Expressions. See also this question for more info.
Basically the pattern string tells the regex to look for a colon followed by two spaces. Then we save in a capture group named "word" whatever the word is surrounded by two spaces on either side. Finally two more spaces are specified to finish the pattern.
The replace uses a lambda which says for every match, replace it with a colon, a new line, the "lone" word, and another newline.
string Paragraph = "Jackdaws love my big sphinx of quartz: fizz The quick onyx goblin jumps over the lazy dwarf. Where: buzz The crazy dogs.";
string Pattern = @": (?<word>\S*) ";
string Result = Regex.Replace(Paragraph, Pattern, m =>
{
var LoneWord = m.Groups[1].Value;
return @":" + Environment.NewLine + LoneWord + Environment.NewLine;
},
RegexOptions.IgnoreCase);
Input
Jackdaws love my big sphinx of quartz: fizz The quick onyx goblin jumps over the lazy dwarf. Where: buzz The crazy dogs.
Output
Jackdaws love my big sphinx of quartz:
fizz
The quick onyx goblin jumps over the lazy dwarf. Where:
buzz
The quick brown fox.
Note, for item 3 on your list, if you also want to replace individual occurrences of two spaces with newlines, you could do this:
Result = Result.Replace(" ", Environment.NewLine);
Upvotes: 1
Reputation: 108977
You can try
var str = ": first : second ";
var result = Regex.Replace(str, ":\\s{2}(?<word>[a-zA-Z0-9]+)\\s{2}",
":\n${word}\n");
Upvotes: 2
Reputation: 1541
Using RegularExpressions will give you exact matches on what you are looking for.
The regex match for a colon, two spaces, a word, then two more spaces is:
Dim reg as New Regex(": [a-zA-Z]* ")
[a-zA-Z]
will look for any character within the alphabetical range. Can append 0-9
on as well if you accept numbers within the word. The *
afterwards indicated that there can be 0 or more instances of the preceding value.
[a-zA-Z]*
will attempt to do a full match of any set of contiguous alpha characters.
Upon further reading, you may use [\w]
in place of [a-zA-Z0-9]
if that's what you are looking for. This will match any 'word' character.
source: http://msdn.microsoft.com/en-us/library/ms972966.aspx
You can retrieve all the matches using reg.Matches(inputString)
.
Review http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.replace.aspx for more information on regular expression replacements and your options from there out
edit: Before I was using \s
to search for spaces. This will match any whitespace character including tabs, new lines and other. That is not what we want, so I reverted it back to search for exact space characters.
Upvotes: 2
Reputation: 499132
Assuming your original string is exactly in the form you described, this will do:
var newString = myString.Trim().Replace(" ", "\n");
The Trim()
removes leading and trailing whitespaces, taking care of your spaces at the end of the string.
Then, the Replace
replaces the remaining " "
two space characters, with a "\n"
new line character.
The result is assigned to the newString
variable. This is needed, as myString
will not change - as strings in .NET are immutable.
I suggest you read up on the String
class and all its methods and properties.
Upvotes: 3
Reputation: 11088
You can use string.TrimEnd - http://msdn.microsoft.com/en-us/library/system.string.trimend.aspx - to trim spaces at the end of the string.
Upvotes: 1