Reputation: 741
The problem is this:
I want to find a regex in a textfile and get the complete block of text
Text example:
text text text text text text text text text
!
title
text text text text text text text text text text text text text text text
text text text text text text text text text text text text text text text
text text text text text text text text text text text text text text text
!
text text text text text text text text text
finding the "title" part is easy but I want to get the following result:
title
text text text text text text text text text text text text text text text
text text text text text text text text text text text text text text text
text text text text text text text text text text text text text text text
What is the best way to go? Working with a regex pattern or selecting the text until I get a "!"? (I want to have simple/fast readable code)
Code for finding a pattern: (with rtxtText as the richtextbox)
private String searchInfo(String pattern)
{
String text = rtxtText.Text;
Regex regExp = new Regex(pattern);
String result = "";
foreach (Match match in regExp.Matches(text))
{
result += "\n" + match.ToString();
}
return result;
}
Upvotes: 2
Views: 7995
Reputation: 236248
public IEnumerable<string> ParseParagraphs(string text)
{
Regex regex = new Regex(@"title[^!]*");
foreach (Match match in regex.Matches(text))
yield return match.Value;
}
Usage is simple:
foreach (var p in ParseParagraphs(your_text))
Console.WriteLine(p);
UPDATE: Use StringBuilder in your SearchInfo method to avoid creating many strings in memory
private string SearchInfo(String pattern)
{
MatchCollection matches = Regex.Matches(rtxtText.Text, pattern);
if (matches.Count == 0)
return String.Empty;
StringBuilder sb = new StringBuilder();
foreach (Match match in matches)
sb.AppendLine(match.Value);
return sb.ToString();
}
And call it this way var result = SearchInfo(@"title[^!]*");
Upvotes: 1
Reputation: 9729
Your Regex be changed to contain the unknown characters as well, like
title
then [^!]*
([^ ]
means something not in this set, so [^!]*
is everything except !
in any number)
Regex regex = new Regex("title[^!]*", RegexOptions.SingleLine); MatcheCollection matches = regex.Matches(text);
Upvotes: 4
Reputation: 70324
The best way is just to loop over the lines of text until you find the first '!' and then collect until you find the next:
line = textfile.readline()
while line and line.strip() != '!'
line = textfile.readline() # skip until first '!'
title = textfile.readline() # now on title line
text = ''
line = textfile.readline()
while line and line.strip() != '!'
text += line
line = textfile.readline()
print title
print text
Upvotes: 1