Reputation: 599
I have searched SO but did not find anything that specifically addresses this issue: So here goes - I have a text file where the paragraphs have lines of text ending with a "return". So it ends up on separate lines - I would like to merge these multiple lines into a single line. I am using Streamreader in C# ( VS 2010).
Example:
GE1:1
xxxxxxxxxxxxxxxxxxxxx
yyyyyyyyyyyyyy.
hhhhhhhhhhhhh.
GE1:2
zzzzzzzzzzz
kkkkkkkkkkkkkkkkkkkkkkk
and so on....
As you can see in above example some paragraphs have 3 lines and some have two.It varies. There are thousands of these paras in the text file.
Basically I would like to have my variable "templine" contain the following: (which will be used for further processing).
var templine = "xxxxxxxxxxxxxxxxxxxxx yyyyyyyyyyyyyy. hhhhhhhhhhhhh."
Code:
using (StreamReader sr = new StreamReader(@"C:\Test.txt"))
using(StreamWriter sw = new StreamWriter(@"C:\Test2.txt"))
{
StringBuilder sb = new StringBuilder ( );
while (!sr.EndOfStream)
{
string templine = sr.ReadLine(); /// further processing code not relevant.
UPDATE: What i need is a way to detect if a paragraph has 3 lines or two. I know how to remove the Newline character etc.. just cant work out how to know when the paragraph ends.
Upvotes: 0
Views: 6453
Reputation: 9790
You could use a regular expressions.
Regex parser = new Regex(@"GE\d*\:\d*\r\n(?<lines>(.*?\r\n){2,3})",
RegexOptions.Singleline);
and then just get all you need:
string[] paragraphs = parser.Matches.Cast<Match>().Select(T =>
Regex.Replace(T.Groups["lines"].Value, @"\t|\n|\r", string.Empty)).ToArray();
(Havn't tested yet.)
Upvotes: 0
Reputation: 176936
You can remove new line char from the string like this
string replacement = Regex.Replace(templine , @"\t|\n|\r", "");
or
templine = templine.Replace("\n", String.Empty);
templine = templine.Replace("\r", String.Empty);
templine = templine.Replace("\t", String.Empty);
to make single line out of multiple lines
Upvotes: 0
Reputation: 2765
To bring all text into a single string
var templine = File.ReadAllText(@"c:\temp.txt").Replace(Environment.NewLine, " ");
That .Replace is because it looks like you want your new lines to be replaced with spaces.
If you want to break it up into 2 or 3 line paragraphs, you'll need to specify for us what the delimiter is.
Upvotes: 1