greenoldman
greenoldman

Reputation: 21062

How to read a file into a string with CR/LF preserved?

If I asked the question "how to read a file into a string" the answer would be obvious. However -- here is the catch with CR/LF preserved.

The problem is, File.ReadAllText strips those characters. StreamReader.ReadToEnd just converted LF into CR for me which led to long investigation where I have bug in pretty obvious code ;-)

So, in short, if I have file containing foo\n\r\nbar I would like to get foo\n\r\nbar (i.e. exactly the same content), not foo bar, foobar, or foo\n\n\nbar. Is there some ready to use way in .Net space?

The outcome should be always single string, containing entire file.

Upvotes: 10

Views: 29066

Answers (6)

Jesper
Jesper

Reputation: 430

This piece of code will preserve LF and CR

string r = File.ReadAllText(@".\TestData\TR120119.TRX", Encoding.ASCII);

Upvotes: 6

vapcguy
vapcguy

Reputation: 7537

This is similar to the accepted answer, but wanted to be more to the point. sr.ReadToEnd() will read the bytes like is desired:

string myFilePath = @"C:\temp\somefile.txt";
string myEvents = String.Empty;

FileStream fs = new FileStream(myFilePath, FileMode.Open);
StreamReader sr = new StreamReader(fs);
myEvents = sr.ReadToEnd();
sr.Close();
fs.Close();

You could even also do those in cascaded using statements. But I wanted to describe how the way you write to that file in the first place will determine how to read the content from the myEvents string, and might really be where the problem lies. I wrote to my file like this:

using System.Reflection;
using System.IO;

private static void RecordEvents(string someEvent)
{
    string folderLoc = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location);
    if (!folderLoc.EndsWith(@"\")) folderLoc += @"\";
    folderLoc = folderLoc.Replace(@"\\", @"\"); // replace double-slashes with single slashes
    string myFilePath = folderLoc + "myEventFile.txt";

    if (!File.Exists(myFilePath))
        File.Create(myFilePath).Close(); // must .Close() since will conflict with opening FileStream, below

    FileStream fs = new FileStream(myFilePath, FileMode.Append);
    StreamWriter sr = new StreamWriter(fs);
    sr.Write(someEvent + Environment.NewLine);
    sr.Close();
    fs.Close();
}

Then I could use the code farther above to get the string of the contents. Because I was going further and looking for the individual strings, I put this code after THAT code, up there:

if (myEvents != String.Empty) // we have something
{
    // (char)2660 is ♠  -- I could have chosen any delimiter I did not
    // expect to find in my text
    myEvents = myEvents.Replace(Environment.NewLine, ((char)2660).ToString());
    string[] eventArray = myEvents.Split((char)2660);
    foreach (string s in eventArray)
    {
        if (!String.IsNullOrEmpty(s))
            // do whatever with the individual strings from your file
    }
}

And this worked fine. So I know that myEvents had to have the Environment.NewLine characters preserved because I was able to replace it with (char)2660 and do a .Split() on that string using that character to divide it into the individual segments.

Upvotes: 0

pitiklan
pitiklan

Reputation: 1

ReadAllText doesn't return carriage returns.

This method opens a file, reads each line of the file, and then adds each line as an element of a string. It then closes the file. A line is defined as a sequence of characters followed by a carriage return ('\r'), a line feed ('\n'), or a carriage return immediately followed by a line feed. The resulting string does not contain the terminating carriage return and/or line feed.

From MSDN - https://msdn.microsoft.com/en-us/library/ms143368(v=vs.110).aspx

Upvotes: 0

Douglas
Douglas

Reputation: 54897

Are you sure that those methods are the culprits that are stripping out your characters?

I tried to write up a quick test; StreamReader.ReadToEnd preserves all newline characters.

string str = "foo\n\r\nbar";
using (Stream ms = new MemoryStream(Encoding.ASCII.GetBytes(str)))
using (StreamReader sr = new StreamReader(ms, Encoding.UTF8))
{
    string str2 = sr.ReadToEnd();
    Console.WriteLine(string.Join(",", str2.Select(c => ((int)c))));
}

// Output: 102,111,111,10,13,10,98,97,114
//           f   o   o \n \r \n  b  a   r

An identical result is achieved when writing to and reading from a temporary file:

string str = "foo\n\r\nbar";
string temp = Path.GetTempFileName();
File.WriteAllText(temp, str);
string str2 = File.ReadAllText(temp);
Console.WriteLine(string.Join(",", str2.Select(c => ((int)c))));

It appears that your newlines are getting lost elsewhere.

Upvotes: 12

user1695736
user1695736

Reputation:

You can read the contents of a file using File.ReadAllLines, which will return an array of the lines. Then use String.Join to merge the lines together using a separator.

string[] lines = File.ReadAllLines(@"C:\Users\User\file.txt");
string allLines = String.Join("\r\n", lines);

Note that this will lose the precision of the actual line terminator characters. For example, if the lines end in only \n or \r, the resulting string allLines will have replaced them with \r\n line terminators.

There are of course other ways of acheiving this without losing the true EOL terminator, however ReadAllLines is handy in that it can detect many types of text encoding by itself, and it also takes up very few lines of code.

Upvotes: 1

Hans Passant
Hans Passant

Reputation: 942000

The outcome should be always single string, containing entire file.

It takes two hops. First one is File.ReadAllBytes() to get all the bytes in the file. Which doesn't try to translate anything, you get the raw data in the file so the weirdo line-endings are preserved as-is.

But that's bytes, you asked for a string. So second hop is to apply Encoding.GetString() to convert the bytes to a string. The one thing you have to do is pick the right Encoding class, the one that matches the encoding used by the program that wrote the file. Given that the file is pretty messed up if it contains \n\r\n sequences, and you didn't document anything else about the file, your best bet is to use Encoding.Default. Tweak as necessary.

Upvotes: 2

Related Questions