Cass
Cass

Reputation: 557

How can I tell if there is an environment.newline at the end of StreamReader.Readline()

I am trying to read a text file line by line and create one line from multiple lines until the line read in has \r\n at the end. My data looks like this:

BusID|Comment1|Text\r\n
1010|"Cuautla, Inc. d/b/a 3 Margaritas VIII\n
State Lic. #40428210000   City Lic.#4042821P\n
9/26/14      9/14/14 - 9/13/15    $175.00\n
9/20/00    9/14/00 - 9/13/01    $575.00 New License"\r\n
1020|"7-Eleven Inc., dba 7-Eleven Store #20638\n
State Lic. #24111110126; City Lic. #2411111126P\n
SEND ISSUED LICENSES TO DALLAS, TX\r\n

I want the data to look like this:

BusID|Comment1|Text\r\n
1010|"Cuautla, Inc. d/b/a 3 Margaritas VIII State Lic. #40428210000   City Lic.#4042821P 9/26/14      9/14/14 - 9/13/15    $175.00 9/20/00    9/14/00 - 9/13/01    $575.00 New License"\r\n
1020|"7-Eleven Inc., dba 7-Eleven Store #20638 State Lic. #24111110126; City Lic. #2411111126P SEND ISSUED LICENSES TO DALLAS, TX\r\n

My code is like this:

FileStream fsFileStream = new FileStream(strInputFileName, FileMode.Open, 
FileAccess.Read, FileShare.ReadWrite);

using (StreamReader srStreamRdr = new StreamReader(fsFileStream))
{
    while ((strDataLine = srStreamRdr.ReadLine()) != null && !blnEndOfFile)
    {
        //code evaluation here
    }

I have tried:

if (strDataLine.EndsWith(Environment.NewLine))
{
    blnEndOfLine = true;
}

and

if (strDataLine.Contains(Environment.NewLine))
{
    blnEndOfLine = true;
}

These do not see anything at the end of the string variable. Is there a way for me to tell the true end of line so I can combine these rows into one row? Should I be reading the file differently?

Upvotes: 1

Views: 1062

Answers (3)

Steve
Steve

Reputation: 216353

You cannot use the ReadLine method of the StringReader because every kind of newline. both the \r\n and \n are removed from the input, a line is returned by the reader and you will never know if the characters removed are \r\n or just \n

If the file is not really big then you can try to load everything in memory and do the splitting yourself into separate lines

// Load everything in memory
string fileData = File.ReadAllText(@"D:\temp\myData.txt");

// Split on the \r\n (I don't use Environment.NewLine because it 
// respects the OS conventions and this could be wrong in this context
string[] lines = fileData.Split(new string[] { "\r\n"}, StringSplitOptions.RemoveEmptyEntries);

// Now replace the remaining \n with a space 
lines = lines.Select(x => x.Replace("\n", " ")).ToArray();

foreach(string s in lines)
   Console.WriteLine(s);

EDIT
If your file is really big (like you say 3.5GB) then you cannot load everything in memory but you need to process it in blocks. Fortunately the StreamReader provides a method called ReadBlock that allows us to implement code like this

// Where we store the lines loaded from file
List<string> lines = new List<string>();

// Read a block of 10MB
char[] buffer = new char[1024 * 1024 * 10];
bool lastBlock = false;
string leftOver = string.Empty;

// Start the streamreader
using (StreamReader reader = new StreamReader(@"D:\temp\localtext.txt"))
{
    // We exit when the last block is reached
    while (!lastBlock)
    {
        // Read 10MB
        int loaded = reader.ReadBlock(buffer, 0, buffer.Length);

        // Exit if we have no more blocks to read (EOF)
        if(loaded == 0) break;

        // if we get less bytes than the block size then 
        // we are on the last block 
        lastBlock = (loaded != buffer.Length);

        // Create the string from the buffer
        string temp = new string(buffer, 0, loaded);

        // prepare the working string adding the remainder from the 
        // previous loop
        string current = leftOver + temp;

        // Search the last \r\n
        int lastNewLinePos = temp.LastIndexOf("\r\n");

        if (lastNewLinePos > -1)
        {
             // Prepare the working string
             current = leftOver + temp.Substring(0, lastNewLinePos + 2);

             // Save the incomplete parts for the next loop
             leftOver = temp.Substring(lastNewLinePos + 2);
        }
        // Process the lines
        AddLines(current, lines);
    }
}

void AddLines(string current, List<string> lines)
{
    var splitted = current.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries);
    lines.AddRange(splitted.Select(x => x.Replace("\n", " ")).ToList());
}

This code assumes that your file always ends with a \r\n and that you always get a \r\n inside a block of 10MB of text. More tests are needed with your actual data.

Upvotes: 1

Akash KC
Akash KC

Reputation: 16310

You can just read all text by calling File.ReadAllText(path) and parse it in following way :

            string input =  File.ReadAllText(your_file_path);
            string output = string.Empty;
            input.Split(new[] { Environment.NewLine } , StringSplitOptions.RemoveEmptyEntries).
                Skip(1).ToList().
                ForEach(x =>
                {
                    output += x.EndsWith("\\r\\n") ? x + Environment.NewLine 
                                                   : x.Replace("\\n"," ");
                });

Upvotes: 0

StfBln
StfBln

Reputation: 1157

If what you have posted is exactly whats in the file. Meaning the \r\n are indeed written, you can use the following to unescape them:

strDataLine.Replace("\\r", "\r").Replace("\\n", "\n");

this will ensure you can now use Environment.NewLine in order to do your comparison as in:

if (strDataLine.Replace("\\r", "\r").Replace("\\n", "\n").EndsWith(Environment.NewLine))
{
    blnEndOfLine = true;
}

Upvotes: 0

Related Questions