Steph
Steph

Reputation: 39

Removing carriage return from specific line in c#

I have this type of data in a text file (csv) :

column1|column2|column3|column4|column5 (\r\n)
column1|column2|column3|column4|column5 (\r\n)
column1|column2 (\r\n)
column2 (\r\n)
column2|column3|column4|column5 (\r\n)

I would like to delete the \r\n that are line 3 and line 4 to have :

column1|column2|column3|column4|column5 (\r\n)
column1|column2|column3|column4|column5 (\r\n)
column1|column2/column2/column2|column3|column4|column5 (\r\n)

My idea is if the row doesn't have 4 column separators ("|") then delete the CRLF, and repeat the operation until you have only correct rows.

This is my code :

String path = "test.csv";

// Read file
string[] readText = File.ReadAllLines(path);

// Empty the file
File.WriteAllText(path, String.Empty);

int x = 0;
int countheaders = 0;
int countlines;
using (StreamWriter writer = new StreamWriter(path))
{
    foreach (string s in readText)
    {
        if (x == 0)
        {
            countheaders = s.Where(c => c == '|').Count();
            x = 1;
        }

        countlines = 0;
        countlines = s.Where(d => d == '|').Count();
        if (countlines == countheaders)
        {
            writer.WriteLine(s);
        }
        else
        {
            string s2 = s;
            s2 = s2.ToString().TrimEnd('\r', '\n');
            writer.Write(s2);
        }
    }
}

The problem is that i'm reading the file in one pass, so the line break on line 4 is removed and line 4 and line 5 are together...

Upvotes: 0

Views: 568

Answers (1)

InBetween
InBetween

Reputation: 32770

You could probably do the following (cant test it now, but it should work):

IEnumerable<string> batchValuesIn(
    IEnumerable<string> source, 
    string separator,
    int size)
{
    var counter = 0;
    var buffer = new StringBuilder();

    foreach (var line in  source)
    {
        var values = line.Split(separator);

        if (line.Length != 0)
        {
            foreach (var value in values)
            {
                buffer.Append(value);
                counter++;

                if (counter % size == 0)
                {
                    yield return buffer.ToString();
                    buffer.Clear();
                }
                else
                   buffer.Append(separator);
            }
        }
    }

    if (buffer.Length != 0)
       yield return buffer.ToString();

And you'd use it like:

var newLines = batchValuesIn(File.ReadLines(path), "|", 5);

The good thing about this solution is that you are never loading into memory the enitre orignal source. You simply build the lines on the fly.

DISCLAIMER: this may behave weirdly with malfomred input strings.

Upvotes: 1

Related Questions