Fred James
Fred James

Reputation: 69

string.Join Linq query to merge two strings from an array and output as single comma delimited string

As part of a data cleansing exercise I need to correct the formatting of a csv file.

Due to poor formatting/lack of quotes an extra comma in a description field is breaking my DTS package.

So, to get around this I have created a simple C# script to find any line in the csv that contains more columns than the header row.

When the row contains more columns than the header I want to merge array item [10] and [11] into one column and then write the line to my new file - keeping all the other existing columns as they are.

Code:

    var columns = splitExpression.Split(line).Where(s => s != delimiter).ToArray();

    if (headers == null) headers = new string[columns.Length];

    if (columns.Length != headers.Length)
    {
        // TODO - Linq to write comma separated string but merge column 10 and 11 of the array
        // writer.WriteLine(string.Join(delimiter, columns));
    }
    else
    {
        writer.WriteLine(string.Join(delimiter, columns));
    }

Unfortunately, my Linq writing skills are somewhat lacking, can someone please help me fill in the TODO.

Upvotes: 2

Views: 372

Answers (1)

Sergey Berezovskiy
Sergey Berezovskiy

Reputation: 236208

Simply use for columns list instead of array. That will allow you to remove unnecessary columns after merge:

var columns = splitExpression.Split(line).Where(s => s != delimiter).ToList();

if (headers == null) headers = new string[columns.Count];

if (columns.Count != headers.Length)
{
    columns[10] = columns[10] + columns[11]; // combine columns here
    columns.RemoveAt(11);       
}

writer.WriteLine(string.Join(delimiter, columns));

Upvotes: 3

Related Questions