Kunjahamed P
Kunjahamed P

Reputation: 81

how to merge two csv files with different columns and rows in c#

I'm trying to merge two csv files which have different headers and different number of rows/lines. Using the following code, but doesn't get correct output. It's working when rows are same.

var first = File.ReadAllLines("firstfile.csv");
var second = File.ReadAllLines("secondfile.csv");
var result = first.Zip(second, (f, s) => string.Join(",", f, s));
File.WriteAllLines("combined.csv", result);

for ex: firstfile is

col1,colb,colc
a,b,c
a,v,f

the secondfile is

colx,coly
x,y
cc,aa
bb,vv
m,n

the output is get

col1,colb,colc,colx,coly
a,b,c,x,y
a,v,f,cc,aa

the second file rows are missiing. my expected output is

col1,colb,colc,colx,coly
a,b,c,x,y
a,v,f,cc,aa
,,,bb,vv
,,,m,n

Upvotes: 1

Views: 2645

Answers (2)

Jonathan Willcock
Jonathan Willcock

Reputation: 5255

There is no inbuilt method that allows you to merge two lists of unequal length. Zip only merges down to the shortest length. However, you can achieve what you want by modifying Marc Gravell's excellent answer here, in order to allow a default value. Create yourself an extensions class, something like this:

public static class Extensions
{
    public static IEnumerable<T> Merge<T>(this IEnumerable<T> first,
    IEnumerable<T> second, T defaultValue, Func<T, T, T> operation)
    {
        using (var iter1 = first.GetEnumerator())
        using (var iter2 = second.GetEnumerator())
        {
            while (iter1.MoveNext())
            {
                if (iter2.MoveNext())
                {
                    yield return operation(iter1.Current, iter2.Current);
                }
                else
                {
                    yield return operation(iter1.Current, defaultValue);
                }
            }
            while (iter2.MoveNext())
            {
                yield return operation(defaultValue, iter2.Current);
            }
        }
    }
}

You can now call it with code like this:

char separator = ',';
var first = File.ReadAllLines("firstfile.csv").AsEnumerable();
var second = File.ReadAllLines("secondfile.csv").AsEnumerable();

string defaultValue = "";
int cnt = 0;
if (first.Count() < second.Count())
{
    cnt = first.FirstOrDefault().Split(separator).Length;
}
else
{
    cnt = second.FirstOrDefault().Split(separator).Length;
}
defaultValue = defaultValue.PadLeft(cnt - 1, separator);
var result = first.Merge(second, defaultValue, (f, s) => string.Join(separator.ToString(), f, s));
File.WriteAllLines("combined.csv", result);

Note I have added a char separator and changed the result of ReadAllLines to give an IEnumerable<string> rather than string[] to make the code more generic. Also the above code assumes that the both files have an internally consistent number of columns.

Upvotes: 2

Longoon12000
Longoon12000

Reputation: 794

First you need to find out which of the two lists is the larger one so you can loop over that one and once you're past the length of the smaller list you can fill up the missing cells with empty values.

Next you need to know how many columns you have in the smaller list as you want to fill these columns with empty values. That means you have to take the header line of the smaller list, split it by comma and count the columns.

Then generate a string containing your empty cells (eg. if your smaller list has 3 columns, you need a string ",," - String Padding may be of help here).

So then you only have to loop over the larger list and get the two corresponding rows (or use the empty one you generated earlier) and concatenate them with a comma and put them in a list.

Upvotes: 0

Related Questions