Reputation: 83
using (StreamWriter writer = File.CreateText(FinishedFile))
{
int lineNum = 0;
while (lineNum < FilesLineCount.Min())
{
for (int i = 0; i <= FilesToMerge.Count() - 1; i++)
{
if (i != FilesToMerge.Count() - 1)
{
var CurrentFile = File.ReadLines(FilesToMerge[i]).Skip(lineNum).Take(1);
string CurrentLine = string.Join("", CurrentFile);
writer.Write(CurrentLine + ",");
}
else
{
var CurrentFile = File.ReadLines(FilesToMerge[i]).Skip(lineNum).Take(1);
string CurrentLine = string.Join("", CurrentFile);
writer.Write(CurrentLine + "\n");
}
}
lineNum++;
}
}
The current way i am doing this is just too slow. I am merging files that are each 50k+ lines long with various amounts of data.
for ex:
File 1
1
2
3
4
File 2
4
3
2
1
i need this to merge into being a third file
File 3
1,4
2,3
3,2
4,1
P.S. The user can pick as many files as they want from any locations.
Thanks for the help.
Upvotes: 2
Views: 345
Reputation: 109567
Here's another approach - this implementation only stores in memory one set of lines from each file simultaneously, thus reducing memory pressure significantly (if that is an issue).
public static void MergeFiles(string output, params string[] inputs)
{
var files = inputs.Select(File.ReadLines).Select(iter => iter.GetEnumerator()).ToArray();
StringBuilder line = new StringBuilder();
bool any;
using (var outFile = File.CreateText(output))
{
do
{
line.Clear();
any = false;
foreach (var iter in files)
{
if (!iter.MoveNext())
continue;
if (line.Length != 0)
line.Append(", ");
line.Append(iter.Current);
any = true;
}
if (any)
outFile.WriteLine(line.ToString());
}
while (any);
}
foreach (var iter in files)
{
iter.Dispose();
}
}
This also handles files of different lengths.
Upvotes: 1
Reputation: 460108
You approach is slow because of the Skip
and Take
in the loops.
You could use a dictionary to collect all line-index' lines:
string[] allFileLocationsToMerge = { "filepath1", "filepath2", "..." };
var mergedLists = new Dictionary<int, List<string>>();
foreach (string file in allFileLocationsToMerge)
{
string[] allLines = File.ReadAllLines(file);
for (int lineIndex = 0; lineIndex < allLines.Length; lineIndex++)
{
bool indexKnown = mergedLists.TryGetValue(lineIndex, out List<string> allLinesAtIndex);
if (!indexKnown)
allLinesAtIndex = new List<string>();
allLinesAtIndex.Add(allLines[lineIndex]);
mergedLists[lineIndex] = allLinesAtIndex;
}
}
IEnumerable<string> mergeLines = mergedLists.Values.Select(list => string.Join(",", list));
File.WriteAllLines("targetPath", mergeLines);
Upvotes: 6