Reputation: 527
I'm looking for different solutions, including those in which is forbidden to use .NET libraries
and those where I can use all of the advantages of them.
Here is the problem, I have two text files, textFile1
and textFile2
. Each of them contains sorted integer numbers(this is the most important condition), like these displayed below :
textFile1 textFile2
0 1
2 3
4 5
I need to create 3rd text file, for example textFile3
by merging those two files, and expected result should be :
textFile3
0
1
2
3
4
5
My first idea was to copy those two text files line by line into two separate arrays and than use solution for merging two sorted arrays in new one, provided
in this question.
After that, I will copy those members of new array into textFile3
, line by line.
Do you have any suggestion ? Maybe better approach ? Please write all of your ideas here, each of them will be helpful to me .
Upvotes: 0
Views: 819
Reputation: 205589
Merging two ordered sequences can easily be generalized and implemented as extension method like this:
public static class Algorithms
{
public static IEnumerable<T> MergeOrdered<T>(this IEnumerable<T> seq1, IEnumerable<T> seq2, IComparer<T> comparer = null)
{
if (comparer == null) comparer = Comparer<T>.Default;
using (var e1 = seq1.GetEnumerator())
using (var e2 = seq2.GetEnumerator())
{
bool more1 = e1.MoveNext(), more2 = e2.MoveNext();
while (more1 && more2)
{
int compare = comparer.Compare(e1.Current, e2.Current);
yield return compare < 0 ? e1.Current : e2.Current;
if (compare <= 0) more1 = e1.MoveNext();
if (compare >= 0) more2 = e2.MoveNext();
}
for (; more1; more1 = e1.MoveNext())
yield return e1.Current;
for (; more2; more2 = e2.MoveNext())
yield return e2.Current;
}
}
}
Then the concrete task can be accomplished simply with:
static void Merge(string inputFile1, string inputFile2, string outputFile)
{
Func<string, IEnumerable<KeyValuePair<int, string>>> readLines = file =>
File.ReadLines(file).Select(line =>
new KeyValuePair<int, string>(int.Parse(line), line));
var inputLines1 = readLines(inputFile1);
var inputLines2 = readLines(inputFile2);
var comparer = Comparer<KeyValuePair<int, string>>.Create(
(a, b) => a.Key.CompareTo(b.Key));
var outputLines = inputLines1.MergeOrdered(inputLines2, comparer)
.Select(item => item.Value);
File.WriteAllLines(outputFile, outputLines);
}
Upvotes: 2
Reputation: 133995
Merging two files is a fairly simple modification to merging two arrays. The idea is to replace the array index increment with a read of the next line of the file. For example, the standard merge algorithm that I show in my blog (http://blog.mischel.com/2014/10/24/merging-sorted-sequences/) is:
while (not end of List A and not end of List B)
if (List A current item <= List B current item)
output List A current item
advance List A index
else
output List B current item
advance List B index
// At this point, one of the lists is empty.
// Output remaining items from the other
while (not end of List A)
output List A current item
advance List A index
while (not end of List B)
output List B current item
advance List B index
To make that merge files, you start by opening and reading the first line of each file. It gets kind of screwy, though, because you have to check for end of file. "Get the next line" is a bit ... odd.
int item1;
int item2;
bool eof1 = false;
bool eof2 = false;
string temp;
var file1 = File.OpenText(textFile1);
temp = file1.ReadLine();
if (temp == null)
eof1 = true;
else
item1 = int.Parse(temp);
// do the same thing for file2
Then we can do the standard merge:
while (!eof1 && !eof2)
{
if (item1 <= item2)
{
outputFile.WriteLine(item1);
// get next item from file1
temp = file1.ReadLine();
if (temp == null)
eof1 = true;
else
item1 = int.Parse(temp);
}
else
{
// output item2 and get next line from file2
}
}
// and the cleanup
while (!eof1)
{
// output item1, and get next line from file1
}
while (!eof2)
{
// output item2, and get next file from file2
}
The only thing different is that getting the next item is more involved than just incrementing an array index.
Upvotes: 2
Reputation: 13745
They are both sorted lists and to avoid memory consumption, open a reader to both files. Read two line from both, compare ahead, write the sorted results and take action based on the current line of each file. E.g:Treat your sorted value in each file as a pointer and keep comparing and advancing from the lesser side until completion. This will ensure a small memory footprint that will perform better for large files than for smaller ones.
You can pinch an algorithm off of the web, here is one and another that even mentions 0(1). Ignore the fact it talks about arrays, your files are effectively sorted arrays so you don't need to duplicate that in memory.
Upvotes: 1