Reputation: 476
In .net/C#, I need o parse a large file, so should not be loaded in memory entirely at once. Is there an optimized technique to read line by line and process them, while keeping in memory the last n lines read, in order to go back and forth through them? What collection would be best suitable for such operation?
Upvotes: 0
Views: 99
Reputation: 7855
For this you'll need a custom collection type, an array that you can continually add to, but doesn't resize and instead deletes the old entries. I came up with this after trying for a few minutes, it is extremely rough, no validation for anything and may contain logic errors for some cases, but it seems to work. (Also I'm not happy with the class name, so if you have a better idea tell me)
class SizeLimitedList<T> : IEnumerable<T>
{
private T[] _internalArray;
private readonly int _capacity;
public SizeLimitedList(int capacity)
{
_internalArray = new T[capacity];
_capacity = capacity;
}
public SizeLimitedList(IEnumerable<T> collection)
{
_internalArray = collection.ToArray();
_capacity = _internalArray.Length;
}
public void Add(T item)
{
MoveArray(1);
_internalArray[_capacity - 1] = item;
}
public T[] GetLastEntries(int n)
{
return _internalArray.Skip(_internalArray.Length - n).ToArray();
}
public T GetLastEntry()
{
return _internalArray[_internalArray.Length - 1];
}
private void MoveArray(int by)
{
Array.Copy(_internalArray, 1, _internalArray, 0, _capacity - 1);
}
public IEnumerator<T> GetEnumerator()
{
return _internalArray.AsEnumerable().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
You can use it like so:
var list = new SizeLimitedList<string>(maxLinesKept);
var file = new StreamReader(@"C:\My\Path\To\File.txt");
while((line = file.ReadLine()) != null)
{
list.Add(line);
if (/* Condition that requires you to read the last n lines */)
{
var lines = list.GetLastEntries(nLinesToGet);
// Do whatever with these last lines
}
}
This is a little complicated, so let me make an example. Let's say you want to print a line to the console, but only when the previous line contains "Print Next Line:"
Print Next Line:
Hello
This line will not be printed
Print Next Line:
World
So now let's implement it:
var list = new SizeLimitedList<string>(1);
var file = new StreamReader("example.txt");
while((line = file.ReadLine()) != null)
{
if (list.GetLastEntry() == "Print Next Line:")
Console.WriteLine(line);
list.Add(line);
}
This will print:
Hello
World
Into the console
P.S Feel free to leave a comment or update your original question with a sample of your file and the condition of when to read the last n lines and I can update my example to match your use-case
Upvotes: 2
Reputation: 179
Well, you can use class BackwardReader which can be found here. I don't exactly know if it will be helpful because I don't how do you want to process previous lines till you get to the last N. Anyway, you can use this class to start reading backwards, save first N lines and then process other lines.
public static void ReadFile(int n, string logFile)
{
int lineCnt = 0;
List <string> lastNLines= new List <string>();
BackwardReader br = new BackwardReader(logFile);
while (!br.SOF())
{
string line = br.Readline();
if (lineCnt < n) lastNLines.Add(line);
// else your implementation for other lines
lineCnt++;
}
}
Upvotes: 0