Reputation: 2069
I have a requirement where I need to read a very large text file (~2 Gb), but only starting from a specific line number till the end of the file.
I can not load the whole text in memory due to performance issues. So I have used StreamReader
. But I noticed that there is no easy way to start the "reading" from a specific line number, Rather what I have done is I have started to read the file from line 1, and ignoring all the lines before I reach my desired line number.
Is this the correct approach ? This is what I have tried. Is there a better way to achieving this?
static string ReadLogFileFromSpecificLine(int LineNumber)
{
string content = null;
using (StreamReader sr = new StreamReader(LogFilePath))
{
sr.ReadLine();
int currentLineNumber = 0;
string line;
while ((line = sr.ReadLine()) != null)
{
currentLineNumber++;
if(currentLineNumber >= LineNumber - 1)
{
content += line + "\n";
}
}
}
return content;
}
Upvotes: 0
Views: 866
Reputation: 460340
Yes, using a StreamReader
is the way to go (a MemoryMappedFiled
is overkill in this case). But you can simplify it and hide the StreamReader
and use File.ReadLines
which does not read all lines into memory(as opposed to File.ReadAllLines
). You should also use a StringBuilder
:
IEnumerable<string> lines = File.ReadLines(LogFilePath).Skip(LineNumber);
string content = new StringBuilder().AppendJoin(Environment.NewLine, lines).ToString();
Upvotes: 3
Reputation: 67345
This is the correct approach. There is no algorithm to figure out the offset of a particular line and seek to it.
You may be able to squeeze a little more performance out of it by have two loops. Once the starting line is reached, you could move to the second loop, which wouldn't need to check the line number. But that would impact performance only minimally at best.
In addition, I'm not sure what you need to do with those lines. You can either process one line at a time and avoid loading all of them in memory at once. Otherwise, you could use a List<string>
to build the list. Or if you want all the lines in a single string, use ReadToEnd()
, as @Fildor recommended. Do not concatenate the lines as you are doing. That is extremely inefficient.
Upvotes: 4