Reputation: 14263
I have to read line-by-line a log file. It's about 6MB in size and 40000 line total. But after testing my program, I discover that that log file is only delimited by LF character only. So I can't use the Readline
method of StreamReader
class
How can I fix this problem?
edit: I tried to use Text Reader, but my program still didn't work:
using (TextReader sr = new StreamReader(strPath, Encoding.Unicode))
{
sr.ReadLine(); //ignore three first lines of log file
sr.ReadLine();
sr.ReadLine();
int count = 0; //number of read line
string strLine;
while (sr.Peek()!=0)
{
strLine = sr.ReadLine();
if (strLine.Trim() != "")
{
InsertData(strLine);
count++;
}
}
return count;
}
Upvotes: 3
Views: 7424
Reputation: 75673
I'd have guessed \LF (\n) would be fine (whereas \CR (\r) -only might cause problems).
You could read each line a character at a time and process it when you read the terminator.
After profiling, if this is too slow, then you could use app-side-buffering with read([]). But try simple character-at-a-time first!
Upvotes: 0
Reputation: 99979
Does File.ReadAllLines(fileName) not correctly load files with LF line ends? Use this if you need the whole file - I saw a site indicating it's slower than another method, but it's not if you pass the correct Encoding to it (default is UTF-8), plus it's as clean as you can get.
Edit: It does. And if you need streaming, TextReader.ReadLine() correctly handles Unix line ends as well.
Edit again: So does StreamReader. Did you just check the documentation and assume it won't handle LF line ends? I'm looking in Reflector and it sure seems like a proper handling routine.
Upvotes: 4
Reputation: 1502935
TextReader.ReadLine
already handles lines terminated just by \n
.
From the docs:
A line is defined as a sequence of characters followed by a carriage return (0x000d), a line feed (0x000a), a carriage return followed by a line feed, Environment.NewLine, or the end of stream marker. The string that is returned does not contain the terminating carriage return and/or line feed. The returned value is a null reference (Nothing in Visual Basic) if the end of the input stream has been reached.
So basically, you should be fine. (I've talked about TextReader
rather than StreamReader
because that's where the method is declared - obviously it will still work with a StreamReader
.)
If you want to iterate through lines easily (and potentially use LINQ against the log file) you may find my LineReader
class in MiscUtil useful. It basically wraps calls to ReadLine()
in an iterator. So for instance, you can do:
var query = from file in Directory.GetFiles("logs")
from line in new LineReader(file)
where !line.StartsWith("DEBUG")
select line;
foreach (string line in query)
{
// ...
}
All streaming :)
Upvotes: 10