Reputation: 985
I am currently working on a method that takes in a text file and will reduce the file down to ~10 MB. This method is used to truncate log files and keep them within a 10 MB limit.
The logic behind the code is basically this... if the file is 250 MB or bigger then read the bytes till the array reaches 250 MB. Store this into a StringBuilder
, set position for next read and repeat until the StringBuilder
contains ~10 MB of data. Then write out to the file erasing all the data and leaving only 10 MB of the most recent writes.
To prevent cutting lines in half, it checks to see where the last CrLf
is and then writes out all the data from that point forward.
My problem is I can't get the seek to correctly position itself after the first read. It reads the data correctly first go, then when I use that position from the previous read for the next iteration it "ignores" the position and reads from the beginning of the file again.
If logFile.Length > (1024 * 1024 * 250) Then
Dim DataToDelete As Integer = logFile.Length - (1024 * 1024 * 250)
Dim ArrayIndex As Integer = 0
While DataToDelete > 0
Using fs As FileStream = New FileStream(logFile.FullName, FileMode.Open, FileAccess.ReadWrite)
fs.Seek(ArrayIndex, SeekOrigin.Begin)
If strBuilder.Length < (1024 * 1024 * 250) Then
Dim bytes() As Byte = New Byte((1024 * 1024 * 250)) {}
Dim n As Integer = fs.Read(bytes, 0, (1024 * 1024 * 250))
ArrayIndex = bytes.Length
Dim enc As Encoding = Encoding.UTF8
strBuilder.Append(enc.GetString(bytes))
Else
If DataToDelete - strBuilder.Length < 0 And strBuilder.Length > (1024 * 1024 * My.Settings.Threshold) Then
Dim DataToCut As Integer = strBuilder.Length - (1024 * 1024 * My.Settings.Threshold)
While Not (strBuilder.Chars(DataToCut).ToString.Equals(vbCr)) And DataToCut <> 0
DataToCut -= 1
End While
strBuilder.Remove(0, DataToCut)
File.WriteAllText(logFile.FullName, strBuilder.ToString)
Else
DataToDelete -= strBuilder.Length
strBuilder.Clear()
End If
End If
End Using
End While
End If
Upvotes: 4
Views: 3715
Reputation: 985
This is my end result,works like a charm!
Dim Maxsize As Integer = (1024 * 1024 * My.Settings.Threshold)
For Each logfile In filesToTrim
Dim sb As New StringBuilder
Dim buffer As String = String.Empty
If logfile.Length > Maxsize Then
Using reader As New StreamReader(logfile.FullName)
reader.BaseStream.Seek(-Maxsize, SeekOrigin.End)
buffer = reader.ReadToEnd()
sb.Append(buffer)
End Using
Dim Midpoint As Integer = 0
While Not (sb.Chars(Midpoint).ToString.Equals(vbCr)) And Midpoint <> sb.Length - 1
Midpoint += 1
End While
sb.Remove(0, Midpoint)
File.WriteAllText(logfile.FullName, sb.ToString)
End If
Next
Upvotes: 0
Reputation: 43743
For what you are doing, it's unnecessary, and really not a great idea, to load the entire file into memory. It would be much better to just read the portion of the log file you intend to keep (the last 10MB). For instance, it would be much simpler and more efficient to do something like this:
Private Sub ShrinkLog(ByVal filePath As String, ByVal maxSize As Integer)
Dim buffer As String
If New FileInfo(filePath).Length > maxSize Then
Using reader As New StreamReader(filePath)
reader.BaseStream.Seek(-maxSize, SeekOrigin.End)
buffer = reader.ReadToEnd()
End Using
File.WriteAllText(filePath, buffer)
End If
End Sub
There are other ways to do this, too. It would be even more efficient, if you were going to be keeping a larger portion of the file, to not even load all of that into memory, but to simply go directly from one stream into another. Also, this simple example doesn't show how you could avoid chopping a line off part way in the file, but I'm sure you could keep seeking one byte at a time until you found the first line break.
Upvotes: 1