user1732364
user1732364

Reputation: 985

Setting Position/Index of FileStream.Seek to retrieve "blocks" of data VB.NET

I am currently working on a method that takes in a text file and will reduce the file down to ~10 MB. This method is used to truncate log files and keep them within a 10 MB limit.

The logic behind the code is basically this... if the file is 250 MB or bigger then read the bytes till the array reaches 250 MB. Store this into a StringBuilder, set position for next read and repeat until the StringBuilder contains ~10 MB of data. Then write out to the file erasing all the data and leaving only 10 MB of the most recent writes.

To prevent cutting lines in half, it checks to see where the last CrLf is and then writes out all the data from that point forward.

My problem is I can't get the seek to correctly position itself after the first read. It reads the data correctly first go, then when I use that position from the previous read for the next iteration it "ignores" the position and reads from the beginning of the file again.

If logFile.Length > (1024 * 1024 * 250) Then
    Dim DataToDelete As Integer = logFile.Length - (1024 * 1024 * 250)
    Dim ArrayIndex As Integer = 0
    While DataToDelete > 0
        Using fs As FileStream = New FileStream(logFile.FullName, FileMode.Open, FileAccess.ReadWrite)
            fs.Seek(ArrayIndex, SeekOrigin.Begin)
            If strBuilder.Length < (1024 * 1024 * 250) Then
                Dim bytes() As Byte = New Byte((1024 * 1024 * 250)) {}
                Dim n As Integer = fs.Read(bytes, 0, (1024 * 1024 * 250))
                ArrayIndex = bytes.Length
                Dim enc As Encoding = Encoding.UTF8
                strBuilder.Append(enc.GetString(bytes))
            Else
                If DataToDelete - strBuilder.Length < 0 And strBuilder.Length > (1024 * 1024 * My.Settings.Threshold) Then
                    Dim DataToCut As Integer = strBuilder.Length - (1024 * 1024 * My.Settings.Threshold)
                    While Not (strBuilder.Chars(DataToCut).ToString.Equals(vbCr)) And DataToCut <> 0
                        DataToCut -= 1
                    End While
                    strBuilder.Remove(0, DataToCut)
                    File.WriteAllText(logFile.FullName, strBuilder.ToString)
                Else
                    DataToDelete -= strBuilder.Length
                    strBuilder.Clear()
                End If
            End If
        End Using
    End While
End If

Upvotes: 4

Views: 3715

Answers (2)

user1732364
user1732364

Reputation: 985

This is my end result,works like a charm!

        Dim Maxsize As Integer = (1024 * 1024 * My.Settings.Threshold)
    For Each logfile In filesToTrim
        Dim sb As New StringBuilder
        Dim buffer As String = String.Empty
        If logfile.Length > Maxsize Then
            Using reader As New StreamReader(logfile.FullName)
                reader.BaseStream.Seek(-Maxsize, SeekOrigin.End)
                buffer = reader.ReadToEnd()
                sb.Append(buffer)
            End Using
            Dim Midpoint As Integer = 0
            While Not (sb.Chars(Midpoint).ToString.Equals(vbCr)) And Midpoint <> sb.Length - 1
                Midpoint += 1
            End While
            sb.Remove(0, Midpoint)
            File.WriteAllText(logfile.FullName, sb.ToString)
        End If
    Next

Upvotes: 0

Steven Doggart
Steven Doggart

Reputation: 43743

For what you are doing, it's unnecessary, and really not a great idea, to load the entire file into memory. It would be much better to just read the portion of the log file you intend to keep (the last 10MB). For instance, it would be much simpler and more efficient to do something like this:

Private Sub ShrinkLog(ByVal filePath As String, ByVal maxSize As Integer)
    Dim buffer As String
    If New FileInfo(filePath).Length > maxSize Then
        Using reader As New StreamReader(filePath)
            reader.BaseStream.Seek(-maxSize, SeekOrigin.End)
            buffer = reader.ReadToEnd()
        End Using
        File.WriteAllText(filePath, buffer)
    End If
End Sub

There are other ways to do this, too. It would be even more efficient, if you were going to be keeping a larger portion of the file, to not even load all of that into memory, but to simply go directly from one stream into another. Also, this simple example doesn't show how you could avoid chopping a line off part way in the file, but I'm sure you could keep seeking one byte at a time until you found the first line break.

Upvotes: 1

Related Questions