drewag
drewag

Reputation: 94733

Reading data from file handle leaks memory on Linux

I am experiencing a memory leak when reading data from files. This code creates the leak:

func read() throws {
    let url = URL(fileURLWithPath: "content.pdf")
    let fileHandle = try FileHandle(forReadingFrom: url)
    while true {
        let chunk = fileHandle.readData(ofLength: 256)
        guard !chunk.isEmpty else {
            break
        }
    }
    print("read")
}

do {
   for _ in 0 ..< 10000 {
        try read()
    }
}
catch {
    print("Error: \(error)")
}

*FYI: to run this code you will have to have a "content.pdf" file in your working directory.

If I run this on linux with Swift 3.1.1 (or 3.1), it does a number of iterations of the loop consuming more and more memory until the process is killed.

On Mac this also happens because the data is put into the Autorelease pool and I can fix the memory issue by wrapping each iteration in an autorelease pool but that does not exist on Linux so I don't know how I can free up that memory. Does anyone have an idea?

Upvotes: 1

Views: 676

Answers (2)

drewag
drewag

Reputation: 94733

I found the problem which is within the standard library. There is actually already a bug report open for it. Basically the problem is that the readData(ofLength:) method is returning a Data object that is not cleaning up after itself when deallocated.

For now, I am using this workaround:

extension FileHandle { 
    public func safelyReadData(ofLength length: Int) -> Data {
        #if os(Linux)
            var leakingData = self.readData(ofLength: length)
            var data: Data = Data() 
            if leakingData.count > 0 { 
                leakingData.withUnsafeMutableBytes({ (bytes: UnsafeMutablePointer<UInt8>) -> Void in
                    data = Data(bytesNoCopy: bytes, count: leakingData.count, deallocator: .free)
                })
            } 
            return data
        #else
            return self.readData(ofLength: length)
        #endif
    }
}

Anywhere I was previously using readData(ofLength:) I am now using my safelyReadData(ofLength:) method. On all platforms other than Linux it simply calls the original because those implementations are fine. On Linux I am creating a copy of the data that will actually free the underlying data when deallocated.

Upvotes: 3

Price Ringo
Price Ringo

Reputation: 3440

Instead of how to work around the missing autorelease pool, a better question is how to prevent the leak. Maybe creating (and not deallocating) 10,000 FileHandles are the problem. Try this.

func read() throws {
    let url = URL(fileURLWithPath: "content.pdf")
    let fileHandle = try FileHandle(forReadingFrom: url)
    while true {
        let chunk = fileHandle.readData(ofLength: 256)
        guard !chunk.isEmpty else {
            break
        }
    }
    fileHandle.closeFile()
    print("read")
}

This may not be the problem, but it is still good code hygiene. How many loops are made before the crash?

Upvotes: 0

Related Questions