David
David

Reputation: 267

Mule SFTP to File flow occasionally produces a corrupted pgp file

Using Mule 3.2.1, we have encountered a strange problem where we use the SFTP component to grab pgp encrypted report files from a server. We archive the file using the SFTP component and then use a file:endpoint to produce a working copy for further processing.

Our problem is that every once in a while, the working copy of the file ends up corrupted, but the SFTP archived file is good. When looking at the corrupted file in a hex editor, we see good bytes and then all of a sudden, we see null bytes for the remainder of the file. It looks like the underlying file got deleted while Mule was copying.

An additional confusing piece of information is that we have tried downloading a failed file again, and had everything work. This leads me to believe that it is not something in the file that is the problem, but apparently we do have one file that consistently seems to fail. All this stuff is occurring on production servers with files that I have no access to.

Without knowing the inner workings of Mule, I have no idea what conditions could create this problem.

Are there any smart folks out there familiar enough with the inner workings of Mule to venture a guess?

Also, we are not Mule experts and would welcome any critique of our Mule configuration. (BTW, the config below is a modified version of what is in production and polls more frequently, etc)

<sftp:connector name="SftpConnector" validateConnections="true" autoDelete="true">
    <file:expression-filename-parser />
</sftp:connector>

<file:connector name="FileConnector" pollingFrequency="1000" fileAge="1000" streaming="false"
    autoDelete="false">
    <service-overrides messageFactory="org.mule.transport.file.FileMuleMessageFactory" />
    <file:expression-filename-parser />
</file:connector>

<sftp:endpoint name="SftpEndpoint" connector-ref="SftpConnector" host="localhost"
    port="22" user="tdr" password="password" path="/opt/tdr/outbound" archiveDir="/home/cps/mule/sftp-archive"
    responseTimeout="30000" sizeCheckWaitTime="2500" disableTransportTransformer="true">
    <file:filename-wildcard-filter pattern="*.pgp,*.gpg" />
</sftp:endpoint>

<file:endpoint name="FileEndpoint" connector-ref="FileConnector" path="/home/cps/mule/input" />

<flow name="DfrFileGrabber">
    <quartz:inbound-endpoint jobName="ptDfrGrabber" cronExpression="0/2 * * * * ?">
        <quartz:endpoint-polling-job>
            <quartz:job-endpoint ref="SftpEndpoint" />
        </quartz:endpoint-polling-job>
    </quartz:inbound-endpoint>

    <file:outbound-endpoint ref="FileEndpoint" outputPattern="#[header:originalFilename]" />
</flow>

Upvotes: 0

Views: 1601

Answers (1)

David
David

Reputation: 267

I think we got to the bottom of this one. I can not guarantee that my test case duplicates the problem we are seeing in production, but I believe that the root of the problem is that if the file to be copied is sufficiently large, it is possible for the quartz timer fire up before a previous sftp copy was complete causing multiple copies to occur using the same file name.

One solution is to include the tempDir attribute on the SFTP connection. This results in the SFTP connection moving the file that is being retrieved into the tempDir directory on the server while the copy is taking place. Thus, if the quartz timer fires before the first copy is complete, it does not find the same file.

Upvotes: 1

Related Questions