Astora
Astora

Reputation: 725

Flume not writing correctly in amazon s3 (weird characters)

My flume config:

agent.sinks = s3hdfs
agent.sources = MySpooler
agent.channels = channel

agent.sinks.s3hdfs.type = hdfs
agent.sinks.s3hdfs.hdfs.path = s3a://mybucket/test
agent.sinks.s3hdfs.hdfs.filePrefix = FilePrefix
agent.sinks.s3hdfs.channel = channel
agent.sinks.s3hdfs.hdfs.useLocalTimeStamp = true


agent.sources.MySpooler.channels = channel
agent.sources.MySpooler.type = spooldir
agent.sources.MySpooler.spoolDir = /flume_to_aws
agent.sources.MySpooler.fileHeader = true

agent.channels.channel.type = memory
agent.channels.channel.capacity = 100

Now I will add a file in /flume_to_aws folder with the following content (text):

Oracle and SQL Server

After it is uploaded in S3, I downloaded the file and opened it, and it show the following text:

    SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable

  Œúg ÊC•ý¤ïM·T.C   !     †"û­þ   Oracle and SQL ServerÿÿÿÿŒúg ÊC•ý¤ïM·T.C

Why the file is not uploaded only with the text "Oracle and SQL Server"??

Upvotes: 0

Views: 22

Answers (1)

Astora
Astora

Reputation: 725

Problem solved. I have found this question in stackoverflow here

Flume is generating files in binary format instead of text format.

So, I have added the following lines:

agent.sinks.s3hdfs.hdfs.writeFormat = Text
agent.sinks.s3hdfs.hdfs.fileType = DataStream

Upvotes: 0

Related Questions