Reputation: 1335
While copying data from local path to HDFS sink, i am getting some garbage data in the file at HDFS location.
My config file for flume:
# spool.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = s1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.s1.type = spooldir
a1.sources.s1.spoolDir = /home/cloudera/spool_source
a1.sources.s1.channels = c1
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = flumefolder/events
a1.sinks.k1.hdfs.filetype = Datastream
#Format to be written
a1.sinks.k1.hdfs.writeFormat = Text
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
I am aopyuing file from local path "/home/cloudera/spool_source" to hdfs path "flumefolder/events".
Flume command:
flume-ng agent --conf-file spool.conf --name a1 -Dflume.root.logger=INFO,console
File "salary.txt" at local path "/home/cloudera/spool_source" is:
GR1,Emp1,Jan,31,2500
GR3,Emp3,Jan,18,2630
GR4,Emp4,Jan,31,3000
GR4,Emp4,Feb,28,3000
GR1,Emp1,Feb,15,2500
GR2,Emp2,Feb,28,2800
GR2,Emp2,Mar,31,2800
GR3,Emp3,Mar,31,3000
GR1,Emp1,Mar,15,2500
GR2,Emp2,Apr,31,2630
GR3,Emp3,Apr,17,3000
GR4,Emp4,Apr,31,3200
GR7,Emp7,Apr,21,2500
GR11,Emp11,Apr,17,2000
At the target path "flumefolder/events", the data is copied with garbage values as:
1 W��ȩGR1,Emp1,Jan,31,2500W��ȲGR3,Emp3,Jan,18,2630W��ȷGR4,Emp4,Jan,31,3000W��ȻGR4,Emp4,Feb,28,3000W��ȽGR1,Emp1,Feb,15,2500W����GR2,Emp2,Feb,28,2800W����GR2,Emp2,Mar,31,2800W����GR3,Emp3,Mar,31,3000W����GR1,Emp1,Mar,15,2500W����GR2,Emp2,
What is wrong in my configuration file spool.conf, i am unable to figure it out.
Upvotes: 0
Views: 211
Reputation: 30089
Flume configuration is case sensitive so change the filetype line to fileType, and fix the Datastream value too as it's also case sensitive
sinks.k1.hdfs.fileType = DataStream
Your current setup means the default of a sequence file is being used, hence the odd characters
Upvotes: 2