J  Calbreath
J Calbreath

Reputation: 2705

Scalding: Ouptut schema from pipe operation

I am reaidng files on HDFS via scalding, aggregating on some fields, and writing to a tab delimited file via TSV. How can I write out a file that contains the schema of my output file? For example,

UnpackedAvroSource(args("input"))
  .project('key, 'var1)
  .groupBy('key){_.sum[Long]('var1 -> var1sum))}
  .write(Tsv(args("output")))

I want to write an output text file that contains "Key, var1sum" that someone who picks up my ooutput file later knows what the columns. I'm assuming scalding doesn't embed this in the file somewhere?

Thanks.

Upvotes: 1

Views: 119

Answers (1)

J  Calbreath
J Calbreath

Reputation: 2705

Just found the option writeHeader = true which will write the column names to the output file, negating the need for writing out to a file.

Upvotes: 2

Related Questions