Reputation: 11
I have a kafka consumer that deserializes avro messages and prints to stdout. I want to pipe the output into files, but want a separate file for each message - not all of the messages in a single file.
I have searched google and most people want the output to multiple files or piped into another program, this is not what I am trying to do. I need each message/line into a unique files name, either with a counter, the message number from the output, or date to the millisecond.
The output is in this format:
AVRO MESSAGE (1): {Data in JSON format}
AVRO MESSAGE (2): {Data in JSON format}
AVRO MESSAGE (3): {Data in JSON format}
AVRO MESSAGE (4): {Data in JSON format}
I wish line 1 to go into a file named output1.txt or output20190518113126104, line 2 to go into a file named output2.txt or output20190518113126351 where the timestamped name is YYYYMMDDHHmmssSSS or something similar to insure it is unique.
Upvotes: 0
Views: 790
Reputation: 212454
I would go with the awk solution presented by Ed Morton. The canonical method (IMO) in a shell would be:
cmd | { i=1; while IFS= read -r line; do printf '%s\n' "$line" > output.$((i++)); done; }
You might prefer a for-loop, but IMO it's not as clean since you cannot write for((i=1; read line; i++))
as you would like. (The second expression cannot be a command). eg:
cmd | for ((i=1;; i++)); do IFS= read -r line || break; printf '%s\n' "$line" > output.$i; done;
Upvotes: 0
Reputation: 20022
Use the split
with option l
(lines) and count 1
cmd | split -l1
When you want a prefix for your outputfiles, you can use
split -l1 <(cmd) output
EDIT:
As suggested in the comment, you can forse numeric output with -d
and let split
read from stdin with -
. This makes:
cmd | split -l1 -d - output
Upvotes: 4
Reputation: 204154
foo | awk '{out="output" NR ".txt"; print > out; close(out)}'
replace foo
with whatever command is currently generating your output.
Upvotes: 2