Reputation: 63984
I have the following lines in 2 chunks (actually there are ~10K of that). And in this example each chunk contain 3 lines. The chunks are separated by an empty line. So the chunks are like "paragraphs".
xox
91-233
chicago
koko
121-111
alabama
I want to turn it into tab-delimited lines, like so:
xox 91-233 chicago
koko 121-111 alabama
How can I do that?
I tried tr "\n" "\t"
, but it doesn't do what I want.
Upvotes: 5
Views: 575
Reputation: 18351
xargs -L3 < filename.log |tr ' ' '\t'
xox 91-233 chicago
koko 121-111 alabama
Upvotes: 3
Reputation: 437197
This answer offers the following:
* It works with blocks of nonempty lines of any size, separated by any number of empty lines; John1024's helpful answer (which is similar and came first) works with blocks of lines separated by exactly one empty line.
* It explains the awk
command used in detail.
A more idiomatic (POSIX-compliant) awk
solution:
awk -v RS= -F '\n' -v OFS='\t' '$1=$1""' file
-v RS=
tells awk
to operate in paragraph mode: consider each run of nonempty lines a single record; RS
is the input record separator.
-F '\n'
tells awk
to consider each line of an input paragraph its own field (breaks the multiline input record into fields by lines); -F
sets FS
, the input field separator.
-v OFS='\t'
tells awk
to separate fields with \t
(tab chars.) on output; OFS
is the output field separator.
$1=$1""
looks like a no-op, but, due to assigning to field variable $1
(the record's first field), tells awk
to rebuild the input record, using OFS
as the field separator, thereby effectively replacing the \n
separators with \t
.
""
is to guard against the edge case of the first line in a paragraph evaluating to 0
in a numeric context; appending ""
forces treatment as a string, and any nonempty string - even if it contains "0"
- is considered true in a Boolean context - see below.Given that $1
is by definition nonempty and given that assignments in awk
pass their value through, the result of assignment $1=$1""
is also a nonempty string; since the assignment is used as a pattern (a condition), and a nonempty string is considered true, and there is no associated action block ({ ... }
), the implied action is to print the - rebuilt - input record, which now consists of the input lines separated with tabs, terminated by the default output record separator (ORS
), \n
.
Upvotes: 4
Reputation: 1317
another version of awk to do this
awk '{if(NF>0){a=a$1"\t";i++};if(i%3==0&&NF>0){print a;a=""}}' input_file
Upvotes: 2
Reputation: 67467
another alternative,
$ sed '/^$/d' file | pr -3ats$'\t'
xox 91-233 chicago
koko 121-111 alabama
remove empty lines with sed
and print to 3 columns with tab delimiter. In your real file, this should be the number of lines in blocks.
Note that this will only work if all your blocks are of the same size.
Upvotes: 3
Reputation: 113824
$ awk -F'\n' '{$1=$1} 1' RS='\n\n' OFS='\t' file
xox 91-233 chicago
koko 121-111 alabama
Awk divides input into records and it divides each record into fields.
-F'\n'
This tells awk to use a newline as the field separator.
$1=$1
This tells awk to assign the first field to the first field. While this seemingly does nothing, it causes awk to treat the record as changed. As a consequence, the output is printed using our assigned value for ORS
, the output record separator.
1
This is awk's cryptic shorthand for print the line.
RS='\n\n'
This tells awk to treat two consecutive newlines as a record separator.
OFS='\t'
This tells awk to use a tab as the field separator on output.
Upvotes: 5