Tyler D
Tyler D

Reputation: 594

Data transformation using sed

I have a file like:

A
B
C
D

E
F
G
H

I
J
K
L

and I want it to come out like

A,B,C,D
E,F,G,H

I'm assuming I'd use sed, but actually I'm not even sure if that's the best tool. I'm open to using anything commonly available on a Linux system.

In perl, I did it like this ... it works, but it's dirty and has a trailing comma. Was hoping for something simpler:

$ perl -ne 'if (/^(\w)\R/) {print "$1,";} else {print "\n";}' test
A,B,C,D,
E,F,G,H,
I,J,K,L,    

Upvotes: 2

Views: 128

Answers (4)

Matt Jacob
Matt Jacob

Reputation: 6553

Set the input record separator to paragraph mode (-00) and then split each record on any remaining whitespace:

$ perl -00 -ne 'print join("," => split), "\n"' test

Add -l to enable automatic newlines (but make sure it comes before -00, because we want $\ to be set to the value of $/ before modification):

$ perl -l -00 -ne 'print join("," => split)' test

Add -a to enable autosplit mode and implicitly split to @F:

$ perl -l -00 -ane 'print join("," => @F)' test

Swap out -n for -p for automatic printing:

$ perl -l -00 -ape '$_ = join("," => @F)' test

Upvotes: 8

glenn jackman
glenn jackman

Reputation: 246744

You could use

awk 'BEGIN {RS=""; FS="\n"; ORS="\n"; OFS=","} {$1=$1} 1' file

I see the gawk manual says this:

If RS is set to the null string, then records are separated by blank lines. When RS is set to the null string, the newline character always acts as a field separator, in addition to whatever value FS may have.

So we don't actually need to specify FS to get the desired output:

awk 'BEGIN {RS=""; ORS="\n"; OFS=","} {$1=$1} 1' file

Upvotes: 3

iamauser
iamauser

Reputation: 11469

xargs could do it,

$ xargs -n4 < file | tr ' ' ','
A,B,C,D
E,F,G,H
I,J,K,L

Upvotes: 2

Socowi
Socowi

Reputation: 27205

Replacing newlines with sed is a bit complicated (see this question). It is easier to use tr for the newlines. The rest can be done by sed.

The following command assumes that yourFile does not contain any ,.

tr '\n' , < yourFile | sed 's/,*$/\n/;s/,,/\n/g'

The tr part converts all newlines to ,. The resulting string will have no newlines.
s/,*$/\n/ removes trailing commas and appends a newline (text files usually end with a newline).
s/,,/\n/g replaces ,, by a newline. Two consecutive commas appear only where your original file contained two consecutive newlines, that is where the sections are separated by an empty line.

Upvotes: 0

Related Questions