Reputation: 1522
I have a file, example.txt
0
A
B
C, C, C
D, D
E
F
1
A, A, A
B
C
2
A
B
C
D, D, D
E
I need to separate the file based on any number and then take the contents between those numbers and join them into a single line, repeating the process for every section of the file:
A, B, C, C, C, D, D, E, F
A, A, A, B, C
A, B, C, D, D, D, E
The best I've come up with is:
cat example.txt | sed -e '1,/^[0-9]/d' -e '/^[0-9]/,$d' | paste -sd "," -
A, A, A, B, C
which is only the middle section, in this case. That, or printing all sections onto one line.
Upvotes: 2
Views: 380
Reputation: 2491
Another sed
sed -n '
N
:A
$bB
/\n[ ]*[0-9][0-9]*$/!{
N
bA
}
h
s/\n[^\n]*$//
:B
s/[^\n]*\n[ ]*//
s/\n[ ]*/, /g
p
$b
x
s/.*\n//
bA
' infile
Upvotes: 1
Reputation: 6345
A shorter idiomatic awk alternative:
$ awk '$1=$1{printf "%s%s",$0,(RT==","?OFS:ORS)}' RS="[0-9]|," OFS=", " file1
A, B, C, C, C, D, D, E, F
A, A, A, B, C
A, B, C, D, D, D, E
RS
is the Record Separator . Default is new line, here is set to number or comma
OFS
is the Output Field Separator = a comma with a single space
RT
is the record separator value in use
ORS
is the Output Record Separator , new line by default
$1=$1
is an idiomatic assignment that forces awk to recalculate fields and records based on the values of OFS,ORS,etc
(RT==","?OFS:ORS)
Is a ternary if operation based on the synthax
(condition?action for true:action for false)
.
Upvotes: 5
Reputation: 1060
sed:
echo `sed 's:$:,:' example.txt` | sed -r 's:^:, :;s:,\s*[0-9]+,\s*:\n:g;s:^\s*::;s:,? *$::'
perl:
perl -p0777e 's:^:, :;s:\n\s*:, :g;s:,\s*[0-9]+,\s*:\n:g;s:^\s*::;s:,?\s*$:\n:' example.txt
echo...
, or perl -p0777...
- treat whole file as a single long line (containing newline characters (perl
) or spaces (echo
))s:^:, :
- add an extra comma in the beginnings:\n:,:g
- replace all newlines with commass:,\*s[0-9]+,\s*:\n:g
- replace all numbers surrounded by commans with newlinesUpvotes: 2
Reputation: 133770
Following awk
may also help in same.
awk '/^[0-9]+/ && val{print val;val="";next} FNR>1{sub(/^ +/,"");val=val?val ", " $0:$0} END{print val}' Input_file
Explanation: Adding explanation too here for above command with it's non-one liner form too now.
awk '
/^[0-9]+/ && val{ ##Checking condition here if a line starts from digit(s) and variable named val is NOT NULL if it is TRUE then do following:
print val; ##printing the value of variable val here.
val=""; ##Nullifying the variable val here.
next ##next will skip all further coming statements.
}
FNR>1{ ##Checking condition here if line number is greater than 1 then do following:
sub(/^ +/,""); ##Using sub utility of awk to substitute all starting space with NULL of the current line.
val=val?val ", " $0:$0 ##creating variable named val and concatenating its own value with it each time it comes here.
}
END{ ##This is awk programs end section here. Which starts once whole Input_file is being read.
print val ##Printing the variable named val value here.
}
' Input_file ##Mentioning the Input_file name here.
Upvotes: 2
Reputation: 113994
Try:
$ awk 'function prn(line) {if(line){gsub(/[[:space:]]+/, " ", line); print line}} /^[0-9]/{prn(line); line=""; next} {if(line)line=line"," $0; else line=$0} END{prn(line)}' example.txt
A, B, C, C, C, D, D, E, F
A, A, A, B, C
A, B, C, D, D, D, E
Or, for those who prefer code spread over multiple lines:
awk 'function prn(line)
{
if(line){
gsub(/[[:space:]]+/, " ", line)
print line
}
}
/^[0-9]/{
prn(line)
line=""
next
}
{
if(line)
line=line"," $0
else
line=$0
}
END{
prn(line)
}' example.txt
function prn(line) {if(line){gsub(/[[:space:]]+/, " ", line); print line}}
This defines a function prn
which compresses excess spaces and prints the line.
/^[0-9]/{prn(line); line=""; next}
If the current line starts with a number, call prn
on the contents of line
, reset line back to an empty string, and skip the rest of the commands and instead jump to the next
line.
{if(line)line=line"," $0; else line=$0}
Add the current line to the end of the variable line
.
END{prn(line)}
After we have reached the end of the file, call prn
on line
.
Upvotes: 2