Anand Abhay
Anand Abhay

Reputation: 359

Replace 2nd string with comma for specific rows using awk or sed

I have data in below format

select 
abc       string                                      
adef        double                                      
opr           timestamp                                   
from Test;

select 
dfg       string                                      
asd        double                                      
wer           timestamp                                   
from Test1;

I am trying to get data in below format

select 
abc,
adef,
opr
from Test;

select 
dfg,
asd,
wer
from Test1;

I tried using awk to get the 2nd field for the lines not starting with select or from awk '!/^ *from/ { print$2 }' sample.txt . But not sure how to replace with comma. Also, how to avoid comma for the row before from

Regards,

Upvotes: 0

Views: 51

Answers (4)

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Awk solution:

awk 'f && r{
         if (/from/){ sep = ORS $0; f = 0 }
         printf "%s%s\n", r, sep 
     }
     $1 == "select"{ f = 1; sep = ","; print; next }
     { r = $1; if (r == "") print }' file

The output:

select 
abc,
adef,
opr
from Test;

select 
dfg,
asd,
wer
from Test1;

Upvotes: 1

karakfa
karakfa

Reputation: 67467

this will work for any number of columns in the select statement

$ awk 'BEGIN{RS=""; FS=OFS="\n"; ORS=FS FS} 
            {for(i=2;i<NF;i++) gsub(/ .*/,(i==NF-1)?"":",",$i)}1' file

select
abc,
adef,
opr
from Test;

select
dfg,
asd,
wer
from Test1;

there will be an extra line break at the end of file can be removed by piping to sed '$d'

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 203219

$ awk '{n=(NR%6)} n~/[234]/{$0=$1(n<4?",":"")} 1' file
select
abc,
adef,
opr
from Test;

select
dfg,
asd,
wer
from Test1;

or if you prefer:

$ awk -v RS= '{printf "%s\n%s,\n%s,\n%s\n%s %s\n\n", $1, $2, $4, $6, $8, $9}' file
select
abc,
adef,
opr
from Test;

select
dfg,
asd,
wer
from Test1;

Upvotes: 1

jas
jas

Reputation: 10865

There's not much to add by way of explanation since it's just a bunch of print statements and turning flags on and off. That's not to say it's easy to follow, perhaps someone else will have a more expressive solution.

For avoiding the comma in the row before from, since you don't know when to stop putting commas until you reach the last line of the block, it's easier to add the comma to the previous line (when appropriate) while processing the current line.

Add you don't need to actually "replace" the second field with a comma, you can just put a literal comma and ignore the second field altogether.

$ cat a.awk
/^ *from/ { printf "\n%s\n\n", $0; f1 = f2 = 0; }
f2 { printf ",\n%s", $1 }
f1 { printf "%s", $1; f1 = 0; f2 = 1;}
/^ *select/ {print; f1 = 1}

$ awk -f a.awk f
select
abc,
adef,
opr
from Test;

select
dfg,
asd,
wer
from Test1;

Upvotes: 1

Related Questions