hamadkh
hamadkh

Reputation: 349

convert table into comma separated in text file using bash

I have a text file like this:

+------------------+------------+----------+
|     col_name     | data_type  | comment  |
+------------------+------------+----------+
| _id              | bigint     |          |
| starttime        | string     |          |
+------------------+------------+----------+

how can i get a result like this using bash

(_id bigint, starttime string   )

so just the column names and type

#remove first 3 lines 
sed -e '1,3d' < columnnames.txt >clean.txt

#remove first character from each line
sed 's/^.//'  < clean.txt >clean.txt

#remove last character from each line
sed 's/.$//' < clean.txt >clean.txt


# remove certain characters 
sed 's/[+-|]//g' < clean.txt >clean.txt 

# remove last line 
sed  '$ d' < clean.txt >clean.txt

so this is what i have so far, if there is a better implementation let me know!

Upvotes: 0

Views: 328

Answers (6)

Ed Morton
Ed Morton

Reputation: 203254

$ awk -F'[| ]+' 'NR>3 && NF>1{v=v s $2" "$3; s=", "} END{print "("v")"}' file
(_id bigint, starttime string)

Upvotes: 1

David C. Rankin
David C. Rankin

Reputation: 84541

A variation on the other answers using awk with the field-separator being the '|' with optional spaces on either side as GNU awk allows, then taking fields 2 and 3 as the fields wanted in each record, and formatting the output as described in the question with the closing " )" provided in the END rule:

$ awk -F' *\\| *' '
    NR>3 && $1~/^[+]/{exit}                 # exit condition first line w/^+
    NR==4{$1=$1; printf "(%s %s", $2,$3}    # 1st data record is 4
    NR>4{$1=$1; printf ", %s %s", $2,$3}    # process all remainng records
    END{print "  )"}                        # output closing "  )"
' table
(_id bigint, starttime string  )

(note: if you don't want the two-spaces before the closing ")", just remove them from the print in the END rule)

Rather than using a BEGIN the first record of interest (4) is used to provide the opening "(". Look things over and let me know if you have questions.

Upvotes: 0

L&#233;a Gris
L&#233;a Gris

Reputation: 19545

Another way to implement Diego Torres Milano's solution as a stand-alone awk program:

tableconvert

#!/usr/bin/env -S awk -f

BEGIN {
  FS="[[:space:]]*[|][[[:space:]]*"
  printf "%s", "( "
}

{
  if (FNR <= 3 || match($0, /^[+]/))
    next
  else {
    printf("%s%s %s", c, $2, $3)
    c = ", "
  }
} 

END {
  print " )" 
}

Make tableconvert an executable:

chmod +x tableconvert

Run tableconvert on intablefile.txt

./tableconvert intablefile.txt 
( _id bigint, starttime string )

With added bonus that using FNR instead of NR allow the awk program to process multiple input files as arguments:

./tableconvert infille1.txt infile2.txt infile3.txt ...

Upvotes: 0

Diego Torres Milano
Diego Torres Milano

Reputation: 69198

Something similar, using only awk:

awk -F ' *[|]' 'BEGIN {printf("(")} NR>3 && NF>1 {printf("%s%s%s", NR>4 ? "," : "", $2, $3)} END {printf(" )\n")}' columnnames.txt

Upvotes: 2

jas
jas

Reputation: 10865

# Set the field separator to vertical bar surrounded by any number of spaces.
# BEGIN and END blocks print the opening and closing parens
# The line between skips the header lines and any line starting with '+'

$ awk -F"[[:space:]]*[|][[[:space:]]*" '
    BEGIN { printf "%s", "( "}  
    NR > 3 && $0 !~ /^[+]/ { printf("%s%s %s", c, $2, $3); c = ", " } 
    END { print " )" }' file

( _id bigint, starttime string )

Upvotes: 1

Matias Barrios
Matias Barrios

Reputation: 5056

I would do this :

cat input.txt \
| tail -n +4 \
| awk -F'[^a-zA-Z_]+' '{ for(i=1;i<=NF;i++) { printf $i" " }}' 

Its a little bit shorter.

Upvotes: 0

Related Questions