user8086348
user8086348

Reputation:

Converting text data file to csv format via shell/bash

I'm looking for a straightforward console solution to change a text file which looks like this:

...
Gender: M
Age: 46
History: 01305
Gender: F
Age: 46
History: 01306
Gender: M
Age: 19
History: 01307
Gender: M
Age: 19
History: 01308
....

To csv file like this one:

Gender,Age,History
M,46,01305
F,46,01306
M,19,01307
M,19,01308

Any help appreciated


With following solutions I've received this output. Am I doing something wrong?

awk 'BEGIN{printf "Gender,Age,History%s",ORS;FS=":"}{c++} {sub(/^   */,"",$2);printf "%s%s",$2,(c==3)?ORS:","}c==3{c=0}' data.txt >> 1.csv

Gender,Age,History
M
,37
,00001
M
,37
,00001
M
,41
,00001

Upvotes: 0

Views: 1059

Answers (4)

user8086348
user8086348

Reputation:

I still don't know where exactly was the problem So I decided to cleanup the data from all of characters except ones, which are supposed to be there (most probably unusual end of the line symbol)

sed -e 's/[^a-zA-Z*0-9:]/ /g;s/  */ /g' history.txt > output.txt

And after that succesfully used the solution from @sjsam

awk 'BEGIN{printf "Gender,Age,History%s",ORS;FS=":"}{c++} {sub(/^   */,"",$2);printf "%s%s",$2,(c==3)?ORS:","}c==3{c=0}' data.txt >> 1.csv

Thanks everyone!

Upvotes: 0

stalet
stalet

Reputation: 1365

Here is a way to do it in bash. Assuming your datafile is called data.txt

#!/bin/bash

echo "Gender,Age,History"
while read -r line; do
  printf '%s' "$(cut -d ' ' -f2 <<< $line )"
  if [[ "$line" =~ ^History.* ]]; then
    printf "\n"
  else
    printf ","
  fi
done < data.txt

Outputs:

Gender,Age,History
M,46,01305
F,46,01306
M,19,01307
M,19,01308

Upvotes: 1

tshiono
tshiono

Reputation: 22012

With bash builtin commands only, I would say:

#!/bin/bash

echo "Gender,Age,History"
while read line; do
    if [[ $line =~ ^Gender:\ *([^\ ]+) ]]; then
        r=${BASH_REMATCH[1]}
    elif [[ $line =~ ^Age:\ *([^\ ]+) ]]; then
        r+=,${BASH_REMATCH[1]}
    elif [[ $line =~ ^History:\ *([^\ ]+) ]]; then
        echo $r,${BASH_REMATCH[1]}
    fi
done < data.text

Upvotes: 1

Kent
Kent

Reputation: 195039

This line should help:

awk 'BEGIN{FS=":|\n";RS="Gender";OFS=",";print "Gender,Age,History"}$0{print $2,$4,$6}' file

With your example as input, it gives:

Gender,Age,History
 M, 46, 01305
 F, 46, 01306
 M, 19, 01307
 M, 19, 01308

Upvotes: 1

Related Questions