Reputation:
I'm looking for a straightforward console solution to change a text file which looks like this:
...
Gender: M
Age: 46
History: 01305
Gender: F
Age: 46
History: 01306
Gender: M
Age: 19
History: 01307
Gender: M
Age: 19
History: 01308
....
To csv file like this one:
Gender,Age,History
M,46,01305
F,46,01306
M,19,01307
M,19,01308
Any help appreciated
With following solutions I've received this output. Am I doing something wrong?
awk 'BEGIN{printf "Gender,Age,History%s",ORS;FS=":"}{c++} {sub(/^ */,"",$2);printf "%s%s",$2,(c==3)?ORS:","}c==3{c=0}' data.txt >> 1.csv
Gender,Age,History
M
,37
,00001
M
,37
,00001
M
,41
,00001
Upvotes: 0
Views: 1059
Reputation:
I still don't know where exactly was the problem So I decided to cleanup the data from all of characters except ones, which are supposed to be there (most probably unusual end of the line symbol)
sed -e 's/[^a-zA-Z*0-9:]/ /g;s/ */ /g' history.txt > output.txt
And after that succesfully used the solution from @sjsam
awk 'BEGIN{printf "Gender,Age,History%s",ORS;FS=":"}{c++} {sub(/^ */,"",$2);printf "%s%s",$2,(c==3)?ORS:","}c==3{c=0}' data.txt >> 1.csv
Thanks everyone!
Upvotes: 0
Reputation: 1365
Here is a way to do it in bash. Assuming your datafile is called data.txt
#!/bin/bash
echo "Gender,Age,History"
while read -r line; do
printf '%s' "$(cut -d ' ' -f2 <<< $line )"
if [[ "$line" =~ ^History.* ]]; then
printf "\n"
else
printf ","
fi
done < data.txt
Outputs:
Gender,Age,History
M,46,01305
F,46,01306
M,19,01307
M,19,01308
Upvotes: 1
Reputation: 22012
With bash builtin commands only, I would say:
#!/bin/bash
echo "Gender,Age,History"
while read line; do
if [[ $line =~ ^Gender:\ *([^\ ]+) ]]; then
r=${BASH_REMATCH[1]}
elif [[ $line =~ ^Age:\ *([^\ ]+) ]]; then
r+=,${BASH_REMATCH[1]}
elif [[ $line =~ ^History:\ *([^\ ]+) ]]; then
echo $r,${BASH_REMATCH[1]}
fi
done < data.text
Upvotes: 1
Reputation: 195039
This line should help:
awk 'BEGIN{FS=":|\n";RS="Gender";OFS=",";print "Gender,Age,History"}$0{print $2,$4,$6}' file
With your example as input, it gives:
Gender,Age,History
M, 46, 01305
F, 46, 01306
M, 19, 01307
M, 19, 01308
Upvotes: 1