Reputation: 21
I am trying to read contents of a file into variables using bash v4.1.x The input file may look like this:
1373232436 785907701 "abc 245" 0 1
1373232436 1048824909 "def pqr" 1 0
1373232486 785907701 "uvw ghn" 0 1
1373232486 1048824909 "1109 xyz" 1 0
If I use
cat <filename>|while read col1 col2 col3 col4 col5 col6
do
...
...
done
I should get col3 values to be
"abc 245"
"def pqr"
"uvw ghn"
"1109 xyz"
Upvotes: 2
Views: 189
Reputation: 531335
Assuming that only the third field can be quoted as shown, I would use a regular expression to split each line into columns.
while read -r line; do
[[ $line =~ ^(.*)\ (.*)\ (\".*\")\ (.*)\ (.*)$ ]] || continue
col1=${BASH_REMATCH[1]}
col2=${BASH_REMATCH[2]}
col3=${BASH_REMATCH[3]}
col4=${BASH_REMATCH[4]}
col5=${BASH_REMATCH[5]}
done < file.txt
Upvotes: 2
Reputation: 20980
You can also use gawk
+FPAT
$ gawk 'BEGIN{FPAT="([^ ]*)|\"([^\"]*)\""} {print "\nLine: " NR; for(i=1;i<=NF;i++){print $i}}' test.csv
Line: 1
1373232436
785907701
"abc 245"
0
1
Line: 2
1373232436
1048824909
"def pqr"
1
0
Line: 3
1373232486
785907701
"uvw ghn"
0
1
Line: 4
1373232486
1048824909
"1109 xyz"
1
0
Note1: FPAT is gawk feature. May not be available with your awk version.
Note2: Just realized, that incidentally, example in the link I mentioned above deals with requirement very similar to yours, though I had written that regex myself. :-)
Upvotes: 1
Reputation: 785276
You can actually use:
grep -Eo '"[^"]*"|\w+' file
to read each quoted column separately from your input file.
You can use a script like this:
#!/bin/bash
numcols=$(awk -F '"[^"]*"|[^[:blank:]]+' '{print NF-1; exit}' file)
n=1
while read -r w; do
echo "$w"
(( (n++ % numcols) )) || echo "<-- End of line $(( (n / numcols) )) -->"
done < <(grep -Eo '"[^"]*"|\w+' file)
For your input file it gives:
1373232436
785907701
"abc 245"
0
1
<-- End of line 1 -->
1373232436
1048824909
"def pqr"
1
0
<-- End of line 2 -->
1373232486
785907701
"uvw ghn"
0
1
<-- End of line 3 -->
1373232486
1048824909
"1109 xyz"
1
0
<-- End of line 4 -->
You can process them individually instead of doing echo "$w"
.
Upvotes: 0
Reputation: 20980
I think, your input file is essentially a csv file, with field separator=space.
Then you can use csvtool
:
csvtool -t " " cols 1-6 test.csv | while IFS=, read col1 col2 col3 col4 col5 col6; do
...
...
done
run csvtool --help
for more details.
Note: There will NOT be surrounding double quotes around col3
data. So you would get abc 245
& not "abc 245"
in the value.
Upvotes: 0