Reputation: 53
I am having trouble with a script I wrote in bash. I am reading in a .txt file line by line in a while loop. The line is in the form of x:y:z:a:b:c. Each element ('x', 'a', etc.) corresponds to something such as age, height, name, and so on. I have successfully managed to print out the desired fields, but when I try to print the whole "x:y:z:a:b:c" "x:y:z" will print out to a .csv in one cell, and "a:b:c" will go to the next cell, due to the fact that there is a comma in the last name and first name. I know this sounds confusing so I can clarify further if necessary but here is what it looks like:
Cell 1 Cell 2
age:height:number:last name first name:language
I want all this information in one cell. I think the problem is due to the fact that there is a comma between the last and first name. Please let me know if you have any tips or pointers.
Edit: Here is what the code looks like:
while read INPUT do
str=$INPUT
IFS=: read -r -a ARR <<< "$str"
NAME=${ARR[0]}
AGE=${ARR[3]}
echo $AGE, $NAME >>Filedirectory.filename.csv
done<filedirectory.filename.txt
...where filedirectory.filename.txt
contains the following:
INPUT=Smith, John:126 lbs:67in:26:College University
Taking a sample input of str = "Smith, John:126 lbs:67in:26:College University"
So, `NAME = Smith, John, AGE = 26. Ideally the output should be 2 cells next to each other populated as:
26 - Smith, John
Instead the output in excel when I open the CSV file ends up being:
12 - Smith - John
Where each '-' symbolizes a column (12
is in one cell, last name in the next, etc.) What I want is for Last Name and First Name to be in one cell, but apparently this is not happening due to the fact that the Name is formatted as "Smith, John" and the comma is messing things up
Upvotes: 1
Views: 1190
Reputation: 652
One way I have found to do this is to is to use a CSV-aware tool, such as csvkit
, which provides the command csvformat
:
while IFS=: read -r name _ age _; do
echo -e "${name}\t${age}";
done < filedirectory.filename.txt | csvformat -t > outputfile.csv
Upvotes: 0
Reputation: 20002
You can use an Excel feature:
When the first line is SEP=:
, the csv file will be read using this as SEP.
Now it is easy:
sed -r ' s/([^:]*):([^:]*):([^:]*):([^:]*):.*/\4:\1/;
1s/.*/SEP=:\n&/' < filedirectory.filename.txt > outputfile;
Upvotes: 0
Reputation: 295403
A simple implementation might look like:
while IFS=: read -r name _ age _; do
printf '"%s","%s"\n' "${name//\"/\"\"}" "${age//\"/\"\"}"
done <in.txt >out.csv
Less correctly (see the APPLICATION USAGE and RATIONALE sections of the linked specification describing caveats), you could do this with echo
:
while IFS=: read -r name _ age _; do
echo '"'"${name//\"/\"\"}"'","'"${age//\"/\"\"}"'"'
done <in.txt >out.csv
In both cases, we're surrounding the cell contents with quotes, and using a parameter expansion to double-up any quotes found inside of the strings, which makes those quotes literal in CSV syntax.
Upvotes: 1