Reputation: 2582
I want a bash script to add a header line (with generic column names) to a CSV file.
My CSV file content:
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j
Desired CSV file content:
1,2,3,4,5,6,7,8,9,10
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j
I have been trying to convert between CSV and ARFF file formats, however the CSV2Arff.java code example from Weka requires the input CSV file to have a header, but my CSV file has none.
Upvotes: 0
Views: 2381
Reputation: 563
this can be done in one line in the shell (bash). For example if using an example file called "dat.csv"
$ cat dat.csv
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j
then
$ cat <(seq -s, 1 $(( `head -n 1 dat.csv | tr -dc "," | wc -c` + 1 ))) dat.csv
1,2,3,4,5,6,7,8,9,10
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j
you can put the result in a new file like this:
$ cat <(seq -s, 1 $(( `head -n 1 dat.csv | tr -dc "," | wc -c` + 1 ))) dat.csv > newfile.csv
Upvotes: 1
Reputation: 2582
Usage:
./add_header.sh "input.csv"
The bash script (i.e. add_header.sh
) takes the csv file name as its 1 argument.
timestamp=$(date +"%Y-%m-%d_%H-%M")
input_csv_file=$1
output_csv_file="header_"$timestamp"_"$input_csv_file
o=""
# Find the number of columns (commas) in the first row
n=$(($(head -n1 $input_csv_file | sed 's/[^,]//g' | wc -c)))
for i in $(seq 1 $n); # Get a list of numbers equal to column qty
do
o=$o""$i",";
done
#Write the numbers with commas to first line of new file.
echo $o > $output_csv_file
#Append whole of other file to new file.
cat $input_csv_file >> $output_csv_file
Output: is a new file containing the header (with comma-separated numbered columns) followed by the original CSV file content. e.g.
1,2,3,4,5,6,7,8,9,10,
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j
Upvotes: 1