pds
pds

Reputation: 2582

How to add a generic header to a csv file in bash

I want a bash script to add a header line (with generic column names) to a CSV file.

My CSV file content:

a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j

Desired CSV file content:

1,2,3,4,5,6,7,8,9,10
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j

I have been trying to convert between CSV and ARFF file formats, however the CSV2Arff.java code example from Weka requires the input CSV file to have a header, but my CSV file has none.

Upvotes: 0

Views: 2381

Answers (2)

D Bro
D Bro

Reputation: 563

this can be done in one line in the shell (bash). For example if using an example file called "dat.csv"

$ cat dat.csv
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j

then

$ cat <(seq -s, 1 $(( `head -n 1 dat.csv | tr -dc "," | wc -c` + 1 ))) dat.csv
1,2,3,4,5,6,7,8,9,10
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j

you can put the result in a new file like this:

$ cat <(seq -s, 1 $(( `head -n 1 dat.csv | tr -dc "," | wc -c` + 1 ))) dat.csv > newfile.csv

Upvotes: 1

pds
pds

Reputation: 2582

Usage:

./add_header.sh "input.csv"

The bash script (i.e. add_header.sh) takes the csv file name as its 1 argument.

timestamp=$(date +"%Y-%m-%d_%H-%M")
input_csv_file=$1
output_csv_file="header_"$timestamp"_"$input_csv_file

o=""
# Find the number of columns (commas) in the first row
n=$(($(head -n1 $input_csv_file | sed 's/[^,]//g' | wc -c)))    

for i in $(seq 1 $n);  # Get a list of numbers equal to column qty
do
        o=$o""$i",";
done

#Write the numbers with commas to first line of new file.
echo $o > $output_csv_file              
#Append whole of other file to new file.
cat $input_csv_file >> $output_csv_file 

Output: is a new file containing the header (with comma-separated numbered columns) followed by the original CSV file content. e.g.

1,2,3,4,5,6,7,8,9,10,
a,b,c,d,e,f,g,h,i,j
a,b,c,d,e,f,g,h,i,j

Upvotes: 1

Related Questions