Reputation: 1070
I am a Mac Newbie and need to import csv files into a mssql database. Large csv files. I am using the import wizard for the actual import but I am having a hard time determining the longest string for each column.
I have found the following command line which will show the values in the selected column:
cut -d, -f<column number> <path to csv file name>
Now I would like to output the longest length of that column. Props to anyone who could help me return the length for all columns.
Thanks in advance
Upvotes: 0
Views: 1714
Reputation: 571
as @jpw sugested
my answer:
why not use awk ?
I don't have a mac to test, but length() is a pretty standard function in awk, so this should work.
awk file:
{ for (i=1;i<=NF;i++) {
l=length($i) ;
if ( l > linesize[i] ) linesize[i]=l ;
}
}
END {
for (l in linesize) printf "Columen%d: %d\n",l,linesize[l] ;
}
then run
mybox$ awk -F, -f test.awk a.txt
Columen4: 4
Columen5: 3
Columen1: 6
Columen2: 7
Columen3: 4
To get columns sorted :
BEGIN{ maxi=-1 ; }
{ for (i=1;i<=NF;i++) {
l=length($i) ;
if ( l > linesize[i] ) linesize[i]=l ;
if ( i > maxi ) maxi = i ;
}
}
END {
for (i=1;i<=maxi;i++) printf "Columen%d: %d\n",i,linesize[i] ;
}
gives:
mybox$ awk -F, -f test.awk a.txt
Columen1: 6
Columen2: 7
Columen3: 4
Columen4: 4
Columen5: 3
Upvotes: 4
Reputation: 44891
The first part of your question is answered in the post flagged as duplicate, but to apply that for all columns this bash one-line script will give you the max length for all columns (based on the first line of the file):
for((i=1;i< `head -1 test.csv|awk '{print NF}' FS=,`+1 ;i++)); do echo | xargs echo -n "Column$i: " && cut -d, -f $i test.csv |wc -L ; done
To explain how it works, the for loop starts from 1 and uses awk
to determine the number of columns, and then does the cut -d, -f <column number>
with pipe (|) through wc -L
to return max length.
The output from my sample file with five columns:
Column1: 6
Column2: 7
Column3: 4
Column4: 4
Column5: 3
You could even put it in a shell-script that takes the filename of the csv file as argument:
#!/bin/bash
for((i=1;i< `head -1 $1|awk '{print NF}' FS=,`+1 ;i++));
do echo | xargs echo -n "Column$i: "
&& cut -d, -f $i $1 |wc -L ; done
Save this as something.sh
, then do chmod + x something.sh
and execute it with ./something.sh your_csv_file.csv
Upvotes: 2