Pepe S
Pepe S

Reputation: 65

How do split a large CSV file into small parts while maintaining headers and file extension with bash

I'm using the script below to split a large CSV file using Bash. The files are split and maintain the header is each output file.

csvheader=`head -1 largeFile.csv`
split -d -l500 largeFile.csv smallFile_split_
find .|grep smallFile_split_ | xargs sed -i "1s/^/$csvheader\n/"
sed -i '1d' smallFile_split_t_00

However, I would also like to maintain the .csv file extension on each split part.

Current output is smallFile_split_00, while I would like it to be smallFile_split_00.csv

I've tried using split -l 500 -d .csv largeFile.csv file but it doesn't seem to be working.

If you have any ideas it would be greatly appreciated.

Upvotes: 0

Views: 244

Answers (1)

tripleee
tripleee

Reputation: 189317

Just rename after splitting.

The find seems superflous; you just want

for file in smallFile_split_*; do
    case $file in
      smallFile_split_t_00)
        cat "$file" ;;
      *)
        sed "1s/^/$csvheader\n/" "$file";;
    esac >"$file.csv"
    rm "$file"
done

Besides being unnecessary here, find will also traverse any subdirectories, which clearly you don't want in this case.

Upvotes: 1

Related Questions