Qubix
Qubix

Reputation: 4353

Splitting CSV file into text files

I have a CSV file of the form:

1,frog
2,truck
3,truck
4,deer
5,automobile

and so on, for about 50 000 entries. I want to create 50 000 separate .txt files named with the number before the comma and containing the word after the comma, like so:

1.txt  contains: frog
2.txt  contains: truck
3.txt  contains: truck
4.txt  contains: deer
5.txt  contains: automobile

and so on.

This is the script I've written so far, but it does not work properly:

#!/bin/bash

folder=/home/data/cifar10

for file in $(find "$folder" -type f -iname "*.csv")
do
    name=$(basename "$file" .txt)

while read -r tag line; do
    printf '%s\n' "$line" >"$tag".txt
done <"$file"
rm "$file"

done 

Upvotes: 3

Views: 194

Answers (3)

Claes Wikner
Claes Wikner

Reputation: 1517

 awk 'BEGIN{FS=","} {print $1".txt  contains: "$2}' file

1.txt  contains: frog
2.txt  contains: truck
3.txt  contains: truck
4.txt  contains: deer
5.txt  contains: automobile

Upvotes: 0

grail
grail

Reputation: 930

An awk alternative:

awk -F, '{print $2 > $1 ".txt"}' file.csv

Upvotes: 2

codeforester
codeforester

Reputation: 42999

The issue is in your inner loop:

while read -r tag line; do
  printf '%s\n' "$line" > "$tag".txt
done < "$file"

You need to set IFS to , so that tag and line are parsed correctly:

while IFS=, read -r tag line; do
    printf '%s\n' "$line" > "$tag".txt
done < "$file"

You can use shopt -s globstar instead of find, with Bash 4.0+. This will be immune to word splitting and globbing, unlike plain find:

shopt -s globstar nullglob
for file in /home/data/cifar10/**/*.csv; do
  while IFS=, read -r tag line; do
    printf '%s\n' "$line" > "$tag".txt
  done < "$file"
done

Note that the name set through name=$(basename "$file" .txt) statement is not being used in your code.

Upvotes: 3

Related Questions