Reputation: 625

To print FileName and selected rows:

Would like to print first 2 rows from all the files located in the directory along with File Name. All are .csv extension files. Having around 100 files in that directory.

sample_jan.csv

10,Jan,100
30,Jan,300
50,Jan,500

sample_feb.csv

10,Feb,200
20,Feb,400
40,Feb,800
60,Feb,1200

Expected Output:

Filename:sample_jan.csv
10,Jan,100
30,Jan,300

Filename:sample_feb.csv
10,Feb,200
20,Feb,400

Tried to list 2 rows for individual files like below but don't know how to loop for all the files.

cat sample_jan.csv | head -2 >>output.csv
cat sample_feb.csv | head -2 >>output.csv

cat *.csv | head -2 >output.csv

Looking for your suggestions, dont have perl & python.

Upvotes: 1

Answers (5)

fab

Reputation: 1859

You can loop ocer all .csv files in the directory like that:

for f in *.csv; do YOUR_COMMAND; done

That should be combinable with your commands:

for f in *.csv; do cat "$f" | head -2 >>output.csv ; done

(Untested - just for the idea)

Upvotes: 0

ooga

Reputation: 15501

In awk:

awk '
  FNR == 1  {if(NR!=1)print""; printf("Filename:%s\n", FILENAME)}
  FNR < 3
' *.csv

Explanation

Recall that:

records default to lines
NR counts records starting at 1 and doesn't reset between files
FNR counts records starting at 1 and resets to 1 each file.

Script:

FNR == 1 {  # If it's the first record of the current file then:
    if (NR != 1) # If it's NOT the first record of all files
        print "";  #   then print an empty line
    printf("Filename:%s\n", FILENAME) # Print the filename
}

# If record number of current file is < 3 then
# perform default action (print the record).
FNR < 3

If you have too many files to fit on the command line (after the expansion of *.csv), then you could try this:

find -name '*.csv' -execdir awk '
  FNR == 1 {
    file = FILENAME
    sub(/^\.\//, "", file)
    printf("\nFilename:%s\n", file)
  }
  FNR < 3
' '{}' +

The find command above will execute awk with as large a list of filenames as will fit on the command line ('{}' + is replaced with this list), running awk as many times as necessary, but the minimum number of times.

The substitution in the awk script strips the ./ from the front of the filenames before they are printed.

Upvotes: 1

Kent

Reputation: 195039

if you don't care the label Filename: you can just:

head -n2 * > output.csv

Upvotes: 0

jaypal singh

Reputation: 77095

Using awk:

$ awk 'FNR==1{print "Filename:" FILENAME}FNR<3' *.csv
Filename:sample_jan.csv
10,Jan,100
30,Jan,300
Filename:sample_feb.csv
10,Feb,200
20,Feb,400

If you have GNU awk and your files are very big then this could be an option:

$ gawk 'FNR==1{print "Filename:" FILENAME}FNR>2{nextfile}1' *.csv
Filename:sample_jan.csv
10,Jan,100
30,Jan,300
Filename:sample_feb.csv
10,Feb,200
20,Feb,400

Upvotes: 2

rojomoke

Reputation: 4015

For file in *
do
    echo "Filename:$file" >> output.csv
    head -2 $file >> output.csv
    echo
done

There's no need to cat and pipe the files into head. It can take the filename as a parameter.

Upvotes: 0

To print FileName and selected rows:

Answers (5)

Related Questions