Reputation:
I have a 50 directories named as Subj1, Subj2 .. Subj50 each containing 152 text files each named in following naming format
regional_vol_GM_atlas1.txt
..
..
regional_vol_GM_atlas152.txt
each file has data arranged in 4 rows and 2 columns, where column values are separated by space delimiter
667869 667869
580083 580083
316133 316133
9020 9020
I would like to export fourth row of each txt file with header into csv file for all 50 directories that i have
I have written a script which exports the data from each text file along with header and creates a CSV but the script takes in all the data inside the text file and pastes in CSV instead of 4th row.
#!/bin/bash
# pasting the file name as column name,
for x in regional_vol_*.txt ; do
sed -i "1s/^/${x}\n/" ${x}
done
# Sorting the files and Subj1 directory name is file name of csv file
paste -d, $(ls -1v regional_vol*.txt ) >> subj1.csv
The figure below describes the output of the file.Subj1 is a directory name
Upvotes: 0
Views: 577
Reputation: 8769
You can use find
to recursively find desired files in all subdirectories and then use sed to pipe 1st and last row appended into a new file.
The main commands that performs all operations are:
$ echo "x" > temp
$ find . -type d -iname "sub*" | sed 's/^.*\///' >> temp
$ find sub1/* -type f -printf "%f\n" | paste -s -d , > data.csv
$ for dir in *; do paste -s -d ',' <(tail -q -n 1 "$dir"/regional_vol_*.txt) >> data.csv; done 2> /dev/null
$ paste -d , temp <(sed '/^\s*$/d' data.csv)
x,regional_vol_GM_atlas1.txt,regional_vol_GM_atlas2.txt
sub1,1 1,2 2
sub2,3 3,4 4
Here is a sample structure which I have made:
$ ls -R
.:
sub1/ sub2/
./sub1:
regional_vol_GM_atlas1.txt regional_vol_GM_atlas2.txt
./sub2:
regional_vol_GM_atlas1.txt regional_vol_GM_atlas2.txt
$ cat sub1/* sub2/*
header1 header1
667869 667869
580083 580083
316133 316133
1 1
header2 header2
667869 667869
580083 580083
316133 316133
2 2
header3 header3
667869 667869
580083 580083
316133 316133
3 3
header4 header4
667869 667869
580083 580083
316133 316133
4 4
$ find sub1/* -type f -printf "%f\n" | paste -s -d , > data.csv
$ for dir in *; do paste -s -d ',' <(tail -q -n 1 "$dir"/regional_vol_*.txt) >> data.csv; done 2> /dev/null
$ cat data.csv
regional_vol_GM_atlas1.txt,regional_vol_GM_atlas2.txt
1 1,2 2
3 3,4 4
Upvotes: 0