user5558875
user5558875

Reputation:

Bash Script Export fourth row in txt file to csv

I have a 50 directories named as Subj1, Subj2 .. Subj50 each containing 152 text files each named in following naming format

regional_vol_GM_atlas1.txt
..
..
regional_vol_GM_atlas152.txt

each file has data arranged in 4 rows and 2 columns, where column values are separated by space delimiter

667869 667869
580083 580083
316133 316133
9020 9020

I would like to export fourth row of each txt file with header into csv file for all 50 directories that i have

I have written a script which exports the data from each text file along with header and creates a CSV but the script takes in all the data inside the text file and pastes in CSV instead of 4th row.

#!/bin/bash


# pasting the file name as column name,  
for x in regional_vol_*.txt ; do  


   sed -i "1s/^/${x}\n/" ${x}

done
# Sorting the files and Subj1 directory name is file name of csv file 
paste -d, $(ls -1v regional_vol*.txt ) >> subj1.csv

The figure below describes the output of the file.Subj1 is a directory name

Sub1output

Upvotes: 0

Views: 577

Answers (1)

riteshtch
riteshtch

Reputation: 8769

You can use find to recursively find desired files in all subdirectories and then use sed to pipe 1st and last row appended into a new file.

The main commands that performs all operations are:

$ echo "x" > temp
$ find . -type d -iname "sub*" | sed 's/^.*\///' >> temp
$ find sub1/* -type f -printf "%f\n" | paste -s -d , > data.csv
$ for dir in *; do paste -s -d ',' <(tail -q -n 1 "$dir"/regional_vol_*.txt) >> data.csv; done 2> /dev/null
$ paste -d , temp <(sed '/^\s*$/d' data.csv)
x,regional_vol_GM_atlas1.txt,regional_vol_GM_atlas2.txt
sub1,1 1,2 2
sub2,3 3,4 4

Here is a sample structure which I have made:

$ ls -R
.:
sub1/  sub2/

./sub1:
regional_vol_GM_atlas1.txt  regional_vol_GM_atlas2.txt

./sub2:
regional_vol_GM_atlas1.txt  regional_vol_GM_atlas2.txt
$ cat sub1/* sub2/*
header1 header1
667869 667869
580083 580083
316133 316133
1 1
header2 header2
667869 667869
580083 580083
316133 316133
2 2
header3 header3
667869 667869
580083 580083
316133 316133
3 3
header4 header4
667869 667869
580083 580083
316133 316133
4 4
$ find sub1/* -type f -printf "%f\n" | paste -s -d , > data.csv
$ for dir in *; do paste -s -d ',' <(tail -q -n 1 "$dir"/regional_vol_*.txt) >> data.csv; done 2> /dev/null
$ cat data.csv 
regional_vol_GM_atlas1.txt,regional_vol_GM_atlas2.txt

1 1,2 2
3 3,4 4

Upvotes: 0

Related Questions