Reputation: 53
I have a file data.txt in which there are 200 columns and rows (a square matrix). So, i have been trying to split my file into 200 files, each of then with one of the column from the big data file. These where my two attempts employing cut and awk, however i don't understand why is not working.
NM=`awk 'NR==1{print NF-2}' < file.txt`
echo $NM
for (( i=1; i = $NM; i++ ))
do
echo $i
cut -f ${i} file.txt > tmpgrid_0${i}.dat
#awk '{print '$i'}' file.txt > tmpgrid_0${i}.dat
done
Any suggestions?.
EDIT: Thank you very much to all of you. All answers were valid but i cannot vote to all of them.
Upvotes: 5
Views: 3715
Reputation: 7342
An alternative solution using tr
and split
< file.txt tr ' ' '\n' | split -nr/200
This assumes that the file is space delimited, but the tr command could be tweaked as appropriate for any delimiter. Essentially this puts each entry on its own line, and then uses split's round robin version to write each 200th line to the same file.
paste -d' ' x* | cmp - file.txt
verifies that it worked if split is writing files with an x
prefix.
I got this solution from Reuti on the coreutils mailing list.
Upvotes: 1
Reputation: 67211
awk '{for(i=1;i<=5;i++){name=FILENAME"_"i;print $i> name}}' your_file
Tested with 5 columns:
> cat temp
PHE 5 2 4 6
PHE 5 4 6 4
PHE 5 4 2 8
TRP 7 5 5 9
TRP 7 5 7 1
TRP 7 5 7 3
TYR 2 4 4 4
TYR 2 4 4 0
TYR 2 4 5 3
> nawk '{for(i=1;i<=5;i++){name=FILENAME"_"i;print $i> name}}' temp
> ls -1 temp_*
temp_1
temp_2
temp_3
temp_4
temp_5
> cat temp_1
PHE
PHE
PHE
TRP
TRP
TRP
TYR
TYR
TYR
>
Upvotes: 7
Reputation: 207385
To summarise my comments, I suggest something like this (untested as I have no sample file):
NM=$(awk 'NR==1{print NF-2}' file.txt)
echo $NM
for (( i=1; i <= $NM; i++ ))
do
echo $i
awk '{print $'$i'}' file.txt > tmpgrid_0${i}.dat
done
Upvotes: 2