Reputation: 159
My tab-delimited file looks like this:
ID Pop snp1 snp2 snp3 snp4 snp5
AD62 1 0/1 1/1 . 1/1 0/.
AD75 1 0/0 1/1 . ./0 1/0
AD89 1 . 1/0 1/1 0/0 1/.
I want to separate the columns (starting from column 3) so that the values separated by the "/" character are delimited into a column of its own. However there are also columns whereby the values are missing (they only contain the "." character) and I want this to be treated as though it was "./." so that the two "." characters are then divided into their own columns. For example:
ID Pop snp1 snp2 snp3 snp4 snp5
AD62 1 0 1 1 1 . . 1 1 0 .
AD75 1 0 0 1 1 . . . 0 1 0
AD89 1 . . 1 0 1 1 0 0 1 .
Thanks
Upvotes: 1
Views: 1185
Reputation: 54392
A fairly robust way, using awk
and a few if
statements:
awk '{ for (i = 1; i <= NF; i++) if (i >= 3 && i < NF && NR == 1) printf "%s\t\t", $i; else if (i == NF && NR == 1) print $i; else if ($i == "." && NR >= 2) printf ".\t.\t", $i; else { sub ("/", "\t", $i); if (i == NF) printf "%s\n", $i; else { printf "%s\t", $i; } } }' file.txt
Broken out on multiple lines:
awk '{ for (i = 1; i <= NF; i++)
if (i >= 3 && i < NF && NR == 1) printf "%s\t\t", $i;
else if (i == NF && NR == 1) print $i;
else if ($i == "." && NR >= 2) printf ".\t.\t", $i;
else {
sub ("/", "\t", $i);
if (i == NF) printf "%s\n", $i;
else {
printf "%s\t", $i;
}
}
}' file.txt
HTH
Upvotes: 0
Reputation: 58351
This might work for you (GNU sed):
sed ''1s/\t/&&/3g;s/\t\.\t/\t.\t.\t/g;y/\//\t/' file
Upvotes: 0
Reputation: 1319
You can use sed:
sed -e 's/ \. /\.\t\. /g' -e 's/\//\t/g' <your_file>
Upvotes: 1
Reputation: 13709
Tried this and works well, you can tweak this as per your requirement.
Assuming data is in data.txt
file.
cat data.txt | sed 1d | tr '/' '\t'| sed 's/\./.\t./g'
This gives the output, but you need to get a work around for the spaces and tab that are getting messed up.
Upvotes: 0