Reputation: 212
I have a huge table at which I am trying to change some duplicated column names using sed with first match replacement. For that, I am using an array with the duplicated column names, which I selected manually.
I first tried the sed code with one simple text string, and it worked:
sed '0,/AGE_032/ s//AGE_032.old/' combined.order.allfilter.abund.tsv | head -n1
Then, I tried to replace matches with an isolated element of the array and it is not working.
declare -a oldarr=("AGE_032" "MOLI_032" "OIA_013" "SH-108" "SH-16")
sed '0,/${oldarr[0]}/ s//${oldarr[0]}.old/' combined.order.allfilter.abund.tsv | head -n1
The expected output should be something like this:
AGE_023 AGE_024 AGE_025 AGE_026 AGE_027 AGE_028 AGE_029 AGE_030 AGE_031
AGE_032.old MOLI_029 MOLI_030 MOLI_031 MOLI_032 MOLI_033 SH-107 OIA_013
SH-108 SH-109 SH-110 SH-13 SH-15 SH-16 SH-17 AREN_36 AREN_38 AREN_39
AGE_032 MOLI_032 OIA_013 SH-108 SH-16
Note that AGE_032
, MOLI_032
, OIA_013
, SH-108
and SH-16
appear twice, and only the first match of AGE_032
should be replaced with AGE_032.old
.
Of course, any other code solution for solving the problem will be appreciated.
Clarification: the code must work for replacing the first match of every string inside the array.
Upvotes: 0
Views: 91
Reputation: 189457
Pulling the columns out into a Bash array seems like a very roundabout way of doing this. A simple Awk or Perl script can examine the column headers and write them out in one go. Here's a Perl one-liner to rename headers on the first line and write the result back to the original file name:
perl -i~ -pe 's/\b(\w+-\d+)\b(?=.*\b\1\b)/$1.old/g if $.==1' combined.order.allfilter.abund.tsv
The regular expression will successively find tokens which occur at least twice on the first line of the file, and replace all except the last one with the original token with ".old" appended.
In some more detail, the regular expression looks for a word boundary (\b
) before and after a label matching \w+-\d+
. The parentheses capture this label and we use a lookahead (?=...)
to see if it occurs again between similar separators further to the right; the \1
matches the first captured string again.
The postfix condition if $.==1
limits this to the first line of the file.
The option -i~
will create a backup file with a tilde appended to its name; once you are confident that this works, you can take it out if you don't want a backup file to be written.
Upvotes: 1