msimmer92
msimmer92

Reputation: 397

Parameter expansion not working when used inside Awk on one of the column entries

System: Linux. Bash 4.

I have the following file, which will be read into a script as a variable:

/path/sample_A.bam A 1
/path/sample_B.bam B 1
/path/sample_C1.bam C 1
/path/sample_C2.bam C 2 

I want to append "_string" at the end of the filename of the first column, but before the extension (.bam). It's a bit trickier because of containing the path at the beginning of the name.

Desired output:

/path/sample_A_string.bam A 1
/path/sample_B_string.bam B 1
/path/sample_C1_string.bam C 1
/path/sample_C2_string.bam C 2 

My attempt: I did the following script (I ran: bash script.sh):

List=${1};
awk -F'\t' -vOFS='\t' '{ $1 = "${1%.bam}" "_string.bam" }1' < ${List} ;

And its output was:

${1%.bam}_string.bam
${1%.bam}_string.bam
${1%.bam}_string.bam
${1%.bam}_string.bam

Problem: I followed the idea of using awk for this substitution as in this thread https://unix.stackexchange.com/questions/148114/how-to-add-words-to-an-existing-column , but the parameter expansion of ${1%.bam} it's clearly not being recognised by AWK as I intend. Does someone know the correct syntax for that part of code? That part was meant to mean "all the first entry of the first column, except the last part of .bam". I used ${1%.bam} because it works in Bash, but AWK it's another language and probably this differs. Thank you!

Upvotes: 1

Views: 1398

Answers (4)

ctac_
ctac_

Reputation: 2471

You can try this way with awk :

awk -v a='_string' 'BEGIN{FS=OFS="."}{$1=$1 a}1' infile

Upvotes: 1

jasonmclose
jasonmclose

Reputation: 1695

sed -i 's/\.bam/_string\.bam/g' myfile.txt

It's a single line with sed. Just replace the .bam with _string.bam

Upvotes: 2

RavinderSingh13
RavinderSingh13

Reputation: 133458

If I understood your requirement correctly, could you please try following.

val="_string"
awk -v value="$val" '{sub(".bam",value"&")} 1'  Input_file

Brief explanation: -v value means passing shell variable named val value to awk variable variable here. Then using sub function of awk to substitute string .bam with string value along with .bam value which is denoted by & too. Then mentioning 1 means print edited/non-edtied line.

Why OP's attempt didn't work: Dear, OP. in awk we can't pass variables of shell directly without mentioning them in awk language. So what you are trying will NOT take it as an awk variable rather than it will take it as a string and printing it as it is. I have mentioned in my explanation above how to define shell variables in awk too.

NOTE: In case you have multiple occurences of .bam then please change sub to gsub in above code. Also in case your Input_file is TAB delmited then use awk -F'\t' in above code.

Upvotes: 2

Inian
Inian

Reputation: 85560

Note that the paramter expansion you applied on $1 won't apply inside awk as the entire command body of the awk command is passed in '..' which sends content literally without applying any shell parsing. Hence the string "${1%.bam}" is passed as-is to the first column.

You can do this completely in Awk

awk -F'\t' 'BEGIN { OFS = FS }{ n=split($1, arr, "."); $1 = arr[1]"_string."arr[2] }1'  file

The code basically splits the content of $1 with delimiter . into an array arr in the context of Awk. So the part of the string upto the first . is stored in arr[1] and the subsequent split fields are stored in the next array indices. We re-construct the filename of your choice by concatenating the array entries with the _string in the filename part without extension.

Upvotes: 3

Related Questions