Reputation: 99
I am trying to substitute coordinates of a particular line in one file for the coordinates of a different file. Both of them have a line in them that has "code word" in them and that is where the coordinates are found. The ccordinates are also on the same sets of columns, 33-54, if that helps. How can I label a certain part of the line of interest as a variable so I could use sed to substitute? This is what I have so far:
#!/bin/bash
FILE=$1
grep -i "ABC DEF" $FILE.pdb
# Somehow select the coordinates in the line with "ABC DEF" in $FILE.pdb and label it PDBcoords
PDBcoords=$unknownfunction1
$Somehow select the coordinates in the line with "ABC DEF" in reference.pdb and label it refcoords
grep -i "ABC DEF" reference.pdb
refcoords=$unknownfunction2
sed -i 's/$refcoords/$PDBcoords/'
wait
echo "Whole Command Done for $FILE"
The grep outputs looks like this:
ATOM 5103 ABC DEF A 100 5.817 2.502 -21.483 1.00 13.63 O
and I only want to select the coordinates
5.817 2.502 -21.483
However, these coordinates change for every file, so I need to label these columns as a variable. Same goes for the reference pdb.
EDIT I came up with this solution:
#!/bin/bash
FILE=$1
PDB=$(grep -i "OXT ORN" $FILE.pdb | cut -c 33-54)
PDBcoords="$(echo "$PDB")"
echo $PDBcoords
echo Found PDB Coordinates for $FILE
pkaSH=$(grep -i "OXT ORN" pkaSH.pdb | cut -c 33-54)
pkaSHcoords="$(echo "$pkaSH")"
echo $pkaSHcoords
echo Found pkaSH Coordinates for $FILE
sed -i "s/$pkaSHcoords/$PDBcoords/" pkaSH.pdb
echo Command Done
My idea was to redirect the grep output to a temporary file, cut out the coordinate columns, and then define that as a variable with spaces preserved. I'm sure this was overcomplicated, but since it works I think I have my answer.
Upvotes: 0
Views: 1728
Reputation: 35366
Assumptions/Understandings ...
code word
)code name
) will only match a single row in each file ($FILE.pdb
and reference.pdb
)Sample data (in place of $FILE.pdb
I'm using codeword.pdb
):
$ cat codeword.pdb
ATOM 5103 something else 23.219 12.880 -78.003 1.00 13.63 O
ATOM 5103 code name A 100 5.817 2.502 -21.483 1.00 13.63 O
ATOM 5103 not this line buddy 105.199 342.192 -1.423 1.00 13.63 O
One idea using grep
and cut
:
ptn="code name"
grep -i "${ptn}" codeword.pdb | cut -c33-56
This generates:
5.817 2.502 -21.483
Capturing the output to a variable:
PDBcoords="$(grep -i "${ptn}" codeword.pdb | cut -c33-56)"
echo ".${PDBcoords}." # decimals are added as visual delimiters
echo "${#PDBcoords}" # number of characters in variable
This generates:
. 5.817 2.502 -21.483.
24
NOTES:
reference.pdb
for storage in the $refcoords
variable$FILE.pdb
and reference.pdb
As for the sed
portion of OP's code ...
sed
command is incomplete (I'm assuming the sed
target is $FILE.pdb
)code name
and $PDBcoords
One sed
idea:
ptn="Code NAME" # mix it up, show case insensitivity
PDBcoords=" 5.817 2.502 -21.483"
refcoords=" 103.227 23.285 -1.223"
sed "/${ptn}/Is/${PDBcoords}/${refcoords}/" codeword.pdb
Where:
/I
- perform case insensitive matchs/ .... / .... /
- replace old coordinates with new coordinates (assumes the 2 variables (PDBcoords
and refcoords
) are of the same length in order to maintain column positions in the output)This generates:
############## before image for sake of comparison:
ATOM 5103 something else 23.219 12.880 -78.003 1.00 13.63 O
ATOM 5103 code name A 100 5.817 2.502 -21.483 1.00 13.63 O
ATOM 5103 not this line buddy 105.199 342.192 -1.423 1.00 13.63 O
############## results of the `sed` command:
ATOM 5103 something else 23.219 12.880 -78.003 1.00 13.63 O
ATOM 5103 code name A 100 103.227 23.285 -1.223 1.00 13.63 O
ATOM 5103 not this line buddy 105.199 342.192 -1.423 1.00 13.63 O
NOTE: Once OP has confirmed this performs the desired modification the -i
flag can be added to the sed
command to allow for in place updating of $FILE.pdb
.
Upvotes: 1
Reputation: 2096
You can use awk
to select columns
grep -i "code name" reference.pdb | awk '{print $7,$8,$9}'
or use cut
grep -i "code name" reference.pdb | tr -s " " | cut -d" " -f 7-9
In both codes, you will be extracting the seventh, eighth, and ninth columns, delimited by white space.
Edit
Reference: How to specify more spaces for the delimiter using cut?
Upvotes: 0
Reputation: 84642
Another option is:
tr -s ' ' | cut -d ' ' -f 7-9
Where tr -s
is used to compress all multiple spaces into a single space and then cut -d ' ' -f 7-9
outputs the space delimited 7th-9th fields, e.g.
$ echo "ATOM 5103 code name A 100 5.817 2.502 -21.483 1.00 13.63 O" |
tr -s ' ' | cut -d ' ' -f 7-9
5.817 2.502 -21.483
Upvotes: 2
Reputation: 3344
I don't know if all files have the same type of "columns", but if so awk might be what you need
echo ATOM 5103 code name A 100 5.817 2.502 -21.483 1.00 13.63 O | awk '{ print $7, $8, $9 }
# outputs: 5.817 2.502 -21.483
Upvotes: 0