Python: replace only one occurrence in a string

Question

I have some sample data which looks like:

ATOM    973  CG  ARG A  61     -21.593   8.884  69.770  1.00 25.13           C
ATOM    974  CD  ARG A  61     -21.610   7.433  69.314  1.00 23.44           C
ATOM    975  NE  ARG A  61     -21.047   7.452  67.937  1.00 12.13           N

I want to replace the 6th column and only the 6th column by the addition of the offset value, in the case above it is 308.

So 61+308 = 369, so 61 in the 6th column should be replaced by 369

I can't str.split() the line as the line spacing is very important.

I have tried tried using str.replace() but the values in column 2 can also overlap with column 6

I did try reversing the line and use str.repalce() but the values in columns 7,8,9,10 and 11 can overlap with the str to be replaced.

The ugly code I have so far is (which partially works apart from if the values overlap in columns 7,8,9,10 and/or 11):

with open('2kqx.pdb', 'r') as inf, open('2kqx_renumbered.pdb', 'w') as outf:
    for line in inf:
        if line.startswith('ATOM'):
            segs = line.split()
            if segs[4] == 'A':
                offset = 308
                number = segs[5][::-1]
                replacement = str((int(segs[5])+offset))[::-1]
                print number[::-1],replacement
                line_rev = line[::-1]
                replaced_line = line_rev.replace(number,replacement,1)
                print line
                print replaced_line[::-1]
                outf.write(replaced_line[::-1])

The code above produced this output below. As you can see in the second line the 6th column is not changed, but is changed in column 7. I thought by reversing the string I could bypass the potential overlap with column 2, but I forgot about the other columns and I dont really know how to get around it.

ATOM    973  CG  ARG A  369     -21.593   8.884  69.770  1.00 25.13           C
ATOM    974  CD  ARG A  61     -21.3690   7.433  69.314  1.00 23.44           C
ATOM    975  NE  ARG A  369     -21.047   7.452  67.937  1.00 12.13           N

user707650 · Accepted Answer

data = """\
ATOM    973  CG  ARG A  61     -21.593   8.884  69.770  1.00 25.13           C
ATOM    974  CD  ARG A  61     -21.610   7.433  69.314  1.00 23.44           C
ATOM    975  NE  ARG A  61     -21.047   7.452  67.937  1.00 12.13           N"""

offset = 308
for line in data.split('
'):
    line = line[:22] + "  {:<5d}  ".format(int(line[22:31]) + offset) + line[31:]
    print line

I haven't done the exact counting of whitespace, that's just a rough estimate. If you want more flexibility than just having the numbers 22 and 31 scattered in your code, you'll need a way to determine your start and end index (but that contrasts my assumption that the data is in fixed column format).

Python: replace only one occurrence in a string

Answers (2)

Related Questions