Reputation: 2680
what i am trying to do is plot two rows out of a file looking like this:
number pair atom count shift error
1 ALA ALA CA 7624 1.35 0.13
1 ALA ALA HA 7494 19.67 11.44
38 ARG LYS CA 3395 35.32 9.52
38 ARG LYS HA 3217 1.19 0.38
38 ARG LYS CB 3061 0.54 1.47
39 ARG MET CA 1115 35.62 13.08
39 ARG MET HA 1018 1.93 0.20
39 ARG MET CB 976 1.80 0.34
What i want to do is to plot the rows that contain atom CA and CB using their atomvalues. so basically i want to do :
atomtypemask_ca = data['atom'] == 'CA'
xaxis = np.array(data['shift'][atomtypemask_ca])
aa, atom = data['aa'][atomtypemask_ca], data['atom'][atomtypemask_ca]
atomtypemask_cb = data['atom'] == 'CB'
yaxis = np.array(data['shift'][atomtypemask_cb])
plot (xaxis, yaxis)
what is kind of ruining my day is the reason that some values don't have a CB entry. How can i plot this kind of thing, ignoring entries that have only one of the two atomvalues set? I can of course program it, but i think this should be possible using masks, therefore producing cleaner code.
Upvotes: 0
Views: 484
Reputation: 36725
I'm guessing, first column is the residue number. Use that. I don't know your data structure or what shift
refers to, but you should be able to do something like this:
In : residues
Out: array([ 1, 1, 38, 38, 38, 39, 39, 39])
In : atom
Out:
array(['CA', 'HA', 'CA', 'HA', 'CB', 'CA', 'HA', 'CB'],
dtype='|S2')
In : shift
Out: array([7624, 7494, 3395, 3217, 3061, 1115, 1018, 976])
# rows with name 'CB'
In : cb = atom=='CB'
# rows with name 'CA' _and_ residues same as 'CB'
In : ca = numpy.logical_and(numpy.in1d(residues, residues[cb]), atom=='CA')
# or if in1d is not available
# ca = numpy.logical_and([(residue in residues[cb]) for residue in residues], atom=='CA')
In : shift[ca]
Out: array([3395, 1115])
In : shift[cb]
Out: array([3061, 976])
Upvotes: 2