Reputation: 37
i have a data frame, for example :
df = ID aa_len aa_seq \
0 001 45 [M, R, S, R, Y, P, L, L, R, G, E, A, V, A, V, ...
1 002 45 [M, R, S, R, Y, P, L, L, R, G, E, A, V, A, V, ...
mut_position
0 [-1]
1 [5, 94, 95, 132]
The "mut_position" can be -1 or other non negative number (2,3,4) or a list of few numbers. for example it can be -1 as in 001. a list of a few like in 002 or one number- for example 4. i need to count the number of subjects who doesnt have -1.
i tried to so that by comparing to -1 and collect the ones that r different but it dosent seems to work...
def count_mutations(df, ref_aa_len):
nomis = -1
mutation = (df['mut_position']) != nomis
print (mutation)
what i get it True for both (ignore the ref_aa_len, that should come later)-
0 True
1 True
Upvotes: 1
Views: 32
Reputation: 863166
I think need list compehension
with generator and sum of boolean True
s:
df['non_negative'] = [sum(y != -1 for y in x) for x in df['mut_position']]
print (df)
mut_position non_negative
0 [-1] 0
1 [5, 94, 95, 132] 4
If possible scalars also:
print (df)
mut_position
0 [-1]
1 [5,94,95,132]
2 6
3 -1
df['non_negative'] = [sum(y != -1 for y in x)
if isinstance(x, list)
else int(x != -1) for x in df['mut_position']]
print (df)
mut_position non_negative
0 [-1] 0
1 [5, 94, 95, 132] 4
2 6 1
3 -1 0
If need check first values if list for -1
and filter by boolean indexing
:
df = pd.DataFrame({'mut_position':[[-1], [5,94,95,132],[2,-1], [-1]]})
print (df)
mut_position
0 [-1]
1 [5, 94, 95, 132]
2 [2, -1]
3 [-1]
df1 = df[df['mut_position'].str[0] != -1 ]
print (df1)
mut_position
1 [5, 94, 95, 132]
2 [2, -1]
Detail:
str[0]
working for select first char of string or first value of iterable:
print (df['mut_position'].str[0])
0 -1
1 5
2 2
3 -1
Name: mut_position, dtype: int64
And for check -1
for any position use all
:
df1 = df[[all(y != -1 for y in x) for x in df['mut_position']]]
print (df1)
mut_position
1 [5, 94, 95, 132]
List comprehension return boolena list:
print ([all(y != -1 for y in x) for x in df['mut_position']])
[False, True, False, False]
Upvotes: 1