Reputation: 1
I've a scientist dataframe
radius date spin atom
0 12,50 YYYY/MM 0 he
1 11,23 YYYY/MM 2 c
2 45,2 YYYY/MM 1 z
3 11,1 YYYY/MM 1 p
I want select for each row, all rows where the difference between the radius is under, for exemple 5
I've define a function to calc (simple,it's an example):
def diff_radius (a,b)
return a-b
Is-it possible for each rows to find some rows which check the condition in calling an external function?
I try some way, not working:
for i in range(df.shape[0]):
....
df_in_radius=df.apply(lambda x : diff_radius(df[i]['radius'],x['radius']))
Can you help me?
Upvotes: 0
Views: 126
Reputation: 1
I misspoke.
My dataframe is :
radius of my atom date spin atom
0 12.50 YYYY/MM 0 he
1 11.23 YYYY/MM 2 c
2 45.2 YYYY/MM 1 z
3 11.1 YYYY/MM 1 p
I do a loop , to apply on one row a special calcul of each row whose respond condition. Example:
def diff_radius(current_row,x):
current_row['radius']-x['radius']
return a-b
df=pd.read_csv(csvfile,delimiter=";",names=('radius','date','spin','atom'))
# for each row of original dataframe
for i in range(df.shape[0]):
# first build a new and tmp dataframe with row
# which have a radius less 5 than df.iloc[i]['radius] (level of loop)
df_tmp=df[diff_radius(df.iloc[i]['radius],df['radius']) <5]
....
# start of special calc, with the df_tmp which contains all of rows
# less 5 than the current row **(i)**
I thank you sincerely for your answers
Upvotes: 0
Reputation: 1841
I am assuming that the datatype of the radius
column is a tuple
. You can keep the diff_radius
method like
def diff_radius(x):
a, b = x
return a-b
Then, you can use loc
method in pandas to select the rows which matches the condition of radius differece less than 5.
df.loc[df.radius.apply(diff_radius) < 5]
Edit #1
If the datatype of the radius
column is a string
, then split them and typecast. The logic will go in the diff_radius
method. In case of string
def diff_radius(x):
x_split = x.split(',')
a,b = int(x_split[0]), int(x_split[-1])
return a-b
Upvotes: 1