oxthon
oxthon

Reputation: 1

Select specific columns

I've a scientist dataframe

     radius      date     spin  atom
0    12,50       YYYY/MM   0     he
1    11,23       YYYY/MM   2     c
2    45,2        YYYY/MM   1     z
3    11,1        YYYY/MM   1     p

I want select for each row, all rows where the difference between the radius is under, for exemple 5

I've define a function to calc (simple,it's an example):

def diff_radius (a,b)
    return a-b

Is-it possible for each rows to find some rows which check the condition in calling an external function?

I try some way, not working:

for i in range(df.shape[0]):
     ....
     df_in_radius=df.apply(lambda x : diff_radius(df[i]['radius'],x['radius']))

Can you help me?

Upvotes: 0

Views: 126

Answers (2)

oxthon
oxthon

Reputation: 1

I misspoke.

My dataframe is :

     radius of my atom      date     spin  atom
0    12.50                  YYYY/MM   0     he
1    11.23                  YYYY/MM   2     c
2    45.2                   YYYY/MM   1     z
3    11.1                   YYYY/MM   1     p

I do a loop , to apply on one row a special calcul of each row whose respond condition. Example:

def diff_radius(current_row,x):
    current_row['radius']-x['radius']
    return a-b

df=pd.read_csv(csvfile,delimiter=";",names=('radius','date','spin','atom'))
# for each row of original dataframe
for i in range(df.shape[0]):

      # first build a new and tmp dataframe with row
      # which have a radius less 5 than df.iloc[i]['radius] (level of loop)
      df_tmp=df[diff_radius(df.iloc[i]['radius],df['radius']) <5]
      ....
      # start of special calc, with the df_tmp which contains all of rows
      # less 5 than the current row **(i)**

I thank you sincerely for your answers

Upvotes: 0

bumblebee
bumblebee

Reputation: 1841

I am assuming that the datatype of the radius column is a tuple. You can keep the diff_radius method like

def diff_radius(x):
    a, b = x
    return a-b

Then, you can use loc method in pandas to select the rows which matches the condition of radius differece less than 5.

df.loc[df.radius.apply(diff_radius) < 5]

Edit #1

If the datatype of the radius column is a string, then split them and typecast. The logic will go in the diff_radius method. In case of string

def diff_radius(x):
    x_split = x.split(',')
    a,b = int(x_split[0]), int(x_split[-1])
    return a-b

Upvotes: 1

Related Questions