Reputation: 644
I have a dataframe as shown in the excel file
I want to find the duplicate values according to the id's like ID 0 and ID 1
the values in nn_id columns are 366 393
which are not same so further we check the values of ID 2 and ID 3
the values in nn_id columns are same 595 595
so if values are same them print the values from columns nn_id , slice-0010-EDSR_x2_X and slice-0010-EDSR_x2_Y
So the ouput would be in the form of dictionary {595:[(492,260),(491,248)]}
Further check the values of ID 4 and ID 5
which are 458 486
which are not same so do nothing.
I am sorry if its confusing but I want to check the two two ID's nn_id values if same then make a dictionary of adjacent column values.
Upvotes: 0
Views: 133
Reputation: 545
Does this achieve what you're after? There might be more elegant ways to achieve the same. I've assumed you have a DataFrame
df
with your table.
df_shift = df.shift(1) # shift database with 1 row
same_idx = df['nn_id'] == df_shift['nn_id']
# get column positions for columns of interest
col1_pos = df.columns.get_loc('slice-0010-EDSR_x2_X ')
col2_pos = df.columns.get_loc('slice-0010-EDSR_x2_Y')
nn_idx_pos = df.columns.get_loc('nn_id')
my_dict = {} # define empty dict to store your results.
for i in np.where(same_idx)[0]: # for each row where the nn_idx value is the same
# define the value that you're after
my_value = [(df.iloc[i-1, col1_pos], df.iloc[i-1, col2_pos]),
(df.iloc[i, col1_pos], df.iloc[i, col2_pos])]
# and add element to dictionary
my_dict[df.iloc[i, nn_idx_pos]] = my_value
Upvotes: 1