HonzaB
HonzaB

Reputation: 7335

Pandas assign label based on index value

I have a dataframe with index and multiple columns. Secondly, I have few lists containing index values sampled on certain criterias. Now I want to create columns with labes based on fact whether or not the index of certain row is present in a specified list.

Now there are two situations where I am using it:

1) To create a column and give labels based on one list:

df['1_name'] = df.index.map(lambda ix: 'A' if ix in idx_1_model else 'B')

2) To create a column and give labels based on multiple lists:

def assignLabelsToSplit(ix_, random_m, random_y, model_m, model_y):
    if (ix_ in random_m) or (ix_ in model_m):
        return 'A'
    if (ix_ in random_y) or (ix_ in model_y):
        return 'B'
    else:
        return 'not_assigned'

df['2_name'] = df.index.map(lambda ix: assignLabelsToSplit(ix, idx_2_random_m, idx_2_random_y, idx_2_model_m, idx_2_model_y))

This is working, but it is quite slow. Each call takes about 3 minutes and considering I have to execute the funtions multiple times, it needs to be faster.

Thank you for any suggestions.

Upvotes: 2

Views: 2180

Answers (1)

jezrael
jezrael

Reputation: 862511

I think you need double numpy.where with Index.isin :

df['2_name'] = np.where(df.index.isin(random_m + model_m), 'A',
               np.where(df.index.isin(random_y + model_y), 'B', 'not_assigned'))

Sample:

np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(10,1)), columns=['A'])
#print (df)

random_m = [0,1]
random_y =  [2,3]
model_m = [7,4]
model_y = [5,6]

print (type(random_m))
<class 'list'>

print (random_m + model_m)
[0, 1, 7, 4]

print (random_y + model_y)
[2, 3, 5, 6]

df['2_name'] = np.where(df.index.isin(random_m + model_m), 'A',
               np.where(df.index.isin(random_y + model_y), 'B', 'not_assigned'))
print (df)
   A        2_name
0  8             A
1  8             A
2  3             B
3  7             B
4  7             A
5  0             B
6  4             B
7  2             A
8  5  not_assigned
9  2  not_assigned

Upvotes: 2

Related Questions