frank
frank

Reputation: 3608

Pandas - replace column values based on a separate array index

I have:

idx Node1   Node2   Immediacy
0   a   C   5
1   a   B   5
2   B   D   3
3   B   E   3
4   B   F   3

and an array (verices):

array(['a', 'B', 'C', 'E', 'G', 'H', 'J', 'D', 'F', 'L', 'M', 'N', 'O',
       'P', 'Q', 'R', 'I', 'K'], dtype=object)

I want to add a new column/substitute all the letters in the data frame based on the index position of the array:

idx Node1   Node2   Immediacy
0   0   2   5
1   0   1   5
2   1   3   3
3   1   4   3
4   1   5   3

I found a way to find the index in the array with:

(verices=='B').argmax()

but am not sure how to use this to achieve the desired results.

any suggestions welcome

Upvotes: 1

Views: 1075

Answers (2)

jezrael
jezrael

Reputation: 863156

You can get only strings columns - obviously strings by DataFrame.select_dtypes and use DataFrame.apply with Series.map - then non matched values are replaced to NaNs:

a = np.array(['a', 'B', 'C', 'E', 'G', 'H', 'J', 'D', 'F', 'L', 'M', 'N', 'O',
       'P', 'Q', 'R', 'I', 'K'])

d = dict(zip(a, np.arange(len(a))))
cols = df.select_dtypes(object).columns

df[cols] = df[cols].apply(lambda x: x.map(d))
print (df)
   idx  Node1  Node2  Immediacy
0    0      0      2          5
1    1      0      1          5
2    2      1      7          3
3    3      1      3          3
4    4      1      8          3

Alternative solution with DataFrame.applymap and get:

df[cols] = df[cols].applymap(d.get)

Upvotes: 1

anky
anky

Reputation: 75100

Try with:

df.replace(dict(zip(pd.Series(a),pd.Series(a).index)))

    Node1 Node2  Immediacy
idx                       
0       0     2          5
1       0     1          5
2       1     7          3
3       1     3          3
4       1     8          3

Upvotes: 2

Related Questions