Reputation: 205
I have a pandas dataframe as shown below:
df =pd.DataFrame({'String':['JAIJAOD','ERJTD','AJDIDO','AJDIDO'],'Position':[5,2,nan,4]})
I am trying to get a third column that shows what is the letter of first column that represents the number in column Position. The dataframe should be something like
df = pd.DataFrame({'String':['JAIJAOD','ERJTD','AJDIDO','AJDIDO'],'Position':[5,2,nan,4],'Letter':['O','J',nan,'D']})
I have tried the following code, however, the output is not exactly what I want since the final table has some mistakes regarding the third column.
third = []
for i, n in zip(df['String'],df['Position']):
if n >0: #I thought it because the column Position have just floats
third.append(i[int(n)]
else:
third.append(np.nan)
df['Third'] = pd.Series(third)
Upvotes: 1
Views: 1127
Reputation: 16683
You can apply a lambda x:
function to the required input columns simultaneously, by applying to the dataframe, passing x
to the dataframe and axis=1
. For each row, my approach takes a slice of each value in String
depending on the corresponding value in the Position
column:
df =pd.DataFrame({'String':['JAIJAOD','ERJTD','AJDIDO','AJDIDO'],'Position':[5,2,np.nan,4]})
df['Letter'] = df[df['Position'].notnull()].apply(lambda x: x['String'][int(x['Position'])],axis=1)
df
Out[1]:
String Position Letter
0 JAIJAOD 5.0 O
1 ERJTD 2.0 J
2 AJDIDO NaN NaN
3 AJDIDO 4.0 D
Upvotes: 0
Reputation: 323326
Let us try
df['Letter'] = [x[int(y)] if y==y else np.nan for x , y in zip(df.String,df.Position) ]
['O', 'J', nan, 'D']
Upvotes: 2