Reputation: 27
I realize this is an incredibly inefficient way to code this, so I'm hoping someone will have suggestions on a more efficient method.
Essentially I'm trying to create a column ("freq") with values of 0 for NA and "Nothing" objects and 1 otherwise. Sample df:
i obj freq
0. Nothing 0
1. Something 1
2. NaN 0
3. Something 1
for i in range(0,len(df)):
if str(df["obj"].iloc[i]) == "Nothing" or str(df["obj"].iloc[i]) == NaN:
d["freq"].iloc[i] = 0
else:
df["freq"].iloc[i] = 1
Upvotes: 0
Views: 88
Reputation: 1388
In this case, it is not even necessary to use numpy
:
df['freq'] = (~(df.obj.isnull() | (df.obj == 'Nothing'))) * 1
Note:
Is it useful to code with '0' and '1'? Can't we stay with the result of the boolean operation keeping the 'False' and True' values? If it is the case the answer would simply be:
df['freq'] = ~(df.obj.isnull() | (df.obj == 'Nothing'))
Upvotes: 0
Reputation: 3096
You can use np.where()
import pandas as pd
import numpy as np
df = pd.DataFrame({'obj': {0: 'Nothing', 1: 'Something', 2: np.nan, 3: 'Something'}})
df['freq'] = np.where((df['obj'] == 'Nothing') | (df['obj'].isnull()), 0, 1)
Upvotes: 2
Reputation: 622
Without a dataframe is hard to check if works, but it should
indexer = (df['obj'] == 'Nothing') | (df['obj'].astype(str) == 'NaN')
df.loc[indexer, 'freq'] = 0
df.loc[~indexer, 'freq'] = 1
Upvotes: 0