Reputation: 1174
def categorizeMainUrl(url):
category = "other"
if "/special/" in url:
category = "special"
return category
df["category"] = df["main_URL"].apply(lambda url: categorizeMainUrl(url))
While running this part of the code, I keep the following exception.
"TypeError: argument of type 'float' is not iterable"
How can I select only the section of the dataframe with the float values?
(In this column, I would wait only string as a datatype)
Upvotes: 0
Views: 3227
Reputation: 30920
Use Series.fillna
to fill NaN
values, then you can use Series.str.contains
with np.where
or Series.map
to create a new serie:
df["category"] = np.where(df['main_URL'].fillna('').str.contains('/special/'),
"special", "other")
or
df["category"] = (df['main_URL'].fillna('')
.str.contains('/special/')
.map({True:"special",
False:"other"})
)
#df['main_URL'].fillna('').str.contains('/special/').replace({True:"special",
# False:"other"})
I recommend you see: when should I want to use apply
Upvotes: 1
Reputation: 1174
The following command select only the rows containing on specific datatype (here float):
df[df["category"].apply(lambda x: isinstance(x, float))]
Upvotes: 0