Reputation: 373
I have run into some unexpected error while try .isin() Here's the problem. I've scrapped web, turned into dataframe. Now I'd like to make changes to make the data more usable for the project. From scrapped data, one column contains all the features, it's a list in json, but in pd, it's a "non-null object":
"feature": ["Wi-Fi", "LAN", "LED"]
I'd like to create new Boolean column base on each feature, which will be helpful down the road. It should look like this
Product Wifi LAN LED
1 True True True
2 True False False
I've tried both str.contains and .isin(), but only got errors. Such as
TypeError: only list-like objects are allowed to be passed to isin(), you passed a [str]
ValueError: Length of values does not match length of index
What is a better way to tackle this problem?
Also, the original data is in Japanese, I've loaded dataframe with "encoding="utf-8" How to best coding when with utf8 in pandas? I'm using notepad++ as editor.
Upvotes: 4
Views: 3028
Reputation: 863791
Use apply
with in
if need check value in list
:
df = pd.read_json('sample.json', lines=True, encoding="utf-8")
print (df)
access address feature hour name offday \
0 30 5-17-62 [Wi-Fi, LAN1, Non-smoking] 9:00〜22:00 CHEZ MADU -
1 30 5-17-62 [Wi-Fi, LAN2, Non-smoking] 9:00〜22:00 CHEZ MADU -
2 30 5-17-62 [Wi-Fi, LAN3, Non-smoking] 9:00〜22:00 CHEZ MADU -
tel web
0 042-465-3533 http://www.hakka-group.co.jp/shoplist/
1 042-465-3533 http://www.hakka-group.co.jp/shoplist/
2 042-465-3533 http://www.hakka-group.co.jp/shoplist/
mask = df['feature'].apply(lambda x: 'LAN1' in x)
print (mask)
0 True
1 False
2 False
Name: feature, dtype: bool
Upvotes: 4