Reputation: 359
Hi I have a dataframe like this with 500+ rows.
company_url company tag_line product data
0 https://angel.co/billguard BillGuard The fastest smartest way to track your spendin... BillGuard is a personal finance security app t... New York City · Financial Services · Security ...
1 https://angel.co/tradesparq Tradesparq The world's largest social network for global ... Tradesparq is Alibaba.com meets LinkedIn. Trad... Shanghai · B2B · Marketplaces · Big Data · Soc...
2 https://angel.co/sidewalk Sidewalk Hoovers (D&B) for the social era Sidewalk helps companies close more sales to s... New York City · Lead Generation · Big Data · S...
3 https://angel.co/pangia Pangia The Internet of Things Platform: Big data mana... We collect and manage data from sensors embedd... San Francisco · SaaS · Clean Technology · Big ...
4 https://angel.co/thinknum Thinknum Financial Data Analysis Thinknum is a powerful web platform to value c... New York City · Enterprise Software · Financia...
What I want to do is that I want to find null in the "data" column and drop the row from the dataframe. I wrote my code for it but I believe it didn't work as expected since the number of rows didn't change. Could someone help me on this?
My code:
for item in bigdata_comp_dropped.iterrows():
if item[1][4] == "":
bigdata_comp_dropped.drop(item[1])
Upvotes: 0
Views: 1031
Reputation: 375635
You can keep only the notnull values using a boolean mask:
df = df[df["data"].notnull()]
Upvotes: 1
Reputation: 10970
Try
bigdata_filtered = bigdata_comp_dropped[~bigdata_comp_dropped['data'].isnull()]
Upvotes: 1