Jean Pierre Waked
Jean Pierre Waked

Reputation: 123

IndexingError: Unalignable boolean Series provided as indexer ( Filtering Rows in pandas using python)

import pandas as pd
import csv
def load_source(filename):
    users = pd.read_csv(filename, encoding="utf8")
    return users

list_me = "Entrepreneur|Behold|=|Ã|±|Ã|®|Å|¥|ð|Ÿ|˜|‡|ð|à|¤|œ|à|¤|²"

users = load_source(latest_file)
filtered_followers_up = users[users.followersCount <= 1500]
filtered_followers_down = filtered_followers_up[filtered_followers_up.followersCount >= 0]

filtered_bio = filtered_followers_down[filtered_followers_down['bio'].dropna().str.contains(list_me)]
filtered_bio.to_csv(r'C:\Users\user\Downloads\test.csv', sep=',', encoding='utf-8')
print("Done!")

enter image description here

So what I'm trying to do is filtering my csv file by removing all rows that contains ("Entrepreneur|Behold|=|Ã|±|Ã|®|Å|¥|ð|Ÿ|˜|‡|ð|à|¤|œ|à|¤|²")

Upvotes: 1

Views: 58

Answers (1)

Laurent
Laurent

Reputation: 13478

The issue comes from the fact that you are calling dropna() while filtering the dataframe.

Instead, remove NA values first and use bitwise not operator ~ to remove all rows matching with list_me:

# Example dataframe
filtered_followers_down = pd.DataFrame({"bio": ["a", "Behold", pd.NA, "d", "Ã"]})

filtered_followers_down = filtered_followers_down.dropna()

filtered_bio = filtered_followers_down[
    ~filtered_followers_down["bio"].str.contains(list_me)
]

print(filtered_bio)
# Output
  bio
0   a
3   d

Upvotes: 1

Related Questions