afober
afober

Reputation: 3

Creating function to filter dataframe pandas

I'm learning Python and Pandas by implementing some projects.

I have multiple dataframes that contain information concerning certain companies. Each company is in a row of each dataframe. I need to delete some companies and therefore some rows from all dataframes.

I intend to create a function that delete these rows from all the dataframes. Something like this:

my_data1 = pd.read_csv(file1.csv)
my_data2 = pd.read_csv(file2.csv)

Company   Inf1 Inf2
c0        1     2
c1        2     4
c2        3     6
c3        4     8
c4        5    10
c5        6    12


list_companies_to_mantain = [c1, c4]

def delete(df):
   df = df[df['Company'].isin(list_companies_to_mantain)

delete(my_data1)
delete(my_data2)


Company   Inf1 Inf2
c1        2     4
c4        5    10

The problem is: the rows are deleted in the df inside the function, but not outside it. If I print() the dataframe inside the function, it works. But outside the function, it does not work.

The point of doing this is to have my_data1 and my_data2, after applying delete(), without the companies I do not want.

I feel the answer to this is pretty obvious, but I simply cannot do it. If I was not clear, please let me know.

Thanks!

Upvotes: 0

Views: 28

Answers (1)

BENY
BENY

Reputation: 323306

You need to return it and assign it back

def delete(df):
   return df[df['Company'].isin(list_companies_to_mantain)

my_data1 = delete(my_data1)

Upvotes: 1

Related Questions