Reputation: 3
I'm learning Python and Pandas by implementing some projects.
I have multiple dataframes that contain information concerning certain companies. Each company is in a row of each dataframe. I need to delete some companies and therefore some rows from all dataframes.
I intend to create a function that delete these rows from all the dataframes. Something like this:
my_data1 = pd.read_csv(file1.csv)
my_data2 = pd.read_csv(file2.csv)
Company Inf1 Inf2
c0 1 2
c1 2 4
c2 3 6
c3 4 8
c4 5 10
c5 6 12
list_companies_to_mantain = [c1, c4]
def delete(df):
df = df[df['Company'].isin(list_companies_to_mantain)
delete(my_data1)
delete(my_data2)
Company Inf1 Inf2
c1 2 4
c4 5 10
The problem is: the rows are deleted in the df inside the function, but not outside it. If I print() the dataframe inside the function, it works. But outside the function, it does not work.
The point of doing this is to have my_data1 and my_data2, after applying delete(), without the companies I do not want.
I feel the answer to this is pretty obvious, but I simply cannot do it. If I was not clear, please let me know.
Thanks!
Upvotes: 0
Views: 28
Reputation: 323306
You need to return
it and assign it back
def delete(df):
return df[df['Company'].isin(list_companies_to_mantain)
my_data1 = delete(my_data1)
Upvotes: 1