Reputation: 491
Following this post I tried this to delete two columns from a dataframe:
import pandas as pd
from io import StringIO
A_csv = """cases,population,country,year,type,count
745,19987071,Afghanistan,1999,population,19987071
2666,20595360,Afghanistan,2000,population,20595360
37737,172006362,Brazil,1999,population,172006362
80488,174504898,Brazil,2000,population,174504898
212258,1272915272,China,1999,population,1272915272
213766,1280428583,China,2000,population,1280428583"""
with StringIO(A_csv) as fp:
A = pd.read_csv(fp)
print(A)
print()
dropcols = ["type", "count"]
A = A.drop(dropcols, axis = 1, inplace = True)
print(A)
result
cases population country year type count
0 745 19987071 Afghanistan 1999 population 19987071
1 2666 20595360 Afghanistan 2000 population 20595360
2 37737 172006362 Brazil 1999 population 172006362
3 80488 174504898 Brazil 2000 population 174504898
4 212258 1272915272 China 1999 population 1272915272
5 213766 1280428583 China 2000 population 1280428583
None
Is there something obvious that is escaping me?
Upvotes: 3
Views: 2455
Reputation: 403128
These solutions were mentioned in the comments. I'm just fleshing them out in this post.
When using drop
, be wary of the two options you have.
One of them is to drop inplace
. When this is done, the dataframe is operated upon and changes are made to the original. This means that this is sufficient.
A.drop(dropcols, axis=1, inplace=1)
A
cases population country year
0 745 19987071 Afghanistan 1999
1 2666 20595360 Afghanistan 2000
2 37737 172006362 Brazil 1999
3 80488 174504898 Brazil 2000
4 212258 1272915272 China 1999
5 213766 1280428583 China 2000
As the df.drop
documentation specifies:
inplace
:bool
, defaultFalse
If
True
, do operation inplace and return None.
Note that when drop
is called inplace, it returns None
(that is the default value of any function that does not return a value), and A
will have already been updated.
The other option is to drop, but return a copy. This means that the original is not modified. So, you can now do:
B = A.drop(dropcols, axis=1)
B
cases population country year
0 745 19987071 Afghanistan 1999
1 2666 20595360 Afghanistan 2000
2 37737 172006362 Brazil 1999
3 80488 174504898 Brazil 2000
4 212258 1272915272 China 1999
5 213766 1280428583 China 2000
A
cases population country year type count
0 745 19987071 Afghanistan 1999 population 19987071
1 2666 20595360 Afghanistan 2000 population 20595360
2 37737 172006362 Brazil 1999 population 172006362
3 80488 174504898 Brazil 2000 population 174504898
4 212258 1272915272 China 1999 population 1272915272
5 213766 1280428583 China 2000 population 1280428583
Where B
and A
exist separately.
Note that you are not saving any memory working with inplace
- both methods create a copy. However, in the former case, a copy is made behind the scene and the changes are added back into the original object.
Upvotes: 3