Reputation: 2618
I was working with Python Pandas for quite a while and now staring at the two commands below thinking what would be the difference between both.
df1['Col1'] #Shows only the values of 'Col1' from df1 dataframe.
df1[['Col1','Col2']] #Shows the values of both 'Col1' and 'Col2' from df1 dataframe.
My question is when we are able to access a column with the help of single square brackets ('[ ]'), why can't we do the same for accessing multiple columns. I tried with the below command and encountered error.
df1['Col1','Col2'] #Encountered error
Upvotes: 1
Views: 224
Reputation: 30605
Usually pandas take one index value while selecting the data using []
. Either pass the one column name or pass a list of columns names as one. When you pass two value it will be treated that as a tuple and will search for the same in the dataframe. There are cases tuples are used as column names. Thats the reason why there will be a key error.
You can have a column name like df['Col1','Col2'] = 'x'
then this df['Col1','Col2']
will work. To avoid this kind of ambugity there is a need of passing column names more than one as a list.
Upvotes: 3
Reputation: 294488
Setup
df = pd.DataFrame([[1, 2], [3, 4]], columns=['col1', 'col2'])
In python, []
is syntactic sugar for the __getitem__
method.
This:
df['col1']
0 1
1 3
Name: col1, dtype: int64
Is equivalent to:
df.__getitem__('col1')
0 1
1 3
Name: col1, dtype: int64
And this:
df[['col1', 'col2']]
col1 col2
0 1 2
1 3 4
Is the same as this:
df.__getitem__(['col1', 'col2'])
col1 col2
0 1 2
1 3 4
So.... when you do this
df['col1', 'col2']
It's trying to force whatever is there into a single argument and it's the same as
df.__getitem__(('col1', 'col2'))
Which gets you
KeyError: ('col1', 'col2')
Upvotes: 2