Is DataFrameA.columnA the same as DataFrameA.["columnA"] in pandas?

I'm considering changing all my DataFrameA.columnA to DataFrameA["columnA"] because it looks like the docs use the bracket syntax quite often. It looks like better practice because it offer the opportunity to dynamically pick a column based on a variable instead of hard coded. For example, you could do:

columnWanted="columnA"; 
DataFrameA[columnWanted] # Yield ColumnA, GOOD

With the other syntax,

columnWanted="columnA"; 
DataFrameA.columnWanted # Yields Nothing, BAD. No way of Evaluating Variable.

would not work. Because it looks for "columnWanted" and there's no way you can put some sort of statement that you want columnWanted to be evaluated for it's value in python.

https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html

Upvotes: 1

Views: 44

Answers (2)

yakir0
yakir0

Reputation: 184

First of all it should be DataFrameA["columnA"] and not DataFrameA.["columnA"].

Also, here:

columnWanted="columnA"; 
DataFrameA.columnWanted # Yields Nothing, BAD. No way of Evaluating Variable.

"No way of Evaluating Variable" is not exactly true. You can replace it with

columnWanted="columnA"; 
getattr(DataFrameA, columnWanted) # equivalent to DataFrameA.columnA

but you should not use this here. It's better practice to use DataFrameA["columnA"].

About the question, both are exactly the same. you can see this by running

DataFrameA.columnA is DataFrameA["columnA"]

The output is True

Upvotes: 0

Valdi_Bo
Valdi_Bo

Reputation: 30971

You can use both above conventions as long as:

  • the column in question exists,
  • the column name is a string literal, not a variable holding a string.

There is however another limitation to the usage of the attribute notation, namely the column name must be a valid identifier (e.g. it can not contain any space).

But if you create a new column, then the only choice is the bracket notation.

In my opinion, the attribute notation is used quite often and there is no need to change to the bracket notation only to use a single notation only.

Note also that df.xxx is more concise than df['xxx'], so I prefer rather the attribute notation.

Upvotes: 1

Related Questions