Reputation: 13
I'm considering changing all my DataFrameA.columnA
to DataFrameA["columnA"]
because it looks like the docs use the bracket syntax quite often. It looks like better practice because it offer the opportunity to dynamically pick a column based on a variable instead of hard coded. For example, you could do:
columnWanted="columnA";
DataFrameA[columnWanted] # Yield ColumnA, GOOD
With the other syntax,
columnWanted="columnA";
DataFrameA.columnWanted # Yields Nothing, BAD. No way of Evaluating Variable.
would not work. Because it looks for "columnWanted
" and there's no way you can put some sort of statement that you want columnWanted to be evaluated for it's value in python.
https://pandas.pydata.org/docs/getting_started/intro_tutorials/03_subset_data.html
Upvotes: 1
Views: 44
Reputation: 184
First of all it should be DataFrameA["columnA"]
and not DataFrameA.["columnA"]
.
Also, here:
columnWanted="columnA";
DataFrameA.columnWanted # Yields Nothing, BAD. No way of Evaluating Variable.
"No way of Evaluating Variable" is not exactly true. You can replace it with
columnWanted="columnA";
getattr(DataFrameA, columnWanted) # equivalent to DataFrameA.columnA
but you should not use this here. It's better practice to use DataFrameA["columnA"]
.
About the question, both are exactly the same. you can see this by running
DataFrameA.columnA is DataFrameA["columnA"]
The output is True
Upvotes: 0
Reputation: 30971
You can use both above conventions as long as:
There is however another limitation to the usage of the attribute notation, namely the column name must be a valid identifier (e.g. it can not contain any space).
But if you create a new column, then the only choice is the bracket notation.
In my opinion, the attribute notation is used quite often and there is no need to change to the bracket notation only to use a single notation only.
Note also that df.xxx is more concise than df['xxx'], so I prefer rather the attribute notation.
Upvotes: 1