Reputation: 795
Is there any commonly used Pandas DataFrame column naming convention? Is PEP8 recommended here (ex. instance variables)?
Concious that lots of data is loaded from external sources with headers but I'm curious what is the correct approach when I have to name/rename the columns on my own?
Upvotes: 31
Views: 16766
Reputation: 1107
There is no clear guidance from pandas founding fathers and the choice is really between already mentioned snake vs Pascal (camel) case, or df['my_column']
vs df['MyColumn']
, and is a matter of preference. A lot of R packages use snake case for dataframes. I'd say snake case is more readable while Pascal case require fewer characters.
Upvotes: 2
Reputation: 7065
Some people tend to use snake_case (lower case with underscores) so that they can access the column using period like this df.my_column
I tend to always access columns using the df['my_column']
syntax because it avoids confusion with DataFrame methods and properties, and it easier to extend to slices and fancy indexing, so the snake case is not necessary.
In short, I think you should use whatever is clearest to a potential reader.
Upvotes: 29
Reputation: 378
One more thing to keep in mind, if your application also makes use of relational databases - I would recommend that you keep your Pandas naming conventions consistent with the column names of your relational database tables.
Upvotes: 9