wmatt
wmatt

Reputation: 795

Pandas DataFrame column naming conventions

Is there any commonly used Pandas DataFrame column naming convention? Is PEP8 recommended here (ex. instance variables)?

Concious that lots of data is loaded from external sources with headers but I'm curious what is the correct approach when I have to name/rename the columns on my own?

Upvotes: 31

Views: 16766

Answers (3)

gregV
gregV

Reputation: 1107

There is no clear guidance from pandas founding fathers and the choice is really between already mentioned snake vs Pascal (camel) case, or df['my_column'] vs df['MyColumn'], and is a matter of preference. A lot of R packages use snake case for dataframes. I'd say snake case is more readable while Pascal case require fewer characters.

Upvotes: 2

blokeley
blokeley

Reputation: 7065

Some people tend to use snake_case (lower case with underscores) so that they can access the column using period like this df.my_column

I tend to always access columns using the df['my_column'] syntax because it avoids confusion with DataFrame methods and properties, and it easier to extend to slices and fancy indexing, so the snake case is not necessary.

In short, I think you should use whatever is clearest to a potential reader.

Upvotes: 29

Keiron Stoddart
Keiron Stoddart

Reputation: 378

One more thing to keep in mind, if your application also makes use of relational databases - I would recommend that you keep your Pandas naming conventions consistent with the column names of your relational database tables.

Upvotes: 9

Related Questions