Erin
Erin

Reputation: 505

Putting list of timezones strings in order by UTC to label plt.x-axis of datetime plot

So I have a pandas df column, which contains the following info:

67               America/Buenos_Aires
68     America/Argentina/Buenos_Aires
70     America/Argentina/Buenos_Aires
77     America/Argentina/Buenos_Aires
78                   Pacific/Auckland
79     America/Argentina/Buenos_Aires
80          America/Argentina/Tucuman
81                      Europe/Lisbon
82     America/Argentina/Buenos_Aires
83     America/Argentina/Buenos_Aires
84     America/Argentina/Buenos_Aires
85     America/Argentina/Buenos_Aires
86     America/Argentina/Buenos_Aires
87     America/Argentina/Buenos_Aires
88          America/Argentina/Cordoba
89     America/Argentina/Buenos_Aires
90          America/Argentina/Mendoza
91     America/Argentina/Buenos_Aires
92                      Europe/Madrid

In order to plot the distribution of timezones, I converted to a datetime.now obj and plotted the following: [![enter image description here][1]][1]

I want to label the x-axis with the appropriate time zone name, e.g. "America/Argentina/Buenos_Aires", or "Europe/Madrid", etc. But can't figure out how to match the datetime.now object to the timezone name. The column that's been converted to datetime.now objs looks like this:

0      2021-04-01 16:41:51.302270+02:00
2      2021-04-01 11:41:51.302270-03:00
3      2021-04-01 11:41:51.302270-03:00
4      2021-04-01 11:41:51.302270-03:00
10     2021-04-01 11:41:51.302270-03:00

Looping through a unique list of timezones looks like:

timezones = timezone_df['Timezone'].unique()
for tz in timezones:
    print(tz)

I get:

Europe/Madrid
America/Argentina/Buenos_Aires
America/Buenos_Aires
America/Montevideo
America/Argentina/Cordoba
Pacific/Auckland
America/Argentina/Tucuman
Europe/Lisbon
America/Argentina/Mendoza
America/Argentina/Salta
America/Santiago
America/New_York
America/Los_Angeles
America/Chicago
Poland
Europe/Rome
America/Cayman
America/Bogota
America/Argentina/Jujuy
America/Mexico_City
Asia/Aqtobe
America/Sao_Paulo
Europe/Berlin
Brazil/East
America/Cordoba

I've tried using strftime, mdates.DateFormatter, and tzname per the datetime documentation (https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes ) but nothing seems to be working.

Is there a way to put the list of strings ( so in this format -> "America/Sao_Paulo", "Europe/Berlin", "Brazil/East", "America/Cordoba" ) in order by UTC +01:00, +02:00, etc.? So in the following order:

(UTC +00:00) Western European Time',
'+01:00' => '(UTC +01:00) Central European Time',
'+02:00' => '(UTC +02:00) Eastern European Time',
'+03:00' => '(UTC +03:00) Further-Eastern European Time',
'+04:00' => '(UTC +04:00) Gulf Standard Time',
'+05:00' => '(UTC +05:00) Pakistan Standard Time',
'+05:30' => '(UTC +05:30) Indian Standard Time',
'+05:45' => '(UTC +05:45) Nepal Time',
'+06:00' => '(UTC +06:00)
```

I'm plotting it like this because I want the timezones to appear on the x-axis as they would if you were follow them on a map from west to east. Doing this:
[![enter image description here][2]][2]
means the timezones are ordered by distribution and not by how they'd appear on a map.


  [1]: https://i.sstatic.net/cxYnj.png
  [2]: https://i.sstatic.net/EPm8Q.png

Upvotes: 1

Views: 148

Answers (1)

FObersteiner
FObersteiner

Reputation: 25564

You can sort the df based on utcoffset (docs). An example to give you the idea:

import pandas as pd
# some sample data...
df = pd.DataFrame({'tzname': ["Europe/Madrid",
                              "America/Argentina/Buenos_Aires",
                              "Europe/London",
                              "America/Los_Angeles"]})

df['dt'] = df['tzname'].apply(lambda t: pd.Timestamp.now(t))
df['utcoffset'] = df['dt'].apply(lambda t: t.utcoffset())
df = df.sort_values('utcoffset', ascending=True).reset_index()

# df
# Out[87]: 
#    index  ...         utcoffset
# 0      3  ... -1 days +17:00:00
# 1      1  ... -1 days +21:00:00
# 2      2  ...   0 days 01:00:00
# 3      0  ...   0 days 02:00:00

# [4 rows x 4 columns]

Note that UTC offsets aren't constant for all time zones over the year since some have DST. So the order might be different during different periods of the year.

Now you can plot, e.g. like

# a quick way to adjust the margins:    
import matplotlib.pyplot as plt
plt.subplots_adjust(top=0.95, bottom=0.4, left=0.14, right=0.94)
title = f"UTC offsets on {pd.Timestamp.now('UTC').date().isoformat()} UTC"
ax = (df['utcoffset'].dt.total_seconds()//3600).plot(title=title, 
                                                     marker='*', markersize=8, 
                                                     color='b', linestyle='')
ax.set_ylabel('UTC offset [h]')
ax.set_xticks(df.index)
ax.set_xticklabels(df['tzname'], rotation=45)
ax.xaxis.grid()
ax.yaxis.grid()

enter image description here

Upvotes: 1

Related Questions