Reputation: 505
So I have a pandas df column, which contains the following info:
67 America/Buenos_Aires
68 America/Argentina/Buenos_Aires
70 America/Argentina/Buenos_Aires
77 America/Argentina/Buenos_Aires
78 Pacific/Auckland
79 America/Argentina/Buenos_Aires
80 America/Argentina/Tucuman
81 Europe/Lisbon
82 America/Argentina/Buenos_Aires
83 America/Argentina/Buenos_Aires
84 America/Argentina/Buenos_Aires
85 America/Argentina/Buenos_Aires
86 America/Argentina/Buenos_Aires
87 America/Argentina/Buenos_Aires
88 America/Argentina/Cordoba
89 America/Argentina/Buenos_Aires
90 America/Argentina/Mendoza
91 America/Argentina/Buenos_Aires
92 Europe/Madrid
In order to plot the distribution of timezones, I converted to a datetime.now obj and plotted the following: [![enter image description here][1]][1]
I want to label the x-axis with the appropriate time zone name, e.g. "America/Argentina/Buenos_Aires", or "Europe/Madrid", etc. But can't figure out how to match the datetime.now object to the timezone name. The column that's been converted to datetime.now objs looks like this:
0 2021-04-01 16:41:51.302270+02:00
2 2021-04-01 11:41:51.302270-03:00
3 2021-04-01 11:41:51.302270-03:00
4 2021-04-01 11:41:51.302270-03:00
10 2021-04-01 11:41:51.302270-03:00
Looping through a unique list of timezones looks like:
timezones = timezone_df['Timezone'].unique()
for tz in timezones:
print(tz)
I get:
Europe/Madrid
America/Argentina/Buenos_Aires
America/Buenos_Aires
America/Montevideo
America/Argentina/Cordoba
Pacific/Auckland
America/Argentina/Tucuman
Europe/Lisbon
America/Argentina/Mendoza
America/Argentina/Salta
America/Santiago
America/New_York
America/Los_Angeles
America/Chicago
Poland
Europe/Rome
America/Cayman
America/Bogota
America/Argentina/Jujuy
America/Mexico_City
Asia/Aqtobe
America/Sao_Paulo
Europe/Berlin
Brazil/East
America/Cordoba
I've tried using strftime, mdates.DateFormatter, and tzname per the datetime documentation (https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes ) but nothing seems to be working.
Is there a way to put the list of strings ( so in this format -> "America/Sao_Paulo", "Europe/Berlin", "Brazil/East", "America/Cordoba" ) in order by UTC +01:00, +02:00, etc.? So in the following order:
(UTC +00:00) Western European Time',
'+01:00' => '(UTC +01:00) Central European Time',
'+02:00' => '(UTC +02:00) Eastern European Time',
'+03:00' => '(UTC +03:00) Further-Eastern European Time',
'+04:00' => '(UTC +04:00) Gulf Standard Time',
'+05:00' => '(UTC +05:00) Pakistan Standard Time',
'+05:30' => '(UTC +05:30) Indian Standard Time',
'+05:45' => '(UTC +05:45) Nepal Time',
'+06:00' => '(UTC +06:00)
```
I'm plotting it like this because I want the timezones to appear on the x-axis as they would if you were follow them on a map from west to east. Doing this:
[![enter image description here][2]][2]
means the timezones are ordered by distribution and not by how they'd appear on a map.
[1]: https://i.sstatic.net/cxYnj.png
[2]: https://i.sstatic.net/EPm8Q.png
Upvotes: 1
Views: 148
Reputation: 25564
You can sort the df based on utcoffset
(docs). An example to give you the idea:
import pandas as pd
# some sample data...
df = pd.DataFrame({'tzname': ["Europe/Madrid",
"America/Argentina/Buenos_Aires",
"Europe/London",
"America/Los_Angeles"]})
df['dt'] = df['tzname'].apply(lambda t: pd.Timestamp.now(t))
df['utcoffset'] = df['dt'].apply(lambda t: t.utcoffset())
df = df.sort_values('utcoffset', ascending=True).reset_index()
# df
# Out[87]:
# index ... utcoffset
# 0 3 ... -1 days +17:00:00
# 1 1 ... -1 days +21:00:00
# 2 2 ... 0 days 01:00:00
# 3 0 ... 0 days 02:00:00
# [4 rows x 4 columns]
Note that UTC offsets aren't constant for all time zones over the year since some have DST. So the order might be different during different periods of the year.
Now you can plot, e.g. like
# a quick way to adjust the margins:
import matplotlib.pyplot as plt
plt.subplots_adjust(top=0.95, bottom=0.4, left=0.14, right=0.94)
title = f"UTC offsets on {pd.Timestamp.now('UTC').date().isoformat()} UTC"
ax = (df['utcoffset'].dt.total_seconds()//3600).plot(title=title,
marker='*', markersize=8,
color='b', linestyle='')
ax.set_ylabel('UTC offset [h]')
ax.set_xticks(df.index)
ax.set_xticklabels(df['tzname'], rotation=45)
ax.xaxis.grid()
ax.yaxis.grid()
Upvotes: 1