Reputation: 525
Trying to format data from ics calendar file to any outpu such as json or even python print()
. Looking for good ways to replace special characters without losing readability and having ascii-like characters. Examples below. Any tips?
Summary field value in ics file
FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race
Summary key value in json file
FORMULA 1 HEINEKEN GRANDE PR\u00c3\u0089MIO DE PORTUGAL 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON \u00c3\u0096STERREICH 2021 - Race
Sample code to reproduce problem
import requests
import json
from icalendar import Calendar
## LOGIC HERE ##
def format_text(text):
text = str(text)
return text
url = "http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics"
res = requests.get(url)
calendar = Calendar.from_ical(res.text)
events = [
{
"id": event["UID"].split("@")[-1].strip(),
"startTime": event["DTSTART"].dt.strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3],
"summary": format_text(event["SUMMARY"])
} for event in calendar.walk("VEVENT") if str(event["UID"]).split("@")[0].startswith("Race")]
with open("events.json", "w") as f:
json.dump(events, f, indent=2)
Upvotes: 0
Views: 277
Reputation: 177971
The data for the .ics file should not be decoded, but passed directly to .from_ical
. Use res.content
instead. Then Calendar
generates the data decoded correctly as UTF-8 (probably part of the .ICS spec) and print
can print Unicode strings correctly. For the JSON, write with utf8
encoding and ensure_ascii=False
as @JosefZ recommended to see it correctly as well:
import requests
import json
from icalendar import Calendar
url = 'http://www.formula1.com/calendar/Formula_1_Official_Calendar.ics'
res = requests.get(url)
calendar = Calendar.from_ical(res.content)
events = [
{
'id': event['UID'].split('@')[-1].strip(),
'startTime': event['DTSTART'].dt.strftime('%Y-%m-%dT%H:%M:%S.%f')[:-3],
'summary': event['SUMMARY']
} for event in calendar.walk('VEVENT') if str(event['UID']).split('@')[0].startswith('Race')]
for event in events:
print(event['summary'])
with open('events.json', 'w', encoding='utf8') as f:
json.dump(events, f, ensure_ascii=False, indent=2)
print
Output:
FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race
FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race
FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race
FORMULA 1 GRAND PRIX DE MONACO 2021 - Race
FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race
FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race
FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race
FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race
FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race
FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race
FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race
FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race
FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race
FORMULA 1 JAPANESE GRAND PRIX 2021 - Race
FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race
FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race
FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race
FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race
FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race
FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race
events.json:
[
{
"id": "1064",
"startTime": "2021-03-28T16:00:00.000",
"summary": "FORMULA 1 GULF AIR BAHRAIN GRAND PRIX 2021 - Race"
},
{
"id": "1065",
"startTime": "2021-04-18T14:00:00.000",
"summary": "FORMULA 1 PIRELLI GRAN PREMIO DEL MADE IN ITALY E DELL'EMILIA ROMAGNA 2021 - Race"
},
{
"id": "1066",
"startTime": "2021-05-02T15:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRANDE PRÉMIO DE PORTUGAL 2021 - Race"
},
{
"id": "1086",
"startTime": "2021-05-09T14:00:00.000",
"summary": "FORMULA 1 ARAMCO GRAN PREMIO DE ESPAÑA 2021 - Race"
},
{
"id": "1067",
"startTime": "2021-05-23T14:00:00.000",
"summary": "FORMULA 1 GRAND PRIX DE MONACO 2021 - Race"
},
{
"id": "1068",
"startTime": "2021-06-06T13:00:00.000",
"summary": "FORMULA 1 AZERBAIJAN GRAND PRIX 2021 - Race"
},
{
"id": "1069",
"startTime": "2021-06-13T19:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRAND PRIX DU CANADA 2021 - Race"
},
{
"id": "1070",
"startTime": "2021-06-27T14:00:00.000",
"summary": "FORMULA 1 EMIRATES GRAND PRIX DE FRANCE 2021 - Race"
},
{
"id": "1071",
"startTime": "2021-07-04T14:00:00.000",
"summary": "FORMULA 1 MYWORLD GROSSER PREIS VON ÖSTERREICH 2021 - Race"
},
{
"id": "1072",
"startTime": "2021-07-18T15:00:00.000",
"summary": "FORMULA 1 PIRELLI BRITISH GRAND PRIX 2021 - Race"
},
{
"id": "1073",
"startTime": "2021-08-01T14:00:00.000",
"summary": "FORMULA 1 MAGYAR NAGYDÍJ 2021 - Race"
},
{
"id": "1074",
"startTime": "2021-08-29T14:00:00.000",
"summary": "FORMULA 1 ROLEX BELGIAN GRAND PRIX 2021 - Race"
},
{
"id": "1075",
"startTime": "2021-09-05T14:00:00.000",
"summary": "FORMULA 1 HEINEKEN DUTCH GRAND PRIX 2021 - Race"
},
{
"id": "1076",
"startTime": "2021-09-12T14:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRAN PREMIO D’ITALIA 2021 - Race"
},
{
"id": "1077",
"startTime": "2021-09-26T13:00:00.000",
"summary": "FORMULA 1 VTB RUSSIAN GRAND PRIX 2021 - Race"
},
{
"id": "1078",
"startTime": "2021-10-03T13:00:00.000",
"summary": "FORMULA 1 SINGAPORE AIRLINES SINGAPORE GRAND PRIX 2021 - Race"
},
{
"id": "1079",
"startTime": "2021-10-10T06:00:00.000",
"summary": "FORMULA 1 JAPANESE GRAND PRIX 2021 - Race"
},
{
"id": "1080",
"startTime": "2021-10-24T20:00:00.000",
"summary": "FORMULA 1 ARAMCO UNITED STATES GRAND PRIX 2021 - Race"
},
{
"id": "1081",
"startTime": "2021-10-31T19:00:00.000",
"summary": "FORMULA 1 GRAN PREMIO DE LA CIUDAD DE MÉXICO 2021 - Race"
},
{
"id": "1082",
"startTime": "2021-11-07T17:00:00.000",
"summary": "FORMULA 1 HEINEKEN GRANDE PRÊMIO DE SÃO PAULO 2021 - Race"
},
{
"id": "1083",
"startTime": "2021-11-21T06:00:00.000",
"summary": "FORMULA 1 ROLEX AUSTRALIAN GRAND PRIX 2021 - Race"
},
{
"id": "1085",
"startTime": "2021-12-05T16:00:00.000",
"summary": "FORMULA 1 SAUDI ARABIAN GRAND PRIX 2021 - Race"
},
{
"id": "1084",
"startTime": "2021-12-12T13:00:00.000",
"summary": "FORMULA 1 ETIHAD AIRWAYS ABU DHABI GRAND PRIX 2021 - Race"
}
]
Upvotes: 1
Reputation: 30153
with open("events.json", mode="w", encoding="utf-8") as f:
json.dump(events, f, indent=2, ensure_ascii=False)
From json.dump
docs:
If
ensure_ascii
is true (the default), the output is guaranteed to have all incoming non-ASCII characters escaped. Ifensure_ascii
is false, these characters will be output as-is.
Used encoding="utf-8"
in open
as the default encoding is platform dependent (whatever locale.getpreferredencoding()
returns).
Upvotes: 1