Reputation: 6799
I have this string coming in from a HTTP request:
s = "{'id': 81, 'udate': datetime.datetime(2021, 2, 3, 7, 20, 5, 369376, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None)), 'cdate': datetime.datetime(2021, 3, 11, 9, 50, 0, 984521, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None)), 'screen_name': 'Hellas Utrecht', 'follower_id': '310489102', 'is_unfollow': True, 'user_id': 8, 'follower_description': 'Atletiekvereniging Hellas Utrecht heeft zo’n 1600 leden, verdeeld over de afd. jeugd-, weg- en baanatletiek, recreatie en triathlon.', 'follower_favourites_count': '675', 'follower_followers_count': '741', 'follower_listed_count': '9', 'follower_location': 'Utrecht', 'follower_screen_name': 'HellasUtrecht', 'follower_statuses_count': '904'}"
I need to convert it to a dictionary but the keys udate
and cdate
are preventing me from doing so as they are in the form of a function i.e. datetime.datetime(2021, 2, 3, 7, 20, 5, 369376, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None))
which throws an error malformed node or string: <_ast.Call object at 0x109346890>
.
Currently my solution is to just manually convert the string to a dictionary by excluding the strings by indexing(but this excludes the id
):
import ast
ast.literal_eval('{'+s[250:])
>>>{'screen_name': 'Hellas Utrecht', 'follower_id': '310489102', 'is_unfollow': True, 'user_id': 8, 'follower_description': 'Atletiekvereniging Hellas Utrecht heeft zo’n 1600 leden, verdeeld over de afd. jeugd-, weg- en baanatletiek, recreatie en triathlon.', 'follower_favourites_count': '675', 'follower_followers_count': '741', 'follower_listed_count': '9', 'follower_location': 'Utrecht', 'follower_screen_name': 'HellasUtrecht', 'follower_statuses_count': '904'}
But I am wondering if there is a better way to do so using regex? I just need the keys udate
,cdate
and their values to be removed.
Expected output:
"{'id': 81, 'screen_name': 'Hellas Utrecht', 'follower_id': '310489102', 'is_unfollow': True, 'user_id': 8, 'follower_description': 'Atletiekvereniging Hellas Utrecht heeft zo’n 1600 leden, verdeeld over de afd. jeugd-, weg- en baanatletiek, recreatie en triathlon.', 'follower_favourites_count': '675', 'follower_followers_count': '741', 'follower_listed_count': '9', 'follower_location': 'Utrecht', 'follower_screen_name': 'HellasUtrecht', 'follower_statuses_count': '904'}"
Upvotes: 2
Views: 56
Reputation: 626738
You can remove these two keys with their values using
s = re.sub(r"'[uc]date':\s*datetime\.datetime\([^()]+\([^()]*\)\)\s*,?", '', s)
See the regex demo. Details:
'[uc]date':
- 'udate':
or 'cdate':
\s*
- zero or more whitespacesdatetime\.datetime\(
- a datetime.datetime(
string[^()]+
- zero or more chars other than (
and )
\(
- a (
char[^()]+
- one or more chars other than (
and )
\)\)
- a ))
string\s*
- zero or more whitespaces,?
- an optional comma.Then, you may use ast.literal_eval
on the result, see the Python demo:
import re, ast
s = "{'id': 81, 'udate': datetime.datetime(2021, 2, 3, 7, 20, 5, 369376, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None)), 'cdate': datetime.datetime(2021, 3, 11, 9, 50, 0, 984521, tzinfo=psycopg2.tz.FixedOffsetTimezone(offset=0, name=None)), 'screen_name': 'Hellas Utrecht', 'follower_id': '310489102', 'is_unfollow': True, 'user_id': 8, 'follower_description': 'Atletiekvereniging Hellas Utrecht heeft zo’n 1600 leden, verdeeld over de afd. jeugd-, weg- en baanatletiek, recreatie en triathlon.', 'follower_favourites_count': '675', 'follower_followers_count': '741', 'follower_listed_count': '9', 'follower_location': 'Utrecht', 'follower_screen_name': 'HellasUtrecht', 'follower_statuses_count': '904'}"
s = re.sub(r"'[uc]date':\s*datetime\.datetime\([^()]+\([^()]*\)\)\s*,?", '', s)
print( ast.literal_eval(s) )
=> {'id': 81, 'screen_name': 'Hellas Utrecht', 'follower_id': '310489102', 'is_unfollow': True, 'user_id': 8, 'follower_description': 'Atletiekvereniging Hellas Utrecht heeft zo’n 1600 leden, verdeeld over de afd. jeugd-, weg- en baanatletiek, recreatie en triathlon.', 'follower_favourites_count': '675', 'follower_followers_count': '741', 'follower_listed_count': '9', 'follower_location': 'Utrecht', 'follower_screen_name': 'HellasUtrecht', 'follower_statuses_count': '904'}
Upvotes: 1