Reputation: 5907
I have a list of strings
query_var = ["VENUE_CITY_NAME == 'Bangalore' & EVENT_GENRE == 'ROMANCE' & count_EVENT_GENRE >= 1","VENUE_CITY_NAME == 'Jamshedpur' & EVENT_GENRE == 'HORROR' & count_EVENT_GENRE >= 1"]
len(query_var) #o/p 2
I want to modify this list to get
query_var = ["df['VENUE_CITY_NAME'] == 'Bangalore' & df['EVENT_GENRE'] == 'ROMANCE' & df['count_EVENT_GENRE'] >= 1","df['VENUE_CITY_NAME'] == 'Jamshedpur' & df['EVENT_GENRE'] == 'HORROR' & df['count_EVENT_GENRE'] >= 1"]
This is my attempt:
for res in query_var:
res = [x for x in re.split('[&)]',res)]
print(res)
res = [x.strip() for x in res]
print(res)
res = [d.replace(d.split(' ', 1)[0], "df['"+d.split(' ', 1)[0]+"']") for d in res]
print(res)
which produces the output:
["VENUE_CITY_NAME == 'Bangalore' ", " EVENT_GENRE == 'ROMANCE' ", ' count_EVENT_GENRE >= 1']
["VENUE_CITY_NAME == 'Bangalore'", "EVENT_GENRE == 'ROMANCE'", 'count_EVENT_GENRE >= 1']
["df['VENUE_CITY_NAME'] == 'Bangalore'", "df['EVENT_GENRE'] == 'ROMANCE'", "df['count_EVENT_GENRE'] >= 1"]
["VENUE_CITY_NAME == 'Jamshedpur' ", " EVENT_GENRE == 'HORROR' ", ' count_EVENT_GENRE >= 1']
["VENUE_CITY_NAME == 'Jamshedpur'", "EVENT_GENRE == 'HORROR'", 'count_EVENT_GENRE >= 1']
["df['VENUE_CITY_NAME'] == 'Jamshedpur'", "df['EVENT_GENRE'] == 'HORROR'", "df['count_EVENT_GENRE'] >= 1"]
AS expected, but when I print query_var
it was not changed
query_var
Out[47]:
["VENUE_CITY_NAME == 'Bangalore' & EVENT_GENRE == 'ROMANCE' & count_EVENT_GENRE >= 1","VENUE_CITY_NAME == 'Jamshedpur' & EVENT_GENRE == 'HORROR' & count_EVENT_GENRE >= 1"]
As you can see my code does not produce the desired output. Is there a better way, for example with a list comprehension?
Upvotes: 2
Views: 81
Reputation: 78750
Heres a regex/list comprehension solution:
>>> [re.sub('(\w+)\s*(==|>=)', r"df['\1'] \2", s) for s in query_var]
["df['VENUE_CITY_NAME'] == 'Bangalore' & df['EVENT_GENRE'] == 'ROMANCE' & df['count_EVENT_GENRE'] >= 1", "df['VENUE_CITY_NAME'] == 'Jamshedpur' & df['EVENT_GENRE'] == 'HORROR' & df['count_EVENT_GENRE'] >= 1"]
Adjust it as needed for more general data, i.e. permitting '<=', for example.
edit in response to the comment:
[re.sub('(\w+)(\s*(==|>=).*?)(\s*&|$)', r"(df['\1']\2)\4", s) for s in query_var]
Upvotes: 3