Reputation: 93
“I have a dataframe called products and one columns has the url that I'm trying to splits.
ID URL
101015 https://defrost/kskswn_sourcefsr_3rt=forst23=rt5s_campaign10=new_york65007
"
I'm trying to see if it is possible to split the url in multiple part in order to understand where people click on the link and which campaign they click
The goal is to have anything between the equal sign '=' in its own columns
I try this code and a bunch of others
def parse_url(files_1):
for index, row in files_1.iterrows():
parsed = urlparse(str(row)).query
parsed = parse_qs(parsed)
for k, v in parsed.items():
df.loc[index, k.strip()] = v[0].strip().lower()
return files_1
parse_url(files_2['CLICK_URL'])
The goal is to have something like this:
ID COL1 COL2 COL3
101015 https://defrost/kskswn_sourcefsr_campaign_3rt rt5s_campaign10 new_york65007
Upvotes: 1
Views: 599
Reputation: 23099
I'm not 100% what your code is doing but IIUC, you can pass your URL
column into a list of lists and iterate over it to assign to columns :
data = df['URL'].str.split('=').tolist() # pass urls to list with split method.
for list_ in data: # access indvidual list (rows)
for number,item in enumerate(list_,start=1):
df[f'COL {number}'] = item # assign to columns
print(df)
ID URL COL 1 COL 2 COL 3 COL 4
0 101015 https://defrost/kskswn_sourcefsr_3rt=forst23=r... https://defrost/kskswn_sourcefsr_3rt forst23 rt5s_campaign10 new_york65007
Upvotes: 1