cesco
cesco

Reputation: 93

dataframe - extracting URL in pandas and creating new columns out of it

“I have a dataframe called products and one columns has the url that I'm trying to splits.

  ID         URL
  101015    https://defrost/kskswn_sourcefsr_3rt=forst23=rt5s_campaign10=new_york65007
  "

I'm trying to see if it is possible to split the url in multiple part in order to understand where people click on the link and which campaign they click

The goal is to have anything between the equal sign '=' in its own columns

I try this code and a bunch of others 

 def parse_url(files_1):
   for index, row in files_1.iterrows():
     parsed = urlparse(str(row)).query
     parsed = parse_qs(parsed)
     for k, v in parsed.items():

        df.loc[index, k.strip()] = v[0].strip().lower()
  return files_1

 parse_url(files_2['CLICK_URL'])

The goal is to have something like this:
ID       COL1                                            COL2              COL3

101015   https://defrost/kskswn_sourcefsr_campaign_3rt  rt5s_campaign10  new_york65007

Upvotes: 1

Views: 599

Answers (1)

Umar.H
Umar.H

Reputation: 23099

I'm not 100% what your code is doing but IIUC, you can pass your URL column into a list of lists and iterate over it to assign to columns :

data = df['URL'].str.split('=').tolist() # pass urls to list with split method.

Iterate over our list with enumerate,

for list_ in data: # access indvidual list (rows) 
    for number,item in enumerate(list_,start=1): 
        df[f'COL {number}'] = item # assign to columns

Result.

print(df)
    ID  URL COL 1   COL 2   COL 3   COL 4
0   101015  https://defrost/kskswn_sourcefsr_3rt=forst23=r...   https://defrost/kskswn_sourcefsr_3rt    forst23 rt5s_campaign10 new_york65007

Upvotes: 1

Related Questions