Kaggle
Kaggle

Reputation: 123

How to add columns to pandas dataframe based on dictionary keys?

I have a dataframe in pandas containing 12 columns. One column is useragent string wchich I want to extract information like os,browser and ....and add new columns to the dataframe based on those values. The column platform does not exist in current dataframe and I want to add it in place.

a   b  c       useragent
1   3  5   "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0)"

a   b  c       useragent              os        platform
1   3  5       same as before         windows    Null

for i in range(len(df["useragent"])):
    try :
        df['platform'].iloc[i] = httpagentparser.detect(df["useragent"].iloc[i])['platform']['name']
    except :
        continue

I want to add columns os and platform to the dataframe based on values from parser. The problem is first of all the first assignment after try is not executed. I put the assignment in try block because dictinories returning from parser do not have always the same keys. For example if the key os does not exist in returning dictionary the new column os for that index should be Null. How can I do the whole process in a efficient way?

Upvotes: 0

Views: 415

Answers (1)

fernandezcuesta
fernandezcuesta

Reputation: 2448

The reason why it's not working is that you cannot set on a copy of slice from a DataFrame (this warning was hidden with the try/except).

You can safely do it in one line for all rows of your dataframe with:

df['platform'] = df.apply(
    lambda k: httpagentparser.detect(k['useragent']).get('platform', {}).get('name'),
    axis=1
)

Upvotes: 1

Related Questions