Add column to Pandas DataFrame created by function?

Question

There is a csv file with following urls inside:

1;https://www.one.de 
2;https://www.two.de 
3;https://www.three.de
4;https://www.four.de
5;https://www.five.de

Then I load it to a pandas dataframe df.

cols = ['nr','url']
df = pd.read_csv("listing.csv", sep=';', encoding = "utf8", dtype=str, names=cols)

Then I like to add another col 'domain_name' corresponding to the nr.

def takedn(url):
    m = urlsplit(url)
    return m.netloc.split('.')[-2]

df['domain_name'] = takedn(df['url'].all())
print(df.head())

But it takes the last domain_name for all nr's.

Output:
  nr                   url domain_name
0  1    https://www.one.de        five
1  2    https://www.two.de        five
2  3  https://www.three.de        five
3  4   https://www.four.de        five
4  5   https://www.five.de        five

I try this to learn vectorizing. It will not work as I think. First line the domain_name should be one, second two and so on.

Ynjxsjmh · Accepted Answer

To operate on element, you can use apply().

def takedn(url):
    m = urlsplit(url)
    return m.netloc.split('.')[-2]

df['domain_name'] = df['url'].apply(takedn)

Add column to Pandas DataFrame created by function?

Answers (2)

Related Questions