Reputation:
How can I select part of string in a data frame's column satisfying the following conditions?
www
, then extract word after the first .
www
, then extract word after //
Example:
Column
https://www.test.com
https://train.co.uk
In the first case I should extract the word after the first full stop, i.e. test
; in the second case, I should consider the first word after //
, i.e. train
Upvotes: 0
Views: 2111
Reputation: 150825
Another option is to use regex with non-caption group:
df.Column.str.extract('//(?:www\.)?([^\.]*)')
Output:
0
0 test
1 train
Upvotes: 1
Reputation: 323396
This is try to get the domain
import pandas as pd
import tldextract
df['domain'] = df.Column.map(lambda x : tldextract.extract(x).domain)
Upvotes: 1