user3704597
user3704597

Reputation: 27

Extract subdomains from strings using split in python

I have a python function that outputs/prints the following:

['CN=*.something1.net', 'CN=*.something2.net', 'CN=*.something4.net', 'CN=something6.net', 'CN=something8.net', 'CN=intranet.something89.net', 'CN=intranet.something111.net, 'OU=PositiveSSL Multi-Domain, CN=something99.net', 'OU=Domain Control Validated, CN=intranet.something66.net',...etc] 

I am trying to extract all the sub-domain names between "CN=" and the single quotation mark, using the split() method in python. I've tried split('CN=', 1)[0] but i can't get my head around on how to use it

what i want to print out:

['something1.net', 'something2.net', 'something4.net', 'intranet.something111.net', 'intranet.something66.net']

Any help would be gratefully appreciated :-)

Thanks, MJ

Upvotes: 0

Views: 302

Answers (3)

RoadRunner
RoadRunner

Reputation: 26315

If you just want to strip CN= from each string, you can strip from the left with str.lstrip():

subdomains = [item.lstrip("CN=") for item in my_list]

Upvotes: 0

yoss
yoss

Reputation: 97

Here is a more readable code that extracts the subdomains in more clean or better way; @tzaman code didn't really give me subdomains.

myDirtyDomains = ['CN=*.something1.net', 'CN=*.something2.net', 'CN=*.something4.net',\
'CN=something6.net', 'CN=something8.net', 'CN=intranet.something89.net',\
 'CN=intranet.something111.net', 'OU=PositiveSSL Multi-Domain', \
 'CN=something99.net', 'OU=Domain Control Validated', 'CN=intranet.something66.net']

cleanSubDomainsList = []

for item in myDirtyDomains:
    countDots = item.count(".")
    if countDots == 2:
        host = item.partition('CN=')[2]
        subdomain = host.partition('.')[0]
        cleanSubDomainsList.append(subdomain)

print(cleanSubDomainsList)

Upvotes: 0

tzaman
tzaman

Reputation: 47790

The last single quote is indicating the end of the string, so it seems you just want everything after CN=. Assuming that's the case, you can just chop off the first three characters:

subdomains = [item[3:] for item in my_list if item.startswith('CN=')]

Upvotes: 1

Related Questions