Reputation: 2140
I would like to split the following tag <b size=5 alt=ref>
as follows:
Open tag: b
Parm: size=5
Parm: alt=ref
However, I tried the following code to split the tag as groups but it didn't work:
temp = '<b size=5 alt=ref>'
matchObj = re.search(r"(\S*)\s*(\S*)", temp)
print 'Open tag: ' + matchObj.groups()
My plan is to split the tag into groups and then print the first group as open tag and the rest as Parm. Can you please suggest any idea that helps me solving this problem?
Note that I read the tags from an html file but I mentioned here an example of open tag and I showed the part of the code that I am stuck with.
Thanks
Upvotes: 0
Views: 215
Reputation: 1852
>>> import re
>>> temp = '<b size=5 alt=ref>'
>>> resList = re.findall("\S+", temp.replace("<","").replace(">",""))
>>> myDict = {}
>>> myDict["Open tag:"] = [resList[0]]
>>> myDict["Parm:"] = resList[1:]
>>> myDict
{'Open tag:': ['b'], 'Parm:': ['size=5', 'alt=ref']}
Upvotes: 0
Reputation: 5658
tag_names = ["Open tag:","Parm:","Parm:"]
import re
# split on <,>,white space, and remove empty strings at
# the start and at the end of the resulting list.
tags = re.split(r'[<> ]','<b size=5 alt=ref>')[1:-1]
# zip tag_names list and with list of tags
print(list(zip(tag_names, tags)))
[('Open tag:', 'b'), ('Parm:', 'size=5'), ('Parm:', 'alt=ref')]
Upvotes: 2