Reputation: 21333
I'm parsing HTML and I need to get only tags with selector like div.content
.
For parsing I'm using HTMLParser. I'm so far that I get list of tags' attributes.
It looks something like this:
[('class', 'content'), ('title', 'source')]
The problem is that I don't know how to check that:
class
,content
;I know this is easy question, but I'm quite new with Python as well. Thanks in any advice!
Upvotes: 4
Views: 7064
Reputation: 1077
For check if one of the tuple element has some value you could use filter function:
tuples_list = [('class', 'content'), ('title', 'source')]
if filter(lambda a: a[0] == 'class', tuples_list):
# your code goes here
if filter(lambda a: a[1] == 'content', tuples_list):
# your code goes here
The filter gives you all tuples that match your conditions:
values = filter(lambda a: a[1] == 'content', tuples_list)
# values == [('class', 'content')]
If you are sure that they are in the same tuple:
if ('class', 'content') in tuples_list:
# your code goes here
Upvotes: 1
Reputation: 15944
It's worth noting that HTML 'class' attributes are allowed to be a space separated list of css classes. E.g., you can do <span class='green big'>...</span>
. It sounds like what you really want to know is whether a given HTML element has a specific CSS class (given a list of (attribute,value) pairs). In that case, I would use something like this:
element_attributes = [('class', 'content'), ('title', 'source')]
is_content = any((attr=='class') and ('content' in val.split())
for (attr, val) in element_attributes)
Of course, if you know for certain that all elements you care about will have only one CSS class, then sr2222's answer is better/simpler.
Upvotes: 2
Reputation: 4448
Try this:
l = [('class', 'content'), ('title', 'source')]
check = False
for item in l:
if item[0] == 'class':
check=True
print item[1]
print "List have tuple with 1st element called class: %s" check
Upvotes: 0
Reputation: 3889
1st question)
if len(list) > 1:
if list[0][0] == 'class':
return True`
2nd question)
for elem in list:
if elem[1] == 'content':
return True
note: from what I understood, the 2nd question means that if ONE of the 2nd tuple values is 'content', you want true.
Upvotes: 0
Reputation: 26160
When looping through your elements:
if ('class', 'content') in element_attributes:
#do stuff
Upvotes: 9
Reputation: 213005
l = [('class', 'content'), ('title', 'source')]
('class', 'content') in l
returns True, because there is at least one tuple with 'class' as first and 'content' as second element.
You can now use it:
if ('class', 'content') in l:
# do something
Upvotes: 2