How to get specific item having same class name and attributes

Question

How can I get the specific item with same Class name and attributes?

I need to get these 3 items

April 14, 2013

580

Fort Pierce, FL


Joined:
Apr 14, 2013


Messages:
580



Location:

Fort 
Pierce, FL

Verbal_Kint · Accepted Answer

this is a good starting point:

In [18]: for a in response.css('.extraUserInfo'):
    ...:     print(a.css('*::text').extract())
    ...:     print('


')
    ...:     
['
', '
', '
', '
']  # <--this (and other outputs like this) is because there is an extra `extraUserInfo` class block above the desired info block if the user has a user group picture/avatar below their username




['
', '
', 'Joined:', '
', 'Mar 24, 2013', '
', '
', '
', 'Messages:', '
', '6,747', '
', '
']




['
', '
', '
', '
']




['
', '
', 'Joined:', '
', 'Mar 24, 2013', '
', '
', '
', 'Messages:', '
', '6,747', '
', '
']




['
', '
', 'Joined:', '
', 'Apr 14, 2013', '
', '
', '
', 'Messages:', '
', '580', '
', '
', '
', 'Location:', '
', '
', 'Fort Pierce, FL', '
', '
', '
']




['
', '
', 'Joined:', '
', 'Oct 20, 2012', '
', '
', '
', 'Messages:', '
', '2,476', '
', '
', '
', 'Location:', '
', '
', 'Philadelphia, PA', '
', '
', '
']




['
', '
', 'Joined:', '
', 'Dec 11, 2012', '
', '
', '
', 'Messages:', '
', '2,938', '
', '
', '
', 'Location:', '
', '
', 'Colorado', '
', '
', '
']




['
', '
', 'Joined:', '
', 'Sep 30, 2016', '
', '
', '
', 'Messages:', '
', '833', '
', '
', '
', 'Location:', '
', '
', 'Indiana', '
', '
', '
']


...

There are many ways to approach this. A little fiddling around will get the data formatted to your liking. The approach above is only a good starting point because there are many lines with only newline character lists as outputs, thats because (it seems) that user info blocks where the user has a user-group image (like tesla of arizona) then the extraUserInfo class is also used to group that block of html. There will be better ways to group this...

Basically response.css('.extraUserInfo') will aggregate all blocks with class extraUserInfo which seems to be the blocks holding the user info you're looking for. From there extract all underlying text with the ::text pseudo selector and parse the arrays.

There is definitely a better way to approach this if you carefully look at the html structure so you are extracting it in a way that leaves you less processing work afterwards but this should get you on the right track. CSS selectors or xpath documentation should be great help.

How to get specific item having same class name and attributes

Answers (2)

Related Questions