Find unknown tag containing given text

Question

My HTML is like :


  
    mytext
  
  
    some random text

I want to find all tags containing "text" & their corresponding classes. In this case, I want:

span, "dfsdf"
h1, null

Next, I want to be able to navigate through the returned tags. For example, find the div parent tag & respective classes of all the returned tags.

If I execute the following

soupx.find_all(text=re.compile(".*text.*"))

it simply returns the text part of the tags:

['mytext', ' some random text']

Please help.

Jack Fleeting · Accepted Answer

You are probably looking for something along these lines:

ts = soup.find_all(text=re.compile(".*text.*"))
for t in ts:
    if len(t.parent.attrs)>0:
        for k in t.parent.attrs.keys():
            print(t.parent.name,t.parent.attrs[k][0])
    else:
        print(t.parent.name,"null")

Output:

span dfsdf
h1 null

Find unknown tag containing given text

Answers (2)

Related Questions