Digital Farmer
Digital Farmer

Reputation: 2127

How to add in my xpath a not contains to remove objects that have a img?

This is an example of the sitemap (https://futebolnatv.com.br/jogos-hoje/):

<tbody>
       <tr>
           <td>
               <div>
                    <b>
                       <i>
                          <img width="15" height="15" src="https://www.examplestack.com.br/static/paises/1565304830_h1fhplmdg9ahxzhygqvvpg_96x96.png"
                          > Campeonato Alemão - 2ª Divisão </i></b>
               <div>
               <div>
               <style>
               <xyx>
                    <xyx>
                         <b> ONEFOOTBALL (APP) </b>

enter image description here

If I use this XPATH:

=IMPORTXML("https://futebolnatv.com.br/jogos-hoje/","//tbody/tr/td//b")

I will retrieve two values:

Campeonato Alemão - 2ª Divisão
ONEFOOTBALL (APP)

The XPATH models that I tried to use to get only the ONEFOOTBALL (APP) value were these:

=IMPORTXML("https://futebolnatv.com.br/jogos-hoje/","//tbody/tr/td//b[not(contains(#src,'www.'))]")
=IMPORTXML("https://futebolnatv.com.br/jogos-hoje/","//tbody/tr/td//b[not(contains(@i,''))]")
=IMPORTXML("https://futebolnatv.com.br/jogos-hoje/","//tbody/tr/td//b[not(contains(@img,''))]")

I can't use the path working with <xyx> because every page this value changes, it could be <xwtrui> or something else random.

How could I manage to collect this value ONEFOOTBALL (APP) on second <b> without ending up also taking the value of this first that in the example is the Campeonato Alemão - 2ª Divisão?

Upvotes: 1

Views: 28

Answers (1)

player0
player0

Reputation: 1

see:

=INDEX(REGEXREPLACE(SPLIT(INDEX(IMPORTHTML(A16, "table", 1),,2), CHAR(10)), " \*|\* |\*", ),,4)

enter image description here

Upvotes: 1

Related Questions