Robopatchers
Robopatchers

Reputation: 11

Using XPATH to find text that is in a div within a div

<?xml version="1.0" encoding="UTF-8"?>

<div id="app" class="grid bg-local font-body justify-center" style="background-image: url(&quot;/img/picture.jpg&quot;);"> 
  <div data-v-f7g8b83d=" " data-fruit-code="**I WANT TO GET WHAT'S IN HERE**" class="note relative bg-background items-center select-none w-56 sm:w-64 pb-4" style="transform: rotate(6deg);"/>  
  <div data-v-f7g8b83d=" " data-fruit-code="**I WANT TO GET WHAT'S IN HERE 1**" class="note relative bg-background items-center select-none w-56 sm:w-64 pb-4" style="transform: rotate(6deg);"/>  
  <div data-v-f7g8b83d=" " data-fruit-code="**I WANT TO GET WHAT'S IN HERE 2**" class="note relative bg-background items-center select-none w-56 sm:w-64 pb-4" style="transform: rotate(6deg);"/>  
  <div data-v-f7g8b83d=" " data-fruit-code="**I WANT TO GET WHAT'S IN HERE 3**" class="note relative bg-background items-center select-none w-56 sm:w-64 pb-4" style="transform: rotate(6deg);"/>  
  <div data-v-f7g8b83d=" " data-fruit-code="**I WANT TO GET WHAT'S IN HERE 4**" class="note relative bg-background items-center select-none w-56 sm:w-64 pb-4" style="transform: rotate(6deg);"/> 
</div>

I'm trying to build a bot to scrape a specific website. I want to be able to get the text that's associated with "data-fruit-code".

I came up with this:

//*[@id="app"]/div[2]/div

and this:

//*[@data-fruit-code]

However, both only highlighted the entire div. I feel like I am missing something here. What can I add or how can I fix my existing XPATH command so that it only gets the "data-fruit-code" text?

I tried adding text() & word() but those did not work for me either.

Here are some of the references I used to get help with this.

https://devhints.io/xpath#class-check

https://developer.mozilla.org/en-US/docs/Web/XPath

Upvotes: 0

Views: 1905

Answers (2)

kjhughes
kjhughes

Reputation: 111726

Note that data-fruit-code is called an attribute and is selected in XPath via a @ preceding its name.

There are lots of ways to select the targeted attributes. Here are two interesting possibilities:

  1. This XPath,

    //@data-fruit-code
    

    will select all of the data-fruit-code attributes in the document.

  2. This XPath,

    //div[@id="app"]/div/@data-fruit-code
    

    will select all data-fruit-code attributes on div elements whose div parents have and id attribute value of app.

Upvotes: 1

Jack Fleeting
Jack Fleeting

Reputation: 24940

Try

//div[@data-fruit-code]/@data-fruit-code

Output

**I WANT TO GET WHAT'S IN HERE**
**I WANT TO GET WHAT'S IN HERE 1**
**I WANT TO GET WHAT'S IN HERE 2**
**I WANT TO GET WHAT'S IN HERE 3**
**I WANT TO GET WHAT'S IN HERE 4**

Upvotes: 0

Related Questions