Reputation: 149
I'm using Xpath in Google docs to get the text inside <div>
.
I want to save the text inside <div id="job_description">
in one cell of Google doc spreadsheet, but it shows each <div>
in separate cell.
<div id="job_description">
<div>
<strong>
Basic Purpose:
</strong>
<br></br>
</div>
<div>
Work closely with developers, product owners and Q…
<br></br>
</div>
<div>
The Test Analyst is accountable for the developmen…
<br></br>
</div>
<div>
<strong>
Duties and Responsibilities:
</strong>
</div>
<ul>
<li></li>
<li></li>
</ul>
<div>
<strong>
Requirements:
</strong>
<br></br>
</div>
<ul>
<li></li>
<li></li>
</ul>
</div>
Image: https://i.sstatic.net/K0mAY.png
and this is the code I wrote:
=IMPORTXML(E4,"//div[@id='job_description']")
May you help me to put all of the text (including <div>
<ul>
...) inside the <div id="job_description">
in only one cell ?
Upvotes: 1
Views: 24600
Reputation: 5509
you can also use regexreplace to remove the line breaks, with
=REGEXREPLACE(IMPORTXML(E4,"//div[@id='job_description']"),"\n","")
this should wrap it all into one cell for you.
Upvotes: 1
Reputation: 22617
Using JOIN is a good start, but you can make it a single operation.
You did not show the URL to the page you're importing, so I can only give you an example with another page. For instance, if you are importing www.w3.org and looking for a div
where @class='event closed expand_block'
, use
=JOIN(CHAR(10),IMPORTXML("http://www.w3.org/","//div[@class='event closed expand_block']//text()"))
Notice that I also modified the XPath expression: //text()
makes sure only descendant text nodes are retrieved, that is, all the text.
EDIT: Responding to your comment:
May I know what is CHAR(10) referring to?
Yes, of course. CHAR
returns a character and takes a number as input. In the case of CHAR(10)
, a newline character is returned (I assume because of
).
In the formula, CHAR(10)
is used as the first argument of JOIN
, which is the delimiter of the objects that are to be joined.
Upvotes: 6
Reputation: 149
For now I found a solution , I'll put it here so that others can know my answer, but if there is any other solution please let us know
I used JOIN
to put the separate cells (L3:X3)
into one single cell
=Trim(JOIN(" ",L3:X3))
Upvotes: 1