M Faizan Farooq
M Faizan Farooq

Reputation: 369

Import xml for query of Google Sheet

I am getting no data when I apply class using importxml, I have reviewed the source code by using right click page source option and sentencs are there, I used ctrl+f to find the line and tried the xpath as mentioned below but it gives #NA, since new to import XML google sheets query, I need the result that is attached in the snap below for cell E55 the Japanese and English sentence in cell I55 & J55.

These sentences came from the following search the first sentence is used always in Japnese & English, here is the snap

I have tried a few combinations but in shows #NA;

  1. =IMPORTXML("https://tangorin.com/sentences?search=時","//div[@class='s-jp']")

  2. =IMPORTXML("https://tangorin.com/sentences?search=時","//div[@class='entry entry-border sentences undefined ']/dd[@class='s-jp']")

Can anyone please assist as can I cant copy for 2000 letters both Japnese & English sentences which I need for translation class, much appreciated thanks

Upvotes: 1

Views: 280

Answers (1)

E.Wiest
E.Wiest

Reputation: 5915

Output :

JapEng

3 formulas to get the data (url is in cell D2).

For Kana in D4 :

=TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN("",TRUE,IMPORTXML(D2,"//div[@class='results-main-container']//dt//text()[not(parent::rt)]|//dd[@class='s-en']/@class")),"s-en","¤"),"¤"))

For Romaji in E4 :

=ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN(" ",TRUE,IMPORTXML(D2,"//div[@class='results-main-container']//dt//rt|//dd[@class='s-en']/@class")),"s-en","¤"),"¤"))))

For English in F4 :

=ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(REGEXREPLACE(TEXTJOIN(" ",FALSE,IMPORTXML(D2,"//div[@class='results-main-container']//dd/span|(//dd[@class='s-en'])[not(position()=1)]/@class")),"(\w)(  )(\w)","$1'$3"),"s-en","¤")," , ",", ")," . ",".")," - ","-")," ( "," (")," ) ",") ")," !","!")," ?","?"),"¤"))))

To limit the array to 1 result, you can use something like :

=INDEX(one of the preceding formulas,1,1)

Output :

Index

EDIT : If you need something like this (word in a cell and first example retrieved. /!\ Limit the number of words to search. Each word = 3 IMPORTXML requests. So, for 20 words => 60 requests, leading to a slow sheet.)

Cell

In column B, copy-paste the words to search.

For Kana in cells C3,C4,C5,... the following formula :

=TEXTJOIN("",TRUE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dt)[1]//text()[not(parent::rt)]"))

For Romaji in cells D3,D4,D5,... the following formula :

=TEXTJOIN(" ",TRUE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dt)[1]//rt"))

For English in cells E3,E4,E5,... the following formula :

=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(REGEXREPLACE(TEXTJOIN(" ",FALSE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dd/span)[1]")),"(\w)(  )(\w)","$1'$3"),"s-en","¤")," ,  ",", ")," .",".")," - ","-")," ( "," (")," ) ",") ")," !","!")," ?","?"),". ",".")

Upvotes: 1

Related Questions