Reputation: 369
I am getting no data when I apply class using importxml, I have reviewed the source code by using right click page source option and sentencs are there, I used ctrl+f to find the line and tried the xpath as mentioned below but it gives #NA, since new to import XML google sheets query, I need the result that is attached in the snap below for cell E55 the Japanese and English sentence in cell I55 & J55.
These sentences came from the following search the first sentence is used always in Japnese & English, here is the snap
I have tried a few combinations but in shows #NA;
=IMPORTXML("https://tangorin.com/sentences?search=時","//div[@class='s-jp']")
=IMPORTXML("https://tangorin.com/sentences?search=時","//div[@class='entry entry-border sentences undefined ']/dd[@class='s-jp']")
Can anyone please assist as can I cant copy for 2000 letters both Japnese & English sentences which I need for translation class, much appreciated thanks
Upvotes: 1
Views: 280
Reputation: 5915
Output :
3 formulas to get the data (url is in cell D2).
For Kana in D4 :
=TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN("",TRUE,IMPORTXML(D2,"//div[@class='results-main-container']//dt//text()[not(parent::rt)]|//dd[@class='s-en']/@class")),"s-en","¤"),"¤"))
For Romaji in E4 :
=ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(SUBSTITUTE(TEXTJOIN(" ",TRUE,IMPORTXML(D2,"//div[@class='results-main-container']//dt//rt|//dd[@class='s-en']/@class")),"s-en","¤"),"¤"))))
For English in F4 :
=ARRAYFORMULA(TRIM(TRANSPOSE(SPLIT(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(REGEXREPLACE(TEXTJOIN(" ",FALSE,IMPORTXML(D2,"//div[@class='results-main-container']//dd/span|(//dd[@class='s-en'])[not(position()=1)]/@class")),"(\w)( )(\w)","$1'$3"),"s-en","¤")," , ",", ")," . ",".")," - ","-")," ( "," (")," ) ",") ")," !","!")," ?","?"),"¤"))))
To limit the array to 1 result, you can use something like :
=INDEX(one of the preceding formulas,1,1)
Output :
EDIT : If you need something like this (word in a cell and first example retrieved. /!\ Limit the number of words to search. Each word = 3 IMPORTXML
requests. So, for 20 words => 60 requests, leading to a slow sheet.)
In column B, copy-paste the words to search.
For Kana in cells C3,C4,C5,... the following formula :
=TEXTJOIN("",TRUE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dt)[1]//text()[not(parent::rt)]"))
For Romaji in cells D3,D4,D5,... the following formula :
=TEXTJOIN(" ",TRUE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dt)[1]//rt"))
For English in cells E3,E4,E5,... the following formula :
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(REGEXREPLACE(TEXTJOIN(" ",FALSE,IMPORTXML("https://tangorin.com/sentences?search="&B3,"(//div[@class='results-main-container']//dd/span)[1]")),"(\w)( )(\w)","$1'$3"),"s-en","¤")," , ",", ")," .",".")," - ","-")," ( "," (")," ) ",") ")," !","!")," ?","?"),". ",".")
Upvotes: 1