Reputation: 439
Im crawling a webpage using Xpath and I need to write the deposit as a number. The deposit needs to be ("monthly rent" x "amount of prepaid rent") the result should be: 15450 in this case
<table>
<tr>
<td>monthly rent: </td>
<td>5.150,00</td>
</tr>
<tr>
<td>deposit: </td>
<td>3 mdr.</td>
</tr>
</table>
I am currently using the following XPath to find the info:
//td[contains(.,'Depositum') or contains(.,'Husleje ')]/following-sibling::td/text()
But I don't know how to remove "mdr." from deposit, and how to multiply the to numbers and only return 1 number to the database.
Upvotes: 3
Views: 575
Reputation: 13986
Pure XPath solution:
translate(
/table/tr/td[contains(., 'monthly rent')]/following-sibling::td[1],
',.',
'.'
)
*
substring-before(
/table/tr/td[contains(., 'deposit')]/following-sibling::td[1],
' mdr'
)
It seems I ended up with a solution quite much similar to hek2mgl's correct answer but there is no need for dividing with 100 (comma converted to dot, dot removed) and <td>
elements containing numeric data have positional predicates in order to avoid matching more elements, if the actual table is not as simple as the given example. XPath number format requires decimal separator to be a dot and no thousand separators.
Upvotes: 0
Reputation: 158040
You can use the following query which is compatible with XPath 1.0 and upwards:
substring-before(//td[contains(.,'deposit:')]/following-sibling::td/text(), ' mdr.') * translate(//td[contains(.,'monthly rent:')]/following-sibling::td/text(), ',.', '') div 100
Output:
15450
Step by Step Explanation:
// Get the deposit and remove mdr. from it using substring-before
substring-before(//td[contains(.,'deposit:')]/following-sibling::td/text(), ' mdr.')
// Arithmetic multiply operator
*
// The number format 5.150,00 can't be used for arithmetic calculations.
// Therefore we get the monthly rent and remove . and , chars from it.
// Note that this is equal to multiply it by factor 100. That's why we divide
// by 100 later on.
translate(//td[contains(.,'monthly rent:')]/following-sibling::td/text(), ',.', '')
// Divide by 100
div 100
You can refer to the List of Functions and Operators supported by XPath 1.0 and 2.0
Upvotes: 3