Reputation: 103
I need a SQL query to get the value between two known strings in a text
column.
The column name is d_info and the table name is Details.
The text is an XML
fragment, but stored as a text
value.
What I need is to get the value between the bookends <nettoeinkommen>
and </nettoeinkommen>
which is 718
in this example.
I also need the output to be saved in new column named income
with data type float(8)
.
land>DE</land></wohnanschrift><taetigkeit>rentner</taetigkeit><dkbkundenstatus><bestandskunde>false</bestandskunde></dkbkundenstatus><haushaltsangaben><einnahmen><einkommen><nettoeinkommen>718</nettoeinkommen></einkommen><kindergeld>0</kindergeld><vermietungverpachtungnetto>0</vermietungverpachtungnetto><elterngeld>0</elterngeld><rentenunbefristet>0</rentenunbefristet><unselbststaendigetaetigkeit>740</unselbststaendigetaetigkeit><geringfuegigebeschaeftigung>0</geringfuegigebeschaeftigung></einnahmen><ausgaben><warmmiete>550</warmmiete><ratenimmobilienfinanzierung>0</ratenimmobilienfinanzierung>
I tried this code:
SELECT cast(SUBSTRING(d_info, CHARINDEX('<nettoeinkommen>', d_info)
, CHARINDEX('</nettoeinkommen>', d_info) - CHARINDEX('<nettoeinkommen>', d_info)) as float(8)) as income
from dbo.Details
But it's returning an Error converting data type varchar to real.
When I remove the cast
function, the script works but it returns <nettoeinkommen>718
instead of only 718
.
Thanks.
Upvotes: 0
Views: 174
Reputation: 3131
As it seems, you are querieing plain xml data, for such purpose sql-server provides xquery functionality:
SELECT CAST(r.d_info AS XML).value('(/haushaltsangaben/einnahmen/einkommen/nettoeinkommen)[1]', 'decimal(19,2)')
FROM
(
SELECT '<taetigkeit>rentner</taetigkeit>
<dkbkundenstatus>
<bestandskunde>false</bestandskunde>
</dkbkundenstatus>
<haushaltsangaben>
<einnahmen>
<einkommen>
<nettoeinkommen>718</nettoeinkommen>
</einkommen>
</einnahmen>
</haushaltsangaben>' AS d_info
) AS r
If you intend to query more info from your source, you will end up with a bunch of stacked substring, patindex functions or even your own defined functions. This should be more readable and mantainable.
Using XQuery: https://learn.microsoft.com/en-us/sql/t-sql/xml/query-method-xml-data-type
As for your initial issue The SUBSTRING
function in SQL returns the subset from a string starting from a given index for a specific length. For example SELECT SUBSTRING('whatever',5,4)
returns 'ever'.
In case of CHARINDEX
it gives the index for the first found match of a given pattern within a string. Example SELECT CHARINDEX('ever','whatever')
should return 5, as 'ever' starts at the fifth position in 'whatever').
Now in your case you need to add the length of '<nettoeinkommen>'
to the starting charindex and substract the length of '</nettoeinkommen>'
from the length of the substring:
Also consider using decimal or numeric type instead of float, if you need to precise calculations: https://technet.microsoft.com/en-us/library/ms187912(v=sql.105).aspx
Upvotes: 1
Reputation: 70513
It is starting at the start of the tag not the end of it.
SELECT cast(
SUBSTRING(
d_info,
CHARINDEX('<nettoeinkommen>', d_info) + len('<nettoeinkommen>'),
CHARINDEX('</nettoeinkommen>', d_info) - (CHARINDEX('<nettoeinkommen>', d_info) + len('<nettoeinkommen>'))
) as float(8)) as income
from dbo.Details
you might even have these defined in variables:
SELECT cast(
SUBSTRING(
d_info,
CHARINDEX(@startTag, d_info) + len(@startTag),
CHARINDEX(@endTag, d_info) - (CHARINDEX(@startTag,d_info)+ len(@startTag))
) as float(8)) as income
from dbo.Details
I think the code is much easier to understand with the variables.
Upvotes: 2
Reputation: 477
You need to add the length of your opening tag from the start index and subtract from the length of your substring statement:
SUBSTRING(d_info, CHARINDEX('<nettoeinkommen>', d_info)+16,
CHARINDEX('</nettoeinkommen>', d_info) - CHARINDEX('<nettoeinkommen>', d_info)-16)
Upvotes: 1