Ray
Ray

Reputation: 103

SQL Query to select a value between two known strings

I need a SQL query to get the value between two known strings in a text column. The column name is d_info and the table name is Details.

The text is an XML fragment, but stored as a text value. What I need is to get the value between the bookends <nettoeinkommen> and </nettoeinkommen> which is 718 in this example. I also need the output to be saved in new column named income with data type float(8).

land>DE</land></wohnanschrift><taetigkeit>rentner</taetigkeit><dkbkundenstatus><bestandskunde>false</bestandskunde></dkbkundenstatus><haushaltsangaben><einnahmen><einkommen><nettoeinkommen>718</nettoeinkommen></einkommen><kindergeld>0</kindergeld><vermietungverpachtungnetto>0</vermietungverpachtungnetto><elterngeld>0</elterngeld><rentenunbefristet>0</rentenunbefristet><unselbststaendigetaetigkeit>740</unselbststaendigetaetigkeit><geringfuegigebeschaeftigung>0</geringfuegigebeschaeftigung></einnahmen><ausgaben><warmmiete>550</warmmiete><ratenimmobilienfinanzierung>0</ratenimmobilienfinanzierung>

I tried this code:

SELECT cast(SUBSTRING(d_info, CHARINDEX('<nettoeinkommen>', d_info) , CHARINDEX('</nettoeinkommen>', d_info) - CHARINDEX('<nettoeinkommen>', d_info)) as float(8)) as income from dbo.Details

But it's returning an Error converting data type varchar to real.

When I remove the cast function, the script works but it returns <nettoeinkommen>718 instead of only 718.

Thanks.

Upvotes: 0

Views: 174

Answers (3)

Raul
Raul

Reputation: 3131

As it seems, you are querieing plain xml data, for such purpose sql-server provides xquery functionality:

SELECT CAST(r.d_info AS XML).value('(/haushaltsangaben/einnahmen/einkommen/nettoeinkommen)[1]', 'decimal(19,2)')
FROM
(
    SELECT '<taetigkeit>rentner</taetigkeit>
<dkbkundenstatus>
    <bestandskunde>false</bestandskunde>
</dkbkundenstatus>
<haushaltsangaben>
    <einnahmen>
        <einkommen>
            <nettoeinkommen>718</nettoeinkommen>
        </einkommen>
    </einnahmen>
</haushaltsangaben>' AS d_info
) AS r

If you intend to query more info from your source, you will end up with a bunch of stacked substring, patindex functions or even your own defined functions. This should be more readable and mantainable.

Using XQuery: https://learn.microsoft.com/en-us/sql/t-sql/xml/query-method-xml-data-type

As for your initial issue The SUBSTRING function in SQL returns the subset from a string starting from a given index for a specific length. For example SELECT SUBSTRING('whatever',5,4) returns 'ever'. In case of CHARINDEX it gives the index for the first found match of a given pattern within a string. Example SELECT CHARINDEX('ever','whatever') should return 5, as 'ever' starts at the fifth position in 'whatever').

Now in your case you need to add the length of '<nettoeinkommen>' to the starting charindex and substract the length of '</nettoeinkommen>' from the length of the substring:

Also consider using decimal or numeric type instead of float, if you need to precise calculations: https://technet.microsoft.com/en-us/library/ms187912(v=sql.105).aspx

Upvotes: 1

Hogan
Hogan

Reputation: 70513

It is starting at the start of the tag not the end of it.

 SELECT cast(
   SUBSTRING(
     d_info, 
     CHARINDEX('<nettoeinkommen>', d_info) + len('<nettoeinkommen>'),
     CHARINDEX('</nettoeinkommen>', d_info) - (CHARINDEX('<nettoeinkommen>', d_info) + len('<nettoeinkommen>'))
   ) as float(8)) as income
 from dbo.Details

you might even have these defined in variables:

 SELECT cast(
   SUBSTRING(
     d_info, 
     CHARINDEX(@startTag, d_info) + len(@startTag),
     CHARINDEX(@endTag, d_info) - (CHARINDEX(@startTag,d_info)+ len(@startTag))
   ) as float(8)) as income
 from dbo.Details

I think the code is much easier to understand with the variables.

Upvotes: 2

Obie
Obie

Reputation: 477

You need to add the length of your opening tag from the start index and subtract from the length of your substring statement:

SUBSTRING(d_info, CHARINDEX('<nettoeinkommen>', d_info)+16,  
CHARINDEX('</nettoeinkommen>', d_info) - CHARINDEX('<nettoeinkommen>', d_info)-16)

Upvotes: 1

Related Questions