Reputation: 179
I'm aiming to retrieve the position of chars in a string, plus the length of a string.
The value of the notes
field in the internal_notes
table, for the row with ticket_id
equal to 1679467247 is literally 'this is a test note'.
When I use the functions against the literal stirngs they work, but when I retrieve the info directly from the table column, the values are just wrong.
Any ideas as to what might be happening?
select notes,
LENGTH(notes),
INSTR(notes,' ')
FROM internal_notes
where ticket_id = 1679467247
union
select 'this is a test note',
LENGTH('this is a test note'),
INSTR('this is a test note',' ')
from dual
This returns the following:
NOTES LENGTH(NOTES) INSTR(NOTES,' ')
------------------- ------------- ----------------
this is a test note 32 11
this is a test note 19 5
Upvotes: 0
Views: 822
Reputation: 191275
You can get this apparent inconsistency if you have zero-width characters in the value; for example:
create table internal_notes(ticket_id number, notes varchar2(32 char));
insert into internal_notes(ticket_id, notes)
values (1679467247, unistr('\200c\200cthis is a test note\200c\200c\200c\200c\200c\200c\200c\200c\200c\200c\200c'));
insert into internal_notes(ticket_id, notes)
values (1679467248, unistr('\200c\200cthis is a test note'));
insert into internal_notes(ticket_id, notes)
values (1679467249, 'this is a test note');
select notes,
LENGTH(notes),
INSTR(notes,' ')
FROM internal_notes
where ticket_id = 1679467247;
NOTES LENGTH(NOTES) INSTR(NOTES,'')
-------------------------------- ------------- ---------------
this is a test note 32 7
I said 'apparent inconsistency' because those numbers are correct; they just don't look it if you can't see some of the characters. Invisible characters still count.
As @MTO suggested you can use the dump()
function to see exactly what is stored in the table, in decimal or hex representation, or mixed with 'normal' characters which can be a bit easier to interpret:
select notes,
LENGTH(notes),
INSTR(notes,' '),
dump(notes, 1000) as dmp
FROM internal_notes;
NOTES LENGTH(NOTES) INSTR(NOTES,'')
-------------------------------- ------------- ---------------
DMP
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
this is a test note 32 7
Typ=1 Len=58: e2,80,8c,e2,80,8c,t,h,i,s, ,i,s, ,a, ,t,e,s,t, ,n,o,t,e,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,8c,e2,80,
8c
this is a test note 21 7
Typ=1 Len=25: e2,80,8c,e2,80,8c,t,h,i,s, ,i,s, ,a, ,t,e,s,t, ,n,o,t,e
this is a test note 19 5
Typ=1 Len=19: t,h,i,s, ,i,s, ,a, ,t,e,s,t, ,n,o,t,e
db<>fiddle - though that is showing the zero-width characters as question marks, unlike SQL Developer and SQL*Plus.
Other zero-width characters are available (space, non-joiner, joiner), and you might see something different in your dump - it just has to be something your client doesn't display at all. Whatever is in there, if it affects all rows and not just that single ticket, then how and why is probably down to whatever front-end/application populates the table - possibly from a character set mismatch, but it could be intentional. If it is just that ticket then that note is an interesting test...
Upvotes: 4