Reputation:
could someone explain this case? example, i have a dump table with data like this:
TGL
19810909
19761026
19832529
when i execute with this query:
SELECT to_date(tgl,'YYYYMMDD') tgl
FROM
(
SELECT tgl
FROM tmpx
WHERE
SUBSTR(tgl,5,2) BETWEEN '01' AND '12'
AND length(tgl) = 8
)
WHERE to_date(tgl,'YYYYMMDD') < to_date('19811231','YYYYMMDD')
result: no error
TGL
09/09/1981
26/10/1976
but, when i execute with this query:
SELECT to_date(tgl,'YYYYMMDD') tgl
FROM
(
SELECT tgl
FROM tmpx
WHERE
SUBSTR(tgl,5,2) IN ('01','02','03','04','05','06','07','08','09','10','01','12')
AND length(tgl) = 8
)
WHERE to_date(tgl,'YYYYMMDD') < to_date('19811231','YYYYMMDD')
result: error
ORA-01843: not a valid month
why the row number third (19832529) include in selection that causes an error? whereas if I execute the following query:
SELECT tgl
FROM tmpx
WHERE
SUBSTR(tgl,5,2) IN ('01','02','03','04','05','06','07','08','09','10','11','12')
AND length(tgl) = 8
the result is like this (wihtout row number 3)
TGL
19810909
19761026
thank you.
Upvotes: 0
Views: 56
Reputation: 191570
If you look at the execution plans for both queries you can see how they are being handled by the optimiser. For the first one:
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TMPX | 1 | 6 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(LENGTH("TGL")=8 AND SUBSTR("TGL",5,2)>='01' AND
SUBSTR("TGL",5,2)<='12' AND TO_DATE("TGL",'YYYYMMDD')<TO_DATE('
1981-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
And for the second:
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 6 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TMPX | 1 | 6 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(LENGTH("TGL")=8 AND TO_DATE("TGL",'YYYYMMDD')<TO_DATE('
1981-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
(SUBSTR("TGL",5,2)='01' OR SUBSTR("TGL",5,2)='02' OR
SUBSTR("TGL",5,2)='03' OR SUBSTR("TGL",5,2)='04' OR
SUBSTR("TGL",5,2)='05' OR SUBSTR("TGL",5,2)='06' OR
SUBSTR("TGL",5,2)='07' OR SUBSTR("TGL",5,2)='08' OR
SUBSTR("TGL",5,2)='09' OR SUBSTR("TGL",5,2)='10' OR
SUBSTR("TGL",5,2)='12'))
Notice the order that the filters are applied. In the first one it's looking at the substring first, and only the values that pass that filter will then be converted to a date for the 1998 comparison.
In the second one the date check is being done first, so it tries to convert the invalid value before it filters it out.
The real problem here is storing dates as string, which allows invalid data to be entered. If you're stuck with that then another approach is to use a function to attempt to convert the string to a date and ignore the error(s) thrown, which still isn't ideal but would ignore the same values you already are. There are lots of examples of this, including this one of mine. With something like that you could do:
SELECT safe_to_date(tgl) tgl
FROM tmpx
WHERE safe_to_date(tgl) < date '1981-12-31';
or if you prefer:
SELECT tgl
FROM (
SELECT safe_to_date(tgl) tgl
FROM tmpx
)
WHERE tgl < date '1981-12-31';
Your function could only look for YYYYMMDD format strings, or you could pass in the format you want to check, if you don't want it to be flexible.
Upvotes: 1