Reputation:
The data from a table in our database is full of errors. I am trying to code a case, when, then clause that could potentially replace the errors with data that is correct, and from another table. Keep in mind that I have pretty basic privileges so I cant code anything more complex than case, when and then. The code I have so far is shown below:
select distinct t.*,
case
--** t.Corridor_RB is full of bad entries
when t.Corridor_RB <> s.Corridor_RB
then s.Corridor_RB --** what I want the data to be
else s.Corridor_RB
end as CorrectRb
from PROJECT_CORRS_RB_MILES t left join
corridors_grouped_tis s on
t.Corridor_RB = s.Corridor_RB
where t.Direction <> 5
order by t.Corridor_RB
Whats happening right now is the added column CorrectRb is still full of nulls as shown in the screen shot below:
So why is this happening? Why for example are the C000001N not correcting to C000001E from s.Corridor_RB? Shouldnt the empty spaces be data from s.Corridor_RB? The join itself is fine, its a one to many.
Here is data from corridors_grouped_tis:
TCR_CORRIDOR_ID CORRIDOR_RB SLICED
C000001 C000001E 1
C000002 C000002E 2
C000003 C000003N 3
C000004 C000004N 4
C000005 C000005N 5
C000006 C000006E 6
C000007 C000007N 7
C000008 C000008E 8
C000009 C000009N 9
C000010 C000010N 10
C000011 C000011N 11
C000012 C000012E 12
Output from projects_corrs_rb_miles is shown below:
FED_ST_PRJ_NBR CONT_ID ROUTE_NBR DIRECTION CORRIDOR_RB BEG_TERMINI END_TERMINI
1 HSIP-MT 1-6(76)384 11713 N-1 3 C000001E 384.0 386.6
2 MT-NH 1-3(71)255 21216 N-1 3 C000001E 254.6 256.0
3 HSIP-MT 1-6(76)384 11713 N-1 3 C000001E 384.0 386.6
4 HSIP-MT 1-6(76)384 11713 N-1 3 C000001E 384.0 386.6
5 MT-NH 1-3(71)255 21216 N-1 3 C000001E 254.6 256.0
6 MT-NH 1-3(71)255 21216 N-1 3 C000001E 254.6 256.0
7 HSIP 1-2(147)136 07C11 N-1 1 C000001N
8 NH 1-1(93)16 08913 N-1 1 C000001N 16.0 44.8
9 CBI-NHTSA-MT 1-1(79)45 03812 N-1 1 C000001N 44.9 48.6
10 CBI-NHTSA-MT 1-1(79)45 03812 N-1 1 C000001N 44.9 48.6
Upvotes: 0
Views: 37
Reputation: 191315
At the moment your case expression is a bit pointless since both branches evaluate to the same thing, s.Corridor_RB
.
But that will be null if there is no matching row in s
for a t
row, and because you've included join condition
t.Corridor_RB = s.Corridor_RB
you will only get a match if they are the same, i.e. you will go in to the 'else' branch. For everything else there is no match on s, so s.Corridor_RB
will be null. (Which will also go into the 'else', as you can't compare null to anything else with in/equality operators).
Essentially, it can only ever be null, or exactly the same as t.Corridor_RB
.
You can probably just remove that condition, and it might be OK as an inner join; but then you will see that you have no other join conditions between s
and t
, so you will get a cartesian product (cross join) between the two tables, which is unlikely to be what you want.
You need to know how to identify a row in s
which will have the correct value, based on some other column(s) in t
. You can't base it on the column you know has bad entries - unless s
only has one row, or has mappings between bad and good values, and you join on or selected the wrong column. As we can't see that table structure or data it's impossible to tell what you need that join to really be.
Based on the data from corridors_grouped_tis
you added, it looks like you want to match based on the root, and just change the last character:
from PROJECT_CORRS_RB_MILES t left join
corridors_grouped_tis s on
s.TCR_CORRIDOR_ID = substr(t.Corridor_RB, 1, length(t.Corridor_RB) - 1)
if the TCR_CORRIDOR_ID
does what it appears to; or perhaps
substr(s.Corridor_RB, 1, length(s.Corridor_RB) - 1)
= substr(t.Corridor_RB, 1, length(t.Corridor_RB) - 1)
or similar, though this is making assumptions about your patterns and how to match still.
Applying a function to a column value will generally prevent any index on that column being used, but if this is a small look-up table it probably won't matter and you're likely to hit a large proportion of rows anyway, making a full scan preferable. If there was a lot of data to work with you could add an indexed virtual column that does the substring, but it seems like overkill here.
Demo with your sample data as CTEs:
with corridors_grouped_tis(TCR_CORRIDOR_ID, CORRIDOR_RB, SLICED) as (
select 'C000001', 'C000001E', 1 from dual
union all select 'C000002', 'C000002E', 2 from dual
union all select 'C000003', 'C000003N', 3 from dual
union all select 'C000004', 'C000004N', 4 from dual
union all select 'C000005', 'C000005N', 5 from dual
union all select 'C000006', 'C000006E', 6 from dual
union all select 'C000007', 'C000007N', 7 from dual
union all select 'C000008', 'C000008E', 8 from dual
union all select 'C000009', 'C000009N', 9 from dual
union all select 'C000010', 'C000010N', 10 from dual
union all select 'C000011', 'C000011N', 11 from dual
union all select 'C000012', 'C000012E', 12 from dual
),
project_corrs_rb_miles (FED_ST_PRJ_NBR, CONT_ID, ROUTE_NBR, DIRECTION, CORRIDOR_RB, BEG_TERMINI, END_TERMINI) as (
select 'HSIP-MT 1-6(76)384', '11713', 'N-1', 3, 'C000001E', 384.0, 386.6 from dual
union all select 'MT-NH 1-3(71)255', '21216', 'N-1', 3, 'C000001N', 254.6, 256.0 from dual
union all select 'HSIP-MT 1-6(76)384', '11713', 'N-1', 3, 'C000001E', 384.0, 386.6 from dual
union all select 'HSIP-MT 1-6(76)384', '11713', 'N-1', 3, 'C000001E', 384.0, 386.6 from dual
union all select 'MT-NH 1-3(71)255', '21216', 'N-1', 3, 'C000001E', 254.6, 256.0 from dual
union all select 'MT-NH 1-3(71)255', '21216', 'N-1', 3, 'C000001E', 254.6, 256.0 from dual
union all select 'HSIP 1-2(147)136', '07C11', 'N-1', 1, 'C000001N', null, null from dual
union all select 'NH 1-1(93)16', '08913', 'N-1', 1, 'C000001N', 16.0, 44.8 from dual
union all select 'CBI-NHTSA-MT 1-1(79)45', '03812', 'N-1', 1, 'C000001N', 44.9, 48.6 from dual
union all select 'CBI-NHTSA-MT 1-1(79)45', '03812', 'N-1', 1, 'C000001N', 44.9, 48.6 from dual
)
select t.*, s.Corridor_RB as CorrectRb
from PROJECT_CORRS_RB_MILES t left join
corridors_grouped_tis s on
s.TCR_CORRIDOR_ID = substr(t.Corridor_RB, 1, length(t.Corridor_RB) - 1)
where t.Direction <> 5
order by t.Corridor_RB
/
FED_ST_PRJ_NBR CONT_ ROU DIRECTION CORRIDOR BEG_TERMINI END_TERMINI CORRECTR
---------------------- ----- --- ---------- -------- ----------- ----------- --------
MT-NH 1-3(71)255 21216 N-1 3 C000001E 254.6 256 C000001E
HSIP-MT 1-6(76)384 11713 N-1 3 C000001E 384 386.6 C000001E
HSIP-MT 1-6(76)384 11713 N-1 3 C000001E 384 386.6 C000001E
MT-NH 1-3(71)255 21216 N-1 3 C000001E 254.6 256 C000001E
HSIP-MT 1-6(76)384 11713 N-1 3 C000001E 384 386.6 C000001E
MT-NH 1-3(71)255 21216 N-1 3 C000001N 254.6 256 C000001E
HSIP 1-2(147)136 07C11 N-1 1 C000001N C000001E
NH 1-1(93)16 08913 N-1 1 C000001N 16 44.8 C000001E
CBI-NHTSA-MT 1-1(79)45 03812 N-1 1 C000001N 44.9 48.6 C000001E
CBI-NHTSA-MT 1-1(79)45 03812 N-1 1 C000001N 44.9 48.6 C000001E
10 rows selected.
Upvotes: 0