Reputation: 49
i have one interesting question, and could not resolve it. Please help!!!!
They are tables
t_employee
ID NUMBER,
DEPARTMENT_ID NUMBER,
CHIEF_ID NUMBER,
NAME VARCHAR2(100 BYTE),
SALARY NUMBER,
BIRTH_DATE DATE,
ADDRESS VARCHAR2(200 BYTE),
STATUS VARCHAR2(1 BYTE)
and
t_department
ID NUMBER,
NAME VARCHAR2(100 BYTE)
Need display the number of employees in each region - in Adress column (if they are now region then = 'No' area). Names of areas converted to uppercase.
What is problem? Problem is that adress collumn has unstructured data for example: Adress:
Country,REGION,city,...
So REGION always must be beetween first (,) and second (,) AND must include word (reg) For example:
Russia(Country), reg Moskovskay , Moscow(city), Lenina, (street) .... or
Russia(Country), Moskovskay reg , Moscow(city), Lenina, (street) .... or
separator is (,) position is - second
Many thanks!
Upvotes: 1
Views: 106
Reputation: 8816
WITH t_employee AS (
SELECT 1 AS id, 10 AS department_id, 'a' AS name, 'kws, aaa reg, skdir, 23049' AS address FROM dual
UNION ALL SELECT 2, 10, 'b', 'slkx, aaa reg, lskdj, 902349' FROM dual
UNION ALL SELECT 3, 20, 'c', 'lskj, bbb reg, lskdi, 489308' FROM dual
UNION ALL SELECT 4, 10, 'd', 'lskj, aaa reg, lskdi, 489308' FROM dual
UNION ALL SELECT 5, 20, 'e', 'lskj, ccc reg, lskdi, 489308' FROM dual
UNION ALL SELECT 6, 30, 'f', 'lskj, bbb reg, lskdi, 489308' FROM dual
)
, t_region AS (
SELECT id,
TRIM (
REPLACE (
SUBSTR (address,
INSTR (address, ',', 1) + 1,
INSTR (address, ',', INSTR(address, ',', 1) + 1)
- INSTR(address, ',', 1) - 1),
'reg',
'')
)
AS region
FROM t_employee e
)
SELECT r.region, count(*) AS employees
FROM t_region r
GROUP BY r.region
;
Upvotes: 0
Reputation: 181077
Free form strings are seldom a good idea in databases, this query won't be able to use indexes which will most likely make it a slow performer;
WITH a AS ( SELECT TRIM(
REPLACE(
UPPER(
REGEXP_SUBSTR(ADDRESS, ',([^,]*),', 1, 1, 'i', 1)
),
' REG ', ''
)
) REGION
FROM t_employee)
SELECT REGION, COUNT(*) cnt FROM a GROUP BY REGION
Upvotes: 1
Reputation: 10703
Try the below:
SELECT regexp_substr(address, ',(.*?reg.*?),', 1, 1, null, 1) AS region, COUNT(*)
FROM t_employee
GROUP BY regexp_substr(address, ',(.*?reg.*?),', 1, 1, null, 1);
I would strongly advise however to refactor the schema and break the address into separate fields for street, city, region, etc. before or during table load, if only you have the possibility to do so.
Upvotes: 1
Reputation: 122
At the outset , please explore to see and change the database design to separate these fields.It will help you in the long run.If you would still want to stick on to this structure , you could manage the data in the required format while inserting itself.Hope that helps !
Upvotes: 0