BaFouad
BaFouad

Reputation: 87

Oracle PLSQL : How to remove duplicate data in string

Step 01 : I have a column A in table tab_T contains that strings :

SELECT A FROM tab_T;


    ((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)

    (<599>*<592>*<591>)

    (<10945>)

    (<736>+<736>+1)

    (<216>*<518>)

    (<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)
    ...
    ...

I want to get :

((<123>)(*<213>+<354>+1)(*<985>))(+<654>+1)

(<599>*<591>)

(<10945>)

(<736>)

(<216>*<518>)

(<598>*<593>)(*<594>+<597>+<595>)
...
...

Step 02 : Then i will replace '+' by 'AND' and '*' by 'OR' and delete the number '1' from my string

this is my query (it works good and i share it with you if you need a help)

SELECT RTRIM(RTRIM(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
       REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(A,'+','AND'),'*','OR'),'(OR','OR('),'(AND','AND('),'(1)','') 
                            ,'OR1',''),'AND1',''),'1OR',''),'1AND',''),'ANDAND','AND'),'OROR','OR'),'AND'),'OR') AS logic   
FROM tab_T 

Result :

((<123>AND<123>AND<123>)OR(<213>AND<213>AND<213>AND<354>AND<354>AND<354>)OR(<985>))OR(<654>AND<654>)

(<599>OR<592>OR<591>)

(<10945>)

(<736>AND<736>)

(<216>OR<518>)

(<598>OR<593>)OR(<594>AND<594>AND<594>AND<597>AND<595>AND<595>AND<595>)
...
...

so when i apply step 01 and step 2 i will have this result

((<123>)OR(<213>AND<354>)OR(<985>))AND(<654>)

(<599>OR<591>)

(<10945>)

(<736>)

(<216>OR<518>)

(<598>OR<593>)OR(<594>AND<597>AND<595>)
...
...

I need a help or an idea for the step 01 please? Thx

Upvotes: 0

Views: 1844

Answers (3)

Gary_W
Gary_W

Reputation: 10360

This will preserve the plus signs in-between the bracketed numbers:

select A original, regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1') fixed
from tab_T;

The regex can be read as: Remember a group of one or more digits inside of brackets when followed by a group of one or more of the SAME group of remembered numbers preceded by an optional plus sign. When this group is encountered, replace it with the first remembered group.

EDIT: For the sake of completeness, here's the whole thing done with successive CTE's breaking the replaces into logical groupings. This way it's a complete answer and I believe reduced the number of REPLACE() calls. You could do it as a bunch of nested REPLACE's, but I think this is arguably cleaner and easier to understand and maintain down the road.

with tab_T(A) as (
  select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' from dual union all
  select '(<599>*<592>*<591>)' from dual union all
  select '(<10945>)' from dual union all
  select '(<736>+<736>+1)' from dual union all
  select '(<216>*<518>)' from dual union all
  select '(<598>*<593>)(*<594>+<594>+<594>+<597>+<595>+<595>+<595>)' from dual 
),
-- Remove dups and '+1'
pass_1(original, fixed) as (
  select A original, replace(regexp_replace(A, '(<\d+>)(\+?\1){1,}', '\1'), '+1') fixed
  from tab_T
),
replace_ors(original, fixed) as (
  select original, replace(replace(fixed, '(*', 'OR('), '*', 'OR')
  from pass_1
),
replace_ands(original, fixed) as (
  select original, replace(replace(fixed, '(+', 'AND('), '+', 'AND')
  from replace_ors
)  
select original, fixed
from replace_ands
;

Upvotes: 1

BaFouad
BaFouad

Reputation: 87

I found it :)

select REGEXP_REPLACE
   (A, 
   '(<[^>]+>)(\+|\*?\1)*', 
   '\1') as logic   
FROM tab_T

Thank you anyway ;)

Upvotes: 0

mehmet sahin
mehmet sahin

Reputation: 812

I know this is not full answer for your question. But maybe it can help you:

with t as (select '((<123>+<123>+<123>)(*<213>+<213>+<213>+<354>+<354>+<354>+1)(*<985>))(+<654>+<654>+1)' as exp from dual)
, t1 as ( select distinct regexp_substr(exp, '[^+]+', 1, level) names
                  from t
                  connect by level <= length(regexp_replace(exp, '[^*+]'))+1
                  ) 
SELECT 
   RTrim(listagg(t1.names,'+') WITHIN GROUP (order by names desc))  string
 from t1

Upvotes: 0

Related Questions