Matheus Vinicius
Matheus Vinicius

Reputation: 49

Dynamically concatenating strings using a loop in Oracle PL/SQL

I've been trying to standardize the "Address" field of a table with 400k lines. Problem is, this field was made to be a free form field, meaning that users (upon registering) could just enter anything they wanted into it. I've already managed to split this address field into 14 others with each word of the adress line in order.

Now i need to concatenate some of these fields back together into different fields such as:

I wish it could be as easy as grabbing the first 3 fields and calling that the street name and the 4th field as House number but due to different street name lengths, the field with the "house number" is usually fields #4, #5 and rarely ever #3 or #6.

I've already thought of an approach that could work in this case and that would be concatenating these fields in a loop and using the first occurence of a field where its first digit is a number as a break point.(due to some houses being numbered "10C", "1A", "1B" and so on)

Due to not being not very good at PL/SQL, i don't know how to put this idea into code.

All that i've managed to do so far was writing a function to check if a string starts with a digit or not, so it could be used in an IF function.

How can i dynamically "traverse" the fields in a loop using PL/SQL? Would i use an array? Is it even possible?

EDIT: Examples of what this address field contains (in portuguese):

Avenida Doutor Theomário Pinto da Costa 450 Condominio Renaissance, rua 1, casa 1

Rua Álvaro Peres Filho 60 Casa azul em frente ao orelhão

Travessa Delegado Zé Lima 61 antiga Praça Rio Branco

Rua Finlândia 28 Qd 111

Alameda Áustria 107 Condomí­nio Jardim Europa I

There is a clear pattern in the data, which leads me to believe that this address data was colected in a segmented format, but then concatenated into a single table field. What i need to do is the reverse, basically.

The pattern goes:

Avenida(1) Doutor Theomário Pinto da Costa (2) 450(3) Condominio Renaissance, rua 1, casa 1(4)

1 = Street type (avenue, street and etc)

2 = Street Name

3 = House Number

4 = Reference (usually used to help with directions, this field is usually open by design. Things like Floor, building and Condo name goes here.)

As you can see from the examples above, the streets usually have widly different name lenghts, which makes setting a specific number of fields as the street name impossible.

Upvotes: 1

Views: 542

Answers (1)

alvalongo
alvalongo

Reputation: 571

A posible solution:

Declare
  /* Dynamically concatenating strings using a loop in Oracle PL/SQL */
  --
  type tyArray_parsed
  is table of varchar2(2000)
  index by binary_integer;
  --
  arAddress_parsed   tyArray_parsed;
  --
  cnuSTREET_TYPE   constant integer:=1;
  cnuSTREET_NAME   constant integer:=2;
  cnuHOUSE_NUMBER  constant integer:=3;
  cnuREFERENCE     constant integer:=4;
  --
  Procedure print(p_isbTexto varchar2)
  is
    nuIzq_print  integer;
    nuDer_print    integer;
  Begin
    nuDer_print:=0;
    Loop
       nuIzq_print:=nuDer_print+1;
       nuDer_print:=instr(p_isbTexto,chr(10),nuIzq_print);
       exit when nvl(nuDer_print,0)=0;
       dbms_output.put_line(substr(p_isbTexto, nuIzq_print, nuDer_print-nuIzq_print));
    End loop;
    dbms_output.put_line(substr(p_isbTexto, nuIzq_print));
  End print;
  --
  procedure pAddress_split(isbAddress_string      varchar2,
                           oarAddress_parsed  out tyArray_parsed
                          )
  is
    type tyArray_string
    is table of varchar2(2000)
    index by binary_integer;
    --
    arFields  tyArray_string;
    nuLeft_pos    integer;
    nuRight_pos  integer;
    nuMy_number   number;
    blIs_number   boolean;
    nuStep        integer;
  Begin
    print('---------------------------------------');
    print('isbAddress_string=['||isbAddress_string||']');
    --
    nuRight_pos:=0;
    Loop
       nuLeft_pos:=nuRight_pos+1;
       nuRight_pos:=instr(isbAddress_string,' ',nuLeft_pos);
       exit when nvl(nuRight_pos,0)=0;
       arFields(arFields.COUNT+1):=substr(isbAddress_string,
                                          nuLeft_pos,
                                          nuRight_pos-nuLeft_pos
                                         );
    End loop;
    arFields(arFields.COUNT+1):=substr(isbAddress_string,
                                       nuLeft_pos
                                      );
    print('arFields.COUNT='||arFields.COUNT);
    --
    For nuI in arFields.FIRST..arFields.LAST loop
        print(lpad(nuI,4)||'['||arFields(nui)||']');
    End loop;
    --
    nuStep:=cnuSTREET_TYPE;
    For nuI in arFields.FIRST..arFields.LAST loop
       if nuStep=cnuSTREET_TYPE then
          oarAddress_parsed(cnuSTREET_TYPE):=arFields(nui);
          nuStep:=cnuSTREET_NAME;
       Elsif nuStep=cnuSTREET_NAME then
             blIs_number:=false;
             Begin
               nuMy_number:=to_number(arFields(nui));
               blIs_number:=true;
             Exception
               when others then
                     blIs_number:=false;
             End;
             if blIs_number then
                oarAddress_parsed(cnuHOUSE_NUMBER):=arFields(nui);
                nuStep:=cnuREFERENCE;
             Else
                if not(oarAddress_parsed.exists(cnuSTREET_NAME)) then
                   oarAddress_parsed(cnuSTREET_NAME):=arFields(nui);
                Else
                   oarAddress_parsed(cnuSTREET_NAME):=oarAddress_parsed(cnuSTREET_NAME)||' '||arFields(nui);
                End if;
             End if;
      Elsif nuStep=cnuREFERENCE then
            if not(oarAddress_parsed.exists(cnuREFERENCE)) then
               oarAddress_parsed(cnuREFERENCE):=arFields(nui);
            Else
               if arFields(nui)=',' then
                  oarAddress_parsed(cnuREFERENCE):=oarAddress_parsed(cnuREFERENCE)||arFields(nui);
               Else
                  oarAddress_parsed(cnuREFERENCE):=oarAddress_parsed(cnuREFERENCE)||' '||arFields(nui);
               end if;
            End if;
      End if;
    End loop;
    --
    for nuI in oarAddress_parsed.FIRST..oarAddress_parsed.LAST loop
        print(lpad(nuI,2)||'|['||oarAddress_parsed(nui)||']');
    End loop;
  End pAddress_split;
Begin
  dbms_application_info.set_module('Split','START-'||to_char(sysdate,'yyyy-mm-dd hh24:mi:ss'));
  --
  pAddress_split('Avenida Doutor Theomário Pinto da Costa 450 Condominio Renaissance, rua 1, casa 1',arAddress_parsed);
  pAddress_split('Rua Álvaro Peres Filho 60 Casa azul em frente ao orelhão',arAddress_parsed);
  pAddress_split('Travessa Delegado Zé Lima 61 antiga Praça Rio Branco',arAddress_parsed);
  pAddress_split('Rua Finlândia 28 Qd 111',arAddress_parsed);
  pAddress_split('Alameda Áustria 107 Condomínio Jardim Europa I',arAddress_parsed);
  --
  --
  dbms_application_info.set_module('Split','END-'||to_char(sysdate,'yyyy-mm-dd hh24:mi:ss'));
End;
/

Answer:

---------------------------------------
isbAddress_string=[Avenida Doutor Theomário Pinto da Costa 450 Condominio Renaissance, rua 1, casa 1]
arFields.COUNT=13
   1[Avenida]
   2[Doutor]
   3[Theomário]
   4[Pinto]
   5[da]
   6[Costa]
   7[450]
   8[Condominio]
   9[Renaissance,]
  10[rua]
  11[1,]
  12[casa]
  13[1]
 1|[Avenida]
 2|[Doutor Theomário Pinto da Costa]
 3|[450]
 4|[Condominio Renaissance, rua 1, casa 1]
---------------------------------------
isbAddress_string=[Rua Álvaro Peres Filho 60 Casa azul em frente ao orelhão]
arFields.COUNT=11
   1[Rua]
   2[Álvaro]
   3[Peres]
   4[Filho]
   5[60]
   6[Casa]
   7[azul]
   8[em]
   9[frente]
  10[ao]
  11[orelhão]
 1|[Rua]
 2|[Álvaro Peres Filho]
 3|[60]
 4|[Casa azul em frente ao orelhão]
---------------------------------------
isbAddress_string=[Travessa Delegado Zé Lima 61 antiga Praça Rio Branco]
arFields.COUNT=9
   1[Travessa]
   2[Delegado]
   3[Zé]
   4[Lima]
   5[61]
   6[antiga]
   7[Praça]
   8[Rio]
   9[Branco]
 1|[Travessa]
 2|[Delegado Zé Lima]
 3|[61]
 4|[antiga Praça Rio Branco]
---------------------------------------
isbAddress_string=[Rua Finlândia 28 Qd 111]
arFields.COUNT=5
   1[Rua]
   2[Finlândia]
   3[28]
   4[Qd]
   5[111]
 1|[Rua]
 2|[Finlândia]
 3|[28]
 4|[Qd 111]
---------------------------------------
isbAddress_string=[Alameda Áustria 107 Condomínio Jardim Europa I]
arFields.COUNT=7
   1[Alameda]
   2[Áustria]
   3[107]
   4[Condomínio]
   5[Jardim]
   6[Europa]
   7[I]
 1|[Alameda]
 2|[Áustria]
 3|[107]
 4|[Condomínio Jardim Europa I]

Upvotes: 1

Related Questions