Aaron H
Aaron H

Reputation: 29

Separating an address line into House Number, Street name, and Apartment in Java or COBOL

I am currently trying to figure out the best way to take an address line and separate it out into three fields for a file, house number, street name, and apartment number. Thankfully, the city, state, and zip are already in columns so all I have to parse out is just the three things listed above, but even that is proving difficult. My initial hope was to do this in COBOL using SQL, but I dont think I am able to use the PATINDEX example someone else had listed on a separate question thread, I kept getting -440 SQL code. My second thought was to do this in Java using the strings as arrays and checking the arrays for numbers, then letters, then a compare for "Apt" or something to that effect. I have this so far to try to test out what I'm ultimately trying to do, but I am getting out of bounds exception for the array.

class AddressTest{
    public static void main (String[] arguments){
       String adr1 = "100 village rest court";
       String adr2 = "1000 Arbor lane Apt. 21-D";
       String[] HouseNbr = new String[9];
       String[] Street = new String[20];
       String[] Apt = new String[5];

       for(int i = 0; i < adr1.length();i++){
           String[] forloop = new String[] {adr1};
           if (forloop[i].substring(0,1).matches("[0-9]")){
               if(forloop[i+1].substring(0,1).matches("[0-9]")){
                   HouseNbr[i] = forloop[i];
               }
               else if(forloop[i+1].substring(0,1).matches(" ")){
               }
               else if(forloop[i].substring(0,1).matches(" ")){
               }
               else{
                   Street[i] = forloop[i];
               }
           }
       }

       for(int j = 0; j < HouseNbr.length; j++){
               System.out.println(HouseNbr[j]);
       }
       for(int k = 0; k < Street.length; k++){
           System.out.println(Street[k]);
       }
    }   
}

Any other thoughts would be extremly helpful.

Upvotes: 1

Views: 8448

Answers (3)

Aaron H
Aaron H

Reputation: 29

I am still working on it, but for any in the future who may need to do this:

import java.util.Arrays;
import java.util.StringTokenizer;
import org.apache.commons.lang3.*;

class AddressTest{
public static void main (String[] arguments){
   String adr1 = "100 village rest court";
   //String adr2 = "1000 Arbor lane Apt. 21-D";
   String reader = new String();
   String holder = new String();
   StringTokenizer a1 = new StringTokenizer(adr1);
   String[] HouseNbr = new String[9];
   String[] StreetName = new String[20];
   String[] Apartment = new String[5];
   int counter = 0;

   while(a1.hasMoreElements()){
       reader = a1.nextElement().toString();
       System.out.println("Reader: " + reader);
       if(StringUtils.isNumeric(reader)){
           String[] HNBR = reader.split("");
           for(int i = 1; i <= reader.length();i++){
               System.out.println("HNBR:_" + HNBR[i]);
               HouseNbr[i-1] = HNBR[i];   
           }
       }
       else if(StringUtils.startsWith(reader, "Apt.")){
           holder = a1.nextElement().toString();
           String[] ANBR = holder.split("");
           for(int j = holder.length(); j >= 0;j--){
               Apartment[j] = ANBR[j];
           }

       }
       else{
           String STR[] = reader.split("");
           for(int k = 1; k <= reader.length();k++){
               if(counter == StreetName.length){
                   break;
               }
               else{
                   StreetName[counter] = STR[k];
                   if(counter < StreetName.length){
                       counter++;
                   }
               }
           }
           if((counter < StreetName.length) && a1.hasMoreElements()){
               StreetName[counter] = " ";
               counter++;
           }
       }

   }
   System.out.println(Arrays.toString(HouseNbr) + " " + Arrays.toString(StreetName)                
       + " " + Arrays.toString(Apartment));
    }   
}

Upvotes: 1

Joe Zitzelberger
Joe Zitzelberger

Reputation: 4263

If you leverage the freely available U.S. Postal Service zip code finder (https://tools.usps.com/go/ZipLookupAction!input.action), you can get back an address in standardized format. The valid options on that format are documented by the USPS and will make it easier to write a very complicated regex, or a number of simple regexes, to read the standard form.

Upvotes: 1

Pat B
Pat B

Reputation: 1955

I would consider removing the unnecessary arrays and use a StringTokenizer...

public static void main(String[] args) {

     String number;
     String address;
     String aptNumber;


    String str = "This is String , split by StringTokenizer";
    StringTokenizer st = new StringTokenizer(str);

    System.out.println("---- Split by space ------");
    while (st.hasMoreElements()) {
                String s = System.out.println(st.nextElement());

                if (StringUtils.isNumeric(s) {
                    number = s;
                    continue;  
            }   

                if(s.indexOf("Apt")) {
                   aptNumber = s.substring(s.indexOf("Apt"),s.length-1);
                   continue;
                }

    }

    System.out.println("---- Split by comma ',' ------");
    StringTokenizer st2 = new StringTokenizer(str, ",");

    while (st2.hasMoreElements()) {
        System.out.println(st2.nextElement());
    }
}

Upvotes: 1

Related Questions