prodigitalson
prodigitalson

Reputation: 60413

Is there a best practice for converting US location strings into individual parts?

Is there a best practice or common algorithm for implementing conversion of natural search string for locations location (US only) into its separate components?

for example:

City Name, ST 00000

TO

city => City Name
state => ST
zipcode => 00000

This is for a form so i dont need to handle any possible permutation - i can restrict the format to something like: city, st 00000 but i need to be able to handle omission of the any of the segments within the format, so that they optional to some extent... some examples of supported combinations (case insensitive):

00000 // zipcode
0000-00000 //zipcode
city, st / city and state - comma separated
city st // city and state - space separated
city, st 00000 // city state zip
st 00000 // state and zip - though i only really need the zip
city 00000 // city and zip - though i only really need the zip

I can also use a static set of State abbreviations so those could potentially be matched to validate a state segment if needed.

Upvotes: 1

Views: 468

Answers (2)

prodigitalson
prodigitalson

Reputation: 60413

While i was researching I found some other code referenced in another SO question that i used while i was waiting... I modified the code here to support getting the zipcode as well as the city state: http://www.eotz.com/2008/07/parsing-location-string-php

Others might also find this useful.

@delphist: THANKS. Once i have time to compare accuracy and performance i may switch to your code if its better - its certainly simpler/shorter! If i do Ill mark it as the official answer.

Upvotes: 0

delphist
delphist

Reputation: 4549

<?php
    function uslocation($string)
    {
            // Fill it with states
        $states = array('D.C.', 'D.C', 'DC', 'TX', 'CA', 'ST');

        // Extract state
        $state = '';
        foreach($states as $st)
        {
            $statepos = strpos(' '.$string, $st);
            if($statepos > 0)
            {
                $state = substr($string, $statepos-1, strlen($st));
                $string = substr_replace($string, '', $statepos-1, strlen($st));
            }
        }

        if(preg_match('/([\d\-]+)/', $string, $zipcode))
        {
            $zipcode = $zipcode[1];
            $string = str_replace($zipcode, '', $string);
        }
        else
        {
            $zipcode = '';
        }

        return array(
            'city' => trim(str_replace(',', '', $string)),
            'state' => $state,
            'zipcode' => $zipcode,
        );
    }

    // Some tests
    $check = array(
        'Washington D.C.',
        'City Name TX',
        'City Name, TX',
        'City Name, ST, 0000',
        'NY 7445',
        'TX 23423',
    );

    echo '<pre>';
    foreach($check as $chk)
    {
        echo $chk . ": \n";
        print_r(uslocation($chk));
        echo "\n";
    }
    echo '</pre>';
?>

Upvotes: 1

Related Questions