Steve Robbins
Steve Robbins

Reputation: 13812

Guess city, state, and zip from single string

So I have a location search field that I want to accept pretty much everything (city, state and zip), examples:

And any combination there of...

From that I split up the words into an array with

$inputs = preg_split("/[\s,-\/]+/", $input);

Which gives me something like

array(5) {
    [0]=> string(4) "Some"
    [1]=> string(4) "City"
    [2]=> string(3) "New"
    [3]=> string(4) "York"
    [4]=> string(5) "88888"
}

Then I pick out the zip code first

foreach ($inputs as $key => $value) {

    if (is_numeric($value) && strlen($value) == 5) {

        $zip = $value;              
        unset($inputs[$key]);
    }
}

Notice the unset()

Now I need to match the state name to my database of states. The dilemma is that some states have multiple words in the name (North Carolina, New York).

How can I match my $inputs to state names and abbreviations, the remove the matched criteria from my array (I have to do the same thing for cities next)?


I was thinking of trying

$inputString = "'" . implode("','", $inputs) . "'";

$result = mysql_query("SELECT state_name
                      FROM states
                      WHERE state_name IN ({$inputString})
                      OR state_abbrev IN ({$inputString})");

But that doesn't tell which stuff it matched or work for multi-word states

Edit:

To the haters, I would rather not have 3 separate fields. I think this complicates the user experience. I would rather have the server do the thinking instead of them, to best guess the location they were trying convey. I'll have an "advanced" search as well, which will have these fields, but all those fields take up too much space for the site design.

Examples:

Upvotes: 2

Views: 3278

Answers (4)

Clark T.
Clark T.

Reputation: 1470

a possible solution would be to to just request a zip code from a user and use http://www.zippopotam.us/ 's api to get the state and city and such not sure if this follows your ux design your seeking but i've done this with jquery using their api which returns two fields with the values

   $("#text-4edcd39ecca23").keyup(function (event) {
        if (this.value.length === 5) {
            var $citywrap = $("#fm-item-text-4edcd393cb50f");
            var $city = $("#text-4edcd38744891");
            var $statewrap = $("#fm-item-text-4edcd38744891");
            var $state = $("#text-4edcd393cb50f");
            var $zip = $('#text-4edcd39ecca23');

            $.ajax({
                url:"http://zippo-zippopotamus.dotcloud.com/us/" + $zip.val(),
                cache:false,
                dataType:"json",
                type:"GET",
                data:"us/" + $zip.val(),
                success:function (result, success) {
                    // Remove Error Message if one is presant
                    $zip.parent().find('small').remove();
                    // US Zip Code Records Officially Map to only 1 Primary Location
                    var places = result['places'][0];
                    $city.val(places['place name']);
                    $state.val(places['state']);
                    $citywrap.slideDown();
                    $statewrap.slideDown();
                },
                error:function (result, success) {
                    $citywrap.slideUp();
                    $statewrap.slideUp();
                    $city.val('');
                    $state.val('');
                    $zip.parent().find('br').remove();
                    $zip.parent().find('small').remove();
                    $zip.after('<br /><small class="error">Sorry your zipcode was not reconized please try again</small>');
                }
            });
        }
    });

Upvotes: 0

davesnitty
davesnitty

Reputation: 1850

I completely agree with your idea of making it easy for the user and having all address info in 1 single input box. However, each user may input the information somewhat differently, and it will be very hard to come up with an algo that covers every case. The best bet is to see if someone has done this already, and as you mention, google has. Luckily, they have an API for just such a problem.

If you use the Google Maps Geocoder (https://developers.google.com/maps/documentation/geocoding/#GeocodingRequests), you can basically pass it anything that reasonably looks like an address, and it will return a well-structured address result.

Google's example: https://google-developers.appspot.com/maps/documentation/javascript/examples/geocoding-simple

Another Example - looking up the white house: Put this URL in your browser: http://maps.googleapis.com/maps/api/geocode/json?address=1600%20pennsylvania%20ave%20washongton%20dc&sensor=false (note I intentionally misspelled here to show the API's tolerance).

The API call returns a very useful JSON object:

{
   "results" : [
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave NW",
               "short_name" : "Pennsylvania Ave NW",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20502",
               "short_name" : "20502",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave NW, Washington, DC 20502, USA",
         "geometry" : {
            "location" : {
               "lat" : 38.89767770,
               "lng" : -77.03651700000002
            },
            "location_type" : "ROOFTOP",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.89902668029149,
                  "lng" : -77.03516801970852
               },
               "southwest" : {
                  "lat" : 38.89632871970850,
                  "lng" : -77.03786598029153
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      },
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave NW",
               "short_name" : "Pennsylvania Ave NW",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20500",
               "short_name" : "20500",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave NW, Washington, DC 20500, USA",
         "geometry" : {
            "location" : {
               "lat" : 38.89871490,
               "lng" : -77.03765550
            },
            "location_type" : "ROOFTOP",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.90006388029150,
                  "lng" : -77.03630651970849
               },
               "southwest" : {
                  "lat" : 38.89736591970851,
                  "lng" : -77.03900448029150
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      },
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave NW",
               "short_name" : "Pennsylvania Ave NW",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20004",
               "short_name" : "20004",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave NW, Washington, DC 20004, USA",
         "geometry" : {
            "location" : {
               "lat" : 38.89549710,
               "lng" : -77.03008090000002
            },
            "location_type" : "ROOFTOP",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.89684608029150,
                  "lng" : -77.02873191970852
               },
               "southwest" : {
                  "lat" : 38.89414811970850,
                  "lng" : -77.03142988029153
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      },
      {
         "address_components" : [
            {
               "long_name" : "1600",
               "short_name" : "1600",
               "types" : [ "street_number" ]
            },
            {
               "long_name" : "Pennsylvania Ave SE",
               "short_name" : "Pennsylvania Ave SE",
               "types" : [ "route" ]
            },
            {
               "long_name" : "Hill East",
               "short_name" : "Hill East",
               "types" : [ "neighborhood", "political" ]
            },
            {
               "long_name" : "Washington",
               "short_name" : "Washington",
               "types" : [ "locality", "political" ]
            },
            {
               "long_name" : "District of Columbia",
               "short_name" : "DC",
               "types" : [ "administrative_area_level_1", "political" ]
            },
            {
               "long_name" : "United States",
               "short_name" : "US",
               "types" : [ "country", "political" ]
            },
            {
               "long_name" : "20003",
               "short_name" : "20003",
               "types" : [ "postal_code" ]
            }
         ],
         "formatted_address" : "1600 Pennsylvania Ave SE, Washington, DC 20003, USA",
         "geometry" : {
            "bounds" : {
               "northeast" : {
                  "lat" : 38.87865290,
                  "lng" : -76.98170180
               },
               "southwest" : {
                  "lat" : 38.87865220,
                  "lng" : -76.98170229999999
               }
            },
            "location" : {
               "lat" : 38.87865290,
               "lng" : -76.98170180
            },
            "location_type" : "RANGE_INTERPOLATED",
            "viewport" : {
               "northeast" : {
                  "lat" : 38.88000153029150,
                  "lng" : -76.98035306970850
               },
               "southwest" : {
                  "lat" : 38.87730356970850,
                  "lng" : -76.98305103029151
               }
            }
         },
         "partial_match" : true,
         "types" : [ "street_address" ]
      }
   ],
   "status" : "OK"
}    

Upvotes: 0

Steve Robbins
Steve Robbins

Reputation: 13812

This is what I'm using currently but there's so many loops and queries that I doubt it's efficient or "guesses" very accurately

    function getLocations($input) {

    $state = NULL;
    $zip = NULL;

    $input = strtoupper(trim($input));

    $inputs = preg_split("/[^a-zA-Z0-9]+/", $input);

    // Resolve zip code
    foreach ($inputs as $key => $value) {

        if (is_numeric($value) && strlen($value) == 5) {

            $zip = $value;              
            unset($inputs[$key]);
        }
    }

    $inputs = array_reverse($inputs);

    $result = mysql_query("SELECT state_name, state_abbrev FROM states");

    // Resolve state (one worded)
    while ($row = mysql_fetch_assoc($result)) {

        foreach ($inputs as $key => $value) {

            if ($row['state_abbrev'] == $value || $row['state_name'] == $value) {

                $state = $row['state_abbrev'];
                unset($inputs[$key]);

                return array(
                    'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
                    'state' => "'" . $state . "'",
                    'zip' => "'" . $zip . "'"
                );
            }
        }
    }

    // Resolve state (2/3 worded)
    for ($i = 0; $i < count($inputs) - 1; $i++) {

        $duoValue = @$inputs[$i + 1] . " " . @$inputs[$i];

        if (count($inputs) > $i + 2) {

            $trioValue = $inputs[$i + 2] . " " . $duoValue;
        }

        $result2 = mysql_query("SELECT state_name, state_abbrev FROM states") or die (mysql_error());

        while ($row = mysql_fetch_assoc($result2)) {

            if ($row['state_abbrev'] == $duoValue || $row['state_name'] == $duoValue) {

                $state = $row['state_abbrev'];
                unset($inputs[$i], $inputs[$i + 1]);

                return array(
                    'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
                    'state' => "'" . $state . "'",
                    'zip' => "'" . $zip . "'"
                );
            }
            else if ($i < count($inputs) - 2) {

                if ($row['state_abbrev'] == $trioValue || $row['state_name'] == $trioValue) {

                    $state = $row['state_abbrev'];
                    unset($inputs[$i], $inputs[$i + 1], $inputs[$i + 2]);

                    return array(
                        'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
                        'state' => "'" . $state . "'",
                        'zip' => "'" . $zip . "'"
                    );
                }
            }
        }
    }

    return array(
        'city' => "'" . implode(" ", array_reverse($inputs)) . "'",
        'state' => "'" . $state . "'",
        'zip' => "'" . $zip . "'"
    );
}

Upvotes: 1

RandomSeed
RandomSeed

Reputation: 29769

You could add a column to your address table that contains the concatenation of City name, State name, Zip code, and so on. Then set a FULLTEXT index on it and run a full text search of your whole input string on it.

Not sure how well this performs, though.

Upvotes: 2

Related Questions