jonatr
jonatr

Reputation: 380

Splitting an address string into a numeric and text parts without using regular expressions

I'm working in C# and I have lots of strings containing addresses, such as:

10 Downing Street
Birch Lane 7 
Palm Creek 8 Street
84 Chancellor Place
Battle on the Somme 56

and so on.

I need to split these strings into a numeric part(such as "10" or "7")and a textual part (such as "Downing street" or "Birch Lane").

Oh, and I was asked not to use RegEx.

I've tried already to split them in the spaces like this:

string s ="84 Chancellor place";
string [] words = s.Split(' ');

Problem is that (of course) it doesn't split all string the same way, so I can't always exclude the number from the rest of the text (I don't always know that the number is in words[0] for example, and the textual parts is in different cells and not fused together).

I would much appreciate your help to find a way to extract the digits.

Edit: Desired outputs in each example:

string1=10  string2=Downing Street
string1=7   string2=Birch Lane 
string1=8   string2=Palm Creek Street
string1=84  string2=Chancellor Place
string1=56  string2=Battle on the Somme

Upvotes: 2

Views: 3012

Answers (1)

Tim Schmelter
Tim Schmelter

Reputation: 460108

You could use this loop to initialize a List<Address> with the help of string.Split and int.TryParse:

List<Address> addresses = new List<Address>();
foreach (string str in strings)
{
    Address addr = new Address();
    addresses.Add(addr);
    int num, numIndex = int.MinValue;
    string[] tokens = str.Split(new[]{' '}, StringSplitOptions.RemoveEmptyEntries);
    for (int i = 0; i < tokens.Length; i++)
    {
        if (int.TryParse(tokens[i], out num))
        {
            addr.Number = num;
            numIndex = i;
            break;
        }
    }
    if (addr.Number.HasValue)
    {
        // join the rest with white-spaces to the street name skipping the number
        addr.Street = string.Join(" ", tokens.Where((s, i) => i != numIndex));
    }
    else
    {
        addr.Street = str;
    }
}

Used this small class:

class Address
{
    public int Number { get; set; }
    public string Street { get; set; }
}

DEMO

Disclaimer: note that this is not fail-safe at all if the input is arbitrary. There are many streets that contain numbers as well in the world and there are also numbers with chars like "17a".

Upvotes: 1

Related Questions