Reputation: 31
I have an array of alphanumeric postal codes that could look something like this:
PostalCodes(0) = "AA000-DD130"
PostalCodes(1) = "DD131-DD150"
PostalCodes(2) = "DD151-EE180"
PostalCodes(3) = "EE300-EE600"
PostalCodes(4) = "EE450-EE700"
PostalCodes(5) = "EE800"
PostalCodes(6) = "EE810"
PostalCodes(7) = "EE811"
PostalCodes(8) = "EE812"
PostalCodes(9) = "EE813"
PostalCodes(10) = "EE814"
PostalCodes(11) = "EE815"
And I want it to optimize to something like this:
PostalCodes(0) = "AA000-EE180"
PostalCodes(1) = "EE300-EE700"
PostalCodes(2) = "EE800"
PostalCodes(3) = "EE810-EE815"
As you can see ranges can overlap or there might be gaps, its ok, I only want to optimize(reduce) the postal codes as much as I can. I already have a code using For loops but I would like to know if there is a way to use Linq to do this task faster and improve performance?
I'm using vb.net.
Thanks in advance.
Upvotes: 0
Views: 81
Reputation: 117124
This pretty much does it:
Dim PostalCodes(11) As String
PostalCodes(0) = "AA000-DD130"
PostalCodes(1) = "DD131-DD150"
PostalCodes(2) = "DD151-EE180"
PostalCodes(3) = "EE300-EE600"
PostalCodes(4) = "EE450-EE700"
PostalCodes(5) = "EE800"
PostalCodes(6) = "EE810"
PostalCodes(7) = "EE811"
PostalCodes(8) = "EE812"
PostalCodes(9) = "EE813"
PostalCodes(10) = "EE814"
PostalCodes(11) = "EE815"
Dim splits = _
PostalCodes _
.Select(Function (x) If(x.Contains("-"), x.Split("-"c), { x, x })) _
.Select(Function (ps) ps.Select(Function (p) New With _
{ _
.Prefix = p.Substring(0, 2), _
.Value = Integer.Parse(p.Substring(2)) _
}).ToArray()) _
.ToArray()
Dim results = _
splits _
.Skip(1) _
.Aggregate( _
splits.Take(1).ToList(), _
Function (a, x)
Dim l = a.Last()
If x(0).Prefix = l(1).Prefix AndAlso x(0).Value <= l(1).Value + 1 Then
a.RemoveAt(a.Count - 1)
a.Add( _
{ _
New With _
{ _
.Prefix = l(0).Prefix, _
.Value = l(0).Value _
}, _
New With _
{ _
.Prefix = l(1).Prefix, _
.Value = x(1).Value _
} _
})
Else
a.Add(x)
End If
Return a
End Function) _
.Select(Function (xs) String.Format("{0}{1:000}-{2}{3:000}", xs(0).Prefix, xs(0).Value, xs(1).Prefix, xs(1).Value)) _
.ToArray()
It gives:
AA000-DD180 EE300-EE700 EE800-EE800 EE810-EE815
A small bit of work remaining to get rid of the double "EE800-EE800"
.
Upvotes: 1
Reputation: 26927
Using an extension method to convert ranges into tuples with StartPrefix, Start, EndPrefix, End values:
<Extension()>
Public Function RangeToTuple(ByVal strRange As String) As (StartPrefix As String, Start As Integer, EndPrefix As String, [End] As Integer
If strRange.Contains("-") Then
Dim twoRanges = strRange.Split("-")
Dim intRanges = twoRanges.Select(Function(rs) rs.Substring(2).ToInteger()).ToList()
Return (twoRanges(0).Substring(0, 2), intRanges(0), twoRanges(1).Substring(0, 2), intRanges(1))
Else
Dim intRange = strRange.Substring(2).ToInteger()
Return (strRange.Substring(0, 2), intRange, strRange.Substring(0, 2), intRange)
End If
End Function
Which uses a simple extension for converting String to Integer:
<Extension()>
Public Function ToInteger(s As String) As Integer
Return Convert.ToInt32(s)
End Function
Using a custom extension method that is a variation of the APL scan operator that works like Aggregate
but returns the intermediate values, and uses a ValueTuple
to return the original values with the intermediate calculations:
<Extension()>
Public Iterator Function ScanPair(Of T, TKey)(src As IEnumerable(Of T), seedKey As TKey, combine As Func(Of (Key As TKey, Value As T), T, TKey)) As IEnumerable(Of (Key As TKey, Value As T))
Using srce = src.GetEnumerator()
If srce.MoveNext() Then
Dim prevkv = (seedKey, srce.Current)
While srce.MoveNext()
Yield prevkv
prevkv = (combine(prevkv, srce.Current), srce.Current)
End While
Yield prevkv
End If
End Using
End Function
And an extension method that groups based on a predicate:
<Extension()>
Public Function GroupByWhile(Of T, TRes)(src As IEnumerable(Of T), test As Func(Of T, T, Boolean), result As Func(Of T, TRes)) As IEnumerable(Of IGrouping(Of Integer, TRes))
Return src.ScanPair(1, Function(kvp, cur) If(test(kvp.Value, cur), kvp.Key, kvp.Key + 1)) _
.GroupBy(Function(kvp) kvp.Key, Function(kvp) result(kvp.Value))
End Function
<Extension()>
Public Function GroupByWhile(Of T)(src As IEnumerable(Of T), test As Func(Of T, T, Boolean)) As IEnumerable(Of IGrouping(Of Integer, T))
Return src.GroupByWhile(test, Function(e) e)
End Function
Then you can use LINQ to process the tuples:
Dim combinedRanges = input.Select(Function(rs) rs.RangeToTuple()) _
.GroupByWhile(Function(prev, cur) prev.EndPrefix = cur.StartPrefix And cur.Start <= prev.End+1 And prev.End <= cur.End) _
.Select(Function(kvpg) (kvpg.First().StartPrefix, kvpg.First().Start, kvpg.Last().EndPrefix, kvpg.Last().End)) _
.Select(Function(tp) If(tp.Start = tp.End, $"{tp.StartPrefix}{tp.Start}", _
$"{tp.StartPrefix}{tp.Start}-{tp.EndPrefix}{tp.End}"))
NOTE: After writing this answer, I realized I could generalize my sequence grouping constructs to group on a predicate, and make it more accessible.
Converted to VB per comment.
Upvotes: 0