Reputation: 1616
I'm wondering if anyone can come up with a way to implement an array of numbers in a more memory efficient manner that will auto-organise itself into ranges. Example;
List testList = new List{1,2,3,4,5,6,7...};
vs
List<Range> testList = new List<Range>{1-3000,3002,4000-5000...};
Previously, I have asked a question just to confirm about whether or not this would in fact be a more memory efficient alternative. This question however pertains to actual application, how to implement this range list solution.
I imagine this would perhaps need to be a custom list solution that would be a mix of ints and ranges. I'm picturing being able to .Add([int]) to the list, at which point it would determine if the value would cause a range to be added or to simply add the int value to the list.
Example
RangeList rangeList = new RangeList{1, 4, 7-9};
rangeList.Add(2);
//rangeList -> 1-2, 4, 7-9
rangeList.Add(3);
//rangeList -> 1-3, 4, 7-9
Details specific to my implementation
In my particular case, I'm analysing a very large document, line by line. Lines that meet a certain criteria need to be identified and then the overall list of line indexes need to be presented to the user.
Obviously displaying "Lines 33-32019 identified" is preferable to "Lines 33,34,35...etc". For this case, numbers will always be positive.
Upvotes: 2
Views: 328
Reputation: 136239
The first thing I would do is make a class which represents your range. You can provide some convenience like formatting as a string, and having an implicit cast from an int (This helps later implementation of the range list)
public class Range
{
public int Start{get; private set;}
public int End{get; private set;}
public Range(int startEnd) : this(startEnd,startEnd)
{
}
public Range(int start, int end)
{
this.Start = start;
this.End = end;
}
public static implicit operator Range(int i)
{
return new Range(i);
}
public override string ToString()
{
if(Start == End)
return Start.ToString();
return String.Format("{0}-{1}",Start,End);
}
}
You can then begin a simple implementation of the RangeList
. By providing an Add
method you can use a list initializer similar to List<T>
:
public class RangeList : IEnumerable<Range>
{
private List<Range> ranges = new List<Range>();
public void Add(Range range)
{
this.ranges.Add(range);
}
public IEnumerator<Range> GetEnumerator()
{
return this.ranges.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator(){
return this.GetEnumerator();
}
}
At this point you can write some test code:
var rangeList = new RangeList(){
new Range(1,10),
15
};
foreach(var range in rangeList)
Console.WriteLine(range);
// Outputs:
// 1-10
// 15
Live example at this point: http://rextester.com/NCZSA71850
The next thing to do is provide an overload of Add
which takes an int and finds the right range or adds a new one. A naive implemntation might look like the below (Assuming the addition of an Update
method on range)
public void Add(int i)
{
// is it within or contiguous to an existing range
foreach(var range in ranges)
{
if(i>=range.Start && i<=range.End)
return; // already in a range
if(i == range.Start-1)
{
range.Update(i,range.End);
return;
}
if(i == range.End + 1)
{
range.Update(range.Start,i);
return;
}
}
// not in any ranges
ranges.Add(i);
}
Live example at this point: http://rextester.com/CHX64125
However this suffers from a few deficiencies
Add(11)
)Add(7)
this will be at the end not in the middle.You can solve both problems by applying a sort after each addition, and some logic to determine if you should merge ranges
private void SortAndMerge()
{
ranges.Sort((a,b) => a.Start - b.Start);
var i = ranges.Count-1;
do
{
var start = ranges[i].Start;
var end = ranges[i-1].End;
if(end == start-1)
{
// merge and remove
ranges[i-1].Update(ranges[i-1].Start,ranges[i].End);
ranges.RemoveAt(i);
}
} while(i-- >1);
}
This needs to be called after every change to the list.
public void Add(Range range)
{
this.ranges.Add(range);
SortAndMerge();
}
public void Add(int value)
{
// is it within or contiguous to an existing range
foreach(var range in ranges)
{
if(value>=range.Start && value<=range.End)
return; // already in a range
if(value == range.Start-1)
{
range.Update(value,range.End);
SortAndMerge();
return;
}
if(value == range.End + 1)
{
range.Update(range.Start,value);
SortAndMerge();
return;
}
}
// not in any ranges
ranges.Add(value);
SortAndMerge();
}
Live example here: http://rextester.com/SYLARF47057
There are still some possible edge cases with this, which I urge you to work through.
UPDATE
The below will get this working as expected. This will merge up any added ranges/ints as you would expect and returns them correctly sorted. I've only changed the Add(Range) method, I think this is a fairly clean way of doing this.
public void Add(Range rangeToAdd)
{
var mergableRange = new List<Range>();
foreach (var range in ranges)
{
if (rangeToAdd.Start == range.Start && rangeToAdd.End == range.End)
return; // already exists
if (mergableRange.Any())
{
if (rangeToAdd.End >= range.Start - 1)
{
mergableRange.Add(range);
continue;
}
}
else
{
if (rangeToAdd.Start >= range.Start - 1
&& rangeToAdd.Start <= range.End + 1)
{
mergableRange.Add(range);
continue;
}
if (range.Start >= rangeToAdd.Start
&& range.End <= rangeToAdd.End)
{
mergableRange.Add(range);
continue;
}
}
}
if (!mergableRange.Any()) //Standalone range
{
ranges.Add(rangeToAdd);
}
else //merge overlapping ranges
{
mergableRange.Add(rangeToAdd);
var min = mergableRange.Min(x => x.Start);
var max = mergableRange.Max(x => x.End);
foreach (var range in mergableRange) ranges.Remove(range);
ranges.Add(new Range(min, max));
}
SortAndMerge();
}
Finally, we need if (ranges.Count > 1)
in the SortAndMerge()
method to prevent an index error when the first range is added.
And with that, I think this fully satisfies my question.
Upvotes: 3