Vilx-
Vilx-

Reputation: 106912

Need an algorithm to split a series of numbers

After a few busy nights my head isn't working so well, but this needs to be fixed yesterday, so I'm asking the more refreshed community of SO.

I've got a series of numbers. For example:

1, 5, 7, 13, 3, 3, 4, 1, 8, 6, 6, 6

I need to split this series into three parts so the sum of the numbers in all parts is as close as possible. The order of the numbers needs to be maintained, so the first part must consist of the first X numbers, the second - of the next Y numbers, and the third - of whatever is left.

What would be the algorithm to do this?

(Note: the actual problem is to arrange text paragraphs of differing heights into three columns. Paragraphs must maintain order (of course) and they may not be split in half. The columns should be as equal of height as possible.)

Upvotes: 11

Views: 1102

Answers (5)

Zaphood
Zaphood

Reputation: 2647

Lets say p is your array of paragraph heights;

int len= p.sum()/3;   //it is avarage value
int currlen=0;
int templen=0;
int indexes[2]; 
int j = 0;
for (i=0;i<p.lenght;i++)
{
    currlen = currlen + p[i];
    if (currlen>len)
    {
        if ((currlen-len)<(abs((currlen-p[i])-len))
        { //check which one is closer to avarege val
            indexes[j++] = i;
            len=(p.sum()-currlen)/2         //optional: count new avearege height from remaining lengths
            currlen = 0;
        }
        else
        {
            indexes[j++] = i-1;
            len=(p.sum()-currlen)/2
            currlen = p[i];
        }
    }
    if (j>2)
        break;
}

You will get starting index of 2nd and 3rd sequence. Note its kind of pseudo code :)

Upvotes: 4

Lior Kogan
Lior Kogan

Reputation: 20608

First, we'll need to define the goal better:

Suppose the partial sums are A1,A2,A3, We are trying to minimize |A-A1|+|A-A2|+|A-A3|. A is the average: A=(A1+A2+A3)/3.

Therefore, we are trying to minimize |A2+A3-2A1|+|A1+A3-2A2|+|A1+A2-2A3|.

Let S denote the sum (which is constant): S=A1+A2+A3, so A3=S-A1-A2.

We're trying to minimize:

|A2+S-A1-A2-2A1|+|A1+S-A1-A2-2A2|+|A1+A2-2S+2A1+2A2|=|S-3A1|+|S-3A2|+|3A1+SA2-2S|

Denoting this function as f, we can do two loops O(n^2) and keep track of the minimum:

Something like:

for (x=1; x<items; x++)
{
    A1= sum(Item[0]..Item[x-1])
    for (y=x; y<items; y++)
    {
        A2= sum(Item[x]..Item[y-1])
        calc f, if new minimum found -keep x,y
    }
}

Upvotes: 6

Ricky Bobby
Ricky Bobby

Reputation: 7608

Following Aasmund Eldhuset answer, I previously answerd this question on SO.

Word wrap to X lines instead of maximum width (Least raggedness)

This algo doesn't rely on the max line size but just gives an optimal cut.

I modified it to work with your problem :

L=[1,5,7,13,3,3,4,1,8,6,6,6]

def minragged(words, n=3):


P=2
cumwordwidth = [0]
# cumwordwidth[-1] is the last element
for word in words:
    cumwordwidth.append(cumwordwidth[-1] + word)
totalwidth = cumwordwidth[-1] + len(words) - 1  # len(words) - 1 spaces
linewidth = float(totalwidth - (n - 1)) / float(n)  # n - 1 line breaks

print "number of words:", len(words)
def cost(i, j):
    """
    cost of a line words[i], ..., words[j - 1] (words[i:j])
    """
    actuallinewidth = max(j - i - 1, 0) + (cumwordwidth[j] - cumwordwidth[i])
    return (linewidth - float(actuallinewidth)) ** P

"""
printing the reasoning and reversing the return list
"""
F={} # Total cost function

for stage in range(n):
    print "------------------------------------"
    print "stage :",stage
    print "------------------------------------"
    print "word i to j in line",stage,"\t\tTotalCost (f(j))"
    print "------------------------------------"


    if stage==0:
        F[stage]=[]
        i=0
        for j in range(i,len(words)+1):
            print "i=",i,"j=",j,"\t\t\t",cost(i,j)
            F[stage].append([cost(i,j),0])
    elif stage==(n-1):
        F[stage]=[[float('inf'),0] for i in range(len(words)+1)]
        for i in range(len(words)+1):
                j=len(words)
                if F[stage-1][i][0]+cost(i,j)<F[stage][j][0]: #calculating min cost (cf f formula)
                    F[stage][j][0]=F[stage-1][i][0]+cost(i,j)
                    F[stage][j][1]=i
                    print "i=",i,"j=",j,"\t\t\t",F[stage][j][0]            
    else:
        F[stage]=[[float('inf'),0] for i in range(len(words)+1)]
        for i in range(len(words)+1):
            for j in range(i,len(words)+1):
                if F[stage-1][i][0]+cost(i,j)<F[stage][j][0]:
                    F[stage][j][0]=F[stage-1][i][0]+cost(i,j)
                    F[stage][j][1]=i
                    print "i=",i,"j=",j,"\t\t\t",F[stage][j][0]

print 'reversing list'
print "------------------------------------"
listWords=[]
a=len(words)
for k in xrange(n-1,0,-1):#reverse loop from n-1 to 1
    listWords.append(words[F[k][a][1]:a])
    a=F[k][a][1]
listWords.append(words[0:a])
listWords.reverse()

for line in listWords:
    print line, '\t\t',sum(line)

return listWords

THe result I get is :

[1, 5, 7, 13]       26
[3, 3, 4, 1, 8]         19
[6, 6, 6]       18
[[1, 5, 7, 13], [3, 3, 4, 1, 8], [6, 6, 6]]

Hope it helps

Upvotes: 2

vikas368
vikas368

Reputation: 1468

find sum and cumulative sum of series.

get a= sum/3

then locate nearest a, 2*a in the cumulative sum which divides your list into three equal parts.

Upvotes: 4

Aasmund Eldhuset
Aasmund Eldhuset

Reputation: 37950

I believe that this can be solved with a dynamic programming algorithm for line breaking invented by Donald Knuth for use in TeX.

Upvotes: 3

Related Questions