thekevinscott
thekevinscott

Reputation: 5413

Algorithm recommendations for breaking up array of strings in most balanced way

I have an array of strings, of arbitrary length (say, 30-45), that I want to reformat to fit on a certain number of pages (say, 15).

I'd like to distribute the strings among the pages as evenly as possible, so that all pages are as close to each other's total character length as possible, regardless of total number of strings per page. I also need to preserve string order (so I can't rearrange the array).

Are there any particular algorithms you'd recommend for tackling this? Or vague approaches you'd take? Thanks!

Upvotes: 4

Views: 225

Answers (4)

LiKao
LiKao

Reputation: 10658

You could use the algorithm for generating euclidian rythms. Euclidian rythms are rythms which are spread out as evenly as possible over a number of beats. So if you have four beats that you want to spread out over 10 positions you would get:

..x.x..x.x

Now if you have 10 strings and you would want to spread them out on four pages, you would just add a page break after each string marked with x:

string1
string2
string3

string4
string5

string6
string7
string8

string9
string10

That way you achieve an almost constant number of strings per page and the shorter pages also get spread out evenly among all pages.

The algorithm is fairly simple and based on the euclidian algorithm for calculating the gcd and can be implemented in a few lines. Also it should be fairly fast even with a large number of pages and elements.

Upvotes: 0

AShelly
AShelly

Reputation: 35540

Isn't this as simple as dividing the total characters by the total pages and adding sentences until you get close to the target chars per page? Eventually you get to a sentence which would span pages if it could break in the middle. If the majority of that sentence fits on the current page, place it, else postpone it for the next page.

chars_left = 0
chars_per_page = total_chars / total_pages
for i = 0 .. total_pages
   chars_left += chars_per_page
   while (chars_left > 0)
       s = get_next_sentence
       if s.length/2 > chars_left then break
       page.add( s)
       chars_left -= s.length
   endwhile
endfor

Upvotes: 0

mcdowella
mcdowella

Reputation: 19601

One way would be to format your text with http://en.wikipedia.org/wiki/TeX - it's line breaking algorithm is optimal, and based on dynamic programming. Unfortunately its page breaking algorithm is not optimal, although I expect it is as good as you will find easily.

If you can model each page as having space for a fixed number of characters, then there is indeed a dynamic programming solution. You need to find a way to put 14 page-breaks at the optimal place. Work from left to right, and at each possible place for a page break, work out an total unevenness penalty for the best possible way of inserting k-1 page breaks in the previous text, terminating with the possible place the k-th page break. Do this for k = 1..14. You can use the information to the left that you have previously calculated in working out the total penalty at the new place.

When you get to the end of the text, you can use the calculations so far to work out the uneven-ness penalty for the best possible way to insert 14 page breaks to the left. If you have kept records of calculations to the left, you can also work out where the rightmost of the 14 page breaks should go. You can go back to there and work out where the 13th page break should be, and so on until you have found the locations of all the page breaks.

Upvotes: 2

rossum
rossum

Reputation: 15685

I would approach this is two stages, first build an approximate solution and then improve that solution.

First go through your list of strings and assign each string in turn to the page with the most space left. You may want to check if there is enough space to redistribute the last page strings onto the earlier pages, so reducing the number of pages needed.

Second, pick the pages with most and least space left. Swap a shorter string from one with a longer string from the other so the space left on the two pages is closer. Repeat (making sure not to get into an infinite loop) until you have something reasonably balanced across all the pages.

This is an approximate solution, not an exact one, but it should be able to produce a reasonable result fairly quickly.

Upvotes: 1

Related Questions