David Mulder
David Mulder

Reputation: 27010

Split text into pages and present separately (HTML5)

Let's say we have a long text like Romeo & Juliet and we want to present this in a simple ereader (no animations, only pages and custom font-size). What approaches exist to get this?

What I have come up with so far:

Yet none of those seems to be acceptable (first didn't give enough control to even get it to work, second isn't supported yet, third is hard and without text selection and fourth gives a ridiculous overhead), so any good approaches I haven't thought of yet, or ways to solve one or more disadvantages of the mentioned methods (yes, I am aware this is a fairly open question, but the more open it is, the higher the chance of producing any relevant answers)?

Upvotes: 27

Views: 27935

Answers (6)

alphakevin
alphakevin

Reputation: 544

Another idea is using CSS column to split html content, this reflow is done by browser it self so it will be very fast, the next step is inserting each page content into dom, I have done this by duplicating whole column and scroll each page to the cropped window, see codepen example:

https://codepen.io/alphakevin/pen/eXqbQP

const pageWidth = 320;
const content = document.getElementById('content');
const totalWidth = content.scrollWidth;
const totalPages = totalWidth / pageWidth;
console.log('totalPages', totalPages);

let contentVisible = true;
const button = document.getElementById('btn-content');
const buttonText = document.getElementById('btn-content-text');
const showHideContent = () => {
  contentVisible = !contentVisible;
  content.style.display = contentVisible ? 'block' : 'none';
  buttonText.innerText = contentVisible ? 'Hide' : 'Show';
}
button.addEventListener('click', showHideContent);

const html = content.innerHTML;
const container = document.getElementById('container');
// console.log('content', content);
for (let p = 0; p < totalPages; p++) {
  const page = document.createElement('div');
  page.innerHTML = html;
  page.className = 'page';
  page.style.cssText = `
    width: ${totalWidth}px;
    transform: translateX(-${p * pageWidth}px);
  `;
  const pageClip = document.createElement('div');
  pageClip.className = 'page-clip';
  pageClip.appendChild(page);
  const pageWrapper = document.createElement('div');
  pageWrapper.className = 'page-wrapper';
  pageWrapper.appendChild(pageClip);
  container.appendChild(pageWrapper);
}

showHideContent();

This is very suitable for few paged content, but not OK for large content, you will get alot of wasted DOM element that will never be shown.

But I think there must be better ideas like combining other answers, using javascript to help splitting column result.

For reference, check paged media solution

https://codepen.io/julientaq/pen/MBryxr

Upvotes: 6

Eric
Eric

Reputation: 97631

See my answer to Wrap text every 2500 characters in a for pagination using PHP or javascript. I ended up with http://jsfiddle.net/Eric/WTPzn/show

Quoting the original post:

Just set your HTML to:

<div id="target">...</div>

Add some css for pages:

#target {
    white-space: pre-wrap; /* respect line breaks */
}
.individualPage {
    border: 1px solid black;
    padding: 5px;    
}

And then use the following code:

var contentBox = $('#target');
//get the text as an array of word-like things
var words = contentBox.text().split(' ');

function paginate() {
    //create a div to build the pages in
    var newPage = $('<div class="individualPage" />');
    contentBox.empty().append(newPage);

    //start off with no page text
    var pageText = null;
    for(var i = 0; i < words.length; i++) {
        //add the next word to the pageText
        var betterPageText = pageText ? pageText + ' ' + words[i]
                                      : words[i];
        newPage.text(betterPageText);

        //Check if the page is too long
        if(newPage.height() > $(window).height()) {
            //revert the text
            newPage.text(pageText);

            //and insert a copy of the page at the start of the document
            newPage.clone().insertBefore(newPage);

            //start a new page
            pageText = null;
        } else {
            //this longer text still fits
            pageText = betterPageText;             
        }
    }    
}

$(window).resize(paginate).resize();

Upvotes: 10

RoboticRenaissance
RoboticRenaissance

Reputation: 1187

I don't have enough rep to make a comment yet, but I just wanted to say that Eric's answer works beautifully. I'm creating an eReader, except that it reads HTML files, and you can use it for not-ready-for-publication text. There are two pages that can be seen and they resize only when you press a button.

I made many modifications. There was only only one small flaw that I found, though. When you check to see if the last word falls off the edge of the page, and it does, you need to add that word back to the list. Simply put, in the first case of the if statement, put in the line i--; in order to go back and put that word on the next page.

Here's my modifications:

  1. made it all into a function, with the arguments (content,target).
  2. added a variable backUpContent, for reuse when I resize the pages.
  3. changed newPage to an invisible testPage and added an array page[i], containing the contents of each page, for easily going back and forth after ordering the pages.
  4. added the line "pC++;", a pagecounter, to the first part of the else statement.
  5. changed .text to .html, so that it wouldn't count the tags as their text equivalents.
  6. I designed it around 1 or 2 div's with changing content, rather than many, many divs that hide and show.
  7. There are any more inserts I haven't gotten around to yet.

If you wanted to keep something like whole paragraphs on the same page, change the line

pageText + ' ' + words[i]

to

pageText + '</p><p>' + words[i]

and the line

words = content.split(' ');

to

words = content.split('</p><p>');

But you should only use that if you're sure that each of the elements like that are small enough to go on one page.

Eric's solution is exactly the piece I was missing. I was going to ask my own question, but I finally found this page in the suggestions after typing almost all of my question. The wording of the question is a bit confusing, though.

Thank You Eric!

Upvotes: 3

Friedrich
Friedrich

Reputation: 2290

I have got a solution with quite simple, changable css markup and 3 pretty short js functions.

First I have created two div-elements, from which one is hidden but contains the whole text and the other is displayed but empty yet. The HTML would look like this:

<div id="originalText">
some text here
</div>
<div id="paginatedText"></div>

the CSS for these two is:

#originalText{
    display: none; // hides the container
}

#paginatedText{
    width: 300px;
    height: 400px;
    background: #aaa;
}

also I made the css ready for a class names page which looks like this:

.page{
    padding: 0;
    width: 298;
    height: 398px; // important to define this one
    border: 1px solid #888;
}

the really important part is to define the height because otherwise the pages will just get streched when we fill in the words later on.


Now comes the important part. The JavaScript functions. The comments should speak for themself.

function paginateText() {
    var text = document.getElementById("originalText").innerHTML; // gets the text, which should be displayed later on
    var textArray = text.split(" "); // makes the text to an array of words
    createPage(); // creates the first page
    for (var i = 0; i < textArray.length; i++) { // loops through all the words
        var success = appendToLastPage(textArray[i]); // tries to fill the word in the last page
        if (!success) { // checks if word could not be filled in last page
            createPage(); // create new empty page
            appendToLastPage(textArray[i]); // fill the word in the new last element
        }
    }
}

function createPage() {
    var page = document.createElement("div"); // creates new html element
    page.setAttribute("class", "page"); // appends the class "page" to the element
    document.getElementById("paginatedText").appendChild(page); // appends the element to the container for all the pages
}

function appendToLastPage(word) {
    var page = document.getElementsByClassName("page")[document.getElementsByClassName("page").length - 1]; // gets the last page
    var pageText = page.innerHTML; // gets the text from the last page
    page.innerHTML += word + " "; // saves the text of the last page
    if (page.offsetHeight < page.scrollHeight) { // checks if the page overflows (more words than space)
        page.innerHTML = pageText; //resets the page-text
        return false; // returns false because page is full
    } else {
        return true; // returns true because word was successfully filled in the page
    }
}

At the end I just called the paginateText function with

paginateText();

This whole skript works for every text and for every style of the pages.

So you can change the font and the font size and even the size of the pages.

I also have a jsfiddle with everything in there.

If I have forgotten anything or you have a question feel free to comment and make suggestions or ask questions.

Upvotes: 5

markE
markE

Reputation: 105035

SVG might be a good fit for your text pagination

  • SVG text is actually text -- unlike canvas which displays just a picture of text.

  • SVG text is readable, selectable, searchable.

  • SVG text does not auto-wrap natively, but this is easily remedied using javascript.

  • Flexible page sizes are possible because page formatting is done in javascript.

  • Pagination does not rely on browser dependent formatting.

  • Text downloads are small and efficient. Only the text for the current page needs to be downloaded.

Here are the details of how SVG pagination can be done and a Demo:

http://jsfiddle.net/m1erickson/Lf4Vt/

enter image description here

Part 1: Efficiently fetch about a page worth of words from a database on the server

Store the entire text in a database with 1 word per row.

Each row (word) is sequentially indexed by the word's order (word#1 has index==1, word#2 has index==2, etc).

For example this would fetch the entire text in proper word order:

// select the entire text of Romeo and Juliet
// “order by wordIndex” causes the words to be in proper order

Select word from RomeoAndJuliet order by wordIndex

If you assume any page has contains about 250 words when formatted, then this database query will fetch the first 250 words of text for page#1

// select the first 250 words for page#1

Select top 250 word from RomeoAndJuliet order by wordIndex

Now the good part!

Let’s say page#1 used 212 words after formatting. Then when you’re ready to process page#2 you can fetch 250 more words starting at word#213. This results in quick and efficient data fetches.

// select 250 more words for page#2
// “where wordIndex>212” causes the fetched words
// to begin with the 213th word in the text

Select top 250 word from RomeoAndJuliet order by wordIndex where wordIndex>212

Part 2: Format the fetched words into lines of text that fit into the specified page width

Each line of text must contain enough words to fill the specified page with, but not more.

Start line#1 with a single word and then add words 1-at-a-time until the text fits in the specified page width.

After the first line is fitted, we move down by a line-height and begin line#2.

Fitting the words on the line requires measuring each additional word added on a line. When the next word would exceed the line width, that extra word is moved to the next line.

A word can be measured using Html Canvases context.measureText method.

This code will take a set of words (like the 250 words fetched from the database) and will format as many words as possible to fill the page size.

maxWidth is the maximum pixel width of a line of text.

maxLines is the maximum number of lines that will fit on a page.

function textToLines(words,maxWidth,maxLines,x,y){

    var lines=[];

    while(words.length>0 && lines.length<=maxLines){
        var line=getOneLineOfText(words,maxWidth);
        words=words.splice(line.index+1);
        lines.push(line);
        wordCount+=line.index+1;
    }

    return(lines);
}

function getOneLineOfText(words,maxWidth){
    var line="";
    var space="";
    for(var i=0;i<words.length;i++){
        var testWidth=ctx.measureText(line+" "+words[i]).width;
        if(testWidth>maxWidth){return({index:i-1,text:line});}
        line+=space+words[i];
        space=" ";
    }
    return({index:words.length-1,text:line});
}

Part 3: Display the lines of text using SVG

The SVG Text element is a true html element that can be read, selected and searched.

Each individual line of text in the SVG Text element is displayed using an SVG Tspan element.

This code takes the lines of text which were formatted in Part#2 and displays the lines as a page of text using SVG.

function drawSvg(lines,x){
    var svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
    var sText = document.createElementNS('http://www.w3.org/2000/svg', 'text');
    sText.setAttributeNS(null, 'font-family', 'verdana');
    sText.setAttributeNS(null, 'font-size', "14px");
    sText.setAttributeNS(null, 'fill', '#000000');
    for(var i=0;i<lines.length;i++){
        var sTSpan = document.createElementNS('http://www.w3.org/2000/svg', 'tspan');
        sTSpan.setAttributeNS(null, 'x', x);
        sTSpan.setAttributeNS(null, 'dy', lineHeight+"px");
        sTSpan.appendChild(document.createTextNode(lines[i].text));
        sText.appendChild(sTSpan);
    }
    svg.appendChild(sText);
    $page.append(svg);
}

Here is complete code just in case the Demo link breaks:

<!doctype html>
<html>
<head>
<link rel="stylesheet" type="text/css" media="all" href="css/reset.css" /> <!-- reset css -->
<script type="text/javascript" src="http://code.jquery.com/jquery.min.js"></script>
<style>
    body{ background-color: ivory; }
    .page{border:1px solid red;}
</style>
<script>
$(function(){

    var canvas=document.createElement("canvas");
    var ctx=canvas.getContext("2d");
    ctx.font="14px verdana";

    var pageWidth=250;
    var pageHeight=150;
    var pagePaddingLeft=10;
    var pagePaddingRight=10;
    var approxWordsPerPage=500;        
    var lineHeight=18;
    var maxLinesPerPage=parseInt(pageHeight/lineHeight)-1;
    var x=pagePaddingLeft;
    var y=lineHeight;
    var maxWidth=pageWidth-pagePaddingLeft-pagePaddingRight;
    var text="Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";

    // # words that have been displayed 
    //(used when ordering a new page of words)
    var wordCount=0;

    // size the div to the desired page size
    $pages=$(".page");
    $pages.width(pageWidth)
    $pages.height(pageHeight);


    // Test: Page#1

    // get a reference to the page div
    var $page=$("#page");
    // use html canvas to word-wrap this page
    var lines=textToLines(getNextWords(wordCount),maxWidth,maxLinesPerPage,x,y);
    // create svg elements for each line of text on the page
    drawSvg(lines,x);

    // Test: Page#2 (just testing...normally there's only 1 full-screen page)
    var $page=$("#page2");
    var lines=textToLines(getNextWords(wordCount),maxWidth,maxLinesPerPage,x,y);
    drawSvg(lines,x);

    // Test: Page#3 (just testing...normally there's only 1 full-screen page)
    var $page=$("#page3");
    var lines=textToLines(getNextWords(wordCount),maxWidth,maxLinesPerPage,x,y);
    drawSvg(lines,x);


    // fetch the next page of words from the server database
    // (since we've specified the starting point in the entire text
    //  we only have to download 1 page of text as needed
    function getNextWords(nextWordIndex){
        // Eg: select top 500 word from romeoAndJuliet 
        //     where wordIndex>=nextwordIndex
        //     order by wordIndex
        //
        // But here for testing, we just hardcode the entire text 
        var testingText="Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.";
        var testingWords=testingText.split(" ");
        var words=testingWords.splice(nextWordIndex,approxWordsPerPage);

        // 
        return(words);    
    }


    function textToLines(words,maxWidth,maxLines,x,y){

        var lines=[];

        while(words.length>0 && lines.length<=maxLines){
            var line=getLineOfText(words,maxWidth);
            words=words.splice(line.index+1);
            lines.push(line);
            wordCount+=line.index+1;
        }

        return(lines);
    }

    function getLineOfText(words,maxWidth){
        var line="";
        var space="";
        for(var i=0;i<words.length;i++){
            var testWidth=ctx.measureText(line+" "+words[i]).width;
            if(testWidth>maxWidth){return({index:i-1,text:line});}
            line+=space+words[i];
            space=" ";
        }
        return({index:words.length-1,text:line});
    }

    function drawSvg(lines,x){
        var svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
        var sText = document.createElementNS('http://www.w3.org/2000/svg', 'text');
        sText.setAttributeNS(null, 'font-family', 'verdana');
        sText.setAttributeNS(null, 'font-size', "14px");
        sText.setAttributeNS(null, 'fill', '#000000');
        for(var i=0;i<lines.length;i++){
            var sTSpan = document.createElementNS('http://www.w3.org/2000/svg', 'tspan');
            sTSpan.setAttributeNS(null, 'x', x);
            sTSpan.setAttributeNS(null, 'dy', lineHeight+"px");
            sTSpan.appendChild(document.createTextNode(lines[i].text));
            sText.appendChild(sTSpan);
        }
        svg.appendChild(sText);
        $page.append(svg);
    }

}); // end $(function(){});
</script>
</head>
<body>
    <h4>Text split into "pages"<br>(Selectable & Searchable)</h4>
    <div id="page" class="page"></div>
    <h4>Page 2</h4>
    <div id="page2" class="page"></div>
    <h4>Page 3</h4>
    <div id="page3" class="page"></div>
</body>
</html>

Upvotes: 16

Bergi
Bergi

Reputation: 664876

That's simple, and no javascript is needed. The paged media type is supported since CSS2. See http://www.w3.org/TR/CSS21/page.html (or current CSS3 module) for supported properties.

Upvotes: -5

Related Questions