Reputation: 169543

Truncate Markdown?

I have a Rails site, where the content is written in markdown. I wish to display a snippet of each, with a "Read more.." link.

How do I go about this? Simple truncating the raw text will not work, for example..

>> "This is an [example](http://example.com)"[0..25]
=> "This is an [example](http:"

Ideally I want to allow the author to (optionally) insert a marker to specify what to use as the "snippet", if not it would take 250 words, and append "..." - for example..

This article is an example of something or other.

This segment will be used as the snippet on the index page.

^^^^^^^^^^^^^^^

This text will be visible once clicking the "Read more.." link

The marker could be thought of like an EOF marker (which can be ignored when displaying the full document)

I am using maruku for the Markdown processing (RedCloth is very biased towards Textile, BlueCloth is extremely buggy, and I wanted a native-Ruby parser which ruled out peg-markdown and RDiscount)

Alternatively (since the Markdown is translated to HTML anyway) truncating the HTML correctly would be an option - although it would be preferable to not markdown() the entire document, just to get the first few lines.

So, the options I can think of are (in order of preference)..

Add a "truncate" option to the maruku parser, which will only parse the first x words, or till the "excerpt" marker.
Write/find a parser-agnostic Markdown truncate'r
Write/find an intelligent HTML truncating function

Upvotes: 10

Answers (7)

ChrisEstanol

Reputation: 121

A simpler option that just works:

truncate(markdown(item.description), length: 100, escape: false)

Upvotes: -1

wondersz1

Reputation: 905

Not sure if it applies to this case, but adding the solution below for the sake of completeness. You can use strip_tags method if you are truncating Markdown-rendered contents:

truncate(strip_tags(markdown(article.contents)), length: 50)

Sourced from: http://devblog.boonecommunitynetwork.com/rails-and-markdown/

Upvotes: -1

Elland

Reputation: 207

I will have to agree with the "two inputs" approach, and the content writer would need not to worry, since you can modify the background logic to mix the two inputs in one when showing the full content.

full_content = input1 + input2 // perhaps with some complementary html, for a better formatting

Upvotes: 0

diclophis

Reputation: 2442

Rather than trying to truncate the text, why not have 2 input boxes, one for the "opening blurb" and one for the main "guts". That way your authors will know exactly what is being show when without having to rely on some sort of funkly EOF marker.

Upvotes: 1

nicholaides

Reputation: 19489

Here's a solution that works for me with Textile.

Convert it to HTML
Truncate it.
Remove any HTML tags that got cut in half with
```
html_string.gsub(/<[^>]*$/, "")
```
Then, uses Hpricot to clean it up and close unclosed tags
```
html_string = Hpricot( html_string ).to_s 
```

I do this in a helper, and with caching there's no performance issue.

Upvotes: 2

csexton

Reputation: 24783

You could use a regular expression to find a line consisting of nothing but "^" characters:

markdown_string = <<-eos
This article is an example of something or other.

This segment will be used as the snippet on the index page.

^^^^^^^^^^^^^^^

This text will be visible once clicking the "Read more.." link
eos

preview = markdown_string[0...(markdown_string =~ /^\^+$/)]
puts preview

Upvotes: 1

dbr

Reputation: 169543

Write/find an intelligent HTML truncating function

The following from http://mikeburnscoder.wordpress.com/2006/11/11/truncating-html-in-ruby/, with some modifications will correctly truncate HTML, and easily allow appending a string before the closing tags.

>> puts "<p><b><a href=\"hi\">Something</a></p>".truncate_html(5, at_end = "...")
=> <p><b><a href="hi">Someth...</a></b></p>

The modified code:

require 'rexml/parsers/pullparser'

class String
  def truncate_html(len = 30, at_end = nil)
    p = REXML::Parsers::PullParser.new(self)
    tags = []
    new_len = len
    results = ''
    while p.has_next? && new_len > 0
      p_e = p.pull
      case p_e.event_type
      when :start_element
        tags.push p_e[0]
        results << "<#{tags.last}#{attrs_to_s(p_e[1])}>"
      when :end_element
        results << "</#{tags.pop}>"
      when :text
        results << p_e[0][0..new_len]
        new_len -= p_e[0].length
      else
        results << "<!-- #{p_e.inspect} -->"
      end
    end
    if at_end
      results << "..."
    end
    tags.reverse.each do |tag|
      results << "</#{tag}>"
    end
    results
  end

  private

  def attrs_to_s(attrs)
    if attrs.empty?
      ''
    else
      ' ' + attrs.to_a.map { |attr| %{#{attr[0]}="#{attr[1]}"} }.join(' ')
    end
  end
end

Upvotes: 6

Truncate Markdown?

Answers (7)

Related Questions