Steph Rose
Steph Rose

Reputation: 2136

How do I grab the last two words of text; Regex, or split string based on whitespace and length?

I have a summary paragraph I'm splitting into two parts based on length.

I'm truncating the string using:

summary = truncate(body_content, :length => 500, :separator => ' ')

But I need the truncated content to put into a hidden paragraph. So, because I don't know a better way to do it, I was trying to get the last two words of the summary, find their index, and then get the text from that point on and output that variable.

Unless someone can tell me a better way to split a string with a space as a separator and have the split parts in an array or two variables for output (this would be preferred), can someone please help me with my regex for this?

So, the idea is that the last two words of a truncated summary would be something like 'the dog...'

So I was thinking it'd be like:

summary.match(/\s[\w\d\W]*){2}\.\.\.) { |m| last_words = m[0] }

But this is definitely not working.

Upvotes: 0

Views: 3244

Answers (3)

iwasrobbed
iwasrobbed

Reputation: 46703

This answer doesn't really use either of your suggested methods, but hopefully it's somewhat helpful.

Assuming we had this paragraph (which is in a variable called lorem):

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed faucibus lacinia dignissim. Duis a vestibulum nulla. Aliquam erat volutpat. In turpis nisl, mollis at faucibus eget, posuere tincidunt sem. Nunc nulla arcu, sodales ac sodales id, dictum vulputate dolor. Nunc mollis suscipit lectus, in imperdiet purus molestie eget. Maecenas eget risus sem. Praesent lectus nunc, consectetur eget accumsan et, dapibus quis ante. Praesent tellus velit, posuere quis imperdiet ornare, malesuada in est. Nullam sit amet risus quam, ac imperdiet mi.

Then we could do:

first_part, second_part = truncate(lorem, :length => 100), lorem[97..lorem.length]

This would give you two variables containing the first part and second part of the paragraph:

> first_part 
=> "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed faucibus lacinia dignissim. Duis a v..."

> second_part
=> "estibulum nulla. Aliquam erat volutpat. In turpis nisl, mollis at faucibus eget, posuere tincidunt sem. Nunc nulla arcu, sodales ac sodales id, dictum vulputate dolor. Nunc mollis suscipit lectus, in imperdiet purus molestie eget. Maecenas eget risus sem. Praesent lectus nunc, consectetur eget accumsan et, dapibus quis ante. Praesent tellus velit, posuere quis imperdiet ornare, malesuada in est. Nullam sit amet risus quam, ac imperdiet mi."

Although this works, I would also just recommend using this jQuery plugin to easily do this and handle the slicing & text expansion animation for you: https://github.com/kswedberg/jquery-expander

Upvotes: 0

fge
fge

Reputation: 121710

This regex probably does what you want:

/\s(\w+)\s+(\w+)\s*\Z/

And capture the first and second group.

However, please note it cannot really do anything about truncated words, since you operate on a truncated input...

Maybe another solution would be to match your input recursively for \s+ and grab the ending index of the match, while keeping the index of the previous match: this way, you'll be able to build a substring and would only have to apply this regex to it:

/(\w+)\s+(\w+)\Z/

to grab the last two words.

Upvotes: 3

the Tin Man
the Tin Man

Reputation: 160551

Here's how I've done something similar. This isn't the actual code, but you'll get the idea:

#
# set up the part of Rails needed for this example.
#
# This isn't needed in a normal Rails app because it will have already been done.
#
require 'active_support'
require 'action_view'
include ActionView::Helpers::TextHelper

text = <<EOT
Lorum ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua.
EOT

# preprocess the text to make it look more a long paragraph
text.gsub!("\n", " ")

# we're ready to go. This would be in your Rails app.
LENGTH_OF_SUMMARY = 50

summary = truncate(text, :length => LENGTH_OF_SUMMARY, :separator => ' ')
leftover = text.sub(summary[0..-4], '').strip

puts text
puts '-----------------'
puts summary
puts '-----------------'
puts leftover

Which outputs:

Lorum ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. 
-----------------
Lorum ipsum dolor sit amet, consectetur...
-----------------
adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

The first line is the full text for reference.

The second is the summary.

The third is the text after stripping out the summary and leading/trailing whitespace. It might not be important stripping the whitespace, but I'm that way. A simple sub strips out the summary from the original text leaving you with the left-overs. strip cleans up the result so there is no leading or trailing spaces.

Upvotes: 1

Related Questions