blackghost
blackghost

Reputation: 713

Split string into chunks (of different size) without breaking words

I am trying to create a method that, given a string, returns three strings: title, description1, description2

This is a related question I found: Split a string into chunks of specified size without breaking words - But my chunks are of different size.

Title needs to be max 25 characters.

Description1 needs to be max 35 characters.

Description2 needs to be max 35 characters.

The question would be:

How can I split a string so that it creates a maximum of three entities (NOTE: If the string can fit in just the first entity that is OK, I don't need to return the three entities), where first entity has a maximum of 25 characters and the other two have a max of 35 characters each. Making the method clever enough to take into account words (and maybe punctuation), so that it doesn't return cut results.

I have done the following:

def split_text_to_entities(big_string)
  title = big_string(0..24)
  description1 = big_string(25..59)
  description2 = big_string(60..94)
end

But the problem with this approach is that if that if the input is "Buy our new brand shoes from our store. Best discounts in town and 40% off for first purchase.", the results would be:

title = "Buy our new brand shoes f"
description1 = "rom our store. Best discounts in to"
description2 = "wn and 40% off for first purchase."

And ideally they would be:

title = "Buy our new brand shoes"
description1 = "from our store. Best discounts in"
description2 = "town and 40% off for first"

So, try to split by character size, taking into account the words.

Upvotes: 2

Views: 1906

Answers (3)

Cary Swoveland
Cary Swoveland

Reputation: 110725

To cover all the bases, I would do the following.

Code

def divide_text(str, max_chars)
  max_chars.map do |n|
    str.lstrip!
    s = str[/^.{,#{n}}(?=\b)/] || ''
    str = str[s.size..-1]
    s
  end
end

(?=\b) is a (zero-width) positive lookahead that matches a word break.

Examples

max_nbr_chars = [25,35,35]

str = "Buy our new brand shoes from our store. Best discounts in " +
      "town and 40% off for first purchase."    
divide_text(str, max_nbr_chars)
  #=> ["Buy our new brand shoes",
  #    "from our store. Best discounts in",
  #    "town and 40% off for first"]

str = "Buy our new brand shoes from our store."
divide_text(str, max_nbr_chars)
  #=> ["Buy our new brand shoes", "from our store.", ""]

str = "Buy our new"
divide_text(str, max_nbr_chars)
  #=> ["Buy our new", "", ""]

str = ""
divide_text(str, max_nbr_chars)
  #=> ["", "", ""]

str = "Buyournewbrandshoesfromourstore."
divide_text(str, max_nbr_chars)
  #=> ["", "Buyournewbrandshoesfromourstore.", ""]

str = "Buyournewbrandshoesfromourstoreandshoesfromourstore."
divide_text(str, max_nbr_chars)
  #=> ["", "", ""]

Note that if ^ were omitted from the regex:

str = "Buyournewbrandshoesfromourstore."
divide_text(str, max_nbr_chars)
  #=> ["ewbrandshoesfromourstore.", "rstore.", ""]

Upvotes: 2

Nobita
Nobita

Reputation: 23713

This doesn't do the trick?:

def get_chunks(str, n = 3)
  str.scan(/^.{1,25}\b|.{1,35}\b/).first(n).map(&:strip)
end

Upvotes: 1

sawa
sawa

Reputation: 168209

s = "Buy our new brand shoes from our store. Best discounts in town and 40% off for first purchase."

s =~ /\b(.{,25})\W+(.{,35})\W+(.{,35})\b/
[$1, $2, $3] # =>
# [
#   "Buy our new brand shoes",
#   "from our store. Best discounts in",
#   "town and 40% off for first purchase"
# ]

Upvotes: 0

Related Questions