Reputation: 601
I have been trying to solve a problem for splitting a sentence down to a meaningful set of words under specific length.
string1 = "Alice is in wonderland"
string2 = "Bob is playing games on his computer"
I want to have a regex that matches the representative words that match the condition of being lower than 20 characters.
new_string1 = "Alice is in"
new_string2 = "Bob is playing games"
Is this possible to do it with Regex?
Upvotes: 1
Views: 80
Reputation: 22294
This is not a good usecase of regular expression. Although, the textwrap.shorten
method achieves exactly that.
import textwrap
string1 = "Alice is in wonderland"
string2 = "Bob is playing games on his computer"
new_string1 = textwrap.shorten(string1, 20, placeholder="")
new_string2 = textwrap.shorten(string2, 20, placeholder="")
print(new_string1) # Alice is in
print(new_string2) # Bob is playing games
The only downside of textwrap.shorten
is that it collapses spaces. In the event you do not want that to happen, you can implement your own method.
def shorten(s, max_chars):
# Special case is the string is shorter than the number of required chars
if len(s) <= max_chars:
return s.rstrip()
stop = 0
for i in range(max_chars + 1):
# Always keep the location of the last space behind the pointer
if s[i].isspace():
stop = i
# Get rid of possible extra space added on the tail of the string
return s[:stop].rstrip()
string1 = "Alice is in wonderland"
string2 = "Bob is playing games on his computer"
new_string1 = shorten(string1, 20)
new_string2 = shorten(string2, 20)
print(new_string1) # Alice is in
print(new_string2) # Bob is playing games
Upvotes: 1