AntoineDotDot
AntoineDotDot

Reputation: 9

How to cut text without cutting words?

I have a text file and I am trying to cut off text after 25 chars (including whitespaces) without cutting words in half (max word length is 15). I have seen some PHP based solutions however, I'd like a solution that uses (bash) regular commands.

I figure I would do the initial cut with cut -c -25, but I have no idea how to prevent words being split in half.

-clarification-

Input: This sentence contains 54 characters, spaces included.

****Bash magic***

Desired output 1:

This sentence contains 59 #25 characters per line

characters, spaces #18 characters per line (placing "included" here would break the 25 limit)

included. #9 characters per line

Upvotes: 0

Views: 446

Answers (1)

Guest
Guest

Reputation: 124

One solution would be to parse every character individually, keeping track of both the number of characters checked and also if the current character is a space.

In this example we're using three variables; $inputvar, which should hold the whole string and is not changed; $resultvar, which will hold the cut section; and $i, which is used within the loop but will be overwritten.

for ((i=0;i>-1;i++)); do #Start $i at zero, increment every loop, and don't break normally.
    if [[ $i -le $((25 - 2)) ]]; then #If $i is less than or equal to 25*
        resultvar="$resultvar""${inputvar:$i:1}" #Add character at position $i to $resultvar
    else #If $i is greater than 25
        if [[ ${inputvar:$i:1} == " " || -z ${textvar:$i:1} ]]; then #If character is a space
            break #Exit loop; we're done here.
        else #Otherwise
            resultvar="$resultvar""${inputvar:$i:1}" #Add to $resultvar like before
        fi
    fi
done

*I have the length check as against $((25 - 2)). This is due to to the fact that we're starting with $i=0, and also how the process order checks for a whitespace character. Both of these would cause us to effectively return a longer string than intended, and so are negated by subtracting two from our string length.

The statement can be condenced into a single line as well;

for ((i=0;i>-1;i++)); do if [[ $i -le $((cutlength - 2)) ]]; then resultvar="$resultvar""${textvar:$i:1}"; else if [[ ${textvar:$i:1} == " " || -z ${textvar:$i:1} ]]; then break; else resultvar="$resultvar""${textvar:$i:1}"; fi; fi; done

It's not an elegant solution, by any means, but should get the job done.

Upvotes: 1

Related Questions