Reputation: 9
I have a text file and I am trying to cut off text after 25 chars (including whitespaces) without cutting words in half (max word length is 15). I have seen some PHP based solutions however, I'd like a solution that uses (bash) regular commands.
I figure I would do the initial cut with cut -c -25
, but I have no idea how to prevent words being split in half.
-clarification-
Input: This sentence contains 54 characters, spaces included.
****Bash magic***
Desired output 1:
This sentence contains 59
#25 characters per line
characters, spaces
#18 characters per line (placing "included" here would break the 25 limit)
included.
#9 characters per line
Upvotes: 0
Views: 446
Reputation: 124
One solution would be to parse every character individually, keeping track of both the number of characters checked and also if the current character is a space.
In this example we're using three variables; $inputvar, which should hold the whole string and is not changed; $resultvar, which will hold the cut section; and $i, which is used within the loop but will be overwritten.
for ((i=0;i>-1;i++)); do #Start $i at zero, increment every loop, and don't break normally.
if [[ $i -le $((25 - 2)) ]]; then #If $i is less than or equal to 25*
resultvar="$resultvar""${inputvar:$i:1}" #Add character at position $i to $resultvar
else #If $i is greater than 25
if [[ ${inputvar:$i:1} == " " || -z ${textvar:$i:1} ]]; then #If character is a space
break #Exit loop; we're done here.
else #Otherwise
resultvar="$resultvar""${inputvar:$i:1}" #Add to $resultvar like before
fi
fi
done
*I have the length check as against $((25 - 2)). This is due to to the fact that we're starting with $i=0, and also how the process order checks for a whitespace character. Both of these would cause us to effectively return a longer string than intended, and so are negated by subtracting two from our string length.
The statement can be condenced into a single line as well;
for ((i=0;i>-1;i++)); do if [[ $i -le $((cutlength - 2)) ]]; then resultvar="$resultvar""${textvar:$i:1}"; else if [[ ${textvar:$i:1} == " " || -z ${textvar:$i:1} ]]; then break; else resultvar="$resultvar""${textvar:$i:1}"; fi; fi; done
It's not an elegant solution, by any means, but should get the job done.
Upvotes: 1