Gernot
Gernot

Reputation: 31

Bash: How to shorten long lines of a log file whilst keeping a fixed number of characters from beginning and end of each line?

I have a long log-file (ASCII text), which contains lines with different lengths going from some characters to many thousands characters. How can I shorten each long line with bash/linux commands? Is it possible to replace the cutted text with something like "... N characters removed ..."?

My goal is to keep all lines with a length up to 100 characters untouched. For all lines > 100 characters keep 40 characters from beginning, 40 characters from end, and insert "... N characters removed ..." in the middle where characters has been cut out (N replaced with number of removed characters).

Is this too complicated to do it with bash/linux commands? Any help would be appreciated.

Upvotes: 2

Views: 416

Answers (2)

SiegeX
SiegeX

Reputation: 140427

You can do this with

awk '
(length > 100) {
    l=length
    $0 = substr($0,0,40) "..."l-80" Characters Removed..." substr($0,l-39)
}1' ./infile

Proof of Concept

$ cat ./infile
|- These are the first 40 characters --|0123456789012345678901234567890|-- These are the last 40 characters --|
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
9012345678901234567890
2345678901234567890
12345678901234567asdfasd9as98jf-a9jfa9uhf0sd9uhfas0dfadfa890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
123456789012345678901234567890123456789012345678901234567aisfjds9dafa908sfj9asdjf9asdf89012345678901234567890123456789012345678901234567890
12345678901234567890123456789012345678901234567890123456789012345678901234567890123456asf9jasf-asjf0as8789012345678901234567890

$ awk '
     (length > 100) {
        l=length
        $0 = substr($0,0,40) "..."l-80" Characters Removed..." substr($0,l-39)
    }1' ./infile
|- These are the first 40 characters --|...31 Characters Removed...|-- These are the last 40 characters --|
1234567890123456789012345678901234567890...30 Characters Removed...1234567890123456789012345678901234567890
0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
9012345678901234567890
2345678901234567890
12345678901234567asdfasd9as98jf-a9jfa9uh...70 Characters Removed...1234567890123456789012345678901234567890
1234567890123456789012345678901234567890...59 Characters Removed...1234567890123456789012345678901234567890
1234567890123456789012345678901234567890...47 Characters Removed...sf9jasf-asjf0as8789012345678901234567890

Upvotes: 4

Dominique
Dominique

Reputation: 17555

awk to the rescue :-)

I believe your problem is: when you do something like cat ... | cut -c..., how can you append something to this?

Let me give you an example: I have a file, test.txt, which looks as follows:

Prompt>cat test.txt
version = 1.203
RAM/ABC/INDIA
RAJ/XYZ/DELHI
VIRAJ/FDS/

I can show different parts of lines, one after the other, like this:

Prompt>cat test.txt | awk '{print substr($1,1,1) "..." substr($1,3,1)}'
// print the first character, some constant string in between, and the third character

This gives following result:

v...r
R...M
R...J
V...R

So, putting all things next to each other in an awk script, like {print <beginning> <middle> <end>} gets the job done.

Upvotes: 0

Related Questions