Geo Fery
Geo Fery

Reputation: 33

How to cut string from pattern until pattern using sed/awk in bash

I can not figure out, how to use sed or awk to trim string in file. I've been searching many hours without desired result. I have a string like this:

c=one, o=roll, root ca          valid until: date
c=one, o=roll, root ca          Located: location
c=two roll, root ca             valid until: date
c=two roll, root ca             Located: location

My desired output align in columns:

c=one            valid until: date
c=one            Located: location
c=two roll       valid until: date
c=two roll       Located: location

I'm using sed command which does not work like (and many others):

sed 's/,.*\^([valid])//g' file.txt

but I can't figure out second condition until word "valid", in addition "valid" and "Location" at the same command. Thanks a lot!

Upvotes: 3

Views: 1873

Answers (4)

Sundeep
Sundeep

Reputation: 23667

$ sed -E 's/,.*(valid|Located)/ \1/' ip.txt
c=one valid until: date
c=one Located: location

You could put the terms valid, Located (and any other such terms you want) inside a capture group of alternations. Then use backreference \1 in replacement section to put it back.

To align the output, you could use some character like | which doesn't occur in input and then use column command

$ sed -E 's/,.*(valid|Located)/|\1/' ip.txt | column -t -s'|'
c=one       valid until: date
c=one       Located: location
c=two roll  valid until: date
c=two roll  Located: location

Upvotes: 4

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626748

You do not have to rely on valid or Located, with the command below, there can be any word there:

sed 's/,.*[^ ]\(   *[^ ][^:]*:\)/\1/'  file

Or, you may force to match up to valid or Located using

sed -E 's/,.*[^[:space:]]([[:space:]]{2,}(valid|Located))/\1/' file
sed -E 's/,.*\S(\s{2,}(valid|Located))/\1/' file     # If \s and \S are supported

where (valid|Located) matches either a valid or Located character sequences. Note that for the OR | operator to work, you need to either escape it in a POSIX BRE pattern, or enable the POSIX ERE syntax with the -E option, as shown above.

See the online sed demo #1 and a demo #2. Details

  • , - match the first comma
  • .* - match any 0 or more chars
  • [^ ] - then find a non-space char
  • \( *[^ ][^:]*:\) - captures into Group 1 (\1) 2 or more spaces followed with a non-space char ([^ ]) and then 0 or more chars other than : ([^:]*) and then a :.

You may replace space with \s (if supported) or [[:space:]] to match any whitespace, and [^ ] with [^[:space:]] or \S (if supported).

Upvotes: 1

KamilCuk
KamilCuk

Reputation: 140960

I would remove everything up until multiple consecutive spaces, assuming there are no double spaces in the left column:

s/,.*     /     /

Upvotes: 0

RavinderSingh13
RavinderSingh13

Reputation: 133458

Could you please try following, tested and written with shown samples.

awk -F'[, ]' 'match($0,/ +valid.*| +Located.*/){print $1,substr($0,RSTART,RLENGTH)}' Input_file

Upvotes: 1

Related Questions