user1356163
user1356163

Reputation: 407

extracting a pattern and a certain field from the line above it using awk and grep preferably

i have a text file like this:

********** time1 **********
line of text1
line of text1.1
line of text1.2
********** time2 **********
********** time3 **********
********** time4 **********
line of text2.1
line of text2.2
********** time5 **********
********** time6 **********
line of text3.1

i want to extract line of text and the time(without the stars) above it and store it in a file.(time with no line of text beneath them have to be ignored). I want to do this preferably with grep and awk. So for example, my output for the above code should be

time1 : line of text1
time1 : line of text1.1
time1 : line of text1.2
time4 : line of text2.1
time4 : line of text2.2
time6 : line of text3

how do i go about it?

Upvotes: 1

Views: 612

Answers (7)

William Pursell
William Pursell

Reputation: 212218

$ uniq -f 2 input-file | awk '{getline n; print $2 " : " n}'

If your timestamp has spaces in it, change the argument to the -f option so that uniq is only comparing the final string of *. Eg, use -f X where X-2 is the number of spaces in the timestamp. Also if there are spaces in the timestamp, the awk will need to change. Either of these will work:

$ uniq -f 3 input-file | awk -F '**********' '{getline n; print $2 " : " n}'
$ uniq -f 3 input-file | awk '{getline n; $1=""; $NF=""; print $0 ": " n }'

Upvotes: 0

Tiksi
Tiksi

Reputation: 471

Works with spaces in the time:

awk '/^[^*]+/ { gsub(/*/,"",x);printf x": "; print };{x=$0}' data.txt

Upvotes: 2

Derek Schrock
Derek Schrock

Reputation: 346

awk '{ if( $0 ~ /^\*+ time[0-9] \*+$/ ) { time = $2 } else { print time " : " $0 } }' file

Upvotes: 0

Gilles Quénot
Gilles Quénot

Reputation: 185005

In awk, see :

#!/bin/bash

awk '
    BEGIN{
        t=0
    }
    {
        if ($0 ~ " time[0-9]+ ") {
            v=$2
            t=1
        }
        else if ($0 ~ "line of text") {
            if (t==1) {
                printf("%s : %s\n", v, $0)
            } else {
               t=0;
            }
        }
    }
' FILE

Just replace FILE by your filename.

Upvotes: 1

potong
potong

Reputation: 58371

This might work for you (GNU sed):

sed '/^\*\+ \S\+.*/!d;s/[ *]//g;$!N;/\n[^*]/!D;s/\n/ : /' file

Explanation:

  • Look for lines beginning with *'s if not delete. /^\*\+ \S\+.*/!d
  • Got a time line. Delete *'s and spaces (leaving time). s/[ *]//g
  • Get next line $!N
  • Check the second line doesn't begin with *'s otherwise delete first line /\n[^*]/!D
  • Got intended pattern, replace \n with spaced : and print. s/\n/ : /

Upvotes: 0

Dennis Williamson
Dennis Williamson

Reputation: 359955

This assumes that there are no spaces in the time and that there is only one (or zero) line of text after each time marker.

awk '$1 ~ /\*+/ {prev = $2} $1 !~ /\*+/ {print prev, ":", $0}' inputfile

Upvotes: 2

Zsolt Botykai
Zsolt Botykai

Reputation: 51593

You can do it like this with vim:

:%s_\*\+ \(YOUR TIME PATTERN\) \*\+\_.\(\[^*\].*\)$_\1 : \2_ | g_\*\+ YOUR TIME PATTERN \*\+_d

That is search for TIME PATTERN lines and saves the time pattern and the next line if it's not started with *. Then create the new line from them. Then delete every remaining TIME PATTERN line.

Note this assumes, that the time pattern lines are ending with *, etc.

With awk:

awk '/\*+ YOUR TIME PATTERN \*+/ { time=gensub("\*+ (YOUR TIME PATTERN) \*+","\\1","g") }
     ! /\*+ YOUR TIME PATTERN \*+/ { print time " : " $0 }' INPUTFILE

And there are other ways to do it.

Upvotes: 1

Related Questions