Harshu
Harshu

Reputation: 91

Unix: Modify a line in a file only if it is preceded by a particular line

I have a text file that looks like this -

Random text
Some more random text ...

TEXT_CATEGORY_A(
    SOME_INFO, A,
    "Some random text.",
    "Other info.",
    )
TEXT_CATEGORY_B(
    SOME_INFO, B,
    "Some random text.",
    "Other info.",
    )
TEXT_CATEGORY_C(
    SOME_INFO, C,
    "Some random text.",
    "Other info.",
    )

and so on ... I want to remove comma from the last sentence of each TEXT_CATEGORY container i.e. from -

"Other info.",

So the final form of the file should look like this -

Random text
Some more random text ...

TEXT_CATEGORY_A(
    SOME_INFO, A,
    "Some random text.",
    "Other info."
    )
TEXT_CATEGORY_B(
    SOME_INFO, B,
    "Some random text.",
    "Other info."
    )
TEXT_CATEGORY_C(
    SOME_INFO, C,
    "Some random text.",
    "Other info."
    )

If I can somehow find out that the next line contains only the ) character then I can solve this problem. I cannot solve this problem using sed as it reads the file line by line. Is there some way that I can find out about contents of the next line or is there some other way to solve this?

Upvotes: 1

Views: 54

Answers (3)

Harshu
Harshu

Reputation: 91

This is solved using sed as follows -

sed -E '/,$/N; s/",$/"/' file

This is a slight modification to the solution provided by @RomanPerekhrest.

Upvotes: 0

RavinderSingh13
RavinderSingh13

Reputation: 133590

1st solution: Could you please try following. Using tac + awk here. This should be faster.

tac Input_file | awk '{sub(/,/,"")} 1' | tac

Explanation: Adding explanation for above code.

tac Input_file    ##Using tac to print Input_file inn reverse order.
awk '{            ##Using tac command output to awk program from here.
  sub(/,/,"")     ##Using sub to substitute very first occurrence of comma with NULL here.
}                 ##Closing BLOCK here.
1                 ##Mentioning 1 will print edited/non-edited line here.
' | tac           ##Passing previous awk command output to tac command now and making it in its normal form.


2nd solution: With GNU awk.

awk -v RS="" '
match($0,/.*,/){
  print substr($0,RSTART,RLENGTH-1) substr($0,RSTART+RLENGTH)
}
' Input_file

Explanation: Adding explanation for above code.

awk -v RS="" '                                                    ##Starting awk program from here and setting RS(record separator) as NULL here.
match($0,/.*,/){                                                  ##Using match function of awk to match a regex till last occurrence of comma.
  print substr($0,RSTART,RLENGTH-1) substr($0,RSTART+RLENGTH)     ##Printing substring from RSTART to till value of RLENGTH-1 then again mentioning substrnig from RSTART+RLENGTH to till end of Input_file.
}                                                                 ##Closing BLOCK for match condition here.
'  Input_file                                                       ##Mentioning Input_file name here.

Upvotes: 1

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

Flexibly with sed command:

sed -E '/,$/N; s/([^,]+),\s+\)$/\1\n)/' file
  • /,$/ - match line that ends with ,
  • N - capture next line into buffer space
  • \1 - the 1st captured group (points to ([^,]+))

The output:

Random text
Some more random text ...

TEXT_CATEGORY_A(
    SOME_INFO, A,
    "Some random text.",
    "Other info."
)
TEXT_CATEGORY_B(
    SOME_INFO, B,
    "Some random text.",
    "Other info."
)
TEXT_CATEGORY_C(
    SOME_INFO, C,
    "Some random text.",
    "Other info."
)

Upvotes: 2

Related Questions