Reputation: 91
I have a text file that looks like this -
Random text
Some more random text ...
TEXT_CATEGORY_A(
SOME_INFO, A,
"Some random text.",
"Other info.",
)
TEXT_CATEGORY_B(
SOME_INFO, B,
"Some random text.",
"Other info.",
)
TEXT_CATEGORY_C(
SOME_INFO, C,
"Some random text.",
"Other info.",
)
and so on ... I want to remove comma from the last sentence of each TEXT_CATEGORY container i.e. from -
"Other info.",
So the final form of the file should look like this -
Random text
Some more random text ...
TEXT_CATEGORY_A(
SOME_INFO, A,
"Some random text.",
"Other info."
)
TEXT_CATEGORY_B(
SOME_INFO, B,
"Some random text.",
"Other info."
)
TEXT_CATEGORY_C(
SOME_INFO, C,
"Some random text.",
"Other info."
)
If I can somehow find out that the next line contains only the )
character then I can solve this problem.
I cannot solve this problem using sed as it reads the file line by line. Is there some way that I can find out about contents of the next line or is there some other way to solve this?
Upvotes: 1
Views: 54
Reputation: 91
This is solved using sed as follows -
sed -E '/,$/N; s/",$/"/' file
This is a slight modification to the solution provided by @RomanPerekhrest.
Upvotes: 0
Reputation: 133590
1st solution: Could you please try following. Using tac
+ awk
here. This should be faster.
tac Input_file | awk '{sub(/,/,"")} 1' | tac
Explanation: Adding explanation for above code.
tac Input_file ##Using tac to print Input_file inn reverse order.
awk '{ ##Using tac command output to awk program from here.
sub(/,/,"") ##Using sub to substitute very first occurrence of comma with NULL here.
} ##Closing BLOCK here.
1 ##Mentioning 1 will print edited/non-edited line here.
' | tac ##Passing previous awk command output to tac command now and making it in its normal form.
2nd solution: With GNU awk
.
awk -v RS="" '
match($0,/.*,/){
print substr($0,RSTART,RLENGTH-1) substr($0,RSTART+RLENGTH)
}
' Input_file
Explanation: Adding explanation for above code.
awk -v RS="" ' ##Starting awk program from here and setting RS(record separator) as NULL here.
match($0,/.*,/){ ##Using match function of awk to match a regex till last occurrence of comma.
print substr($0,RSTART,RLENGTH-1) substr($0,RSTART+RLENGTH) ##Printing substring from RSTART to till value of RLENGTH-1 then again mentioning substrnig from RSTART+RLENGTH to till end of Input_file.
} ##Closing BLOCK for match condition here.
' Input_file ##Mentioning Input_file name here.
Upvotes: 1
Reputation: 92854
Flexibly with sed
command:
sed -E '/,$/N; s/([^,]+),\s+\)$/\1\n)/' file
/,$/
- match line that ends with ,
N
- capture next line into buffer space\1
- the 1st captured group (points to ([^,]+)
)The output:
Random text
Some more random text ...
TEXT_CATEGORY_A(
SOME_INFO, A,
"Some random text.",
"Other info."
)
TEXT_CATEGORY_B(
SOME_INFO, B,
"Some random text.",
"Other info."
)
TEXT_CATEGORY_C(
SOME_INFO, C,
"Some random text.",
"Other info."
)
Upvotes: 2