Reputation: 68426
I am in the process of porting an existing database schema to Postgresql.
I need to replace occurrences of the word 'go' with a semi comma.
I have noticed that the word 'go' appears in the text, in the following pattern:
I want to replace the above pattern with the following one:
I am trying to build a regex expression which I can use with sed, to perform the replacement described above - but I am relatively new to regex.
For the purpose of clarity, I have included sample text BEFORE and AFTER the substitution I want to achieve:
-- Original File contents below -------
go
CREATE TABLE foobar
(
f1 INT,
f2 INT,
f3 FLOAT,
f4 VARCHAR(32) NOT NULL,
f5 INT,
f6 datetime,
f7 smallint
)
go
GRANT UPDATE, INSERT, DELETE, SELECT ON foobar TO dbusr
go
CREATE UNIQUE INDEX idxu_foobar ON foobar (f1, f2)
go
--- REPLACED FILE CONTENTS -----------
go
CREATE TABLE foobar
(
f1 INT,
f2 INT,
f3 FLOAT,
f4 VARCHAR(32) NOT NULL,
f5 INT,
f6 datetime,
f7 smallint
);
GRANT UPDATE, INSERT, DELETE, SELECT ON foobar TO dbusr;
CREATE UNIQUE INDEX idxu_foobar ON foobar (f1, f2);
Can anyone help with the expression to use to achieve this, so I can execute:
sed -i 's/original_match_expr/replacement_expr/g' myfile.sql
Upvotes: 1
Views: 453
Reputation: 36262
Try following solution using the GNU
version of sed:
sed -ne ':a; $! { N; ba }; s/\([^[:space:]]\)[[:space:]]*go/\1;/g; p' infile
It reads the whole file to a buffer and replace all go
words and all blanks that precede it with a semicolon. It yields:
go
CREATE TABLE foobar
(
f1 INT,
f2 INT,
f3 FLOAT,
f4 VARCHAR(32) NOT NULL,
f5 INT,
f6 datetime,
f7 smallint
);
GRANT UPDATE, INSERT, DELETE, SELECT ON foobar TO dbusr;
CREATE UNIQUE INDEX idxu_foobar ON foobar (f1, f2);
EDIT to add an explanation (see comments):
It's not as hard as it seems.
:a; $! { N; ba }
is a loop that reads every line of input to a buffer.
[[:space:]]
matches any whitespace character and [^[:space:]]
negates it. So the substitution command replaces from last non-whitespace character until the word go
. If there is only whitespace before the go
word as in the first case, the substitution doesn't match and does replace nothing.
Upvotes: 1
Reputation: 23364
With gawk
awk -v RS='\\s*go' '{print $0""(RT ~ /go/? ";\n\n": "")}' file.txt
The record separator RS
is set to 0 or more space characters followed by go
. GNU awk then treats the block of text between two successive instances of record separators as a record. So print the record followed by a custom record separator (;
followed by two newlines)
Upvotes: 1