Reputation: 4378
I have a file containing queries, each query ends with a ";", I want to add a "COMMIT;BEGIN;" every 100 queries. Queries can take more than 1 line.
For instance:
INSERT INTO table
VALUES(...);
DELETE FROM table WHERE ...;
UPDATE table
SET ...;
So I want to replace every 100th ";" with "COMMIT;BEGIN;" (I know I have to add a BEGIN; at the beginning of the file and do something at the end but it's easy)
I need to do that in a shell script but I am not a Linux expert, is it better to use sed or awk (also the file size can be huge like 4GB)? I know the basics for those commands but I don't know if I can do what I want here...
Thanks!
Upvotes: 2
Views: 993
Reputation: 58381
This might work for you (GNU sed);
sed ':a;$!{N;ba};s/^\([^;]*\(;[^\n][^;]*\)*;\s*$\)\{100\}/&\nCOMMIT;BEGIN;/mg' file
Upvotes: 0
Reputation: 27050
I would recommend not to try to replace the semicolons. Instead, "parse" all the command and put BEGIN
and COMMIT
around the bunch of commands you want to commit. This is fairly easy with sed. If you have this file:
$ cat my.sql
INSERT INTO table VALUES (1);
INSERT INTO table VALUES (2);
INSERT INTO table VALUES (3);
INSERT INTO table VALUES (4);
INSERT INTO table VALUES (5);
INSERT INTO table VALUES (6);
INSERT INTO table VALUES (7);
INSERT INTO table VALUES (8);
INSERT INTO table VALUES (9);
INSERT INTO table VALUES (10);
just run this command:
$ sed -n 'H;${x;s/\([^;]*;\)\{,3\}/BEGIN;&\nCOMMIT;\n\n/g;p}' my.sql
BEGIN;
INSERT INTO table VALUES (1);
INSERT INTO table VALUES (2);
INSERT INTO table VALUES (3);
COMMIT;
BEGIN;
INSERT INTO table VALUES (4);
INSERT INTO table VALUES (5);
INSERT INTO table VALUES (6);
COMMIT;
BEGIN;
INSERT INTO table VALUES (7);
INSERT INTO table VALUES (8);
INSERT INTO table VALUES (9);
COMMIT;
BEGIN;
INSERT INTO table VALUES (10);
COMMIT;
(Here I am using 3 as the size of the block for clarity. For "embracing" blocks of 100 commands, replace \{,3\}
with \{,100\}
)
What does it do?
First, we inhibit the printing of lines with -n
. Now sed will only print lines when we order it to print them explicitly.
Now, for each line, we append the line to the hold space with H
.
At the last line (address $
) we execute a block of commands (beginning at {
and ending at }
). The first command, x
, swaps the content of the hold space (which now contains all the file) and the pattern space.
After this, we replace with s///
n (0 < n <= 3) series of characters (none being ;
) followed by ;
with the string BEGIN;
, the matched block of commands (represented by &
) and the string \nCOMMIT;\n\n
(with newlines for better readability.
At last, we print the content of the pattern space with p
.
Upvotes: 1
Reputation: 359985
If the semicolons are at the end of the line:
awk '{print} /;$/ && ! (count++%100) {print "COMMIT; BEGIN;"}' inputfile
If it's important that precisely 100 semicolons are counted and they may be anywhere in a line it can be done but it gets a bit more complicated.
Upvotes: 0
Reputation: 212238
If you can guarantee that the ;
is at the end of the line, or if you don't really care about lines with multiple ;
's, the easy solution (untested) is:
awk '/;/{ count+=1 } {print} count==100 { print "COMMIT; BEGIN"; count=0 }'
Upvotes: 1