Sal00m
Sal00m

Reputation: 2916

Delete lines from file using sed containing different patterns

I have a dump from a production DB but i want to remove data of some tables like messages, messages_files, etc because are not useful to debug/programming on local.

I have been using this command to remove the lines containing this kind of data:

sed -i '/CREATE DATABASE/d' $current_main_db.sql &&
sed -i '/USE \`okn/d' $current_main_db.sql && 
sed -i '/INSERT INTO \`messages\`/ d' $current_main_db.sql && 
sed -i '/INSERT INTO \`messages_email_cron\`/ d' $current_main_db.sql &&
sed -i '/INSERT INTO \`messages_users\`/ d' $current_main_db.sql &&
sed -i '/INSERT INTO \`messages_files\`/ d' $current_main_db.sql &&
sed -i '/INSERT INTO \`messages_mail_list\`/ d' $current_main_db.sql &&
sed -i '/INSERT INTO \`messages_sms_cron\`/ d' $current_main_db.sql &&
sed -i '/INSERT INTO \`messages_tags\`/ d' $current_main_db.sql &&
sed -i '/INSERT INTO \`messages_temp_receivers\`/ d' $current_main_db.sql &&
sed -i '/INSERT INTO \`messages_threads\`/ d' $current_main_db.sql;

It works well but is very slow so i try to combine all the patterns into one sed command. I read the manual and find this:

regexp1\|regexp2

Matches either regexp1 or regexp2. Use parentheses to use complex alternative regular expressions. The matching process tries each alternative in turn, from left to right, and the first one that succeeds is used. It is a GNU extension.

So i tried this:

sed -i '/CREATE DATABASE\|USE \`okn\|INSERT INTO \`messages\`\|INSERT INTO \`messages_email_cron\`\|INSERT INTO \`messages_users\`\|INSERT INTO \`messages_files\`\|INSERT INTO \`messages_mail_list\`\|INSERT INTO \`messages_sms_cron\`\|INSERT INTO \`messages_tags\`\|INSERT INTO \`messages_temp_receivers\`\|INSERT INTO \`messages_threads\`/ d' $current_main_db.sql;

But do not work, i tried to use parenthesis for every pattern without any luck:

sed -i '/(CREATE DATABASE\|USE \`okn)\|(INSERT INTO \`messages\`)\|(INSERT INTO \`messages_email_cron\`)\|(INSERT INTO \`messages_users\`)\|(INSERT INTO \`messages_files\`)\|(INSERT INTO \`messages_mail_list\`)\|(INSERT INTO \`messages_sms_cron\`)\|(INSERT INTO \`messages_tags\`)\|(INSERT INTO \`messages_temp_receivers\`)\|(INSERT INTO \`messages_threads\`)/d'

Am i doing something wrong?

I search in SO and find some similar questions but do not work for me.

Upvotes: 0

Views: 436

Answers (3)

Michael Vehrs
Michael Vehrs

Reputation: 3363

Your attempt is slow because you start a new sed instance for every command. And your regexp is complicated because you try to handle all expressions at once. There is a compromise solution, however

sed '/pattern1/d; /pattern2/d; ...'

Also note that you can simplify your regular expression as demonstrated by @CasimirEtHippolyte.

Upvotes: 1

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

grep should suffice:

grep -vE '^(INSERT INTO `messages(_email_cron|_users|_files|_mail_list|_sms_cron|_tags|_temp_receivers|_threads)`|CREATE DATABASE|USE `okn)' file

Upvotes: 1

Julien Cassette
Julien Cassette

Reputation: 11

Escape the parenthesis as well:

sed -i '/\(CREATE DATABASE\)\|\(USE \`okn\)\|\(INSERT INTO \`messages\`\)\|\(INSERT INTO \`messages_email_cron\`\)\|\(INSERT INTO \`messages_users\`\)\|\(INSERT INTO \`messages_files\`\)\|\(INSERT INTO \`messages_mail_list\`\)\|\(INSERT INTO \`messages_sms_cron\`\)\|\(INSERT INTO \`messages_tags\`\)\|\(INSERT INTO \`messages_temp_receivers\`\)\|\(INSERT INTO \`messages_threads\`\)/d'

Upvotes: 1

Related Questions