Emrah
Emrah

Reputation: 29

Exclude some characters in string

In quotes I try to exclude the ; characters using regex in Node.js.

For example i have an sql like below:

update codes set character='.' where character = ';' and write_date < now()-5 ;

I want to find effected rows before execute the statement. I wrote below regex but it not work correctly if there is ; character in quotes as above

const regexp = /update\s+((.|\n)[^;]*?)\s+set\s+((.|\n)[^;]*?)\s+where\s+((.|\n)[^;]*?);/ig;
regexp.exec(str)

Expected output:

table: codes
where: character = ';' and write_date < now()-5

But I get:

table: codes
where: character = ';

Upvotes: 0

Views: 705

Answers (2)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626835

You can use

update\s+([^;]*?)\s+set\s(?:[^;]*?\s)?where\s+((?:'[^']*'|[^;])*?)\s*;

See the regex demo. Details:

  • update - a word
  • \s+ - one or more whitespaces
  • ([^;]*?) - Group 1: zero or more but as few as possible chars other than ;
  • \s+ - one or more whitespaces
  • set - a word
  • \s - a whitespace
  • (?:[^;]*?\s)? - an optional sequence of any chars other than ; as few as possible, and then a whitespace
  • where - a word
  • \s+ - one or more whitespaces
  • ((?:'[^']*'|[^;])*?) - Group 2: zero or more (as few as possible) sequences of ', zero or more non-'s, and then ', or any single char other than a ;
  • \s* - zero or more whitespaces
  • ; - a ; char.

Upvotes: 2

RoToRa
RoToRa

Reputation: 38400

First of, I'm not sure what (.|\n) is for, so I'm ignoring that.

I believe there are two problems with your regexp, changing either will probably solve your problem, but I'd change both, just to be sure.

  1. The ? after the * makes the * non-greedy, which means the regex will match as little as possible, so that the final ; in the regexp will match the first possible ; it finds, not the last possible. So I'd leave the ? out.

  2. The regexp doesn't use $ to anchor to the end of string. Add $ after ; at the end (possibly \s*$ if you expect additional white space at the end of the string). If you do this, you actually don't need to exclude ;. And it may be a good idea, to add ^ (or ^\s*) at the beginning to anchor to the beginning of the string, too.

So the resulting regexp is

const regexp = /^\s*update\s+((.|\n).*)\s+set\s+((.|\n).*)\s+where\s+((.|\n).*);\s*$/ig;

Finally some conceptional ideas: Why are you doing this in the first place? Instead of starting with the UPDATE SQL, why don't you start out with the structure:

{ 
  table: "codes",
  where: "character = ';' and write_date < now()-5"
}

and build both the UPDATE and the SELECT SQLs from that?

Or if you only have the UPDATE SQL, instead of using a regular expression, there are SQL parser libraries (example) which would probably be more reliable.

Upvotes: 0

Related Questions