jpganz18
jpganz18

Reputation: 5858

how can I make a Regex to match prefix and multiple sufix at same time?

Ive encounter this problem.

I need to clean some corrupted data that has the following pattern:

VALUES (  1', NULL, 'Smith', 'Jackson',...
VALUES (  2', 'jon@doe', 'John', 'Doe',...

Basically, I need to remove the extra ', for that I first need to match it...

So I came up with this

(?<=VALUES \(  \d+)(.*')(?=(, '[a-zA-Z])||(^NULL,$))

Unfortunately, it's not matching anything, the \d+ is specially problematic.

Any idea how I can isolate the 1', 2', ... (only the ' after the number that can have N digits) and be able to match the prefix for 'string or NULL

Upvotes: 2

Views: 851

Answers (2)

The fourth bird
The fourth bird

Reputation: 163207

One option is to use

\b(VALUES\s*\(\s*\d+)'(?=,\s*(?:NULL|'[a-zA-Z]))

Explanation

  • \b A word boundary
  • ( Capture group 1 (to be used in the replacement keeping this part)
    • VALUES\s*\(\s*\d+ Match VALUES, ( and then match 1+ digits
  • ) Close group 1
  • ' Match the ' that you want to remove
  • (?=,\s*(?:NULL|'[a-zA-Z]) Positive lookahead to make sure that directly to the right is a comma followed by NULL or ' and a char a-z

Regex demo

In the replacement use group 1.

Output

VALUES (  1, NULL, 'Smith', 'Jackson',...
VALUES (  2, 'jon@doe', 'John', 'Doe',...

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 520878

Assuming the pattern/trend you showed continues for the entire VALUES clause, I would recommend the following find and replace, in regex mode:

Find:   (VALUES\s+\(\s*\d+)',
Replace: $1,

Demo

Note: Another possibility, perhaps simpler than above, might be available if the stray single quote appearing after the numbers also happens to appear in a single column. If so, then you could do a block select of that column across the entire VALUES clause and just delete it away.

Upvotes: 2

Related Questions