dummly
dummly

Reputation: 9

regex code: does or does not contain a character

I cant figure this out. I want to capture the string inside the square brackets, with or without characters in it.

[5123512], [412351, 1235123, 5125123], [12312-AA] and []

i want to convert the square brackets into double quote

[5123512] ==> "5123512"
[412351, 1235123, 5125123] ==> "412351, 1235123, 5125123"

[12312-AA] ==> "12312-AA"
[] == > ""

i tried this \[\d+\] and not working

This is my sample data, its a json format. Square brackets inside the description need not to change, only the attributes.

{"results":
[{"listing": 4613456431,"sku": [5123512],"category":[412351, 1235123, 
5125123],"subcategory": "12312-AA", "description":"This is [123sample]"} 
{"listing": 121251,"sku":[],"category": [412351],"subcategory": "12312-AA", 
"description": "product sample"}]}

TIA

Upvotes: 0

Views: 141

Answers (2)

Martti Heikkilä
Martti Heikkilä

Reputation: 23

First off, regex is a horrible tool for parsing JSON formatted data. I'm sure you'll find plenty of tools to simply read your JSON in vb.net and mangle it in simpler ways than taking it in as text... For example: How to parse json and read in vb.net

Original answer (edited slightly):

You're almost there, but here's a few things you need to change:

  • in your regex pattern, escape the square brackets: \[ and \]
  • if you only want to capture all characters in the brackets, then . is a good way to go
  • the plus sign + means "at least one" — if you want to match empty brackets too, use *? instead
    • the question mark means "lazy" — it explicitly tells the regex to match the shortest sequence of characters possible (instead of going over to the next square bracket...)
  • wrap the .*? into parenthesis so that you can reference to that part later when substituting the stuff
  • finally, the output value / pattern to substitute with is \1 or $1, depending on the context
    • or "\1" or "$1" if you really need the double quotes in the output — maybe you just need a string variable?

All in all this becomes:

Find this: \[(.*?)\]

Replace with: \1

Upvotes: 1

Aaron
Aaron

Reputation: 24802

Your regex doesn't work for three reasons :

  1. [ is a meta-character that opens a character class. To match a literal [, you need to escape it with a backslash. ] also is a meta-character when it follows the [ meta-character, but if you escape the [ you shouldn't need to escape the ] (not that it hurts to do so).

  2. \d only captures decimal digits, however your sample contains the letter A. If that's the hexadecimal digit, you will probably want to use [\dA-F] instead of \d, or [\dA-Fa-f] if the digits can be found in small case. If that can be any letter, you could use [\dA-Z] or [\dA-Za-z] depending on your need to match small case letters.

  3. + means "one or more occurences", so it wouldn't match an empty []. Use the * "0 or more occurences" quantifier instead.

Additionally, you probably need to capture the sequence of digits in a (capturing group) in order to be able to reference it in your replacement pattern.

However, as Andrew Morton suggests, it looks like you should be able to use a plain text search/replace.

Upvotes: 1

Related Questions