Briskstar Technologies
Briskstar Technologies

Reputation: 2253

Extract json data from string using regex

I have data string like below:

....
data=[{"CaseNo":1863,"CaseNumber":"RD14051315","imageFormat":"jpeg","ShiftID":241,"City":"Riyadh","ImageTypeID":2,"userId":20}]
--5Qf7xJyP8snivHqYCPKMDJS-ZG0qde4OqIyIG
Content-Disposition: form-data
.....

I want to fetch json data from above string. How can I use regex to find that part of string? I tried with finding indexOf("data=[") and indexOf("}]") but its not working fine and proper way to do.

Upvotes: 2

Views: 12743

Answers (3)

Shazi
Shazi

Reputation: 1569

Just want to add a somewhat more complex regex that can handle multiple json objects (with nested objects) in a text. Found many regexes that uses recursion, but since that is not an option for .Net and I found no answer anywhere that provided a solution then I thought I'd share it here.

(?<json>{(?:[^{}]|(?<Nested>{)|(?<-Nested>}))*(?(Nested)(?!))})

Regex101 example

Instead of using recursion that many other languages take advantage of we can instead use a balancing group. What we basically do is count the amount of open brackets against the amount of closing brackets within the json object and if there are any remaining open brackets without matching closing brackets then we reject the capture (via a negative lookaround).

Upvotes: 2

Tyrrrz
Tyrrrz

Reputation: 2611

Somewhat more resilient way, in case of nested data, would be to try to use RegEx to find the beginning of the JSON and then match opening/closing braces until you find the end of it.

Something like this:

string ExtractJson(string source)
{
    var buffer = new StringBuilder();
    var depth = 0;

    // We trust that the source contains valid json, we just need to extract it.
    // To do it, we will be matching curly braces until we even out.
    for (var i = 0; i < source.Length; i++)
    {
        var ch = source[i];
        var chPrv = i > 0 ? source[i - 1] : default;

        buffer.Append(ch);

        // Match braces
        if (ch == '{' && chPrv != '\\')
            depth++;
        else if (ch == '}' && chPrv != '\\')
            depth--;

        // Break when evened out
        if (depth == 0)
            break;
    }

    return buffer.ToString();
}


// ...

var input = "...";

var json = ExtractJson(Regex.Match(input, @"data=\{(.*)\}").Groups[1].Value);

var jsonParsed = JToken.Parse(json);

This handles situations where there might be multiple json blobs in the input, or some other content that also contains braces.

Upvotes: 0

KaraokeStu
KaraokeStu

Reputation: 768

I'm not entirely certain there isn't a better way to do this, however the following regex string should get you the data you need:

// Define the Regular Expression, including the "data="
// but put the latter part (the part we want) in its own group
Regex regex = new Regex(
    @"data=(\[{.*}\])",
    RegexOptions.Multiline
);

// Run the regular expression on the input string
Match match = regex.Match(input);

// Now, if we've got a match, grab the first group from it
if (match.Success)
{
    // Now get our JSON string
    string jsonString = match.Groups[1].Value;

    // Now do whatever you need to do (e.g. de-serialise the JSON)
    ...

    }
}

Upvotes: 2

Related Questions