Extract content of code which start with a curly bracket and ends with a curly bracket followed by closing parenthesis

Question

I'm completely mess with Regular Expressions right now(lack of practice). I'm writing a node script, which goes through a bunch of js files, each file calls a function, with one of the arguments being a json. The aim is to get all those json arguments and place them in one file. The problem I'm facing at the moment is the extraction of the argument part of the code, here is the function call part of that string:

$translateProvider.translations('de', {
        WASTE_MANAGEMENT: 'Abfallmanagement',
        WASTE_TYPE_LIST: 'Abfallarten',
        WASTE_ENTRY_LIST: 'Abfalleinträge',
        WASTE_TYPE: 'Abfallart',
        TREATMENT_TYPE: 'Behandlungsart',
        TREATMENT_TYPE_STATUS: 'Status Behandlungsart',
        DUPLICATED_TREATMENT_TYPE: 'Doppelte Behandlungsart',
        TREATMENT_TYPE_LIST: 'Behandlungsarten',
        TREATMENT_TARGET_LIST: 'Ziele Behandlungsarten',
        TREATMENT_TARGET_ADD: 'Ziel Behandlungsart hinzufügen',
        SITE_TARGET: 'Gebäudeziel',
        WASTE_TREATMENT_TYPES: 'Abfallbehandlungsarten',
        WASTE_TREATMENT_TARGETS: '{{Abfallbehandlungsziele}}',
        WASTE_TREATMENT_TYPES_LIST: '{{Abfallbehandlungsarten}}',
        WASTE_TYPE_ADD: 'Abfallart hinzufügen',
        UNIT_ADD: 'Einheit hinzufügen'
})

So I'm trying to write a regular expression which matches the segment of the js code, which starts with "'de', {" and ends with "})", while it can have any characters between(single/double curly brackets included). I tried something like this \'de'\s*,\s*{([^}]*)})\ , but that doesn't work. The furthest I got was with this \'de'\s*,\s*{([^})]*)}\ , but this ends at the first closing curly bracket within the json, which is not what I want. It seems, that even the concepts of regular exressions I understood before, now I completely forgot. Any is help is much appreciated.

damonholden · Accepted Answer

This can be done with lookahead, lookbehind, and boundary-type assertions:

/(?<=^\$translateProvider\.translations$'de', {)[\s\S]*(?=}$$)/

(?<=^\$translateProvider\.translations$'de', {) is a lookbehind assertion that checks for '$translateProvider.translations('de', {' at the beginning of the string.
(?=}$$) is a lookahead assertion that checks for '})' at the end of the string.
[\s\S]* is a character class that matches any sequence of space and non-space characters between the two assertions.

Here is the regex101 link for you to test

Hope this helps.

Extract content of code which start with a curly bracket and ends with a curly bracket followed by closing parenthesis

Answers (2)

Related Questions