efox29
efox29

Reputation: 141

RegEx syntax for multiline

This is a snippet of a file I am trying to parse.

typedef union {
    struct {
        unsigned RC1REG                 :8;
    };
} RC1REGbits_t;
extern volatile RC1REGbits_t RC1REGbits __at(0x119);
// bitfield macros
#define _RC1REG_RC1REG_POSN                                 0x0
#define _RC1REG_RC1REG_POSITION                             0x0
#define _RC1REG_RC1REG_SIZE                                 0x8
#define _RC1REG_RC1REG_LENGTH                               0x8
#define _RC1REG_RC1REG_MASK                                 0xFF
// alias bitfield definitions
typedef union {
    struct {
        unsigned RC1REG                 :8;
    };
} RCREGbits_t;

extern volatile RCREGbits_t RCREGbits __at(0x119);
// bitfield macros
#define _RCREG_RC1REG_POSN                                  0x0
#define _RCREG_RC1REG_POSITION                              0x0
#define _RCREG_RC1REG_SIZE                                  0x8
#define _RCREG_RC1REG_LENGTH                                0x8
#define _RCREG_RC1REG_MASK                                  0xFF
typedef union {
    struct {
        unsigned RC1REG                 :8;
    };
} RCREG1bits_t;

I am trying to extract the typedef declarations in this file but only the typedef.

I thought I had it it with typedef\s+union\s+\{(\n.+)+bits_t; and unfortunately, it's not quite right. In the sample provided above, all the text will be highlighted which is undesired. However,if there is a newline before the extern, it works. But I can't guarantee that there will be a newline there always.

The desired output would be similar to

typedef union {
    struct {
        unsigned RC1REG                 :8;
    };
} RC1REGbits_t;

typedef union {
    struct {
        unsigned RC1REG                 :8;
    };
} RCREGbits_t;    

typedef union {
    struct {
        unsigned RC1REG                 :8;
    };
} RCREG1bits_t;

How can bound my search to typedef union .... bits_t; ?

Bonus points if its in python syntax (using the re package) since that's the language I am writing this in.

Upvotes: 0

Views: 38

Answers (1)

CertainPerformance
CertainPerformance

Reputation: 370659

You can lazily repeat any characters until coming across bits_t;:

typedef\s+union\s+{[\s\S]*?bits_t;

For Python syntax, something like

re.findall(r'typedef\s+union\s+{[\s\S]*?bits_t;', input)

Upvotes: 1

Related Questions