Nick Adams
Nick Adams

Reputation: 199

Extracting dictionary variable names in a python file using Regex

I need advice in attacking this problem, I am stumped and don't really know where to start. I don't want the code i just need advice

The question is as follows:

use regular expressions to extract all of the variable names which get assigned to dictionary or set literals from a Python program located in the file code.py. Python variables match the regular expression \w+ in Python (see the documentation about \w). Dictionary and set literals start with a open squiggly brace ({).

For example, given this Python program:

code.py

# Here is a Python file.
names = {}
foo = []
names[0] = 12
christmas ={
  'tree': 'green',
  'candy cane': 'red and white'
}

def yes():
  return "yes"

your program should produce this output:

names
christmas

The variable names should be printed out in the order that they appear in the file. The Python code will be syntactically valid, but not necessarily executable without error.

So I have this code.py consisting of the above code and my new file called program.py where I have to read the contents of code.py and output the variable names for all the dictionaries. That means if code.py is to change and a lot more dictionaries were added in it would still output all of them in order.

Explanations as to what does what would help greatly!

Upvotes: 0

Views: 425

Answers (1)

Chaker
Chaker

Reputation: 1207

You can use this regex

(\w*)\s*=\s*{(?:.|\n)*?}

How

(\w*) #match the dict name (any alphanumeric character and _ ) and capture it
\s*=\s* #match '=' with any amount of space around it
{ # match '{' 
(?:.|\n)*? # match any dict defintion over couple of lines
} # match '}' 

You can check it here

Upvotes: 1

Related Questions