Reputation: 4850
I'm trying to figure out what's causing this error:
line 60: What's causing this error??? Only arising on the first line:
UnboundLocalError: local variable 'name' referenced before assignment
Code:
import re
import json
import jsonpickle
from nameparser import HumanName
from pprint import pprint
import csv
import json
import jsonpickle
from nameparser import HumanName
from pprint import pprint
from string import punctuation, whitespace
def parse_ieca_gc(s):
########################## HANDLE NAME ELEMENT ###############################
degrees = ['M.A.T.','Ph.D.','MA','J.D.','Ed.M.', 'M.A.', 'M.B.A.', 'Ed.S.', 'M.Div.', 'M.Ed.', 'RN', 'B.S.Ed.', 'M.D.']
degrees_list = []
# check whether the name string has an area / has a comma
if ',' in s['name']:
# separate area of practice from name and degree and bind this to var 'area'
split_area_nmdeg = s['name'].split(',')
area = split_area_nmdeg.pop()
print 'split area nmdeg'
print area
print split_area_nmdeg
# Split the name and deg by spaces. If there's a deg, it will match with one of elements and will be stored deg list. The deg is removed name_deg list and all that's left is the name.
split_name_deg = re.split('\s',split_area_nmdeg[0])
for word in split_name_deg:
for deg in degrees:
if deg == word:
degrees_list.append(split_name_deg.pop())
name = ' '.join(split_name_deg)
# if the name string does not contain a comma, just parse as normal string
else:
area = []
split_name_deg = re.split('\s',s['name'])
for word in split_name_deg:
for deg in degrees:
if deg == word:
degrees_list.append(split_name_deg.pop())
name = ' '.join(split_name_deg)
# area of practice
category = area
# name
name = HumanName(name)
first_name = name.first
middle_name = name.middle
last_name = name.last
title = name.title
full_name = dict(first_name=first_name, middle_name=middle_name, last_name=last_name, title=title)
# degrees
degrees = degrees_list
# website
website = s.get('website','')
gc_ieca = dict(
name = name,
website = website,
degrees = degrees,
),
myjson = [] # myjson = list of dictionaries where each dictionary
with(open("ieca_first_col_fake_text.txt", "rU")) as f:
sheet = csv.DictReader(f,delimiter="\t")
for row in sheet:
myjson.append(row)
for i in range(4):
s = myjson[i]
a = parse_ieca_gc(s)
pprint(a)
example data (made up data):
name phone email website
Diane Grant Albrecht M.S.
"Lannister G. Cersei M.A.T., CEP" 111-222-3333 [email protected] www.got.com
Argle D. Bargle Ed.M.
Sam D. Man Ed.M. 000-000-1111 [email protected] www.daManWithThePlan.com
D G Bamf M.S.
Amy Tramy Lamy Ph.D.
Last login: Tue Jul 2 15:33:31 on ttys000
/var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup\ At\ Startup/ieca_first_col-394486416.142.py.command ; exit;
Samuel-Finegolds-MacBook-Pro:~ samuelfinegold$ /var/folders/jv/9_sy0bn10mbdft1bk9t14qz40000gn/T/Cleanup\ At\ Startup/ieca_first_col-394486416.142.py.command ; exit;
range 4
split area nmdeg
CEP
['Lannister G. Cersei M.A.T.']
({'additionaltext': '',
'bio': '',
'category': ' CEP',
'certifications': [],
'company': '',
'counselingoptions': [],
'counselingtype': [],
'datasource': {'additionaltext': '',
'linktext': '',
'linkurl': '',
'logourl': ''},
'degrees': ['M.A.T.'],
'description': '',
'email': {'emailtype': [], 'value': '[email protected]'},
'facebook': '',
'languages': 'english',
'linkedin': '',
'linktext': '',
'linkurl': '',
'location': {'address': '',
'city': '',
'country': 'united states',
'geo': {'lat': '', 'lng': ''},
'loc_name': '',
'locationtype': '',
'state': '',
'zip': ''},
'logourl': '',
'name': {'first_name': u'Lannister',
'last_name': u'Cersei',
'middle_name': u'G.',
'title': u''},
'phone': {'phonetype': [], 'value': '1112223333'},
'photo': '',
'price': {'costrange': [], 'costtype': []},
'twitter': '',
'website': ''},)
({'additionaltext': '',
'bio': '',
'category': [],
'certifications': [],
'company': '',
'counselingoptions': [],
'counselingtype': [],
'datasource': {'additionaltext': '',
'linktext': '',
'linkurl': '',
'logourl': ''},
'degrees': ['Ed.M.'],
'description': '',
'email': {'emailtype': [], 'value': ''},
'facebook': '',
'languages': 'english',
'linkedin': '',
'linktext': '',
'linkurl': '',
'location': {'address': '',
'city': '',
'country': 'united states',
'geo': {'lat': '', 'lng': ''},
'loc_name': '',
'locationtype': '',
'state': '',
'zip': ''},
'logourl': '',
'name': {'first_name': u'Argle',
'last_name': u'Bargle',
'middle_name': u'D.',
'title': u''},
'phone': {'phonetype': [], 'value': ''},
'photo': '',
'price': {'costrange': [], 'costtype': []},
'twitter': '',
'website': ''},)
({'additionaltext': '',
'bio': '',
'category': [],
'certifications': [],
'company': '',
'counselingoptions': [],
'counselingtype': [],
'datasource': {'additionaltext': '',
'linktext': '',
'linkurl': '',
'logourl': ''},
'degrees': ['Ed.M.'],
'description': '',
'email': {'emailtype': [], 'value': '[email protected]'},
'facebook': '',
'languages': 'english',
'linkedin': '',
'linktext': '',
'linkurl': '',
'location': {'address': '',
'city': '',
'country': 'united states',
'geo': {'lat': '', 'lng': ''},
'loc_name': '',
'locationtype': '',
'state': '',
'zip': ''},
'logourl': '',
'name': {'first_name': u'Sam',
'last_name': u'Man',
'middle_name': u'D.',
'title': u''},
'phone': {'phonetype': [], 'value': '0000001111'},
'photo': '',
'price': {'costrange': [], 'costtype': []},
'twitter': '',
'website': ''},)
({'additionaltext': '',
'bio': '',
'category': [],
'certifications': [],
'company': '',
'counselingoptions': [],
'counselingtype': [],
'datasource': {'additionaltext': '',
'linktext': '',
'linkurl': '',
'logourl': ''},
'degrees': ['M.S.'],
'description': '',
'email': {'emailtype': [], 'value': ''},
'facebook': '',
'languages': 'english',
'linkedin': '',
'linktext': '',
'linkurl': '',
'location': {'address': '',
'city': '',
'country': 'united states',
'geo': {'lat': '', 'lng': ''},
'loc_name': '',
'locationtype': '',
'state': '',
'zip': ''},
'logourl': '',
'name': {'first_name': u'D',
'last_name': u'Bamf',
'middle_name': u'G',
'title': u''},
'phone': {'phonetype': [], 'value': ''},
'photo': '',
'price': {'costrange': [], 'costtype': []},
'twitter': '',
'website': ''},)
logout
[Process completed]
Upvotes: 0
Views: 580
Reputation: 14211
I thought leaving a comment, but I guess that should actually qualify as an answer. You seem to like programming (or at least be serious about it), so please take my answer positively: not as another piece of criticisms but as an advice how to avoid similar errors/problems in the future.
This are just a few points that I came up with after reading your code:
Code is messy which makes it difficult to find and follow the main line of your thinking (the program logic). Since you are not just prototyping or experimenting, but writing a functioning program you should really add an entry point. In python one first defines his module with all entries and elements (mainly imports, constants and functions), and only then sets the entry point with the section: if __name__ == '__main__':
at the bottom of the module.
The program is not that big, but because you are trying to do too much (very quick-n-dirty) and using just few lines it becomes dangerous. Your code growth organically very fast and exposing it errors like this. Please take your time and learn how to break your code into functions, which are the basic building blocks for each module. Try to define many small self-consistent functions in your module and call them from the main part of the program. If you manage to give them proper names - your code will be very readable, especially starting with __main__
part.
Treat each function as a small program (divide and conquer). Keep each function small in number of lines (<= 20) and compact in number of arguments (<= 5-7). It has many advantages:
__main__
, doctests or unittests. Like this, you will always have full control of your program even before/without applying sophisticated debugging techniquesProgressing slowly allows to keep constant overview of your idea while writing the program. Even if the code will end up uglier than you would wish it to be, any incremental change to your code should be traceable (you kind-of know/observe how much code is added with each step). You can also start using version control even locally (just for yourself), that will allow you to progress slowly by keeping your commits atomic and self-contained.
If you still feel like you give to far or written too much code w/o running it, your end up in a situation similar to yours now. Another trick could be just to put exit()
call in the middle or before of the newly written code that breaks (by checking line number of the exception info). In most cases, trying to print out variables and check if their values are similar to expected helps to find the problem. Otherwise just comment a section of you program in order to make a few steps back (cutting it until it gets that small that whatever is "on" - works)
Avoid too many nested loops and conditional constructions. Try to do not more than 2-3 nested blocks per function. It is important matter. Use tools like pylint and PEP8 to check the quality of your code. You will be surprised how many complains those tools are capable to find about the code that looks decent. E.g. there is a lot of motivation for having 80 chars limit per line of your code. That really does prevent writing too much hanging and nested code. Ideally code is always compact: each function is not too wide, and not too tall.
Finally, try to avoid
name = HumanName(name)
If you write a line of code that takes you too long to think afterwards, consider correcting it. If you write a function that you later don't understand, consider throwing it away. If you do everything right and you get an error that you don't understand, consider going to sleep.
PS
Don't forget to smoke the famous
>>> import this
Hope some of the points are useful.
GL!
Upvotes: 1
Reputation: 1122502
You are using a local variable name
here:
name = HumanName(name)
You do set name
before that point, but only if certain conditions match. When those conditions do not match, name
is never assigned to and the exception is thrown.
For example, in the first if
branch, the loop is:
for word in split_name_deg:
for deg in degrees:
if deg == word:
degrees_list.append(split_name_deg.pop())
name = ' '.join(split_name_deg)
If deg == word
never matches, then name
is never set either.
Your function also doesn't return anything, so the line a = parse_ieca_gc(s)
will only ever assign None
to a
. You need to use the return
keyword to set a return value for your function.
Last but not least, you only pass the first row from your CSV file to the function, and that first row has no website associated with it:
Diane Grant Albrecht M.S.
Upvotes: 4