goldisfine
goldisfine

Reputation: 4850

Not appending to list

I'm trying to create a list of dictionaries where each dictionary key is a job and each value is a list of abilities associated with that job.

Ex:

[{'clerk': ['math ability','writing ability',...etc]},{'salesman':['charisma','writing ability','etc']}]

This is the data that I'm working with:

O*NET-SOC Code  Element ID  Element Name    Scale ID    Data Value  N   Standard Error  Lower CI Bound  Upper CI Bound  Recommend Suppress  Not Relevant    Date    Domain Source
11-1011.00  1.A.1.a.1   Oral Comprehension  IM  4.5 8   0.19    4.13    4.87    N   n/a Jun-06  Analyst
11-1011.00  1.A.1.a.1   Oral Comprehension  LV  4.75    8   0.25    4.26    5.24    N   N   Jun-06  Analyst
11-1011.00  1.A.1.a.2   Written Comprehension   IM  4.38    8   0.18    4.02    4.73    N   n/a Jun-06  Analyst

And this is what I've done so far:

First I create a list of dictionaries, each representing a row in the data above with keys = to column names an vals = column values. Sample:

OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.19'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.13'), ('Date', '06/2006'), ('Data Value', '4.50'), ('Upper CI Bound', '4.87'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.25'), ('Element ID', '1.A.1.a.1'), ('N', '8'), ('Scale ID', 'LV'), ('Not Relevant', 'N'), ('Element Name', 'Oral Comprehension'), ('Lower CI Bound', '4.26'), ('Date', '06/2006'), ('Data Value', '4.75'), ('Upper CI Bound', '5.24'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.18'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'IM'), ('Not Relevant', 'n/a'), ('Element Name', 'Written Comprehension'), ('Lower CI Bound', '4.02'), ('Date', '06/2006'), ('Data Value', '4.38'), ('Upper CI Bound', '4.73'), ('O*NET-SOC Code', '11-1011.00')]), OrderedDict([('Domain Source', 'Analyst'), ('Recommend Suppress', 'N'), ('Standard Error', '0.32'), ('Element ID', '1.A.1.a.2'), ('N', '8'), ('Scale ID', 'LV'),

And then I try to merge the dictionaries into fewer dictionaries where each key is job code and each value is a list of abilities associated with that job.

def add_abilites(abilites_m_l):
    jobs_list = []
    for ind, dict in enumerate(abilites_m_l):
        activities_list = []
        if abilities_m_l[ind-1]['O*NET-SOC Code'] == abilities_m_l[ind]['O*NET-SOC Code']: 
            if abilities_m_l[ind]['Element Name'] != abilities_m_l[ind-1]['Element Name']:
                activities_list.append(abilities_m_l[ind]['Element Name'])
            else: pass
        else: list.append({abilities_m_l[ind]['O*NET-SOC Code']:activities_list})        
    return jobs_list
a_l_with_abilities = add_abilites(abilities_m_l)
print a_l_with_abilities

I get the following output:

[{'11-1011.00': []}, {'11-1021.00': []}, {'11-2011.00': []}, {'11-2021.00': []}, {'11-2022.00': []}, {'11-2031.00': []}, {'11-3011.00': []}, {'11-3021.00': []}, {'11-3031.01': []}, {'11-3031.02': []}, {'11-3051.00': []}, {'11-3051.01': []}, {'11-3051.02': []}, {'11-3051.04': []}, {'11-3061.00': []}, {'11-3071.01': []}, {'11-3071.02': []}, {'11-3071.03': []}, {'11-3111.00': []}, {'11-3121.00': []}, {'11-3131.00': []}, {'11-9013.01': []}, {'11-9013.03': []}, {'11-9021.00': []}, {'11-9031.00': []}, {'11-9032.00': []}, {'11-9033.00': []}, {'11-9041.00': []}, {'11-.....

In other words, my lists aren't being filled.

Upvotes: 0

Views: 85

Answers (1)

Peter DeGlopper
Peter DeGlopper

Reputation: 37319

The core problem is that you're reassigning activities_list to the empty list for each dictionary in your abilities_m_l. So when you detect a changed 'O*NET-SOC code' value, you append the empty list you just reassigned.

Here's a cleaner way to do this:

def add_abilities(abilities_m_l):
    jobs_dict = OrderedDict()
    for data_dict in abilities_m_l:
        o_code = data_dict['O*NET-SOC Code']
        activity = data_dict['Element Name']
        activities_so_far = jobs_dict.setdefault(o_code, OrderedDict())
        activities_so_far[activity] = True
    return [{o_code: activities.keys()} for o_code, activities in jobs_dict.iteritems()]

Or if you're on Python 3, where the keys, values and items calls return iterables rather than lists:

    return [{o_code: list(activities.keys())} for o_code, activities in jobs_dict.items()]

Or, if you don't need the order of the activities preserved, use a set for the activities. That's preferable, but Python unfortunately does not have a native OrderedSet so I approximated it above with an OrderedDict containing True for the activities found for a code.

def add_abilities(abilities_m_l):
    jobs_dict = OrderedDict()
    for data_dict in abilities_m_l:
        o_code = data_dict['O*NET-SOC Code']
        activity = data_dict['Element Name']
        activities_so_far = jobs_dict.setdefault(o_code, set)
        activities_so_far.add(activity)
    return [{o_code: list(activities)} for o_code, activities in jobs_dict.iteritems()]

The point is to let Python's dictionaries gather the information about the shared keys, and to maintain uniqueness of the activities for each code.

Upvotes: 1

Related Questions