Reputation: 131

pairing up three different lists

I have the following list of dicts:

 authorvals= [
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value1": 4.0
        },
        {
            "author": "author2",
            "year": [
                "2016"
            ],
            "value1": 2.0
        },
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value3": 1.0
        },
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value2": 4.0
        },
        {
            "author": "author2",
            "year": [
                "2016"
            ],
            "value2": 2.0
        }]

Now I want lists from the dict as follows:

val_list=["value1","value2","value3"]
num_list=[[4,2],[4,2],[1,0]]
auth_list=["author1","author2"]

I want the dict as three separate lists.

First list is the keys "value"+x in the dict
Second list is the value of that particular key for auth1 and auth2
Third list is just the list of authors

I have tried the following code:

num_list=[]
auth_list=[]
val_list=[]
for item in authors_dict: 
        if item['author'] not in auth_list: 
            auth_list.append(item['author']) 
            for k in item.keys(): 
                if k.startswith("value") and k not in val_list: 
                    val_list.append(k) 
                    val_list.sort() 
                    for v in val_list:
                        temp_val_list = [] 
                        for i in authors_dict: 
                            try: 
                                val = i[v] 
                                temp_val_list.append(val) 
                            except: 
                                pass
                        if len(temp_val_list) > 0: 
                            num_list.append(temp_val_list) 
                            print(val_list) 
                            print(num_list) 
                            print(auth_list)

but this is not what I want to accomplish the 0 in the last list of num_list is because there is no value for author2.If there is no value,then 0 should be printed

Upvotes: 0

Answers (3)

darkash

Reputation: 161

auth_list = set([x['author'] for x in authorvals]) # in case you need to access it by index, you can cast the type into list
indexed = {} # for easy representation

for auth in authorvals:
  keys = auth.keys()
  filtered = keys.__sub__(['author', 'year']).__iter__().__next__() # removing 'author' and 'year' key from the key list and take the first value
  if indexed.get(filtered) is None:
    indexed[filtered] = [] # initialize if no same key name found
  indexed[filtered].append(auth[filtered]) # append the value from iteration to respective index

val_list = list(indexed.keys())
num_list = [indexed[key] for key in val_list]

Note that the num_list might be different in that the number of pairs of each members does not have fixed number of members as in the example provided, but you can always process them afterwards

Upvotes: 0

Martin Kleiven

Reputation: 7303

Collect authors in a set
Collect keys and values in a defaultdict
Postprocess the values by adding padding upto the maxlength.

from collections import defaultdict

DATA_INDEX = 2

def collect(records):
    vals = defaultdict(list)
    authors = set()
    for record in records:
        for i, (k, v) in enumerate(record.items()):
            if k == 'author':
                authors.add(v)
            elif i == DATA_INDEX:
                vals[k].append(int(v))

    return (list(authors),
            list(vals.keys()),
            list(pad_by_max_len(vals.values())))



def pad_by_max_len(lol):
    lengths = map(len, lol)
    padlength = max(*lengths)
    padded = map(lambda l: pad(l, padlength), lol)
    
    return padded

def pad(l, padlength):
    return (l + [0] * padlength)[:padlength]

print(collect(authorvals))

Giving:

(
    ['author2', 'author1'],
    ['value1', 'value3', 'value2'],
    [[4, 2], [1, 0], [4, 2]]
)

Upvotes: 1

Saad Hussain

Reputation: 80

Wasn't super clear on two things so I made assumptions:

Ordering of values doesn't matter
All values should appear as many times as the maximum occurring value. If not, add zeros to the num_list for that value.

The following code should work to that end:

val_list=[]
num_list=[]
auth_list=[]
max_values = 0

for d in authorvals:
    if d["author"] not in auth_list:
        auth_list.append(d["author"])
    for key in d:
        if key.startswith("value"):
            if key not in val_list:
                val_list.append(key)
                num_list.append([d[key]])
                max_values = max(max_values, 1)
            else:
                idx = val_list.index(key)
                num_list[idx].append(d[key])
                max_values = max(max_values, len(num_list[idx]))

for sublist in num_list:
    if len(sublist) != max_values:
        padding = [0] * (max_values - len(sublist))
        sublist.extend(padding)

print(val_list)  # ['value1', 'value3', 'value2']
print(num_list)  # [[4.0, 2.0], [1.0, 0], [4.0, 2.0]]
print(auth_list) # ['author1', 'author2']

Upvotes: 0

pairing up three different lists

Answers (3)

Related Questions