Sam
Sam

Reputation: 131

pairing up three different lists

I have the following list of dicts:

 authorvals= [
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value1": 4.0
        },
        {
            "author": "author2",
            "year": [
                "2016"
            ],
            "value1": 2.0
        },
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value3": 1.0
        },
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value2": 4.0
        },
        {
            "author": "author2",
            "year": [
                "2016"
            ],
            "value2": 2.0
        }]

Now I want lists from the dict as follows:

val_list=["value1","value2","value3"]
num_list=[[4,2],[4,2],[1,0]]
auth_list=["author1","author2"]

I want the dict as three separate lists.

  1. First list is the keys "value"+x in the dict
  2. Second list is the value of that particular key for auth1 and auth2
  3. Third list is just the list of authors

I have tried the following code:

num_list=[]
auth_list=[]
val_list=[]
for item in authors_dict: 
        if item['author'] not in auth_list: 
            auth_list.append(item['author']) 
            for k in item.keys(): 
                if k.startswith("value") and k not in val_list: 
                    val_list.append(k) 
                    val_list.sort() 
                    for v in val_list:
                        temp_val_list = [] 
                        for i in authors_dict: 
                            try: 
                                val = i[v] 
                                temp_val_list.append(val) 
                            except: 
                                pass
                        if len(temp_val_list) > 0: 
                            num_list.append(temp_val_list) 
                            print(val_list) 
                            print(num_list) 
                            print(auth_list)

but this is not what I want to accomplish the 0 in the last list of num_list is because there is no value for author2.If there is no value,then 0 should be printed

Upvotes: 0

Views: 90

Answers (3)

darkash
darkash

Reputation: 161

auth_list = set([x['author'] for x in authorvals]) # in case you need to access it by index, you can cast the type into list
indexed = {} # for easy representation

for auth in authorvals:
  keys = auth.keys()
  filtered = keys.__sub__(['author', 'year']).__iter__().__next__() # removing 'author' and 'year' key from the key list and take the first value
  if indexed.get(filtered) is None:
    indexed[filtered] = [] # initialize if no same key name found
  indexed[filtered].append(auth[filtered]) # append the value from iteration to respective index

val_list = list(indexed.keys())
num_list = [indexed[key] for key in val_list]

Note that the num_list might be different in that the number of pairs of each members does not have fixed number of members as in the example provided, but you can always process them afterwards

Upvotes: 0

Martin Kleiven
Martin Kleiven

Reputation: 7303

  1. Collect authors in a set
  2. Collect keys and values in a defaultdict
  3. Postprocess the values by adding padding upto the maxlength.
from collections import defaultdict

DATA_INDEX = 2

def collect(records):
    vals = defaultdict(list)
    authors = set()
    for record in records:
        for i, (k, v) in enumerate(record.items()):
            if k == 'author':
                authors.add(v)
            elif i == DATA_INDEX:
                vals[k].append(int(v))

    return (list(authors),
            list(vals.keys()),
            list(pad_by_max_len(vals.values())))



def pad_by_max_len(lol):
    lengths = map(len, lol)
    padlength = max(*lengths)
    padded = map(lambda l: pad(l, padlength), lol)
    
    return padded

def pad(l, padlength):
    return (l + [0] * padlength)[:padlength]

print(collect(authorvals))

Giving:

(
    ['author2', 'author1'],
    ['value1', 'value3', 'value2'],
    [[4, 2], [1, 0], [4, 2]]
)

Upvotes: 1

Saad Hussain
Saad Hussain

Reputation: 80

Wasn't super clear on two things so I made assumptions:

  1. Ordering of values doesn't matter
  2. All values should appear as many times as the maximum occurring value. If not, add zeros to the num_list for that value.

The following code should work to that end:

val_list=[]
num_list=[]
auth_list=[]
max_values = 0

for d in authorvals:
    if d["author"] not in auth_list:
        auth_list.append(d["author"])
    for key in d:
        if key.startswith("value"):
            if key not in val_list:
                val_list.append(key)
                num_list.append([d[key]])
                max_values = max(max_values, 1)
            else:
                idx = val_list.index(key)
                num_list[idx].append(d[key])
                max_values = max(max_values, len(num_list[idx]))

for sublist in num_list:
    if len(sublist) != max_values:
        padding = [0] * (max_values - len(sublist))
        sublist.extend(padding)

print(val_list)  # ['value1', 'value3', 'value2']
print(num_list)  # [[4.0, 2.0], [1.0, 0], [4.0, 2.0]]
print(auth_list) # ['author1', 'author2']

Upvotes: 0

Related Questions