Reputation: 103
I have data in the format
from to
Location1 Location2
Location1 Location3
Location1 Location4
Location1 Location5
Location2 Location1
Location2 Location3
Location3 Location1
Location3 Location2
Location3 Location4
In a csv file. The data maps bike trips from one station to another and is taken from the website of a bike rental company in Chicago.
Right now I have basic code that takes each row and adds it to a list, but it does not create a dictionary in the second index as I am wishing for. My script looks like:
import csv
li = []
with open('Desktop/test_Q4_trips.csv', newline='') as csvfile:
reader = csv.reader(csvfile, delimiter=',')
for name, imports in reader:
li.append({
"name": name,
"imports": imports,
})
del li[0]
Here is the output,
[{"from": "Location1", "to": "Location2"}, {"from": "Location1", "to": "Location3"},
{"from": "Location1", "to": "Location4"}, {"from": "Location1", "to": "Location5"},
...]
I want to convert this data into a this format,
[{"from": "Location1", "to": ["Location2", "Location3", "Location4", "Location5"]},
{"from": "Location2", "to": ["Location1", "Location3"]},
{"from": "Location3", "to": ["Location1", "Location2", "Location4"]}, ...
].
In words, I would like to create a list of dictionaries where each dictionary has a single value in the first index and a list of (variably many) values in the second index. In particular, the output should list all of the stations that are on the receiving end of a bike rental trip in a list in the second index. To do this, I imagine I will have to create a script with a for loop that loops through the "from" values on the left and appends every "to" location that corresponds to each "from" one to a list.
I would like my data to be in the particular form I am mentioning in order to work with data visualization code that I have. I am sure that creating the format that I would like needs a leap in thought, but I am not sure exactly what to do to satisfy this. I am also not sure if the output type I need should be a list or an array, and would appreciate a clarification about this.
Please help me solve this, Thanks in advance.
Upvotes: 1
Views: 61
Reputation: 2726
This should work I think
import numpy as np
l = [{"from": "Location1", "to": "Location2"}, {"from": "Location1", "to": "Location3"},
{"from": "Location1", "to": "Location4"}, {"from": "Location1", "to": "Location5"}]
from_to = np.array(([d['from'] for d in l],[d['to'] for d in l])).T
froms = set(from_to[:,0])
out = []
for f in froms:
d = {}
mask = from_to[:,0]==f
d['from']=f
d['to'] = from_to[:,1][mask]
out.append(d)
Upvotes: 0
Reputation: 1597
The collections.defaultdict
might be a good approach to solve this problem.
from collections import defaultdict
d = defaultdict(list)
a = [{"from": "Location1", "to": "Location2"}, {"from": "Location1", "to": "Location3"},
{"from": "Location1", "to": "Location4"}, {"from": "Location1", "to": "Location5"}]
for o in a:
d[o['from']].append(o['to'])
print(d)
Upvotes: 2