Reputation: 2492
I'd like to write an API that reads from a CSV on disk (with x, y coordinates) and outputs them in JSON format to be rendered by a web front end. The issue is that there are lots of data points (order of 30k) and so going from numpy arrays of x and y into JSON is really slow.
This is my current function to get the data in JSON format. Is there any way to speed this up? It seems very redundant to have such a large data structure for each 2d point.
def to_json(xdata, ydata):
data = []
for x, y in zip(xdata, ydata):
data.append({"x": x, "y": y})
return data
Upvotes: 1
Views: 1675
Reputation:
You could use list comprehension like:
def to_json(xdata, ydata):
return [{"x": x, "y": y} for x, y in zip(xdata, ydata)]
Eliminates use of unnessacary variable, and is cleaner.
You can also use generators like:
def to_json(xdata, ydata):
return ({"x": x, "y": y} for x, y in zip(xdata, ydata))
They're created super fast and are light on the system, use little to no memory. This last's until you do something like convert it to a list.
Since the objects are just x-y co-ordinates i'd use a generator object with x-y tuples - which are also created faster - like so:
def to_json(xdata, ydata):
return ((x,y) for x, y in zip(xdata, ydata))
Edit: You could replace the tuples with lists []
, theyre valid JSON arrays.
Upvotes: 1
Reputation: 3682
Your method seems reasonable enough. Here are a few changes I might make to it. The itertools module has lots of handy tools that can make your life easier. I used izip, which you can read up on here
import json
from itertools import izip
def to_json(xdata, ydata):
data = []
for x, y in izip(xdata, ydata): # using izip is more memory efficient
data.append({"x": x, "y": y})
return json.dumps(data) # convert that list into json
Upvotes: 0