Reputation: 20862
What is the easiest way to save and load data in python, preferably in a human-readable output format?
The data I am saving/loading consists of two vectors of floats. Ideally, these vectors would be named in the file (e.g. X and Y).
My current save()
and load()
functions use file.readline()
, file.write()
and string-to-float conversion. There must be something better.
Upvotes: 15
Views: 80978
Reputation: 601519
There are several options -- I don't exactly know what you like. If the two vectors have the same length, you could use numpy.savetxt()
to save your vectors, say x
and y
, as columns:
# saving:
f = open("data", "w")
f.write("# x y\n") # column names
numpy.savetxt(f, numpy.array([x, y]).T)
# loading:
x, y = numpy.loadtxt("data", unpack=True)
If you are dealing with larger vectors of floats, you should probably use NumPy anyway.
Upvotes: 9
Reputation: 478
Here is an example of Encoder until you probably want to write for Body
class:
# add this to your code
class BodyEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.ndarray):
return obj.tolist()
if hasattr(obj, '__jsonencode__'):
return obj.__jsonencode__()
if isinstance(obj, set):
return list(obj)
return obj.__dict__
# Here you construct your way to dump your data for each instance
# you need to customize this function
def deserialize(data):
bodies = [Body(d["name"],d["mass"],np.array(d["p"]),np.array(d["v"])) for d in data["bodies"]]
axis_range = data["axis_range"]
timescale = data["timescale"]
return bodies, axis_range, timescale
# Here you construct your way to load your data for each instance
# you need to customize this function
def serialize(data):
file = open(FILE_NAME, 'w+')
json.dump(data, file, cls=BodyEncoder, indent=4)
print("Dumping Parameters of the Latest Run")
print(json.dumps(data, cls=BodyEncoder, indent=4))
Here is an example of the class I want to serialize:
class Body(object):
# you do not need to change your class structure
def __init__(self, name, mass, p, v=(0.0, 0.0, 0.0)):
# init variables like normal
self.name = name
self.mass = mass
self.p = p
self.v = v
self.f = np.array([0.0, 0.0, 0.0])
def attraction(self, other):
# not important functions that I wrote...
Here is how to serialize:
# you need to customize this function
def serialize_everything():
bodies, axis_range, timescale = generate_data_to_serialize()
data = {"bodies": bodies, "axis_range": axis_range, "timescale": timescale}
BodyEncoder.serialize(data)
Here is how to dump:
def dump_everything():
data = json.loads(open(FILE_NAME, "r").read())
return BodyEncoder.deserialize(data)
Upvotes: 1
Reputation: 538
As I commented in the accepted answer, using numpy
this can be done with a simple one-liner:
Assuming you have numpy
imported as np
(which is common practice),
np.savetxt('xy.txt', np.array([x, y]).T, fmt="%.3f", header="x y")
will save the data in the (optional) format and
x, y = np.loadtxt('xy.txt', unpack=True)
will load it.
The file xy.txt
will then look like:
# x y
1.000 1.000
1.500 2.250
2.000 4.000
2.500 6.250
3.000 9.000
Note that the format string fmt=...
is optional, but if the goal is human-readability it may prove quite useful. If used, it is specified using the usual printf
-like codes (In my example: floating-point number with 3 decimals).
Upvotes: 0
Reputation: 500257
Since we're talking about a human editing the file, I assume we're talking about relatively little data.
How about the following skeleton implementation. It simply saves the data as key=value
pairs and works with lists, tuples and many other things.
def save(fname, **kwargs):
f = open(fname, "wt")
for k, v in kwargs.items():
print >>f, "%s=%s" % (k, repr(v))
f.close()
def load(fname):
ret = {}
for line in open(fname, "rt"):
k, v = line.strip().split("=", 1)
ret[k] = eval(v)
return ret
x = [1, 2, 3]
y = [2.0, 1e15, -10.3]
save("data.txt", x=x, y=y)
d = load("data.txt")
print d["x"]
print d["y"]
Upvotes: 0
Reputation: 172219
If it should be human-readable, I'd also go with JSON. Unless you need to exchange it with enterprise-type people, they like XML better. :-)
If it should be human editable and isn't too complex, I'd probably go with some sort of INI-like format, like for example configparser.
If it is complex, and doesn't need to be exchanged, I'd go with just pickling the data, unless it's very complex, in which case I'd use ZODB.
If it's a LOT of data, and needs to be exchanged, I'd use SQL.
That pretty much covers it, I think.
Upvotes: 8
Reputation: 5095
The most simple way to get a human-readable output is by using a serialisation format such a JSON. Python contains a json library you can use to serialise data to and from a string. Like pickle, you can use this with an IO object to write it to a file.
import json
file = open('/usr/data/application/json-dump.json', 'w+')
data = { "x": 12153535.232321, "y": 35234531.232322 }
json.dump(data, file)
If you want to get a simple string back instead of dumping it to a file, you can use json.dumps() instead:
import json
print json.dumps({ "x": 12153535.232321, "y": 35234531.232322 })
Reading back from a file is just as easy:
import json
file = open('/usr/data/application/json-dump.json', 'r')
print json.load(file)
The json library is full-featured, so I'd recommend checking out the documentation to see what sorts of things you can do with it.
Upvotes: 27
Reputation: 838096
A simple serialization format that is easy for both humans to computers read is JSON.
You can use the json Python module.
Upvotes: 2