Reputation: 20862

easy save/load of data in python

What is the easiest way to save and load data in python, preferably in a human-readable output format?

The data I am saving/loading consists of two vectors of floats. Ideally, these vectors would be named in the file (e.g. X and Y).

My current save() and load() functions use file.readline(), file.write() and string-to-float conversion. There must be something better.

Upvotes: 15

Answers (7)

Sven Marnach

Reputation: 601519

There are several options -- I don't exactly know what you like. If the two vectors have the same length, you could use numpy.savetxt() to save your vectors, say x and y, as columns:

# saving:
f = open("data", "w")
f.write("# x y\n")        # column names
numpy.savetxt(f, numpy.array([x, y]).T)
# loading:
x, y = numpy.loadtxt("data", unpack=True)

If you are dealing with larger vectors of floats, you should probably use NumPy anyway.

Upvotes: 9

Koke Cacao

Reputation: 478

Here is an example of Encoder until you probably want to write for Body class:

# add this to your code
class BodyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        if hasattr(obj, '__jsonencode__'):
            return obj.__jsonencode__()
        if isinstance(obj, set):
            return list(obj)
        return obj.__dict__

    # Here you construct your way to dump your data for each instance
    # you need to customize this function
    def deserialize(data):
        bodies = [Body(d["name"],d["mass"],np.array(d["p"]),np.array(d["v"])) for d in data["bodies"]]
        axis_range = data["axis_range"]
        timescale = data["timescale"]
        return bodies, axis_range, timescale

    # Here you construct your way to load your data for each instance
    # you need to customize this function
    def serialize(data):
        file = open(FILE_NAME, 'w+')
        json.dump(data, file, cls=BodyEncoder, indent=4)
        print("Dumping Parameters of the Latest Run")
        print(json.dumps(data, cls=BodyEncoder, indent=4))

Here is an example of the class I want to serialize:

class Body(object):
    # you do not need to change your class structure
    def __init__(self, name, mass, p, v=(0.0, 0.0, 0.0)):
        # init variables like normal
        self.name = name
        self.mass = mass
        self.p = p
        self.v = v
        self.f = np.array([0.0, 0.0, 0.0])

    def attraction(self, other):
        # not important functions that I wrote...

Here is how to serialize:

# you need to customize this function
def serialize_everything():
    bodies, axis_range, timescale = generate_data_to_serialize()

    data = {"bodies": bodies, "axis_range": axis_range, "timescale": timescale}
    BodyEncoder.serialize(data)

Here is how to dump:

def dump_everything():
    data = json.loads(open(FILE_NAME, "r").read())
    return BodyEncoder.deserialize(data)

Upvotes: 1

Dalker

Reputation: 538

As I commented in the accepted answer, using numpy this can be done with a simple one-liner:

Assuming you have numpy imported as np (which is common practice),

np.savetxt('xy.txt', np.array([x, y]).T, fmt="%.3f", header="x   y")

will save the data in the (optional) format and

x, y = np.loadtxt('xy.txt', unpack=True)

will load it.

The file xy.txt will then look like:

# x   y
1.000 1.000
1.500 2.250
2.000 4.000
2.500 6.250
3.000 9.000

Note that the format string fmt=... is optional, but if the goal is human-readability it may prove quite useful. If used, it is specified using the usual printf-like codes (In my example: floating-point number with 3 decimals).

Upvotes: 0

NPE

Reputation: 500257

Since we're talking about a human editing the file, I assume we're talking about relatively little data.

How about the following skeleton implementation. It simply saves the data as key=value pairs and works with lists, tuples and many other things.

    def save(fname, **kwargs):
      f = open(fname, "wt")
      for k, v in kwargs.items():
        print >>f, "%s=%s" % (k, repr(v))
      f.close()

    def load(fname):
      ret = {}
      for line in open(fname, "rt"):
        k, v = line.strip().split("=", 1)
        ret[k] = eval(v)
      return ret

    x = [1, 2, 3]
    y = [2.0, 1e15, -10.3]
    save("data.txt", x=x, y=y)
    d = load("data.txt")
    print d["x"]
    print d["y"]

Upvotes: 0

Lennart Regebro

Reputation: 172219

If it should be human-readable, I'd also go with JSON. Unless you need to exchange it with enterprise-type people, they like XML better. :-)
If it should be human editable and isn't too complex, I'd probably go with some sort of INI-like format, like for example configparser.
If it is complex, and doesn't need to be exchanged, I'd go with just pickling the data, unless it's very complex, in which case I'd use ZODB.
If it's a LOT of data, and needs to be exchanged, I'd use SQL.

That pretty much covers it, I think.

Upvotes: 8

Jamie Rumbelow

Reputation: 5095

The most simple way to get a human-readable output is by using a serialisation format such a JSON. Python contains a json library you can use to serialise data to and from a string. Like pickle, you can use this with an IO object to write it to a file.

import json

file = open('/usr/data/application/json-dump.json', 'w+')
data = { "x": 12153535.232321, "y": 35234531.232322 }

json.dump(data, file)

If you want to get a simple string back instead of dumping it to a file, you can use json.dumps() instead:

import json
print json.dumps({ "x": 12153535.232321, "y": 35234531.232322 })

Reading back from a file is just as easy:

import json

file = open('/usr/data/application/json-dump.json', 'r')
print json.load(file)

The json library is full-featured, so I'd recommend checking out the documentation to see what sorts of things you can do with it.

Upvotes: 27

Mark Byers

Reputation: 838096

A simple serialization format that is easy for both humans to computers read is JSON.

You can use the json Python module.

Upvotes: 2

easy save/load of data in python

Answers (7)

Related Questions