user3657953
user3657953

Reputation: 33

Populate large files into sqlite from Python

I am new to Python and Sqlite. I run large neural network simulations and I store the spikes as ASCII file(here named spikeTimes.csv ) with 2 columns, First column is the spike time and the second the Neuron Id. Each simulation run has a different parameter (call it theta). I want to populate the database such that I can query with Neuron Id and Theta to get all the spikes times for that neuron Id. This is what I am doing, it works, but is extremely slow since I am looping through every spike time. Could anybody provide suggestions to make it faster ? Thanks in advance.

from  peewee import *
from numpy import *
spkDB = SqliteDatabase('simData.db')

class SimData(Model):
    neuronId = IntegerField();

    class Meta:
        database = spkDB

class SpikeTimes(Model):
    spkNeuronId = ForeignKeyField(SimData, related_name = 'neuron')
    theta = DoubleField();
    spkTimes = DoubleField();      

    class Meta:
        database = spkDB

st = loadtxt('spkTimes.csv')
curTheta = 0;
SimData.create_table()
SpikeTimes.create_table()
for k in unique(st[:, 1]):
    tmp = SimData.create(neuronId = k)
    tmp.save()
    for m in st[st[:, 1] == k, 0]:
        tmpSt = SpikeTimes.create(spkNeuronId = tmp, theta = curTheta, spkTimes = m)
        tmpSt.save()

print 'done'

Upvotes: 0

Views: 161

Answers (1)

coleifer
coleifer

Reputation: 26225

Use a transaction.

with spkDB.transaction():
    for k in unique(st[:, 1]):
        tmp = SimData.create(neuronId = k)
        tmp.save()
        for m in st[st[:, 1] == k, 0]:
            tmpSt = SpikeTimes.create(spkNeuronId = tmp, theta = curTheta, spkTimes = m)
            tmpSt.save()

See also: http://peewee.readthedocs.org/en/latest/peewee/cookbook.html#bulk-inserts

Upvotes: 1

Related Questions