Reputation: 23634
I want to consolidate my logging data into a single StatisticStore
model. Right now, my logging data is scattered around 3 models, which is a mess.
What would be the best way to iterate over all those records of all 3 models, and create a copy of each in the new StatisticStore
model?
Upvotes: 0
Views: 49
Reputation: 15143
You haven't described many limitations, so I assume it's just a simple copy operation you're after. "Best way" is kinda vague, I don't know what you're comparing against. The only thing that you'd want to be careful about is to do the actual work of creating the new entity, copying data over, and deleting the old entity in a transaction. This is simple to do, and will prevent you from creating duplicates in case something goes wrong.
The remote API shell is definitely the least-coding-effort way to do it. You can write simple python functions to do your transactional copy, and run it in the shell. You don't need to write any extra handlers, and you don't even need to deploy a new version of your app. The problem with the remote shell is that it's probably 100x slower in accessing your datastore, so it could take a long time. If you let it run overnight, it potentially could stop if you have a hiccup in your internet connection - though this shouldn't be a huge problem if you copied your entities in a transaction, you can just restart the operation. Just as a reference, I recently ran an operation via remote API that uploaded 6000 entities, it took maybe 5 minutes. If you're ok with letting the operation run overnight, this is probably the way to go unless you have > 100K entities.
The mapreduce API method will run faster, since the load will be spread across a number of instances. A bit more effort to get mapreduce set up, and you'll have to deploy a new version of your app with the functionality, kick it off, wait until it finishes, and maybe clean out the code, as well as a bunch of logging entities that mapreduce automatically generates.
Upvotes: 0
Reputation: 1839
If you only have a few thousand entities per model, I would simply iterate over each of the three models using a datastore fetch
and store them in your new StatisticStore
entity. You might even be able to do this using the remote api.
If you have many thousands of entities per model, check out the MapReduce framework. With the MapReduce framework, you would need to write a pipeline definition for each of your three models and three map functions that take an entity and store it in your StatisticStore
. The "reduce" part should be unnecessary in your case.
The answers for this SO question might also provide further inspiration.
Upvotes: 1