Reputation: 1
I have some problem with updating all documents in a collection. What I need to do: I need to iterate through ~2 million docs load each doc into memory, parse HTML from one of fields of a doc and save the doc back to DB.
I tried take/skip logic with/without indexes but Id etc. but some records still remain unchanged (even tested for 1000 records with 128 records in a page). In the process of updating documents no more records are inserted. Simple patching (patching API) does not work for this as the update I need to perform is quite complex
Please help with this. Thanks
Code:
public static int UpdateAll<T>(DocumentStore docDB, Action<T> updateAction)
{
return UpdateAll(0, docDB, updateAction);
}
public static int UpdateAll<T>(int startFrom, DocumentStore docDB, Action<T> updateAction)
{
using (var session = docDB.OpenSession())
{
int queryCount = 0;
int start = startFrom;
while (true)
{
var current = session.Query<T>().Take(128).Skip(start).ToList();
if (current.Count == 0)
break;
start += current.Count;
foreach (var doc in current)
{
updateAction(doc);
}
session.SaveChanges();
queryCount += 2;
if (queryCount >= 30)
{
return UpdateAll(start, docDB, updateAction);
}
}
}
return 1;
}
Upvotes: 0
Views: 106
Reputation: 99
Move your session.SaveChanges();
to outside the while loop.
As per Raven's session design, you can only do 30 interactions with the database during any given instance of a session.
If you refactor your code to only SaveChanges() once (or very few times) per using
block, it should work.
For more information, check out the Raven docs : Understanding The Session Object - RavenDB
Upvotes: 0