Reputation: 9065
We have an ASP.NET project with Entity Framework and SQL Azure.
A big part of our data only needs to be updated a few times a day, other data is very volatile.
So far, so good.
Until we introduced a bug that linked one of these 'cached' objects to the 'volatile' data, and did a SaveChanges.
Well, that was quite a mess.
The whole data tree was added again and again by every update, corrupting the whole database with a whole lot of duplicated data.
As a complete hack I added a completely arbitrary column with a UniqueConstraint and some gibberish data on one of the root tables; hopefully failing the SaveChanges() next time we introduce such a bug because it will violate the Unique Constraint.
But it is of course hacky, and I'm still pretty scared ;P Are there any better ways to prevent whole tree's of cached objects ending up in the database?
More information
EntityCache.Instance.LolCats = new DbContext().LolCats.AsNoTracking().ToList();
This cache I dependency-inject into my controllers.
Upvotes: 2
Views: 1458
Reputation: 39014
You can solve it like this:
1) Create an interface like this:
public interface IIsReadOnly
{
bool IsReadOnly { get; set; }
}
2) Implement this interface in all of the entities that can be cached. When you read and cache them, set the IsReadOnly
property to true
. This flag will be used when SaveChanges
is invoked. Remember to decorate this property with the [NotMapped]
attribute, or use any other mean to make EF ignore it.
public class ACacheableEntitySample
: IIsReadOnly
{
[NotMapped]
public bool IsReadOnly { get; set; }
// define the "regular" entity properties
}
NOTE: you can include the property directly in the class definition (if using Code First), or use partial classes (for Db First, Model First, or Code First).
NOTE: alternatively you can make EF ignore the IsReadOnly
property using the Fluent API, or even better a custom convention (EF 6+)
3) Override your inherited DbContext.SaveChanges
method. In the overridden method, review all the entries with pending changes, and if they are read only, change there state to Unchanged
:
if (entry is IIsReadOnly) // if it's a cacheable entity
{
if (entry.IsReadOnly) // and it was marked as readonly when caching
{
// change the entry state to unchanged here, so that it's not updated
}
}
NOTE: This is sample code to explain what you need to do. In your final implementation you can do it with a simple LINQ sentence that get all the IIsReadOnly
entities, which have the IsReadOnly
set to true, and set their state to Unchanged
.
You can use the IIsReadOnly
entites in another DbContext
and manipulate them in the usual way. For example if you get one of these entites, update it, and call SaveChanges
, the changes will be saved because IsReadOnly
will have the default false
value. But you'll easily avoid saving changes of cached data accidentally, simply by setting the IsReadOnly
property to true when caching.
Upvotes: 1
Reputation: 785
Original answer deleted because it was a waste of time.
Your post and proceeding comments are a perfect example of the XY Problem.
You say:
I really need a solution for the problem, not for the architecture
A caching solution you implemented that violates at least a half dozen best practices has (surprise!) blown up in your face. You've managed to stop it from blowing again up via a spectacular (not in a good way) hack but you want to know how to do it in a way that won't require such a spectacular hack.
You needed to cache some data because it was getting too expensive to hit the database for every request.
This is a perfectly valid answer and, surprise, a best practice. Navigation properties can change any time you regenerate the code in your Entity Data Model and are often ambiguous. With a bit of effort you could have used this and never had to worry about EF's handling of object relationships again.
Another valid answer, and one that requires the least amount of actual work. MVC applications usually require some redundancy between viewmodels and entity objects and if you ever write a proper multi-tier application you'll practically drown in redundant objects. And nobody will accidentally add these objects to a DbContext ever again - because they can't.
You have offered up very little useful information. From what I can tell your approach from the get-go was wrong.
Firstly, dumping whole tables into memory at App_Start is at best a temporary solution. If the table was too big to hit on every request, it's too big to hit on App_Start. What happens if something important breaks while people are using your application and you need to deploy a bug fix ASAP? What happens when your tables get really big and you start getting timeouts from EF while trying to dump them into memory? What happens if 95% of your users only really ever need 10% of that big table you've dumped into memory? Is the memory on your web/cache server going to be enough to accommodate the increasing size of your tables? For how long?
Secondly, no Entity object should remain anywhere after its originating DbContext is disposed. Entity objects behave in a convenient way while their DbContext is in scope and become troublesome POCOs when it's out of scope. I say troublesome because the 'magic' DbContext does with change tracking tends to fool people unfamiliar with the inner workings of EF into thinking that an Entity object is directly connected to a table row in the database. The problem you had illustrates this point perfectly.
Thirdly, it looks like you need to delete and re-dump a whole table to memory, even if you only update a single column in a single row. That's immensely wasteful to both the memory and CPU on your web server, and to your Azure SQL instance(s). What happens when a small bit of data comes in wrong and needs to be updated in a hurry? What if one of your nightly update jobs fails but you need fresh data in the morning?
You may not worry about any of this stuff now but your solution blowing up in your face should have at the very least raised some red flags. I've had to deal with as lot of caching in projects I've worked on in the past few years and everything I say here comes from experience.
If you've put a little effort into organizing your code, all of your CRUD operations on the database should be in specialized helper classes which I call repositories. Your controller calls its specialized repository (StuffController - StuffRepository), receives a model and binds that model to a view, kinda like this:
public class StuffController : Controller
{
private MyDbContext _db;
private StuffRepository _repo;
public StuffController()
{
_db = new MyDbContext();
_repo = new StuffRepository(_db);
}
// ...
public ActionResult Details(int id)
{
var model = _repo.ReadDetails(id);
// ...
return View(model);
}
protected override void Dispose(bool disposing)
{
_db.Dispose();
base.Dispose(disposing);
}
}
What on-demand caching would do is wrap that call to the repository in such a way that if the result of that method was already in the cache and it was not stale, it would return it from the cache. Otherwise it would hit the database.
Here's a simplified (and probably nonfunctional) example of a CacheWrapper class so you can understand what it does, using HttpRuntime.Cache:
public static class CacheWrapper
{
private static List<string> _keys = new List<string>();
public static List<string> Keys
{
get { lock(_keys) { return _keys.ToList(); } }
}
public static T Fetch<T>(string key, Func<T> dlgt, bool refresh = false) where T : class
{
var result = HttpRuntime.Cache.Get(key) as T;
if(result != null && !refresh) return result;
lock(HttpRuntime.Cache)
{
lock(_keys)
{
_keys.Add(key);
}
result = dlgt();
HttpRuntime.Cache.Add(key, result, /* some other params */);
}
return result;
}
}
And the new way to call things from the controller:
public ActionResult Details(int id)
{
var model = CacheWrapper.Fetch("StuffDetails_" + id, () => _repo.ReadDetails(id));
// ...
return View(model);
}
A slightly more complex version of this is in production on a public web application as we speak and working quite well.
Upvotes: 1