Reputation: 235
Rarely and apparently randomly, entity framework will insert many duplicate records. Can anyone explain why this behaviour occurs? This is the second project i've seen this in:
protected void btnAddQual_Click(object sender, EventArgs e)
{
QEntities ds = new QEntities();
Qualification qual = new Qualification();
qual.PersonID = ds.Persons.Where(x => x.Username == User.Identity.Name).Single().PersonID;
qual.QualificationName = txtQualAddName.Text;
qual.QualificationProvider = txtQualAddProvider.Text;
qual.QualificationYear = txtQualAddYear.Text;
qual.Inactive = false;
qual.LastUpdatedBy = User.Identity.Name;
qual.LastUpdatedOn = DateTime.Now;
ds.Qualifications.Add(qual);
ds.SaveChanges();
}
Qualifications Table:
public partial class Qualification
{
public int QualificationID { get; set; }
public int PersonID { get; set; }
public string QualificationName { get; set; }
public string QualificationProvider { get; set; }
public string QualificationYear { get; set; }
public bool Inactive { get; set; }
public string LastUpdatedBy { get; set; }
public Nullable<System.DateTime> LastUpdatedOn { get; set; }
public virtual Persons Persons { get; set; }
}
I've seen it create from three to 32 records in one button click, and when it does, the timestamps which can be spread across a good period of time (last time was 28 records, all identical apart from the primary key and timestamps, unevenly distributed over 23 minutes)
I've previously put this down to user or browser based behaviour, but last night it happened with me using the machine. I didn't notice anything unusual at the time, but its infrequent occurance makes it a devil to track down. Can anyone suggest a cause?
Edit with additional information:
This is with .net framework 4.5.2 and EF 6.1.3
Edit to explain the bounty:
I've just seen this occur in the following code:
using(exEntities ds = new exEntities())
{
int initialStations;
int finalStations;
int shouldbestations = numStations * numSessions * numRotations * numBlock;
initialStations = ds.Stations.Count();
for(int b = 1; b <= numBlock; b++)
{
for (int se = 1; se <= numSessions; se++)
{
for (int r = 1; r <= numRotations; r++)
{
for (int st = 1; st <= numStations; st++)
{
Stations station = new Stations();
station.EID = eID;
station.Block = b;
station.Rotation = r;
station.Session = se;
station.StationNum = st;
station.LastUpdatedBy = User.Identity.Name + " (Generated)";
station.LastUpdatedOn = DateTime.Now;
station.Inactive = false;
ds.Stations.Add(station);
}
}
}
}
ds.SaveChanges();
In this instance, the number of iterations of each of the loops were: 1, 2, 6, 5 respectively.
This one click (same timestamp) has duplicated the complete set of records
Upvotes: 0
Views: 388
Reputation: 34908
This is a case where you need to add logging into your application. Based on your code, I do not believe Entity Framework is duplicating your data, rather that your code is being triggered in ways that you are not catering for. Where I have seen EF duplicate records has been due to developers passing entities loaded from one DBContext and then associating them to entities created in a second DBContext without checking the DbContext and attaching them first. EF will treat them as "new", and re-insert them. From the code that you have provided, I do not believe this is the case, but it is something to watch out for.
Firstly, when dealing with web applications in particular, you should write any event handler or POST/PATCH API method to treat every call as untrustworthy. Under normal use these methods should do what you expect, but under abusive use or hacking attempts they can be called when they shouldn't, or carry payloads that they shouldn't. For example, you may expect that an event handler with a record ID of 1234 would only be fired when record 1234 is updated and the user pressed the "Save" button (once), however it is possible to:
Trust nothing, verify and log everything, and if something is out of place, terminate the session. (Force log-out)
For logging, beyond the standard exception logging, I would recommend adding Information traces with an additional compiler constant for a production debug build to monitor one of these cases where this event is getting tripped more than once. Personally I use Diagnostics.Trace
and then hook a logging handler like Log4Net into it.
I would recommend adding something like the following:
#if DEBUG
Trace.TraceInformation(string.Format("Add Stations called. (eIS: {0})", eID));
Trace.TraceInformation(Environment.StackTrace);
#endif
Then when you do your count check and find a problem:
Trace.TraceWarning(string.Format("Add Station count discrepancy. Expected {0}, Found {1}", shouldBeStations, finalStations));
I put the compiler condition because Environment.StackTrace will incur a cost that you normally do not want in a production system. You can replace "DEBUG" with another custom compilation constant that you enable for a deployment to inspect this issue. Leave it running in the wild and monitor your trace output (database, file, etc.) until the problem manifests. When the warning appears you can go back through the Information traces to see when and where the code is being triggered. You can also put similar tracing in calls that loaded the screen where this event would be triggered to record IDs and User details to see if/how an event might have been triggered for the wrong eID via a code bug or some environmental condition or hack attempt.
Alternatively, for logging you can also consider adding a setting to your application configuration to turn on and off logging events based on logging levels or flags to start/stop capturing scenarios without re-deploying.
These kinds of issues are hard to diagnose and fix over something like StackOverflow, but hopefully logging will help highlight something you hadn't considered. Worst case, consider bringing in someone experienced with EF and your core tech stack short-term to have a look over the system as a second pair of eyes over the entire workings may also help point out potential causes.
One small tip. Rather than something like:
qual.PersonID = ds.Persons.Where(x => x.Username == User.Identity.Name).Single().PersonID;
use:
qual.PersonID = ds.Persons.Where(x => x.Username == User.Identity.Name).Select(x => x.PersonID).Single();
The first statement executes a "SELECT * FROM tblPersons WHERE..." where the entity isn't already cached and pulls back all columns, where only PersonID is needed. The second executes a "SELECT PersonID FROM tblPersons WHERE..."
Upvotes: 1