JensB
JensB

Reputation: 6850

How to achieve good import performance with CRM online 2015

Scenario:

Create or update records in CRM from an external database based on if the record key exists in CRM or not.

Platform: CRM 2015 Online.

Method SSIS, with .Net 4 Script component

Problem: No matter what I do I cannot get more than around 1 record per second. I really hope I'm doing something wrong

What I'm doing:

  1. Get a list of all records from CRM with Guid and Key field.
  2. Divide records into multiple tasks.
  3. Create ExecuteMultipleRequest within each task with Create or Update based on if record exists in previous list from (1).
  4. Create one OrganizationService for each thread (reused between successive threads Round Robin style (see code below)).
  5. Run the execute multiple.
  6. Cry over speed.

The result does not seem to vary with amount of parallel tasks or with batch size. It's basically ALWAYS around 0.9-1.5 records per second.

I've tried everything from 1 thread with 1000 pieces to 16 threads with 1 piece each.

This post claiming that 200-300 records per second is possible is taunting me something fierce: evil taunting blog post with 300 records per second to CRM online

Code snippets from the project (tried to only copy what might be relevant):

The context creation.

 foreach (int i in Enumerable.Range(1, _MaxThreads * 2))
 {
    var crmConnection = CrmConnection.Parse(connectionString);
    var organisationservice = new OrganizationService(crmConnection);
    _OrgServiceList.Add(organisationservice);
 }

How I'm creating my tasks:

private void ImportNewBatch(List<Customer> dataSet)
{
    var service = _OrgServiceList[_CurrentServicePosition];
    _CurrentServicePosition++;

    if (_CurrentServicePosition >= _OrgServiceList.Count)
        _CurrentServicePosition = 0;

    var aTask = new Task(() => WorkerThread(dataSet, service), TaskCreationOptions.None);
    aTask.Start();
    _RunningThreads.Add(aTask);

    if (_RunningThreads.Count >= _MaxThreads)
        Task.WaitAny(_RunningThreads.ToArray());

    _RunningThreads.RemoveAll(t => t.IsCompleted);
}

The execute multiple query

var requestWithResults = new ExecuteMultipleRequest()
{
    Settings = new ExecuteMultipleSettings()
    {
        ContinueOnError = true,
        ReturnResponses = true
    },
    Requests = new OrganizationRequestCollection()
};

Update 1 Got a semi unofficial answer regarding CRM online from Microsoft which might be interesting to others as well:

Upvotes: 1

Views: 5191

Answers (4)

user5278381
user5278381

Reputation: 11

I must disagree with Dayrl's comment about 2 threads; in On Premise CRM the 2 thread limit does not apply; not per company or anything. I have used executeMultiple with up to 50 threads for the same company. But you have to be careful to not send multiple ExecuteMultiple requests through the same OganizationProxy. As MSDN documentation states, org proxies are not thread safe. The easiest way to make sure you are on separate proxies is to build an organization proxy pool. Perhaps the trick here is that you then have multiple web server sessions.

Using this technique I can saturate the CPU of a 16 core CRM machine and push the inserts per second until my limitation is the disk of the SQL server. I did this using a CRM local entity. I.e., if you create a new entity in CRM by taking all the defaults. Using this technique I achieved 3,900 inserts per second using two 16 CORE CRM FE machines. This was without tweaking the .Net thread pool.

The neglected whitepaper, Microsoft Dynamics CRM 2011 Data Load Performance and Scalability Case Study, discusses the use of 10 threads per server. But I have used more than 10 threads with executeMutliple against the same server and the same organization to the point where I have 97% CPU use on multiple servers.

Upvotes: 1

Nicknow
Nicknow

Reputation: 7224

You will always be limited by the performance that Dynamics CRM Online makes available to your instance. I know that in on-premise scenarios I've been able to achieve tens of thousands of inserts per second (normal records with no plugins/workflows running.)

I wouldn't be trying to multi-thread inside your code, I would use the Balanced Data Distributor to achieve the desired functionality. It's a bit of a pain - because you have to duplicate the destination but it works.

You can read some highlights about using the Balanced Data Distributor with CRM at Sonoma Partner's blog. I would copy it here but it is not much code and mostly images.

Upvotes: 4

Daryl
Daryl

Reputation: 18895

You're limited to having two ExeucteMultipleRequests at a single time within CRM. So there is no point in attempting to have more than two threads.

Make sure you maximize the number of requests that are being performed in the single ExecuteMultipelRequests call. From your explanation of what you're doing, it sounds like you are only putting a single update/create request in CRM at a time.

I'd also check to make sure there aren't any plugins that are being performed as a result of your import.

Also a little unsure on your use of creating the Organization Connections before hand. If you put 1000 updates in a single request, (at 1 per second) it'll take almost 17 minutes for that single request to complete. Creating a new Service is in the sub second range, so you might as well create a new service for each request, just to make sure you don't have a multi-threading issue, since the Organization Service is not multi-thread safe.

Upvotes: 3

VMAtm
VMAtm

Reputation: 28355

Can't easily find the information about multithreading usage of the Script Component for SSIS, so only a try which can help you.

This is how you do the Task creation:

var aTask = new Task(() => WorkerThread(dataSet, service), TaskCreationOptions.None);
aTask.Start();

Right here you only provide the delegate to execute and no other information. The Script component may be a single-threaded by default so all of your tasks are executed right in your main thread, without using the ThreadPool.

Consider the Task creation with providing the TaskScheduler.Default as it will use whole resources it can to achieve the result quickly:

TaskFactory.StartNew(delegate here, null, TaskCreationOptions.None, TaskScheduler.Default)

So other issues in your code:

if (_RunningThreads.Count >= _MaxThreads)
    Task.WaitAny(_RunningThreads.ToArray());

This is a bad practice. By default, ThreadPool knows better, should it advance some task or not.

aTask.Start();

Consider async\await usage here as it will be more efficient for your code.


Update: I think that the ThreadPool woudn't start simultaneosly number of tasks greater than processors number. You can easily check it on your machine simply by examing the internal it's propeties, but as I can remember, it is near the 4 in each moment of time.

According MSDN:

The number of operations that can be queued to the thread pool is limited only by available memory; however, the thread pool limits the number of threads that can be active in the process simultaneously.
Beginning with the .NET Framework 4, the default size of the thread pool for a process depends on several factors, such as the size of the virtual address space. A process can call the GetMaxThreads method to determine the number of threads.
You can control the maximum number of threads by using the GetMaxThreads and SetMaxThreads methods.

Upvotes: 1

Related Questions