I had a problem with a piece of software I wrote quite a long time ago, when it was first implemented and data volumes were low it worked fine, but as recently the volume of data it is expected to process has grown beyond all expectations, it was time to revisit the code to see if I could speed things up.
Basically all the software does is send a travel alert email out to people on a mailing list twice a day at a time they specify in 30 minute blocks. The old software worked in a pretty sequential way, first get a list of all the people who were expecting the email during that time slot, then work through the list generating the email from a template, moving it to the correct sending queue, and then sending it, then moving on to the next one. Unfortunately the volumes of traffic now meant that it was taking longer in some 30 minute slots to get all the emails out so some people in the next slots were getting them much later than they wanted so they were no longer useful.
My first idea was to split the process in two, have one process that would generate the emails and then a second that would send them, but my first cut didn’t really show much of a speed increase. It meant that instead of an email roughly every 4 seconds, it was now an email every 3 seconds so while it might speed things up a bit it wasn’t great.
Next I thought about multithreading, and fortunately in .Net 4 and above this is really easy. When I implemented it on a test VM it went from 20 emails a minute, to 20 emails in 4 seconds which to me is a big enough speed increase to make it worthwhile.
Here is my final code:-
using (_serviceProxy = new OrganizationServiceProxy(crmURI,null, clientCredentials,null)) { _serviceProxy.EnableProxyTypes(); _service = (IOrganizationService)_serviceProxy; // do some stuff here like get a list of records to process int maxThread = 10; // decide on how many threads I want to process SemaphoreSlim sem = new SemaphoreSlim(maxThread); // initialise a semaphore foreach (_toprocess tp in ReadWriteCRM.RecordsToProcess) { sem.Wait(); // if all my threads are in use wait until one becomes available // then start a new thread and call the make message function Task.Factory.StartNew(() => MakeMessage(emailtosend, _service)).ContinueWith(t => sem.Release()); // spawn a new copy of the MakeMessage function and pass 2 parameters to it // then release the semaphore when the function completes } // this is the import bit, because of the using statement if this is omitted then each task will fail // because the IOrganizationService will no longer be available while (sem.CurrentCount!=maxThread) { // .CurrentCount is the number of available tasks so once all the // tasks are closed it should equal the number you set in maxThread earlier Thread.Sleep(2); // let the rest of the system catch up } } private void MakeMessage(_toprocess emailtosend, IOrganizationService _service) { // do stuff here ReadWriteCRM.CreateNewEmailFromTemplate(_service, emailtosend); }
Using the SemaphoreSlim class makes the whole process painless as it easily allows you to decide in advance how many simultaneous tasks you want to run. In the final code I added this value to the configuration file so I can tweak it until I am happy with the balance during final testing.
int maxThread = 10; // decide on how many threads I want to process SemaphoreSlim sem = new SemaphoreSlim(maxThread); // initialise a semaphore
Next inside the actual process look I added a WAIT command, this will pause the loop until a free task slot becomes available.
sem.Wait();
Then once a slot is available I use the line Task.Factory.StartNew to create a new copy of the function that performs all the work.
Task.Factory.StartNew(() => MakeMessage(emailtosend, _service)).ContinueWith(t => sem.Release());
This starts the function, passes 2 parameters to it and then when the function is done it clears the semaphore so the thread can be used again by another copy.
Initially once I had this and was testing it threw errors to tell me the IOrganization service had been closed.
Cannot access a disposed object.
Object name: ‘System.ServiceModel.ChannelFactory`1[Microsoft.Xrm.Sdk.IOrganizationService]’.
This took a little bit of head scratching as sometimes it would run with no errors and other times it would fail and eventually I realised that because I was creating the IOrganization service in a Using statement, a lot of the time the threads would be create and running silently in the background and the Using{} statement would end and dispose of the IOrganization statement especially with the final block of threads. I could remove the Using{} statement altogether and rely on the C# clean-up to get rid of it, or instead I added the following at the end
while (sem.CurrentCount != maxThread) { Thread.Sleep(2); }
This waits until all the threads are closed before continuing. The sem.CurrentCount shows the number of threads available out of the original number you set in maxThread, so if you set a pool size of 10 initially you just have to wait until sem.CurrentCount==10 again before you let the Using{} statement scope close.
So far in testing this has provided a huge speed increase with very little effort.