BartAtRanch
BartAtRanch

Reputation: 127

C#, DataTables, Primary Keys, and Select performance

I have a C# application that has a large DataTable in memory. I need to do a DataTable.Select with a search filter to subset the data. Would adding a primary key to the table help with data retrieval performance? Are there other tricks that can be played to get fastest performance in sub-setting a DataTable?

This is what I am doing currently;

using (DataTable datThisProduction = datProduction.Select("IDColumn = '" + strThisID + "'", "DateColumn ASC").CopyToDataTable())

I need to search on the IDColumn and then sort by DataColumn. Should the primary key include both columns?

I looked for other postings about DataTable primay keys and there seemed to be uncertainties if primary keys helped.

Upvotes: 2

Views: 1387

Answers (2)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67311

According to your comment you deal with "lots of data". Believe me, "1000 items" with about 200 sub-items each is not "lots of data" :-)

If performance is an issue with so little data your problems are somewhere else...

You are mixing two concepts

Speaking about DataTable and Primary Key is a database topic. This is best to be done within your RDBMS (SQL Server as you write). Create a nice structure there, implement the right indices and you won't have any performance problems...

One should not try to do the database's job within application's code...

Multi-Layer

It is something else if you transfer your datarows into business objects. There are very specialised collections (e.g. KeyedCollection) which are created to implement exactly this need. But you should - for cleanness! - have a clean cut between your data layer and your business objects. Normally we do not speak about objects from System.Data-namespace there...

Scalability and multi-user-behaviour

In this point of view your approach (load everything into the application's memory and deal with the data there) is far the worst...

Conclusio

It depends on your needs if "reading data on demand" or if a business object layer with permanent objects is better.

You can solve your problem easily with Linq (but Linq is not fast, it's using Reflection), with a specialised collection and even with a primary key within your application - but this is not the best approach.

There is good support in typed datasets, EF and all ORM-tools with relations. In this case your items and your sub-data would be in 1:n-related tables. The generated objects will have methods like "MyParentItem" or "MyChildrenList".

Upvotes: 0

jdweng
jdweng

Reputation: 34421

Pretty simple with Linq

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;

namespace ConsoleApplication57
{
    class Program
    {
        static void Main(string[] args)
        {
            DataTable ThisProduction = new DataTable();
            datThisProduction.Columns.Add("IDColumn", typeof(int));
            datThisProduction.Columns.Add("DataColumn", typeof(int));
            init strThisID = 123;

            DataTable datThisProduction = ThisProduction.AsEnumerable()
                .Where(x => x.Field<int>("IDColumn") == strThisID)
                .OrderBy(y => y.Field<int>("DataColumn"))
                .CopyToDataTable();

        }
    }
}

Upvotes: 1

Related Questions