ManInMoon
ManInMoon

Reputation: 7005

Why is second approach slower when looping through a DataTable column

I have a very large DataTable ~ 4million rows.

I need to calculate columns in the table, and if I process entire column in a method (Go1), it is faster than Go2 where I loop through rows and call method for each row.

I need to use Go2 approach, because later I need to add more rows to the table and update all columns.

But why is Go2 approach slower - is it just overhead of call ProcessRow() each time?

Is there a workaround?

public static void AddSignal()
{
    foreach (DataRow row in Data.Rows)
    {
        row[x] = (invertSignal ? -1:1)*Math.Sign(row.Field<double>(y) - row.Field<double>(y));
    }
}

public class ByRowAddSignal
{
    DataRow row;

    public ByRowAddSignal()
    {

    }

    public void ProcessRow(int r)
    {
        row = Data.Rows[r];
        row[x] = (invertSignal ? -1 : 1) * Math.Sign(row.Field<double>(y) - row.Field<double>(y));
    }
}

Public static DataTable Data;
public void Go1()
{
      Data = LoadData();

      AddSignal();
}

public void Go2()
{
      Data = LoadData();

      ByRowAddSignal byRowAddSignal = new ByRowAddSignal ();

      for (int r = 0; r < Data.Rows.Count; r++)
      {
            byRowAddSignal.ProcessRow(r);
      }
}

Upvotes: 0

Views: 236

Answers (1)

Titian Cernicova-Dragomir
Titian Cernicova-Dragomir

Reputation: 249606

Looking at the code for DataRowCollection we find the following:

public DataRow this[int index]
{
    get
    {
        return ((RBTree<DataRow>)this.list)[index];
    }
}

And RBTree<K> is actually a tree not an array backed list, so indexing into it is complicated, as on each index call you need to iterate to the aproproate element. The code from RBTree<K> shows this:

public K this[int index]
{
    get
    {
        return this.Key(this.GetNodeByIndex(index).NodeID);
    }
}
private NodePath GetNodeByIndex(int userIndex)
{
    int num;
    int mainTreeNodeID = default(int);
    if (this._inUseSatelliteTreeCount == 0)
    {
        num = this.ComputeNodeByIndex(this.root, userIndex + 1);
        mainTreeNodeID = 0;
    }
    else
    {
        num = this.ComputeNodeByIndex(userIndex, out mainTreeNodeID);
    }
    if (num == 0)
    {
        if (TreeAccessMethod.INDEX_ONLY == this._accessMethod)
        {
            throw ExceptionBuilder.RowOutOfRange(userIndex);
        }
        throw ExceptionBuilder.InternalRBTreeError(RBTreeError.IndexOutOFRangeinGetNodeByIndex);
    }
    return new NodePath(num, mainTreeNodeID);
}
private int ComputeNodeByIndex(int x_id, int index)
{
    while (x_id != 0)
    {
        int num = this.Left(x_id);
        int num2 = this.SubTreeSize(num) + 1;
        if (index < num2)
        {
            x_id = num;
        }
        else
        {
            if (num2 >= index)
            {
                break;
            }
            x_id = this.Right(x_id);
            index -= num2;
        }
    }
    return x_id;
}

Note Code decompiled with ILSpy

Upvotes: 1

Related Questions