Broken
Broken

Reputation: 31

Fastest way to update all rows to have same value in one column in datatable, without loop in C#

I have datatable "users" and column "is_published" in it. I have about 100k rows. What is the fastest way to update value in the column, so the whole rows in column have same value = 1.

I try with classic foreach loop and it't slow, also I try with LINQ :

dsData.Tables["users"].Select().ToList().ForEach(x => x["is_published"] = 1;);

and it still isn't fast enough.

Also variant wit Expression doesn't work for me, because after that fields is ReadOnly and I can't change value again.

This is C#.

Upvotes: 3

Views: 5649

Answers (3)

Broken
Broken

Reputation: 31

Finaly! I speed up update so it takes 2-3 sec. I added BeginLoadData() and EndLoadData()

    DataTable dt = ToDataSet().Tables["users"];
    var sb = new StringBuilder();
    var xmlWriter = XmlWriter.Create(sb);
    dt.WriteXml(xmlWriter);
    var xml = XDocument.Parse(sb.ToString());
    xml.Descendants("is_published").ToList().ForEach(e => e.Value = "1");
    dt.Clear();
    dt.BeginLoadData();
    dt.ReadXml(xml.CreateReader());
    dt.EndLoadData();

Upvotes: 0

keenthinker
keenthinker

Reputation: 7830

Separate the select and the update into two operations. Skip the ToList() operation and instead iterate afterwards over the IEnumerable collection using forEach and update the value:

var rows = dsData.Tables["users"].Select();
forEach(var row in rows)
{
    row["is_published"] = 1;
}

The ToList forces an immediate query evaluation which in this case acts as a copy of all items from the IEnumerable collection, so you can gain some speed here. I ran some tests and the result in this case is (using your code and the modification): ToList is 3 times slower than iterating over IEnumerable and single update!

IMO 40 seconds is an awful lot for 100K items. If your DataTable is bound to a DataGridView or some other UI control, i believe that the update of the GUI is taking so long and not the update of the values itself. In my tests the update using ToList took fractions of a second (on my simple Lenovo netbook with AMD E-450 processor, and i assume you are not using a 386 machine). Try suspending the UI bevor updating and refreshing the values and then enable it again - example in this SO post.


My original post (as i can see you gained some speed using the code - interesting):

More an experiment for my part, but it is possible to:

  • convert the table to XML
  • fetch all elements that should be changed
  • change them
  • write the changed XML back to the table

The code:

// temp table
var dataTable = new DataTable("Table 1");
dataTable.Columns.Add("title", typeof(string));
dataTable.Columns.Add("number", typeof(int));
dataTable.Columns.Add("subnum1", typeof(int));
dataTable.Columns.Add("subnum2", typeof(int));
// add temp data
Enumerable.Range(1, 100000).ToList().ForEach(e =>
{
    dataTable.Rows.Add(new object[] { "A", 1, 2, 3 });
});
// "bulk update"!
var sb = new StringBuilder();
var xmlWriter = XmlWriter.Create(sb);
dataTable.WriteXml(xmlWriter);
var xml = XDocument.Parse(sb.ToString());
// take column to change
var elementsToChange = xml.Descendants("title").ToList();
// the list is referenced to the XML, so the XML is changed too!
elementsToChange.ForEach(e => e.Value = "Z");
// clear current table
dataTable.Clear();
// write changed data back to table
dataTable.ReadXml(xml.CreateReader());

The table is updated. IMO the parts that make this solution slow are the

  • convertion from and to XML
  • and the fill of the StringBuilder

The other way around the pure update of the list is probably faster than the table update.

Upvotes: 0

sihirbazzz
sihirbazzz

Reputation: 718

when you create your table you can simply push a default value to your column..

 DataTable dt = new DataTable();
dt.Columns["is_published"].DataType = System.Int32;
dt.Columns["is_Published"].DefaultValue = 1;

then when you need to change the rows to default value ( or will you need? )

// Say your user selects the row which its index is 2..
// The ItemArray gives the selectedRow's cells as object..
// And say your columns index no is 5..
dt.Rows[2].ItemArray[5] = default ;

or

dt.Rows[2].ItemArray[5] = dt.Columns["is_published"].DefaultValue;

Upvotes: 1

Related Questions