Reputation: 2443
(Please see the answer I wrote for more understanding of the situation.)
Below is a query which works great, operating on selected rows of table STUDENTS. Then one edit destroys the query
variable. What's wrong here?
students
is rows selected from an import Datatable defined in part by:
importTable.Columns.Add("SECTION", typeof(string));
importTable.Columns.Add("NUMBER", typeof(string));
importTable.Columns.Add("ID", typeof(string));
(Because the DataTable is untyped, I need to cast the data into string to use the fields).
Then called by:
IEnumerable<DataRow> s = importTable.AsEnumerable();
IEnumerable<DataRow> t = s
.OrderBy(r => r["HALL"]);
IEnumerable<DataRow> sortedTable = t
.OrderBy(r =>
{ //if (r["ID"] is DBNull)
// return "";
//else
return r["ID"]; // ERROR
});
IEnumerable<DataRow> tue = sortedTable.Where(r => r["DAY"].Equals("TUE"));
IEnumerable<DataRow> wed = sortedTable.Where(r => r["DAY"].Equals("WED"));
AssignSections(tue);
AssignSections(wed);
Here is the query:
public void AssignSections(IEnumerable<DataRow> students)
{
IEnumerable<IEnumerable<DataRow>> query = from e in students.AsEnumerable()
orderby (e["SHORTSCHOOL"] as string).PadRight(30) + e["SEED"] as string
group e by new { DAY=e["DAY"], GRADE=e["GRADE"] } into g
orderby g.Key.GRADE as string
select g.AsEnumerable();
var queryList = query.ToList(); // ArgumentException during "WED" call
foreach (var grade in query)
foreach (var student in grade)
if (student["ID"] == DBNull.Value)
{
student["SECTION"] = "S";
student["ID"] = "ID1";
}
}
Assigning SECTION works, NO PROBLEM. Assigning ID causes query
to look like:
query
now appears invalid. Future uses of query
also prove to be invalid (though the foreach finishes fine).
For what it's worth, grade
is just fine, but students
is also invalidated through the original table seems to be fine as well.
Upvotes: 2
Views: 244
Reputation: 205629
No magic here. It's a combination of LINQ query Deferred Execution and the usage of the DBNull
, which cannot be compared to/from other types.
The deferred execution has been explained many times, so I'm not going to spend time on it. Shorty, the query is executed the only (but anytime) when enumerated. Enumerating means foreach
, ToList
etc. and technically speaking happens when the GetEnumerator of the enumerable (or the first MoveNext of the enumerator) is called.
All you need to remember from the above is that the IEnumerable<T>
(or IQueryable<T>
) returning LINQ queries are not executed (evaluated) at the time you define them, but every time you enumerate them (directly or indirectly). This should explain the "The answer surprisingly to me is that LINQ reorders code" part from your own answer. No, LINQ does not reorder the code, it's your code which is doing that by reevaluting the LINQ queries at certain points which are different from the place where you define your query variables. If you want to evaluate them just once at specific point, then do that by adding ToList
, ToArray
and similar methods which enumerate the query and store the result in some in memory collection and use that collection for further processing. It still be IEnumerable<T>
, but further enumerations would enumerate the query result rather than reevaluate the query.
The main issue is the DBNull
. From your explanations looks like initially all the ID
values are DBNull
, so the first query runs fine (DBNull
knows how to compare to itself :). Once the source contains at least one value which is not DBNull
, any further query that uses OrderBy
that column with the default IComparer
will fail.
It can easily be reproduced w/o data tables with the following simple code:
var data = new[]
{
new { Id = (object)DBNull.Value },
new { Id = (object)DBNull.Value }
};
var query = data.OrderBy(e => e.Id);
query.ToList(); // Success
data[1] = new { Id = (object)"whatever" };
query.ToList(); // Fail
showing the deferred query execution and reevaluation, or directly (to prove that the problem is not with editing):
new[]
{
new { Id = (object)DBNull.Value },
new { Id = (object)"whatever" }
}
.OrderBy(e => e.Id)
.ToList(); // Fail
The solution is to avoid DBNull
at all. The easiest (and much better than as string
or ToString()
) with DataTable
is to use DataRowExtensions.Field extension methods instead of object
returning indexer, which besides providing strongly typed access to the columns also automatically handle DBNull
s for you (converts them to null
when you request string
or nullable type), so you won't experience such issues.
It can be proved by changing your problematic code to
.OrderBy(r => r.Field<string>("ID"))
and the problem will be gone. I strongly recommend doing that for other column accessors as well.
Upvotes: 4
Reputation: 7618
DBNull and null is not the same...
As your original error message says "Object must be of type string" (to be assigned to a string)
DBNull can't be cast to a string,it is a class...
You need to handle this case in your code.
See this link for a simple helper method:
Unable to cast object of type 'System.DBNull' to type 'System.String
using System;
namespace ConsoleApp1
{
class Program
{
static void Main(string[] args)
{
DBNull dbNull = DBNull.Value;
Console.WriteLine(typeof(string).IsAssignableFrom(typeof(DBNull)));//False
Console.WriteLine(dbNull is string); //False
//Console.WriteLine((string)dbNull); // compile time error
//Console.WriteLine(dbNull as string); // compile time error
Console.ReadLine();
}
}
}
Also, make sure you read how "Lazy Loading"/"Deferred Execution" works on LINQ/IEnumerable.
You don't have to use IEnumerable all the time,especially if you are not sure how it works.
Upvotes: 0
Reputation: 2443
The answer surprisingly to me is that LINQ reorders code. The context was this:
IEnumerable<DataRow> s = importTable.AsEnumerable();
IEnumerable<DataRow> t = s
.OrderBy(r => r["HALL"]);
IEnumerable<DataRow> sortedTable = t
.OrderBy(r =>
{ //if (r["ID"] is DBNull)
// return "";
//else
return r["ID"]; // ERROR
});
IEnumerable<DataRow> tue = sortedTable.Where(r => r["DAY"].Equals("TUE"));
IEnumerable<DataRow> wed = sortedTable.Where(r => r["DAY"].Equals("WED"));
AssignSections(tue);
AssignSections(wed);
The 3 commented lines indicate the fault. And what happened: sortedTable was partially initialized in order to feed the Where clause for initializing tue. But then the sortedTable was completed to initialize wed AFTER the call to assign wed appeared in the code, but just in time to use wed in the query constructed in AssignSections!
So the ERROR arose during AssignSections, when the code detoured to completing the initializing of sortedTable, and I could detect this by adding the 3 disabled lines and setting a breakpoint on the "return "";
Magic?
Upvotes: 0