Soony
Soony

Reputation: 913

Using First() get the 2nd item of LINQ result?

I'm new to Linq and EntityFramework. This is a sample program I met while learning them.

The data in table is like this:

BlogId  Title
1       Hello Blog
2       New Blog
3       New Blog

I have the following Linq code, trying to read the first blog id(expected to be 2):

var name = "New Blog";
var blogs = (from b in db.Blogs
                where b.Title == name
                orderby b.Title
                select b);//.ToList();

Console.Write("The first id: ");
Console.WriteLine(blogs.First().BlogId);

The result comes out to be 3.

Then I use ToList():

var blogs = (from b in db.Blogs
                where b.Title == name
                orderby b.Title
                select b).ToList();
Console.Write("The first id: ");
Console.WriteLine(blogs.First().BlogId);

The result comes out to be 2.

Can anyone help to explain this? Or is this a bug?

Thanks.

//////////////////////// UPDATE /////////////////////////////

I just deleted the data in the database and inserted some new items. Now the table is like this:

BlogId  Title
5       New Blog
6       New Blog
7       New Blog
8       New Blog

Then I ran the program above(Not with ToList()), the First() method returns the id 6 So I assume the method always returns the 2nd item in the situation above. And it doesn't seem to have anything to do with the RDBMS. Can anyone explain?

Thanks.

/////////////////////////////////////////////////////

FYI, the following is the whole .cs file:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using System.Data.Entity;
using System.ComponentModel.DataAnnotations;

namespace SampleNew
{
    class Program
    {
        public class Blog
        {
            [Key]
            public Int32 BlogId { get; set; }
            public String Title { get; set; }
            public virtual List<Post> Posts { get; set; } 
        }

        public class Post
        {
            [Key]
            public Int32 PostId { get; set; }
            public String Title{ get; set; }

            public String Content { get; set; }
        }

        public class BlogContext : DbContext
        {
            public DbSet<Blog> Blogs{ get; set; }

            public DbSet<Post> Posts { get; set; }
        }

        static void Main(string[] args)
        {
            using (var db = new BlogContext())
            {
                // Create and save a new Blog 
                // Console.Write("Enter a name for a new Blog: ");
                var name = "New Blog";

                //var blog = new Blog { Title = name };
                var blogs = (from b in db.Blogs
                             where b.Title == name
                             orderby b.Title
                             select b).ToList();

                Console.Write("The first id: ");
                Console.WriteLine(blogs.First().BlogId);

                Console.WriteLine(blogs.Count());

                Blog blog = null;
                foreach (Blog b in blogs)
                {
                    blog = b;
                    Console.WriteLine(blog.BlogId);
                }
                Console.WriteLine(blog.BlogId);

                Console.WriteLine(blogs.First().BlogId);
                Console.WriteLine(blogs.First().BlogId);

                Console.WriteLine(blogs.Last().BlogId);
                Console.WriteLine(blogs.Last().BlogId);


                blog.Posts = new List<Post>();
                var post = new Post { Content = "Test Content2", Title = "Test Title2"};
                blog.Posts.Add(post);

                db.Posts.Add(post);
                db.SaveChanges();

                // Display all Blogs from the database 
                var query = from b in db.Blogs
                            orderby b.Title
                            select b;

                Console.WriteLine("All blogs in the database:");
                foreach (var item in query)
                {
                    Console.WriteLine(item.Title);
                }

                Console.WriteLine("Press any key to exit...");
                Console.ReadKey();
            } 
        }
    }
}

Upvotes: 1

Views: 1421

Answers (2)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726479

You've got two identical titles there, but with different IDs. Your RDBMS has the flexibility of returning the rows that correspond to your 'New Blog' in any order that it wishes, because your code does not specify anything beyond the requirement to order by the title. Moreover, it is not even required to return results in the same order each time that you run the same query.

If you would like predictable results, add a "then by" to your LINQ statement to force the ordering that you wish to have:

var query = from b in db.Blogs
    orderby b.Title, b.BlogId
    select b;

EDIT :

When I ran the program above, the First() method returns the id 6 so I assume the method always returns the 2nd item in the situation above. And it doesn't seem to have anything to do with the RDBMS. Can anyone explain?

That's also happening in RDBMS, and it is reproducible without LINQ. Here is a small demo (link to sqlfiddle):

create table blogs(blogid int,title varchar(20));
insert into blogs(blogid,title) values (5,'New blog');
insert into blogs(blogid,title) values (6,'New blog');
insert into blogs(blogid,title) values (7,'New blog');
insert into blogs(blogid,title) values (8,'New blog');
SELECT * FROM Blogs ORDER BY Title

This query produces results in "natural" order:

BLOGID  TITLE
------  --------
5       New blog
6       New blog
7       New blog
8       New blog

However, this query, which is what EF runs to get the First() item in RDBMS

SELECT TOP 1 * FROM Blogs ORDER BY Title

returns the second row in natural order:

BLOGID  TITLE
------  --------
6       New blog

It does not mean that it is going to return the same row in other RDBMSs (link to a demo with MySQL returning a different row for the same query), or even in the same RDBMS. It simply demonstrates that LINQ relies on RDBMS for the selection of the row, and the RDBMS returns an arbitrarily selected row.

Upvotes: 10

Rick Love
Rick Love

Reputation: 12780

I suspect the difference comes in the optimizations that are taken by First() without the ToList().

When you call ToList(), the entire ordered list must be created. So it will order everything using an efficient sort algorithm.

However, with First(), it only needs to find the min value. So it can use a much more effecient algorithm that basically goes through the enumerable once and stores the current min object value. (So it will result in the first object of the min value).

This is a different algorithm then sorting the entire list and hence gets a different result.

Update: Also, this being a database, it may be using linq to sql which will produce a different query based on the above description (getting a sorted list vs getting the first with the min value).

Upvotes: 0

Related Questions