Reputation: 2402
I have merged DataTable that consist duplicates in both columns.
DataTable structure: "Worknumber Username"
1234 John
1235 Mike
1235 Mike
1236 Donald
1236 Jack
I need to catch
1236 Donald
1236 Jack
and preferably output it in format
1236 Donald Jack
I have been trying to create Linq query for this purpose for some time now (quite long), but I can't figure out the right way.
Here is what I have so far. This works, I just tested that column names are correct with dataRow.Field<string>("WORKNUMBER")
:
public static DataTable ProjectsInLiinos = new DataTable();
...
MainProcess.MergedTable();
foreach (DataRow dataRow in MainProcess.ProjectsInLiinos.Rows)
{
Console.WriteLine(dataRow.Field<string>("WORKNUMBER") + dataRow.Field<string>("USERNAME"));
}
Here is Linq query:
var queryCustomers = from data in MainProcess.ProjectsInLiinos.AsEnumerable()
where data.Field<string>("WORKNUMBER").Count() > 1 && data.Field<string>("USERNAME").Count() < 1
select data;
It does not produce anything. I can't figure out how I can make statement where first column duplicates are compared to second column duplicates?
Otherwise I could have grouped dataRow.Field<string>("WORKNUMBER") + dataRow.Field<string>("USERNAME")
and then check list for duplicates, but it will produce then:
1235Mike
1235Mike
and in reverse
1234John
1236Donald
1236Jack
Upvotes: 2
Views: 1136
Reputation: 1646
Assuming you have
class MyClass : IEquatable<MyClass> // IEquatable is required to perform Dictinct()
{
public string Worknumber { get; set; }
public string Username { get; set; }
public bool Equals(MyClass other)
{
if (this.Username != other.Username)
return false;
if (this.Worknumber != other.Worknumber)
return false;
return true;
}
public override int GetHashCode() // required for objects comparison
{
int hashWorknumber = Worknumber == null ? 0 : Worknumber.GetHashCode();
int hashUsername = Username == null ? 0 : Username.GetHashCode();
return hashWorknumber ^ hashUsername;
}
}
and a mock like this
List<MyClass> myclasses = new List<MyClass>();
myclasses.Add(new MyClass { Worknumber = "1234", Username = "John" });
myclasses.Add(new MyClass { Worknumber = "1235", Username = "Mike" });
myclasses.Add(new MyClass { Worknumber = "1235", Username = "Mike" });
myclasses.Add(new MyClass { Worknumber = "1236", Username = "Donald" });
myclasses.Add(new MyClass { Worknumber = "1236", Username = "Jack" });
we make a simple lambda query
var result = myclasses
.Distinct()
.GroupBy(e => e.Worknumber) // grouping
.Select(e => new
{
Worknumber = e.Key,
Count = e.Count(),
Users = string.Join(" ", e.Select(p => p.Username)) // joining names
})
.Where(e => e.Count > 1) // filtering duplicates only
.ToList();
Upvotes: 1
Reputation: 23732
as already suggested use grouping by the first column first. Then you can find the different usernames by a second grouping in the Where filter:
void Main()
{
string column1 = "Worknumber";
string column2 = "Username";
DataTable table = new DataTable();
table.Columns.Add(column1, typeof(int));
table.Columns.Add(column2, typeof(string));
table.Rows.Add(1234, "john");
table.Rows.Add(1235, "Mike");
table.Rows.Add(1235, "Mike");
table.Rows.Add(1236, "Donald");
table.Rows.Add(1236, "Jack");
table.Rows.Add(1237, "Val");
table.Rows.Add(1237, "Nick");
var result = table.AsEnumerable()
.GroupBy(x => x.Field<int>(column1))
.Where(g => g.Count() > 1 && g.GroupBy(x => x.Field<string>(column2)).Count() > 1)
.Select(x => new {WorkNumber = x.Key, AllUsers = string.Join(" ", x.Select(u => u.Field<string>(column2)))}).Dump();
}
I used an anonymous type here, but you can easily write a custom class with an appropriate constructor to fit in the user name values.
Upvotes: 1
Reputation: 23797
You can use group by:
void Main()
{
DataTable tbl = new DataTable();
tbl.Columns.Add("Id", typeof(int));
tbl.Columns.Add("Name", typeof(string));
tbl.Rows.Add(1234, "John");
tbl.Rows.Add(1235, "Mike");
tbl.Rows.Add(1235, "Mike");
tbl.Rows.Add(1236, "Donald");
tbl.Rows.Add(1236, "Jack");
var dupes = tbl.AsEnumerable()
.GroupBy(t => t.Field<int>("Id"))
.Where(x => x.Select(y => y.Field<string>("Name")).Distinct().Count() > 1)
.Select(x => new
{
Id = x.Key,
Names = string.Join(",", x.Select(y => y.Field<string>("Name")))
});
foreach (var dupe in dupes)
{
Console.WriteLine($"{dupe.Id}:{dupe.Names}");
}
}
I think it would much better to use the source anyway, rather than a DataTable in between.
Upvotes: 1
Reputation: 30565
check this simple demo
public class C
{
public int Worknumber { get; set; }
public string Username { get; set; }
public C(int worknumber, string username)
{
Worknumber = worknumber;
Username = username;
}
}
void Main()
{
var lst = new List<C>(){
new C(1234, "Mike"),
new C(1235, "Mike"),
new C(1235, "Mike"),
new C(1236, "Donald"),
new C(1236, "Jake")
};
var p = lst.GroupBy(t => t.Worknumber)
.Select(t => new { Key = t.Key, l = t.Select(x => x.Username).ToList() })
.Where(x => x.l.Distinct().Count() > 1)
.Select(x => String.Format("{0} {1}", x.Key, string.Join(' ', x.l)))
.Dump();
}
Upvotes: 1
Reputation: 34421
Try following :
DataTable dt = new DataTable();
dt.Columns.Add("Worknumber",typeof(int));
dt.Columns.Add("Username",typeof(string));
dt.Rows.Add(new object[] {1234, "John"});
dt.Rows.Add(new object[] {1235, "Mike"});
dt.Rows.Add(new object[] {1235, "Mike"});
dt.Rows.Add(new object[] {1236, "Donald"});
dt.Rows.Add(new object[] {1236, "Jack"});
var distinct = dt.AsEnumerable()
.GroupBy(x => x.Field<int>("Worknumber"))
.Select(x => new { workNumber = x.Key, user = x.Select(y => y.Field<string>("Username")).Distinct().ToList() })
.Where(x => x.user.Count > 1)
.Select(x => new { workNumber = x.workNumber, user = string.Join(" ", x.user) })
.ToList();
Upvotes: 2