Reputation: 8681
I am trying to prevent adding an item in the list that already exist in C#. The code below loops through datatable rows.
As you can see rows is of the type List<CubeReportRow>
datatables rows does contain duplicates. I need check if the rowName in the datatable is already in rows object of type List<CubeReportRow>
. Please see the condition that i have set in the foreach loop. When I try to check by rowname its says cannot convert string to type CubeReportRow. If I check if (!rows.Contains(row[0])),there is no compile error but i doesn't work. How do I check for its existence in the rows collection.
Class CubeReportRow
public class CubeReportRow
{
public string RowName { get; set; }
public string RowParagraph { get; set; }
public int ReportSection { get; set; }
}
C# Method
public virtual IList<CubeReportRow> TransformResults(CubeReport report,DataTable dataTable)
{
if (dataTable.Rows.Count == 0 || dataTable.Columns.Count == 0)
return new List<CubeReportRow>();
var rows = new List<CubeReportRow>();
var columns = columnTransformer.GetColumns(dataTable);
foreach (DataRow row in dataTable.Rows)
{
var rowName = row[0].ToString();
if (!rows.Contains(rowName))
{
var values =
cubeReportValueFactory.CreateCubeReportValuesForRow(dataTable, row, rowName, columns, report);
var reportRow = new CubeReportRow(row[3].ToString(), row[2].ToString(), row[1].ToString(), values);
rows.Add(reportRow);
}
}
return rows;
}
Upvotes: 0
Views: 1232
Reputation: 81493
This is not really an answer as I believe Guru Strons answer is sufficient.
However, there are a bunch of ways to do this which will yield different performance and complexity depending on your data / duplicate ratio (and not limited to the following).
Dictionary
var rows = new Dictionary<string, CubeReportRow>();
foreach (var dataRow in _data)
if (!rows.ContainsKey(dataRow.RowName))
rows.Add(dataRow.RowName, dataRow);
return rows.Values.ToList();
HashSet
var hashSet = new HashSet<string>(_data.Length);
return _data.Where(x => hashSet.Add(x.RowName)).ToList();
GroupBy
return _data.GroupBy(x => x.RowName).Select(x => x.First()).ToList();
IEqualityComparer
public class SomeComparer : IEqualityComparer<CubeReportRow> {
public bool Equals(CubeReportRow x, CubeReportRow y) {
return x.RowName == y.RowName;
}
public int GetHashCode(CubeReportRow obj) {
return obj.RowName.GetHashCode();
}
}
...
return _data.Distinct(new SomeComparer()).ToList();
Config
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19041.746 (2004/?/20H1)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=5.0.102
[Host] : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT
.NET Core 5.0 : .NET Core 5.0.2 (CoreCLR 5.0.220.61120, CoreFX 5.0.220.61120), X64 RyuJIT
Job=.NET Core 5.0 Runtime=.NET Core 5.0
Results
Method | Mean | Error | StdDev |
---|---|---|---|
Dictionary | 205.3 us | 4.06 us | 5.69 us |
HashSet | 237.6 us | 4.73 us | 10.19 us |
Distinct | 299.4 us | 5.24 us | 4.90 us |
GroupBy | 451.3 us | 5.28 us | 4.68 us |
Full Test Code
[SimpleJob(RuntimeMoniker.NetCoreApp50)]
public class Test
{
private CubeReportRow[] _data;
public class CubeReportRow
{
public string RowName { get; set; }
public string RowParagraph { get; set; }
public int ReportSection { get; set; }
}
[GlobalSetup]
public void Setup()
{
var r = new Random(32);
_data = new CubeReportRow[10000];
for (int i = 0; i < 10000; i++)
_data[i] = new CubeReportRow() {RowName = r.Next(100).ToString()};
}
[Benchmark]
public List<CubeReportRow> Dictionary()
{
var rows = new Dictionary<string, CubeReportRow>();
foreach (var dataRow in _data)
if (!rows.ContainsKey(dataRow.RowName))
rows.Add(dataRow.RowName, dataRow);
return rows.Values.ToList();
}
[Benchmark]
public List<CubeReportRow> HashSet()
{
var hashSet = new HashSet<string>(_data.Length);
return _data.Where(x => hashSet.Add(x.RowName)).ToList();
}
public class SomeComparer : IEqualityComparer<CubeReportRow>
{
public bool Equals(CubeReportRow x, CubeReportRow y)
{
return x.RowName == y.RowName;
}
public int GetHashCode(CubeReportRow obj)
{
return obj.RowName.GetHashCode();
}
}
[Benchmark]
public List<CubeReportRow> Distinct()
{
return _data.Distinct(new SomeComparer()).ToList();
}
[Benchmark]
public List<CubeReportRow> GroupBy()
{
return _data.GroupBy(x => x.RowName).Select(x => x.First()).ToList();
}
}
Note : If you are interested in performance, run these benchmarks yourself with realistic data.
Upvotes: 2
Reputation: 141990
You can use Dictionary<string, CubeReportRow>
for your rows
variable and check if key (rowName
) exists with ContainsKey
:
var rows = new Dictionary<string, CubeReportRow>();
if (!rows.ContainsKey(rowName))
{
// ...
rows.Add(rowName, reportRow);
}
// ...
return rows.Values.ToList();
Upvotes: 2
Reputation: 9499
LINQ is perfect for this (in terms of easy-to-read code)
At the top of the file:
using System.Linq;
Then:
if (!rows.Any(r => r.RowName == rowName))
(replace if (!rows.Contains(rowName))
)
Upvotes: 1