Reputation: 3311
I need to know the maximum current lenght of each column of a DataTable (using VB.Net)
I need the maximum .ToString.Length
for each column.
I found the below C# code here, but I wasn't able to translate it to VB.Net
List<int> maximumLengthForColumns =
Enumerable.Range(0, dataTable.Columns.Count)
.Select(col => dataTable.AsEnumerable()
.Select(row => row[col]).OfType<string>()
.Max(val => val.Length)).ToList();
EDIT
I finally was able to translate the code in more readable vb.net but not to adapt it to my needs:
maximumLengthForColumns = Enumerable.Range(0, DT.Columns.Count).
Select(Function(col)
Return DT.AsEnumerable().Select(Function(row)
Return row(col)
End Function).OfType(Of String)().Max(Function(v)
Return v.Length
End Function)
End Function).ToList()
Upvotes: 1
Views: 6373
Reputation: 38915
The posted self answer is iterating all columns and treating them like string columns even if they are not. That is, it is measuring and collecting the .ToString
length of Data which is not string (which seems not to be what's desired).
The non string datacolumns could be omitted this way:
Dim MaxColLen As New Dictionary(Of String, Integer)
For Each dc As DataColumn In dtSample.Columns
If dc.DataType Is GetType(String) Then
MaxColLen.Add(dc.ColumnName, 0)
For Each dr As DataRow In dtSample.Rows
If dr.Field(Of String)(dc.ColumnName).Length > MaxColLen(dc.ColumnName) Then
MaxColLen(dc.ColumnName) = dr.Field(Of String)(dc.ColumnName).Length
End If
Next
End If
Next
Note that it uses For Each
to reduce the clutter in code and allow the use of DataRow
extensions such as Field<T>()
. Personally, I think Field(Of T)(Col)
is more readable than DT.Rows(x)(Col).ToString
although if you do actually want to measure non string data, using it on non text data will surely crash.
Note that the loop skips over non string columns. To find the longest text in 715,000 rows, the original takes ~34 ms, while the above takes ~9 ms.
A linqy version of the same dictionary approach (with comments explaining the steps):
' a) look at cols as cols
' b) just the string ones
' c) get the name and inital zed value to an Anonymous type
' d) convert to a dictionary of String, Int to store the longest
Dim txtCols = dtSample.Columns.Cast(Of DataColumn).
Where(Function(c) c.DataType = GetType(String)).
Select(Function(q) New With {.Name = q.ColumnName, .Length = 0}).
ToDictionary(Of String, Int32)(Function(k) k.Name, Function(v) v.Length)
' get keys into an array to interate
' collect the max length for each
For Each colName As String In txtCols.Keys.ToArray
txtCols(colName) = dtSample.AsEnumerable().
Max(Function(m) m.Field(Of String)(colName).Length)
Next
This form takes ~12 ms for the same 715k rows. Extension methods are almost always slower, but the none of these differences are worth worrying about.
Upvotes: 0
Reputation: 3311
I was forced to do as @Putonix said and use a loop over the datatable for two reasons:
1) I wasn't able to use the translated C#
code, because it gives me error "The sequence contains no elements" even if all cells have a value and also because it seems to be written only for string fields.
At the moment my knowledge isn't enough to successfully edit this code so to adapt it to my needs.
2) The 2 answers that suggest to use MaxLength
don't give me what I need because I need the current Length of each column and not the maximum allowed length.
Thanks to all for helping
So here's the code I used:
Dim MaxColLen As New Dictionary(Of String, Integer)
For y As Integer = 0 To DT.Columns.Count - 1
Dim Col As String = DT.Columns(y).ColumnName
MaxColLen.Add(Col, 0)
For x As Integer = 0 To DT.Rows.Count - 1
If DT.Rows(x)(Col).ToString.Length > MaxColLen(Col) Then
MaxColLen(Col) = DT.Rows(x)(Col).ToString.Length
End If
Next
Next
Upvotes: 0
Reputation: 5413
Non-LINQ answer...
Dim maximumLengthForColumns As New List(Of Integer)
For i As Integer = 0 To dtb.Columns.Count - 1
maximumLengthForColumns.Add(dtb.Columns(i).MaxLength)
Next i
If the size of the column is unlimited, then the MaxLength
property returns -1
Upvotes: 0
Reputation: 559
A DataTable
exposes a Columns
property which is a collection of column definitions. Each item (which really is a DataColumn
type) exposes the maximum allowable length. The code sample that you found only looks at the data as stored in the table. That is, it is returning the current maximum length used by data, not the maximum supported by the column.
To retrieve the maximum allowed length, simply scan through the DataColumns property of the supplied DataTable object and use the MaxLength
property.
Here's a snippet using LINQ syntax:
Dim maximumLengthForColumns = From c in dataTable.Columns.Cast(Of DataColumn)
Order By c.Ordinal
Select c.MaxLength
The actual type of this isn't exactly a List. It's IQueryable(Of Integer)
. Ycan use .ToList()
to force the enumeration and conversion instead of letting it sit idle until you actually need to use the results. You could just leave as an IQueryable if you just need to enumerate over the results as the interface does inherit from IEnumerable.
I didn't need to include an Order By clause. It will probably slow down the actual execution. But, if you have so many columns in your data table that this becomes a real bottleneck, you need to be taken out back and given some other remedial instruction.
Why didn't I add a filtering clause (Select
)? The MaxLength
property is exposed for all columns, not just string types. And, a simple enumeration of the results should probably match up to the number of columns in your original data table. If not, feel free to add the clause to the LINQ statement.
Where c.DataType = GetType(String)
Upvotes: 2
Reputation: 6542
You have to also translate those lambdas:
Dim maximumLengthForColumns As List(Of Integer) = Enumerable.Range(0, dataTable.Columns.Count).Select(Function(col) dataTable.AsEnumerable().Select(Function(row) row(col)).OfType(Of String)().Max(Function(val) val.Length)).ToList()
Upvotes: 0