Reputation: 27
1 - I load one file to PQ from a specific folder. 2 - There is always just one file but it could be .txt .csv .xls .xlsx 3 - The size of the table may vary each time (# of columns)
I need the transformed table to show each column's data type.
The data is here: https://drive.google.com/file/d/1cHZsszVlwsXIeC3AjxgMemjd-leGDqH0/view?usp=sharing (not that it would matter a lot but just in case ...)
I have found this wonderful [function ] (https://gist.github.com/ImkeF/6af3d67c91b81d9eb0adceba0261a252) which does detect the data types for each column dynamically, when the table is loaded from the current workbook (table, named range).
Here is the code of the Invoked Function, when the table comes from a table in a worksheet:
let
Source = TransformColumnTypesDynamically(Data, null)
in
Source
... and works just fine.
However, for some reason when it is loaded from a text file (same data), it fails:
let
Source = TransformColumnTypesDynamically(#"PC Card DAta Q1 22-23", null)
in
Source
Also, if the file is loaded from a folder it fails again (here invoked to the Transform Sample File:
let
Source = TransformColumnTypesDynamically(#"Transform Sample File", null)
in
Source
... or if invoked for the file from Other Queries>
let
Source = TransformColumnTypesDynamically(Input, null)
in
Source
Thank you,
G
Upvotes: 0
Views: 1075
Reputation: 26
When looking at the TransformColumnTypesDynamically
function you are trying to use in the GitHub Link you provided, it does not detect the column data type, the function is just capable to deduct it using the already detected types of its elements.
Indeed, in the following code snippet of the function, the Value.Type(i)
simply retrieves the type of the value in the column.
Types = List.Transform(Columns,
each List.Distinct(List.Transform(List.Select(_,
(ListItem)=>ListItem<>null), (i)=>Value.Type(i)))),
To be clearer, Value.Type("1")
will return type text
even though "1" can be converted to a number.
As a consequence,
When you load a table from excel, the data types are already being interpreted. You can observe that the TRANSACTION_AMOUNT column elements are interpreted as numbers (which are in italics in the interface) even though the column is not explicitly converted to a number column. Then, Value.Type
will be able to return type number
in this context.
Table when loading it from an excel
When you load the data from a file, all columns are being directly interpreted as text without attempt to convert them. You can observe this behaviour by looking at the ABC symbols in the column headers. As a consequence, Value.Type
in the TRANSACTION_AMOUNT column will not be able to return the type number
in this context.
Table when loading it from a file or a folder
To convert data types dynamically, you can use the following function, that I found in this Microsoft Power BI forum post:
(tbl as table) as table =>
let
fn = (tbl as table, col as text, numberofrecords as number, marginforerror as number) as type =>
let
LijstmetValues = List.FirstN( Table.Column(tbl, col),numberofrecords),
Env = Record.Combine({[L=LijstmetValues],[DTF = #shared[DateTime.From]], [DF = #shared[Date.From]], [NF = #shared[Number.From]], [TF = #shared[Text.From]], [LT = #shared[List.Transform]], [LS = #shared[List.Select]], [LC = #shared[List.Count]] }),
NumberOfErrors = List.Transform({"DTF", "DF", "NF", "TF"}, each Expression.Evaluate("
LC(LS(
LT(L, each try " & _ & "(_) otherwise ""Error""), each _ = ""Error""))", Env)),
CheckWithinMargin = List.Transform(NumberOfErrors, each _ <= numberofrecords * marginforerror),
typenr = List.PositionOf(CheckWithinMargin, true),
FirstTypeWithinMargin = {"datetime", "date", "number", "text"}{typenr},
CheckType = if List.Distinct(LijstmetValues){0} = null or FirstTypeWithinMargin = -1 then 4 else FirstTypeWithinMargin,
result = Record.Field([number = type number,date = type date,datetime = type datetime,text = type text, any = type any],CheckType)
in result,
Columnsto2Type = Table.TransformColumnTypes(tbl, List.Transform(Table.ColumnNames(tbl), each {_, fn(tbl,_, 7, 0.1)}))
in
Columnsto2Type
In this code snippet, the function really tries to cast each column to a datetime, a date or a number and count the number of errors to determine the final column data type.
Please note that this function may not work depending on your regional settings. For example, in France, the decimal separator is a comma (,). So Number.From("1.1")
returns a DataFormat.Error
for French regional settings. So, for French people, a column with the dot as a decimal separator will be interpreted as text.
You can change this behaviour by:
changing the function to replace dots by commas (Text.Replace("1.1", ".", ",")
) before trying to convert columns to numbers.
changing your regional settings. To do this, go to File > Options and settings > Query Options. In the Regional Settings section, you can set the regional settings only for your workbook. Changing the regional settings option
Upvotes: 1