Reputation: 43
After running an API GET request method I get a set of data whose snippet is shown by the below screenshot:
However when I run a Merge Queries
of the Contract
column with another table so I can get each correspondent translation to Portuguese it'll only match a few items, as shown below:
After checking the other table's contents every correspondent Contract
was there, but then after copying the entire Contract
column from the API dataset and pasting it to a blank Excel spreadsheet I realized that the spaces displayed on Power Query are actually some unrecognizable characters displayed as "?" inside a diamond:
I've already tried the Clean
tool but it won't work. I also tried other think such as Trim
and Replace Values...
space by something else and no glory.
Something tells me I need to figure out which character is actually being imported instead of space so I can try to replace it on Power Query, but the question is: how can I do that? How can I turn those "?" into readable characters in Excel?!
Leonardo
Upvotes: 0
Views: 1040
Reputation: 7511
�
?There's a special character named �
Replacement Char which is the codepoint u+fffd
or in power query: #(fffd)
If you see one, it means text is either encoded or decoded using the wrong encoding, which causes errors.
You say it's a web api, so its almost always UTF-8. However the default UI doesn't usually pick that.
1252
which is TextEncoding.Windows
65001
which is TextEncoding.Utf8
�
?This is why I used the term code-unit
instead of code-point
. Text.ToList
converts text into a list of 2-byte [char]
s.
It's just like Dotnet calling String.ToCharArray() verses String.EnumerateRunes()
[char]
represents a single utf-16 code-unit
, that is not the same thing as a codepoint
How do I enumerate characters in a string?
= Text.ToList("hi world")
How do I tell what a character is?
= Character.ToNumber("x")
https://learn.microsoft.com/en-us/powerquery-m/character-tonumber
let
// helper function,
Csv = (source as list) as text =>
Text.Combine(
List.Transform( source, (item) =>
try Text.From(item)
catch (e) => "<error>"// e[Message]
),
", "
),
ReplaceSymbol = "#(fffd)",
Source = "hi world #(0001f412)",
Letters = Text.ToList( Source ),
CodeUnits = List.Transform( Letters,
(char) =>
try Character.ToNumber( char )
catch (e) => "invalid" meta [ Char = char ]
),
CodeUnits2 = List.Transform( Letters,
(char) =>
try Character.ToNumber( char )
catch (e) => ReplaceSymbol meta [ Char = char ]
),
Summary = [
Source = Source,
CodeUnit = Csv( CodeUnits ),
CodeUnit2 = Csv( CodeUnits2 ),
Letters = Csv( Letters )
]
in
Summary
Upvotes: 1
Reputation: 60224
try something like:
=UNICODE(MID(A4, SEQUENCE(,LEN(A4)),1))
and fill down. You should see codes for all the characters in the adjacent row. (Be sure the row is clear or you will the #SPILL!
error)
Upvotes: 2