Reputation: 10959
I am using MLOAD to pull data from a Oracle into Teradata. It has proven to be a nightmare because the Oracle table contains all sorts of characters that Teradata Unicode columns do not accept. When I query the resulting ET_
table after MLOAD runs to see what columns failed, unsupported characters end up being turned into �
. When I track these characters down on the Oracle side using the ASCII()
function, I get numbers like 160 (apparently a non-breaking space), 15712189 (replacement character), and 15556979 (doesn't show up at all anywhere in Google).
There are so many unsupported characters that I can't possibly hardcode for them all. I need to somehow have my MLOAD script strip out any character that is not supported by a Unicode column.
Upvotes: 3
Views: 11260
Reputation: 7786
No longer applies as KANJI was not the issue:
This may be a situation where you have to use the TRANSTLATE() function:
SELECT TRANSLATE({colA}
USING {KANJISJIS_TO_UNICODE|KANJI1_SBC_TO_UNICODE|KANJI1_KANJISJIS_TO_UNICODE
|KANJI1_KanjiEUC_TO_UNICODE|KANJI1_KanjiEBCDIC_TO_UNICODE}
);
Unsupported Characters: If you are on Teradata 14 you may be able to use one of the Regular Expression functions to replace data that is out of LATIN or UNICODE range with an empty string:
SELECT REGEXP_REPLACE({colA}, '[^[print]]', '') FROM {MyDB}.{MyTable};
You also have at your disposal in Teradata 14 the CHR() function which can also be used to weed out the offending data:
SELECT CASE WHEN POSITION(CHR(26) IN {ColA}) = 0
THEN {ColA}
ELSE SUBSTRING({ColA} FROM 1 FOR
(POSITION(CHR(26) IN {ColA}) - 1))
|| '' || SUBSTRING({ColA} FROM
(POSITION(CHR(26) IN {ColA}) + 1))
END AS Test_
FROM {MyDB}.{MyTable};
In the above example it works for a single occurrence of the "bad" data. If you have repeating occurrences in a single string then you may have to play with OREPLACE()
combined with CHR()
to clean it up.
Lastly, you could try to preparse the file in UNIX (if that is your source environment) to strip out the bad data using something like AWK and gensub()
.
Upvotes: 1