James R
James R

Reputation: 151

Excel file contains invalid hidden characters that can't be removed

I have a peculiar problem with hidden characters in an Excel spreadsheet which uses VBA to create a text file. I've attached a link to a test version of the file, and I'll explain as best I can the issue.

The file creates a plain txt file that can be used to feed data into a System we use. It works well normally, however we've been supplied approximately 15,000 rows of data, and at random points throughout there are hidden characters.

In the test file, there's 1 row and it's cell B11 that has hidden characters at the beginning and end of the value. If you put your cursor at the end of it, and press the backspace key, it will look as if nothing has happened, but actually you've just deleted one of the characters.

As far as Excel is concerned, those hidden characters are question marks, but they're not, as text stream would parse those, but it doesn't, and instead throws up an invalid procedure call error.

I've tried using Excel's CLEAN formula, I've tried the VBA equivalent, tried using 'Replace', but nothing seems to recognise those characters. Excel is convinced they're just question marks, even an ASCII character call gives me the same answer (63), but replace doesn't replace them as question marks, it just omits them!

Any help on this, even if it's just a formula I could apply would be appreciated. In the interests of data protection the data in the file is fake by the way, it's nobody's real NI number.

The excel file with vba code is here

Upvotes: 0

Views: 1643

Answers (1)

user4039065
user4039065

Reputation:

This VBA macro could be run on its own or in conjunction with the ClearFormatting macro. It did strip out the rogue unichars from the sample.

Sub strip_Rogue_Unichars()
    Dim uc As Long
    With Cells(11, 1).CurrentRegion
        For uc = 8000 To 8390
            .Replace what:=ChrW(uc), replacement:=vbNullString, lookat:=xlPart
            DoEvents
        Next uc
    End With
End Sub

There's probably a better way to do this and being able to restrict the scope of the Unicode characters to search and replace would obviously speed things up. Turning off .EnableEvents, .ScreenUpdating, etc would likewise help. I believe the calculation was already at manual. I intentionally left a DoEvents in the loop as my first run was several thousand different unichars.

Upvotes: 1

Related Questions