Steven T. Snyder
Steven T. Snyder

Reputation: 6177

How do I handle Python unicode strings with null-bytes the 'right' way?

Question

It seems that PyWin32 is comfortable with giving null-terminated unicode strings as return values. I would like to deal with these strings the 'right' way.

Let's say I'm getting a string like: u'C:\\Users\\Guest\\MyFile.asy\x00\x00sy'. This appears to be a C-style null-terminated string hanging out in a Python unicode object. I want to trim this bad boy down to a regular ol' string of characters that I could, for example, display in a window title bar.

Is trimming the string off at the first null byte the right way to deal with it?

I didn't expect to get a return value like this, so I wonder if I'm missing something important about how Python, Win32, and unicode play together... or if this is just a PyWin32 bug.

Background

I'm using the Win32 file chooser function GetOpenFileNameW from the PyWin32 package. According to the documentation, this function returns a tuple containing the full filename path as a Python unicode object.

When I open the dialog with an existing path and filename set, I get a strange return value.

For example I had the default set to: C:\\Users\\Guest\\MyFileIsReallyReallyReallyAwesome.asy

In the dialog I changed the name to MyFile.asy and clicked save.

The full path part of the return value was: u'C:\Users\Guest\MyFile.asy\x00wesome.asy'`

I expected it to be: u'C:\\Users\\Guest\\MyFile.asy'

The function is returning a recycled buffer without trimming off the terminating bytes. Needless to say, the rest of my code wasn't set up for handling a C-style null-terminated string.

Demo Code

The following code demonstrates null-terminated string in return value from GetSaveFileNameW.

Directions: In the dialog change the filename to 'MyFile.asy' then click Save. Observe what is printed to the console. The output I get is u'C:\\Users\\Guest\\MyFile.asy\x00wesome.asy'.

import win32gui, win32con

if __name__ == "__main__":
    initial_dir = 'C:\\Users\\Guest'
    initial_file = 'MyFileIsReallyReallyReallyAwesome.asy'
    filter_string = 'All Files\0*.*\0'
    (filename, customfilter, flags) = \
        win32gui.GetSaveFileNameW(InitialDir=initial_dir,
                    Flags=win32con.OFN_EXPLORER, File=initial_file,
                    DefExt='txt', Title="Save As", Filter=filter_string,
                    FilterIndex=0)
    print repr(filename)

Note: If you don't shorten the filename enough (for example, if you try MyFileIsReally.asy) the string will be complete without a null byte.

Environment

Windows 7 Professional 64-bit (no service pack), Python 2.7.1, PyWin32 Build 216

UPDATE: PyWin32 Tracker Artifact

Based on the comments and answers I have received so far, this is likely a pywin32 bug so I filed a tracker artifact.

UPDATE 2: Fixed!

Mark Hammond reported in the tracker artifact that this is indeed a bug. A fix was checked in to rev f3fdaae5e93d, so hopefully that will make the next release.

I think Aleksi Torhamo's answer below is the best solution for versions of PyWin32 before the fix.

Upvotes: 10

Views: 5844

Answers (3)

tzot
tzot

Reputation: 95991

ISTR that I had this issue some years ago, then I discovered that such Win32 filename-dialog-related functions return a sequence of 'filename1\0filename2\0...filenameN\0\0', while including possible garbage characters depending on the buffer that Windows allocated.

Now, you might prefer a list instead of the raw return value, but that would be a RFE, not a bug.

PS When I had this issue, I quite understood why one would expect GetOpenFileName to possibly return a list of filenames, while I couldn't imagine why GetSaveFileName would. Perhaps this is considered as API uniformity. Who am I to know, anyway?

Upvotes: 0

Aleksi Torhamo
Aleksi Torhamo

Reputation: 6632

I'd say it's a bug. The right way to deal with it would probably be fixing pywin32, but in case you aren't feeling adventurous enough, just trim it.

You can get everything before the first '\x00' with filename.split('\x00', 1)[0].

Upvotes: 6

Nicholas Riley
Nicholas Riley

Reputation: 44331

This doesn't happen on the version of PyWin32/Windows/Python I tested; I don't get any nulls in the returned string even if it's very short. You might investigate if a newer version of one of the above fixes the bug.

Upvotes: 2

Related Questions