Reputation: 11612
I was reading though a library of python code, and I'm stumped by this statement:
struct.pack( "<ii%ds"%len(value), ParameterTypes.String, len(value), value.encode("UTF8") )
I understand everything but%d
, and I'm not sure why the length of value
is being packed in twice.
As I understand it, the structure will have little endian encoding (<
) and will contain two integers (ii
) followed by %d, followed by a string (s
).
What is the significance of %d
?
Upvotes: 2
Views: 3399
Reputation: 19413
It is an ordinary string format which is being used to create the struct format
Try reading it to begin with as an ordinary string (forget struct
for the moment) ...
"<ii%ds" % len(value)
If, for example, the length of the value iterable is 4 then the string will be, <ii4s
. This is then passed to struct.pack
ready to pack two integers followed by a string of length four bytes from the value
iterable
Upvotes: 1
Reputation: 392050
The %d
means this works in two steps.
Step 1.
"<ii%ds"%len(value)
Creates the struct formatting string of "<ii...some number...s"
.
Step 2.
The resulting formatting string is applied to three values
ParameterTypes.String, len(value), value.encode("UTF8")
Upvotes: 0
Reputation: 83032
Aarrrgh the mind boggles ....
@S.Lott: """I don't think the number is particularly important, since Python will tend to pack correctly without it.""" -1. Don't think; investigate. Without a number means merely that the number defaults to 1. Tends to pack correctly??? Perhaps you think that struct.pack("s", foo)
works the same way as "%s" % foo
? It doesn't; docs say """For the 's' format character, the count is interpreted as the size of the string, not a repeat count like for the other format characters; for example, '10s' means a single 10-byte string, while '10c' means 10 characters. For packing, the string is truncated or padded with null bytes as appropriate to make it fit."""
@Brendan: -1. value
is not an array (whatever that is); it is patently obviously intended to be a unicode string ... lookee here: value.encode("UTF8")
@Matt Ellen: The line of code that you quote is severely broken. If there are any non-ASCII characters in value
, data will be lost.
Let's break it down:
`struct.pack("<ii%ds"%len(value), ParameterTypes.String, len(value), value.encode("UTF8"))`
Reduce problem space by removing the first item
struct.pack("<i%ds"%len(value), len(value), value.encode("UTF8"))
Now let's suppose that value
is u'\xff\xff'
, so len(value)
is 2.
Let v8
= value.encode('UTF8')
i.e. '\xc3\xbf\xc3\xbf'
.
Note that len(v8)
is 4. Is the penny dropping yet?
So what we now have is
struct.pack("<i2s", 2, v8)
The number 2 is packed as 4 bytes, 02 00 00 00
. The 4-byte string v8
is TRUNCATED (by the length 2 in "2s") to length two. DATA LOSS. FAIL.
The correct way to do what is presumably wanted is:
v8 = value.encode('UTF8')
struct.pack("<ii%ds" % len(v8), ParameterTypes.String, len(v8), v8)
Upvotes: 2
Reputation: 13191
The significance of %d
is that it's a formatting parameter for strings:
String Formatting Operations
When broken apart, "<ii%ds" % len(value)
is a bit easier to understand. It is replacing the %d conversion indicator in the string with the return value of len(value)
, typecast appropriately.
>>> str = "<ii%ds"
>>> str % 5
'<ii5s'
>>> str % 3
'<ii3s'
Upvotes: 1
Reputation: 40879
It's used to specify that a string (value
) of len(value)
characters is to be packed after those two integers.
If, for instance, value
contained "boo"
then the actual format specifier for pack
would be "<ii3s"
.
Upvotes: 0