brainysmurf
brainysmurf

Reputation: 646

Trouble understanding how to process C string

I'm trying to use Mac OS X's listxattr C function and turn it into something useful in Python. The man page tells me that the function returns a string buffer, which is a "simple NULL-terminated UTF-8 strings and are returned in arbitrary order. No extra padding is provided between names in the buffer."

In my C file, I have it set up correctly it seems (I hope):

  char buffer[size];
  res = listxattr("/path/to/file", buffer, size, options);

But when I got to print it, I only get the FIRST attribute ONLY, which was two characters long, even though its size is 25. So then I manually set buffer[3] = 'z' and low and behold when I print buffer again I get the first TWO attributes.

I think I understand what is going on. The buffer is a sequence of NULL-terminated strings, and stops printing as soon as it sees a NULL character. But then how am I supposed to unpack the entire sequence into ALL of the attributes?

I'm new to C and using it to figure out the mechanics of extending Python with C, and ran into this doozy.

Upvotes: 0

Views: 283

Answers (4)

brainysmurf
brainysmurf

Reputation: 646

Actually, since I'm going to send it to Python I don't have to process it C-style after all. Just use the Py_BuildValue passing it the format character s#, which knows what do with it. You'll also need the size.

return Py_BuildValue("s#", buffer, size);

You can process it into a list on Python's end using split('\x00'). I found this after trial and error, but I'm glad to have learned something about C.

Upvotes: 0

bmargulies
bmargulies

Reputation: 100161

  1. char *p = buffer;
  2. get the length with strlen(p). If the length is 0, stop.
  3. process the first chunk.
  4. p = p + length + 1;
  5. back to step 2.

Upvotes: 3

sshannin
sshannin

Reputation: 2825

So you guessed pretty much right.

The listxattr function returns a bunch of null-terminated strings packed in next to each other. Since strings (and arrays) in C are just blobs of memory, they don't carry around any extra information with them (such as their length). The convention in C is to use a null character ('\0') to represent the end of a string.

Here's one way to traverse the list, in this case changing it to a comma-separated list.

int i = 0;
for (; i < res; i++)
   if (buffer[i] == '\0' && i != res -1) //we're in between strings
       buffer[i] = ',';

Of course, you'll want to make these into Python strings rather than just substituting in commas, but that should give you enough to get started.

Upvotes: 1

Chriszuma
Chriszuma

Reputation: 4568

It looks like listxattr returns the size of the buffer it has filled, so you can use that to help you. Here's an idea:

for(int i=0; i<res-1; i++)
{
    if( buffer[i] == 0 )
        buffer[i] = ',';
}

Now, instead of being separated by null characters, the attributes are separated by commas.

Upvotes: 0

Related Questions