Python Linux read process memory - int too large to convert to C long

Question

I have this snippet that reads a process memory in Linux and search for a string, works ok on some distros, but on other just got this error:

  maps_file = open("/proc/%s/maps"%pid, 'r')
  mem_file = open("/proc/%s/mem"%pid, 'r')
  for line in maps_file.readlines():  # for each mapped region
      m = re.match(r'([0-9A-Fa-f]+)', line)
      if m.group(3) == 'r':  # if this is a readable region
          start = int(m.group(1), 16)
          end = int(m.group(2), 16)
          mem_file.seek(start)  # seek to region start
          chunk = mem_file.read(end - start)  # read region contents
          #print chunk,  # dump contents to standard output
          mem_dump = open(working_dir+"/%s.bin"%pid, "ab")
          mem_dump.write(chunk,)
          mem_dump.close()
  maps_file.close()
  mem_file.close()

the error:

scan process: 491
Traceback (most recent call last):
  File "./dump.py", line 106, in 
    MainDump(pid)
  File "./dump.py", line 79, in MainDump
    mem_file.seek(start)  # seek to region start
OverflowError: Python int too large to convert to C long

the problem line is:

start = int(m.group(1), 16)

and

mem_file.seek(start)

should I declare as float? Any idea?

Tried also long() with same result and error.

EDIT: something I forgot to say is that the error I get on an "x64" system.

abarnert · Accepted Answer

The problem is that you've got the address 0xffffffffff600000L. A (signed) C long can only hold values from -0x8000000000000000 to 0x7fffffffffffffff. So, this address is indeed "too large to convert to C long".

If you look at the source, you can see that the problem is most likely that for some reason, when Python was configured on the non-working distro, it couldn't detect that fseeko and off_t existed. But unless you want to rebuild Python, that isn't going to help you.

So, how can you work around the problem? There are a few things to try.

The first possibility is to seek from the end instead of the start.

mem_len = os.fstat(mem_file.fileno()).st_size

if start >= 1<<63L:
    mem_file.seek(mem_len - start, os.SEEK_END)
else:
    mem_file.seek(start)

You can also try this horrible hack:

if start >= 1<<63L:
    start -= 1<<64L

This will convert your 0xffffffffff600000L to -0xa00000, which fits just fine into a long… and then hopefully, that long is actually being cast to some unsigned 64-bit type inside the C layer, meaning it seeks to 0xffffffffff600000L as you'd hoped.

You may also be able to get around this by using mmap to map the page(s) you want, instead of seek and read.

If worst comes to worst, you can use ctypes (or cffi or whatever you prefer) to call fseeko directly on your file handle.

Finally, make sure you actually want to read this region. I may be wrong, but I seem to remember that linux reserves the upper region for kernel pages mapped into userspace. If I'm right, the strings you're looking for aren't going to be here, so you can just skip them…

To skip processing a region, you can either move the processing inside an if:

start = int(m.group(1), 16)
end = int(m.group(2), 16)
if start <= sys.maxint:
    mem_file.seek(start)  # seek to region start
    chunk = mem_file.read(end - start)  # read region contents
    # ...

… or use a continue statement to skip to the next iteration of the loop:

start = int(m.group(1), 16)
end = int(m.group(2), 16)
if start > sys.maxint:
    continue
mem_file.seek(start)  # seek to region start
chunk = mem_file.read(end - start)  # read region contents
# ...

If you know the regions are always in sorted order, you can use break instead of continue (because the rest of the regions will also be out of range).

But I think the best solution is to just try it, and handle errors. There are other reasons this seek and read could fail—for example, if the process you're looking at unmaps a region before you get to it, or exits—and you'd rather skip the error and continue on than just exit, right?

So:

if m.group(3) == 'r':  # if this is a readable region
    start = int(m.group(1), 16)
    end = int(m.group(2), 16)
    try:
        mem_file.seek(start)  # seek to region start
        chunk = mem_file.read(end - start)  # read region co
    except Exception as e:
        print('Skipping region {:#018x} because of error {}'.format(start, e))
        continue
    mem_dump = open(working_dir+"/%s.bin"%pid, "ab")
    # ...

Python Linux read process memory - int too large to convert to C long

Answers (1)

Related Questions