Rashid
Rashid

Reputation: 51

lseek SEEK_DATA appears to be not working

Running on Tumbleweed with kernel 6.2.12-1-default

Using the _GNU_SOURCE define should allow the use of SEEK_DATA as the whence value for lseek.

According to the man page this should work for a range of file systems. I have tested this code on btrfs and ext4 with the same result.

#define _GNU_SOURCE
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>

void test(char *filepath) {
    int write_value = 0x22;

    int fh = open(filepath, O_CREAT | O_TRUNC | O_RDWR);

    int os    = lseek(fh, 10, SEEK_SET);
    int wret  = write(fh, &write_value, sizeof(int));
    int os2   = lseek(fh, 0, SEEK_DATA);
    printf("os: %d os2: %d wret %d\n", os, os2, wret);
    close(fh);
}

Given the above code, I would expect SEEK_DATA to find the first written value at offset 10, and that os2 == os, yet lseek returns 0.

The file is indeed written as expected and od -x gives the following output:

0000000 0000 0000 0000 0000 0000 0022 0000

Any suggestions? Have I made an incorrect assumption about the expected behaviour...

Upvotes: 1

Views: 357

Answers (1)

dimich
dimich

Reputation: 1843

Sparse files are implementation-specific. We can assume hole granularity depends on filesystem I/O block size.

I modified your program to make some tests:

#define _GNU_SOURCE
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <stdlib.h>
#include <stdint.h>

static void test(off_t offset)
{
    int write_value = 0x22;

    int fh = open("dummy", O_CREAT | O_TRUNC | O_RDWR, 0644);

    off_t os     = lseek(fh, offset, SEEK_SET);
    ssize_t wret = write(fh, &write_value, sizeof(write_value));
    off_t os2    = lseek(fh, 0, SEEK_DATA);

    struct stat statbuf;
    int ret = fstat(fh, &statbuf);

    printf("% 6jd | % 6jd | % 6zd | % 6jd\n",
        (intmax_t)os, (intmax_t)os2, wret,
        (intmax_t)(ret == 0 ? statbuf.st_blocks*512 : -1));

    close(fh);

    unlink("dummy");
}

int main(int, char **argv)
{
    printf("   os  |   os2  |  wret  |  used \n"
           "----------------------------------\n");

    while (*++argv)
    {
        test(atol(*argv));
    }

    return 0;
}

Filesystem block size is usual:

$ stat -f -c "%s" .
4096

Results:

$ ./holes 10 4095 4096 8191 8192 8193
   os  |   os2  |  wret  |  used 
----------------------------------
    10 |      0 |      4 |   4096
  4095 |      0 |      4 |   8192
  4096 |   4096 |      4 |   4096
  8191 |   4096 |      4 |   8192
  8192 |   8192 |      4 |   4096
  8193 |   8192 |      4 |   4096

Analysis:

Write offset 10: | data |

0 blocks of hole, 1 block of data. Data block starts at the beginning: os2 == 0.

Write offset 4095: | data | data |

0 blocks of hole, 2 blocks of data. Data block starts at the beginning: os2 == 0.

Write offset 4096: | hole | data |

1 block of hole, 1 block of data. Data block start from second, os2 == 4096.

Write offset 8191: | hole | data | data |

1 block of hole, 2 blocks of data. Data blocks start from second, os2 == 4096

Write offset 8192: | hole | hole | data |

2 blocks of hole, 1 blocks of data. Data blocks start from third, os2 == 8192

Write offset 8193: the same as 8192.

Observed results meet expected.

Upvotes: 1

Related Questions