user12577723
user12577723

Reputation:

How to read files in directory in their lexical order?

So before asking this question, I looked at this thread : How can I read the files in a directory in sorted order? However, this is was Perl related question and I couldn't extract the information I needed to solve my problem.

So... here is the function I made :

SEQUENCE *init_TSEQ(int nseq)
{
    DIR *D=opendir("sequences");
    struct dirent *entry;
    SEQUENCE *TSEQ=malloc(sizeof(SEQUENCE)*nseq);
    FILE *F;
    int i=0;

    chdir("sequences");

    while(((entry=readdir(D))!=NULL) && (i<nseq))
    {
        if(entry->d_type==DT_REG)
        {
            char seq[MAXSIZE];

            F=fopen(entry->d_name, "r");
            fscanf(F, "%s", seq);
            TSEQ[i].lenght=strlen(seq);

            for (int j=0; j<TSEQ[i].lenght; j++)
            {
                fscanf(F, "%c", seq);
                TSEQ[i].c[j]=seq[j];
            }

            fclose(F);
            i++;
        }
    }

    closedir(D);

    return TSEQ;
}

It seems like this functions reads the files in the order in which they are stored in the computer's memory, but I would like it to read files in their lexical order (their names are seq1, seq2 etc...). How could I do that ? It is very important since the sequences are then stored in the TSEQ variable in the other in which they have been read.

EDIT : So, based on weston's and Shawn's tips, I made this function.

SEQUENCE *init_TSEQ(int nseq)
{
    SEQUENCE *TSEQ=malloc(sizeof(SEQUENCE)*nseq);
    struct dirent **namelist;
    FILE *F;
    char seq[MAXSIZE];
    int n;

    chdir("sequences");

    n=scandir("sequences", &namelist, 0, alphasort);

    if(n>=0)
    {
        for(int i=0; i<n; i++)
        {
            F=fopen(namelist[i]->d_name, "r");
            fscanf(F, "%s", seq);
            TSEQ[i].lenght=strlen(seq);

            for (int j=0; j<TSEQ[i].lenght; j++)
            {
                fscanf(F, "%c", seq);
                TSEQ[i].c[j]=seq[j];
            }
            fclose(F);
            free(namelist[i]);
        }
        free(namelist);
    }

    return TSEQ;
}

However, when I try to display a sequence (stored with TSEQ), valgrind says that TSEQ.lenght is uninitialized.

Upvotes: 0

Views: 315

Answers (2)

user12577723
user12577723

Reputation:

I didn't manage to cycle trough all the files in the alphabetical order, tho I did it in another way since all my files are named seq1, seq2 etc... here is how I did it.

SEQUENCE *init_TSEQ(int nseq)
{
    SEQUENCE *TSEQ=malloc(sizeof(SEQUENCE)*(nseq+1));
    char seq[MAXSIZE];
    FILE *F;

    chdir("sequences");

    for(int i=1; i<=nseq; i++)
    {
        char buf[0x100];

        snprintf(buf, sizeof(buf), "seq%d.txt", i);
        F=fopen(buf, "r");
        fscanf(F, "%s", seq);
        TSEQ[i].lenght=strlen(seq);

        for (int j=0; j<TSEQ[i].lenght; j++)
        {
            fscanf(F, "%c", seq);
            TSEQ[i].c[j]=seq[j];
        }

        fclose(F);
    }

    return TSEQ;
}

Upvotes: 0

weston
weston

Reputation: 54801

  1. Read file names into a list.
  2. Sort list to taste.
  3. Now loop over this list and process each file as before.

Upvotes: 4

Related Questions