Mauro
Mauro

Reputation: 23

issues filling 2d array from txt file(csv)

I am working on filling a 2d array by reading from a text file, the items are separated by commas. I have tried 2 ways and I am having some issues.

first approach:

Using strtok (which I've read I should avoid so I'm strcpy to copy original string that was read in to another one) I am using a comma as a delimiter. First problem is the program crashes unless I add additional spaces between the words i'm reading in. so I added spaces and it works, it reads everything and i print to check its added to the 2d array, or so it seems. After it finishes filling array I do nested for loop to print and for some reason eveything in the 2d array has been replaced by the last thing it read from the txt file. so my issues is how to make strtok not require the extra space and how come array is getting overwritten for some reason, when I first fill and print it it seems that it was filled correctly.

#include <string.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[])
{
    FILE *fp;
    char text[20], *token;
    char word[20];
    const char delimiters[] = ",";
    char *table[8][8];
    int i = 0;
    int j = 0;

    fp = fopen("board.txt", "r");
    if (fp == NULL)
    {
        printf("Error opening");
    }
    printf("\n\n");
    while (fscanf(fp, "%15s", text) != EOF)
    {
        strcpy(word, text);
        token = strtok(word, delimiters);

        table[i][j] = token;
        //pritn table values as they get added
        printf("table[%d][%d] = %s ", i, j, table[i][j]);

        //ghetto nested for loop
        j++;
        if (j >= 8)
        {
            i++;
            j = 0;
            printf("\n");
        }
    }

    printf("\n\n\ntable[0][3] = %s|", table[0][3]);
    printf("\n");

    for (i = 0; i < 8; i++)
    {
        //printf("\n");
        for (j = 0; j < 8; j++)
        {
            printf("table[%d][%d] = %s|", i, j, table[i][j]);
        }
        printf("\n");
    }
    return 0;
}

this is the data i'm reading from text file

-4,-2,-3,-5,-6,-3,-2,-4
-1,-1,-1,-1,-1,-1,-1,-1
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
+1,+1,+1,+1,+1,+1,+1,+1
+4,+2,+3,+5,+6,+3,+2,+100

but if i don't add spaces like this it crashes

-4, -2, -3, -5, -6, -3, -2, -4
-1, -1, -1, -1, -1, -1, -1, -1
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
+1, +1, +1, +1, +1, +1, +1, +1
+4, +2, +3, +5, +6, +3, +2, +100

second approach:

I am reading each character one at a time from txt file, if it detects a comma it adds all the previous characters as string, moves onto next character and keeps repeating until EOF. With this method I don't have the problem of needing the extra spaces, but the issue with the code is that whenever it gets to the end of a row it adds 2 items instead of one, so now everything gets shifted from there after. This happens at the end of every row so when it's all done I am missing nRows items.

With this approach I also get the same issues as first approach that it seems to overwrite everything with the last value read from the text file. One small isssue with this also is that since the way it works is by detecting a comma then it knows everything before it is a word, when I get to the last value in the file unless I add a comma it will not write it to the array. I'm working around it by adding a comma but its not part of the file so I shouldn't use it.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    FILE *fp;
    char text[20];
    char *table[8][8] = {0};
    char word[30];
    //char *table[8][8];
    int i = 0;
    int j = 0;

    fp = fopen("board.txt", "r");
    if (fp == NULL)
    {
        printf("Error opening");
    }

    int word_i = 0;
    int c;
    while ((c = fgetc(fp)) != EOF)
    {
        if (c == ',')
        {
            //separate words with commas
            if (word_i > 0)
            {                                        
                text[word_i] = '\0';

                // strcpy(word, text);
                // table[i][j] = word;

                table[i][j] = text;
                printf("table[%d][%d] = %s |\t", i, j, table[i][j]);
                j++;

                if (j >= 8)
                {
                    i++;
                    j = 0;
                }
            }
            word_i = 0;
        }
        else
        {
            text[word_i] = c;
            ++word_i;
        }
    }

    printf("\n\n");
    //want to check that i manually modified table[0][0]=124
    for (i = 0; i < 8; i++)
    {
        //printf("\n");
        for (j = 0; j < 8; j++)
        {
            printf("table[%d][%d] = %s|", i, j, table[i][j]);
        }
        printf("\n");
    }
    return 0;
}

with this code I have to add a comma at the end of the text file so it reads the last value

-4,-2,-3,-5,-6,-3,-2,-4
-1,-1,-1,-1,-1,-1,-1,-1
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
+1,+1,+1,+1,+1,+1,+1,+1
+4,+2,+3,+5,+6,+3,+2,+100,

I can post what output I'm getting if its needed.

Any help would be greatly appreciated, thank you.

Upvotes: 1

Views: 87

Answers (1)

David C. Rankin
David C. Rankin

Reputation: 84551

Continuing on from the comment by @JohathanLeffler, using a line-oriented input function to read a line of data at a time, such as fgets() or POSIX getline() ensures you consume a line of input with each read from your file. You then simply parse the comma separated values from the buffer holding the line of data from your file.

There are several ways to separate each of the comma-separated values (and each will have variants depending whether you want to preserve or discard the whitespace surrounding a field). You can always use a start_pointer and end-pointer moving the end_pointer to locate the next ',' and then copying the characters (token) from start_pointer to end_pointer and then setting start_pointer = ++end_pointer and repeating until you reach the end of the buffer.

If you have no empty-fields (meaning your data doesn't have adjacent ',' delimiters, e.g. -4,-2,,-5,...) then using strtok() is a simple way to split the buffer into tokens. If you have empty-fields, then if your compiler provides BSD strsep() it will handle empty-fields, or simply using a combination of strcspn() and strspn() (or in the case of a single ',' delimiter using strchr() instead) will allow you to automate walking a pair of pointers through the buffer.

A very simple implementation with strtok() to separate each line into tokens (reading your file from stdin) would be:

#include <stdio.h>
#include <string.h>

#define MAXC 1024

int main (void) {

    char buf[MAXC];                         /* buffer to hold each line */

    while (fgets (buf, MAXC, stdin)) {      /* read each line into buf */
        /* split buf into tokens using strtok */
        for (char *tok = strtok (buf, ","); tok; tok = strtok (NULL, ",")) {
            tok[strcspn (tok, "\n")] = 0;   /* trim '\n' from end tok */
            /* output board (space before if not 1st) */
            printf (tok != buf ? " %s" : "%s", tok);
        }
        putchar ('\n');
    }
}

(note: with printf a simple ternary operator is used to put a space before all fields except the first -- you can change the output formatting to anything you like. Also note that checking if strlen(buf) + 1 == MAXC && buf[MAXC-2] != '\n' to validate that the entire line fit in buf was intentionally omitted and left to you to implement)

The use of the for loop above is just a condensed way to incorporating the call to get the first-token where the first parameter to strtok is the string itself, and then getting a subsequent token where the first parameter to strtok is NULL while checking tok != NULL to validate the call to strtok returns a valid token. It can also be written with a while() loop if that is easier to read, e.g.

        /* split buf into tokens using strtok */
        char *tok = strtok (buf, ",");      /* separate 1st token */
        while (tok) {                       /* validate tok != NULL */
            tok[strcspn (tok, "\n")] = 0;   /* trim '\n' from end tok */
            /* output board (space before if not 1st) */
            printf (tok != buf ? " %s" : "%s", tok);
            tok = strtok (NULL, ",");       /* get next token */
        }

(both are equivalent loops for separating the comma-separated tokens from buf)

Example Input File

$ cat dat/board-8x8.txt
-4,-2,-3,-5,-6,-3,-2,-4
-1,-1,-1,-1,-1,-1,-1,-1
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
 0, 0, 0, 0, 0, 0, 0, 0
+1,+1,+1,+1,+1,+1,+1,+1
+4,+2,+3,+5,+6,+3,+2,+100

Example Use/Output

Outputting the data simply separating each token with a space yields:

$ ./bin/strtok_board_csv < dat/board-8x8.txt
-4 -2 -3 -5 -6 -3 -2 -4
-1 -1 -1 -1 -1 -1 -1 -1
 0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0
+1 +1 +1 +1 +1 +1 +1 +1
+4 +2 +3 +5 +6 +3 +2 +100

Allocating Storage for Each Pointer in table

When you declare char *table[ROW][COL]; you have declared a 2D array of pointers to char. In order to use the pointers, you must either assign the address for a valid existing block of memory to each pointer, or you must allocate a new block of memory sufficient to hold tok and assign the starting address for each such block to each of your pointers in turn. You can't simply assign, e.g. table[i][j] = tok; due to tok pointing to an address within buf that will be overwritten with something new each time a new line is read.

Instead you need to allocate sufficient memory to hold the contents of tok (e.g. strlen(tok) + 1 bytes) assign the resulting new block of memory to your table[i][j] pointer and then copy tok to that new block of memory. You can do that similar to:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define ROW     8       /* if you need a constant, #define one (or more) */
#define COL   ROW
#define MAXC 1024

int main (void) {

    char buf[MAXC],                         /* buffer to hold each line */
        *table[ROW][COL] = {{NULL}};        /* 2D array of pointers */
    size_t row = 0;
    while (fgets(buf,MAXC,stdin)) {         /* read each line into buf */
        size_t col = 0;
        /* split buf into tokens using strtok */
        for (char *tok = strtok (buf, ","); tok; tok = strtok (NULL, ",")) {
            size_t len;
            tok[strcspn (tok, "\n")] = 0;   /* trim '\n' from end tok */
            len = strlen (tok);
            if (!(table[row][col] = malloc (len + 1))) {  /* allocate/validate */
                perror ("malloc-table[row][col]");
                exit (EXIT_FAILURE);
            }
            memcpy (table[row][col++], tok, len + 1);   /* copy tok to table */
        }
        if (col != COL) {   /* validate COL tokens read from buf */
            fprintf (stderr, "error: insufficient columns, row %zu\n", row);
            exit (EXIT_FAILURE);
        }
        row++;  /* increment row counter */
    }

    for (size_t i = 0; i < row; i++) {      /* loop rows */
        for (size_t j = 0; j < COL; j++) {  /* loop COLS */
            /* output board from table (space before if not 1st) */
            printf (j > 0 ? " %s" : "%s", table[i][j]);
            free (table[i][j]);             /* free allocated memory */
        }
        putchar ('\n');
    }
}

(example input and output are the same)

Memory Use/Error Check

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.

$ valgrind ./bin/strtok_board_table_csv < dat/board-8x8.txt
==3469== Memcheck, a memory error detector
==3469== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3469== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==3469== Command: ./bin/strtok_board_table_csv
==3469==
-4 -2 -3 -5 -6 -3 -2 -4
-1 -1 -1 -1 -1 -1 -1 -1
 0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0
 0  0  0  0  0  0  0  0
+1 +1 +1 +1 +1 +1 +1 +1
+4 +2 +3 +5 +6 +3 +2 +100
==3469==
==3469== HEAP SUMMARY:
==3469==     in use at exit: 0 bytes in 0 blocks
==3469==   total heap usage: 66 allocs, 66 frees, 5,314 bytes allocated
==3469==
==3469== All heap blocks were freed -- no leaks are possible
==3469==
==3469== For counts of detected and suppressed errors, rerun with: -v
==3469== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.

Let me know if you have any further questions.

Upvotes: 2

Related Questions