Reputation: 27
I have a problem with parsing a .csv file. I have a struct world
defined like this:
typedef struct world
{
char worldName[30];
int worldId;
char *message;
char **constellationArray;
struct world *next;
} tWorld;
And I have a .csv file designed like this (so the 'c' is for 'semi-Colon'):
worldId;worldName;message;constellationArray
1;K'tau;Planeta pod ochranou Freyra;Aquarius;Crater;Orion;Sagittarius;Cetus;Gemini;Earth
2;Martin's homeworld;Znicena;Aries;Sagittarius;Monoceros;Serpens;Caput;Scutum;Hydra;Earth
3;...
The task seems simple: write a method loadWorlds(char *file)
. Load the file and parse it. The number of constellations is not guaranteed. Each new line signals a new world and I have to create a linked list of these worlds. I have a rough idea of doing this, but I can't make it work. I have a method called tWorld *createWorld()
which is implemented as such:
tWorld *createWorld() {
tWorld *world;
world = (*tWorld)malloc((sizeof(tWorld)));
return world;
}
I have to use this method inside my loadWorlds(char *file). Plus I have to serialize them into the linked list with this:
if (*lastWorld == NULL){
*lastWorld = nextWorld;
}else{
(*actualWorld)->next = nextWorld;
}
*actualWorld = nextWorld;
But I don't know when to use it. This is my rough sketch of loadWorlds(char *file)
:
void loadWorlds(char *file)
{
FILE *f;
char text[30];
char letter;
tWorld *lastWorld = NULL, *actualWorld = NULL, *world;
//f = fopen(file, "r");
if(!(f = fopen(file, "r")))
{
printf("File does not exist! \n");
while(!kbhit());
}
else
{
while(!(feof(f)) && (letter = fgetc(f))!= '\n')
{
if((znak = fgetc(f)) != ';')
{
}
}
}
}
I would be grateful for any ideas to make this work.
Upvotes: 1
Views: 326
Reputation: 84559
The question "How do I parse this file?... (Plus I have to serialize them into the linked list)" is a non-trivial undertaking when considered in total. Your "How do I parse this file?" is a question in its own right. The second part, regarding the linked list, is a whole separate issue that is not at all explained sufficiently, though it appears you are referring to a singularly-linked-list. There are as many different ways to approach this as there are labels of wine. I'll attempt to provide an example of one approah to help you along.
In the example below, rather than creating a single static character array worldName
within a tWorld
struct where all other strings are dynamically allocated, I've changed worldName
to a character pointer
as well. If you must use a static array of chars
, that can be changed easily, but as long as you are allocating the remainder of the strings, it makes sense to allocate for worldName
as well.
As to the parsing
part of the question, you can use any number of library functions identified in the comments, or you can simply use a couple of pointers
and step through each line parsing each string as required. Either approach is fine. The only benefit to using simple pointers, (aside from the learning aspect), is avoidance of repetative function calls which in some cases can be a bit more efficient. One note when parsing data from a line that has been dynamically allocated is to make sure you preserve the starting address for the buffer to insure the allocated memory can be properly tracked and freed. Some of the library functions clobber the original buffer (i.e. strtok
, etc.) which can cause interesting errors if you pass the buffer itself without, in some way, preserving the original start address.
The function read_list_csv
below parses each line read from the csv
file (actually semi-colon separated
values) into each of the members of the tWorld
struct using a pair of character pointers to parse the input line. read_list_csv
then calls ins_node_end
to insert each of filled & allocated tWorld nodes
into a singularly-linked circular linked-list
. The parsing is commented to help explain the logic, but in summary it simply sets a starting pointer p
to the beginning, then using an ending pointer ep
checks each character in the line until a semi-colon ;
is found, temporarily sets the ;
to \0
(nul) and reads the string pointed to by p
. The temporary \n
is replaced with the original ;
and the process repeats beginning with the following character, until the line has been completely parsed.
The linked-list
part of your question is somewhat more involved. It is complicated by many linked-list examples
being only partially explained and usually equivalently correct. Further, a linked-list
is of little use unless you can add to it, read from it, remove from it, and get rid of it without leaking memory like a sieve. When you look at examples, note there are two primary forms linked-lists take. Either HEAD/TAIL
lists or circular
lists. Both can be either singularly
or doubly
linked. HEAD/TAIL
lists generally use separate pointers for the list start or HEAD
and the list end or TAIL
node (generally set to NULL
). circular
lists simply have the end node next
pointer point back to the beginning of the list. Both have their uses. The primary benefit to the circular
list is that you can traverse the list from any node to any other node, regardless where you start in the list. (since there is no end-node
, you can iterate through all nodes starting from any node).
The example below is a singularly linked circular list
. It provides functions for creating nodes, inserting them into the list, counting the nodes, printing the entire list, removing nodes from the list, and deleting the list. Importantly, it frees all memory allocated to the list.
Go through both the parsing
part of the example and the linked-list
part of the example and let me know if you have questions. While the list implementation should be fairly solid, there may be some undiscovered issues. The datafile used for testing as well as the sample output is shown following the code. The code expects the datafile as the first argument and an optional (zero based) node to delete as a second argument (default: node 2):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXL 256
// #define wName 30
typedef struct world
{
// char worldName[wName];
char *worldName;
int worldId;
char *message;
char **constellationArray;
struct world *next;
} tWorld;
/* allocate & populate node */
tWorld *create_node (int wid, char *wnm, char *msg, char **ca);
/* insert node into list */
tWorld *ins_node_end (tWorld **list, int wid, char *wnm, char *msg, char **ca);
/* read data from file fname and add to list */
tWorld *read_list_csv (tWorld **list, char *fname);
/* return the number of nodes in list */
size_t getszlist (tWorld *list);
/* print all nodes in list */
void print_list (tWorld *list);
/* free memory allocated to tWorld list node */
void free_node (tWorld *node);
/* (zero-based) delete of nth node */
void delete_node (tWorld **list, int nth);
/* delete tWorld list & free allocated memory */
void delete_list (tWorld *list);
int main (int argc, char **argv)
{
if (argc < 2) {
fprintf (stderr, "error: insufficient input. Usage: %s <filename> [del_row]\n", argv[0]);
return 1;
}
char *fname = argv[1];
tWorld *myworld = NULL; /* create pointer to struct world */
read_list_csv (&myworld, fname); /* read fname and fill linked list */
printf ("\n Read '%zd' records from file: %s\n\n", getszlist (myworld), fname);
print_list (myworld); /* simple routine to print list */
int nth = (argc > 2) ? atoi (argv[2]) : 2;
printf ("\n Deleting node: %d\n\n", nth);
delete_node (&myworld, nth); /* delete a node from the list */
print_list (myworld); /* simple routine to print list */
delete_list (myworld); /* free memory allocated to list */
return 0;
}
/* allocate & populate node */
tWorld *create_node (int wid, char *wnm, char *msg, char **ca)
{
tWorld *node = NULL;
node = malloc (sizeof *node);
if (!node) return NULL;
node-> worldId = wid;
node-> worldName = wnm;
node-> message = msg;
node-> constellationArray = ca;
return node;
}
/* insert node into list */
tWorld *ins_node_end (tWorld **list, int wid, char *wnm, char *msg, char **ca)
{
tWorld *node = NULL;
if (!(node = create_node (wid, wnm, msg, ca))) return NULL;
if (!*list) { /* if empty, create first node */
node-> next = node;
*list = node;
} else { /* insert as new end node */
if (*list == (*list)-> next) { /* second node, no need to iterate */
(*list)-> next = node;
}
else /* iterate to end node & insert */
{
tWorld *iter = *list; /* second copy to iterate list */
for (; iter->next != *list; iter = iter->next) ;
iter-> next = node; /* insert node at end of list */
}
node-> next = *list; /* set next pointer to list start */
}
return *list; /* provides return as confirmation */
}
/* read list from file fname and add to list */
tWorld *read_list_csv (tWorld **list, char *fname)
{
FILE *fp = fopen (fname, "r");
if (!fp) {
fprintf (stderr, "%s() error: file open failed for '%s'\n", __func__, fname);
return NULL;
}
/* allocate and initialize all variables */
char *line = calloc (MAXL, sizeof *line);
char *p = NULL;
char *ep = NULL;
char *wnm = NULL;
int wid = 0;
int lcnt = 0;
char *msg = NULL;
char **ca = NULL;
size_t idx = 0;
while (fgets (line, MAXL, fp)) /* for each line in file */
{
if (lcnt++ == 0) continue; /* skip header row */
p = line;
idx = 0;
ep = p;
size_t len = strlen (line); /* get line length */
if (line[len-1] == '\n') /* strip newline from end */
line[--len] = 0;
while (*ep != ';') ep++; /* parse worldId */
*ep = 0;
wid = atoi (p);
*ep++ = ';';
p = ep;
while (*ep != ';') ep++; /* parse worldName */
*ep = 0;
wnm = strdup (p);
*ep++ = ';';
p = ep;
while (*ep != ';') ep++; /* parse message */
*ep = 0;
msg = strdup (p);
*ep++ = ';';
p = ep;
ca = calloc (MAXL, sizeof *ca); /* allocate constellationArray */
if (!ca) {
fprintf (stderr, "%s() error allocation failed for 'ca'.\n", __func__);
return NULL;
}
while (*ep) /* parse ca array elements */
{
if (*ep == ';')
{
*ep = 0;
ca[idx++] = strdup (p);
*ep = ';';
p = ep + 1;
/* if (idx == MAXL) reallocate ca */
}
ep++;
}
if (*p) ca[idx++] = strdup (p); /* add last element in line */
ins_node_end (list, wid, wnm, msg, ca); /* add to list */
}
/* close file & free line */
if (fp) fclose (fp);
if (line) free (line);
return *list;
}
/* return the number of nodes in list */
size_t getszlist (tWorld *list) {
const tWorld *iter = list; /* pointer to iterate list */
register int cnt = 0;
if (iter == NULL) {
fprintf (stdout,"%s(), The list is empty\n",__func__);
return 0;
}
for (; iter; iter = (iter->next != list ? iter->next : NULL)) {
cnt++;
}
return cnt;
}
/* print all nodes in list */
void print_list (tWorld *list) {
const tWorld *iter = list; /* pointer to iterate list */
register int idx = 0;
char *stub = " ";
if (iter == NULL) {
fprintf (stdout,"%s(), The list is empty\n",__func__);
return;
}
for (; iter; iter = (iter->next != list ? iter->next : NULL)) {
printf (" %2d %-20s %-20s\n",
iter-> worldId, iter-> worldName, iter-> message);
idx = 0;
while ((iter-> constellationArray)[idx])
printf ("%38s %s\n", stub, (iter-> constellationArray)[idx++]);
}
}
/* free memory allocated to tWorld list node */
void free_node (tWorld *node)
{
if (!node) return;
register int i = 0;
if (node-> worldName) free (node-> worldName);
if (node-> message) free (node-> message);
while (node-> constellationArray[i])
free (node-> constellationArray[i++]);
if (node-> constellationArray)
free (node-> constellationArray);
free (node);
}
/* (zero-based) delete of nth node */
void delete_node (tWorld **list, int nth)
{
/* test that list exists */
if (!*list) {
fprintf (stdout,"%s(), The list is empty\n",__func__);
return;
}
/* get list size */
int szlist = getszlist (*list);
/* validate node to delete */
if (nth >= szlist || nth < 0) {
fprintf (stderr, "%s(), error: delete out of range (%d). allowed: (0 <= nth <= %d)\n",
__func__, nth, szlist-1);
return;
}
/* create node pointers */
tWorld *victim = *list;
tWorld *prior = victim;
/* if nth 0, prior is last, otherwise node before victim */
if (nth == 0) {
for (; prior->next != *list; prior = prior->next) ;
} else {
while (nth-- && victim-> next != *list) {
prior = victim;
victim = victim-> next;
}
}
/* non-self-reference node, rewire next */
if (victim != victim->next) {
prior-> next = victim-> next;
/* if deleting node 0, change list pointer address */
if (victim == *list)
*list = victim->next;
} else { /* if self-referenced, last node, delete list */
*list = NULL;
}
free_node (victim); /* free memory associated with node */
}
/* delete tWorld list */
void delete_list (tWorld *list)
{
if (!list) return;
tWorld *iter = list; /* pointer to iterate list */
for (; iter; iter = (iter->next != list ? iter->next : NULL))
if (iter) free_node (iter);
}
input test data file:
$ cat dat/struct.csv
worldId;worldName;message;constellationArray
1;K'tau;Planeta pod ochranou Freyra;Aquarius;Crater;Orion;Sagittarius;Cetus;Gemini;Earth
2;Martin's homeworld;Znicena;Aries;Sagittarius;Monoceros;Serpens;Caput;Scutum;Hydra;Earth
3;Martin's homeworld2;Znicena2;Aries2;Sagittarius2;Monoceros2;Serpens2;Caput2;Scutum2;Hydra2;Earth2
4;Martin's homeworld3;Znicena3;Aries3;Sagittarius3;Monoceros3;Serpens3;Caput3;Scutum3;Hydra3;Earth3
output:
$ ./bin/struct_ll_csv dat/struct.csv 1
Read '4' records from file: dat/struct.csv
1 K'tau Planeta pod ochranou Freyra
Aquarius
Crater
Orion
Sagittarius
Cetus
Gemini
Earth
2 Martin's homeworld Znicena
Aries
Sagittarius
Monoceros
Serpens
Caput
Scutum
Hydra
Earth
3 Martin's homeworld2 Znicena2
Aries2
Sagittarius2
Monoceros2
Serpens2
Caput2
Scutum2
Hydra2
Earth2
4 Martin's homeworld3 Znicena3
Aries3
Sagittarius3
Monoceros3
Serpens3
Caput3
Scutum3
Hydra3
Earth3
Deleting node: 1
1 K'tau Planeta pod ochranou Freyra
Aquarius
Crater
Orion
Sagittarius
Cetus
Gemini
Earth
3 Martin's homeworld2 Znicena2
Aries2
Sagittarius2
Monoceros2
Serpens2
Caput2
Scutum2
Hydra2
Earth2
4 Martin's homeworld3 Znicena3
Aries3
Sagittarius3
Monoceros3
Serpens3
Caput3
Scutum3
Hydra3
Earth3
Upvotes: 1