Reputation: 11
I am asked to extract rows of a csv file into a linkedlist, every node of the linkedlist will store one row of the data, where the "data" pointer inside the linkedlist points towards a struct, which has every columns of information (like double heading1; char *heading2 etc.) There may be quotes around values that are separated by commas and they should be treated as a whole as the data of that particular column. For example, one field of heading 3 may contain data such as "1, 2, 3" where "1, 2, 3" should be all displayed under heading3, instead of as 3 separate columns.
What I tried is to use two while loops:
//defining data structure here
typedef struct node{
suburb_t *data;
struct node *next;
}Node_t;
//function definition here...
while(fscanf("%d,%d,%d", &x, &y, &z) == 3){
Node_t node = (Node_t *)malloc(sizeof(Node_t));
node->data = (suburb_t *)malloc(sizeof(suburb_t));
node->data->content1 = x
//and so on
while(fscanf(.....) == ...){
//similar to first line
}
//and here I can not figure out how to keep creating new nodes, moving the current
//node to the next node until the outer while loop terminates.
}
The reason why I used double while loop is due to the existence of comma separated values quoted inside quotation marks, which I can then use the second while loop to read the quotation mark, store it inside a variable and read the content as a whole, until the end quotation markk is encountered. We can assume value with quotation marks will not be the first few values...
And my question is how do I create nodes, link it with the previous one iteratively until the first while loop terminates, that is either an EOF is reached or lesser than 3 values are read? My question is mainly towards the end of the code snippet I provided, I think I need to add something within that area but I just don't know what to add....
Upvotes: 1
Views: 106
Reputation: 149105
A csv file is logically a container of rows, each row containing fields. The hard part in writing a general csv parser is that a field is allowed to contain separator and newline characters provided it is enclosed in quotation marks.
This is far beyond what the scanf
family can do...
If you know the structure of the csv file (number and type of all fields) and you know that no field can contain a comma nor a new line, then it can be done through a single loop. Pseudocode:
read line by line until end of file
per line
alloc a new node
link it to previous one
parse the line into the node
Still that pseudo-code contains a number of caveats:
suburb_t
structure)In the general case, you have to read the file one character at a time and build a list of logical rows (each can use more than one line...) each containing a list of fields. Not really complex but way too long for a SO answer.
TL/DR: If you have no problem in parsing a row, your code could become:
// put here the maximum size of a line of your csv file + 1 (or slightly more if you want to be super safe
#define MAXLINE 256
typedef struct node{
suburb_t data; // no need for indirection here...
struct node *next;
}Node_t;
//function definition here...
FILE* fd = fopen(...); // open the csv file
// tests for open error omitted for brievety
char line[MAXLINE];
Node_t *first, **last = &first
while(NULL != fgets(line, sizeof(line), fd)) {
// test for empty line omitted for brievety
Node_t* node = malloc(sizeof(Node_t)); // never cast malloc in C!
node->next = NULL;
// parse the line into the data member of *node
...
*last = node;
last = &(node->next);
}
fclose(fd);
But beware, the above code has not been tested and can contain a number of typos or other stupid mistakes...
Upvotes: 0
Reputation: 2528
Since no details are provided for suburb_t
this is just an example of linking a list by adding the new node to the end of the list.
Pointers are needed to keep track of the first and last nodes.
#include <stdio.h>
#include <stdlib.h>
#define SIZE 15
typedef struct Node node;
struct Node {
int val;
node *next;
};
node *newnode ( int value) {
node *item = NULL;
if ( NULL == ( item = malloc ( sizeof *item))) {
return NULL;
}
item->val = value;
item->next = NULL; // sentinel to terminate list
return item;
}
void addnode ( node **list, node **last, node *add) {
if ( NULL == *list) { // empty list
*list = add;
*last = add;
}
else {
(*last)->next = add;
*last = add;
}
}
void showlist ( node *list) {
while ( list) {
printf ( "%2d ", list->val);
list = list->next;
}
printf ( "\n");
}
node *freelist ( node *list) {
while ( list) {
node *tmp = list;
list = list->next;
free ( tmp);
}
return list;
}
int main ( void) {
node *list = NULL;
node *last = NULL;
node *add = NULL;
for ( int each = 0; each < SIZE; ++each) {
if ( ( add = newnode ( each))) { // create node
addnode ( &list, &last, add); // add created node to end of list
}
}
showlist ( list);
list = freelist ( list);
return 0;
}
Upvotes: 1