ceno980
ceno980

Reputation: 2011

Understanding code for creating a singly linked list using double pointer in C

I am trying to understand how the code below for creating a singly linked list works using a double pointer.

#include <stdio.h>
#include <stdlib.h>

struct Node {
    int data;
    struct Node* next;
};

void push(struct Node** headRef, int data) {

    struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
    newNode->data = data;   
    newNode->next = *headRef;
    *headRef = newNode;
}

//Function to implement linked list from a given set of keys using local references
struct Node* constructList(int keys[], int n) {
    struct Node *head = NULL; 
    struct Node **lastPtrRef = &head; 
    int i, j;
    for(i = 0; i < n; i++) {
        push(lastPtrRef, keys[i]);
        lastPtrRef = &((*lastPtrRef)->next); //this line
        if((*lastPtrRef) == NULL) {
            printf("YES\n");
        }
    }

    return head;
}

int main() {
    int keys[] = {1, 2, 3, 4};
    int n = sizeof(keys)/sizeof(keys[0]);

    //points to the head node of the linked list
    struct Node* head = NULL;

    head = constructList(keys, n); //construct the linked list

    struct Node *temp = head;

    while(temp != NULL) { //print the linked list
        printf(" %d -> ", temp->data);
        temp = temp->next;
    }
}

I understand the purpose of using the double pointer in the function push(), it allows you to change what the pointer headRef is pointing to inside the function. However in the function constructList(), I don't understand how the following line works:

lastPtrRef = &((*lastPtrRef)->next);

Initially lastPtrRef would be pointing to head which points to NULL. In the first call to push(), within the for loop in constructList(), the value that head points to is changed (it points to the new node containing the value 1). So after the first call to push(), lastPtrRef will be pointing to head which points to a node with the value of 1. However, afterwards the following line is executed:

lastPtrRef = &((*lastPtrRef)->next);

Whereby lastPtrRef is given the address of whatever is pointed to by the next member of the newly added node. In this case, head->next is NULL.

I am not really sure what the purpose of changing lastPtrRef after the call to push(). If you want to build a linked list, don't you want lastPtrRef to have the address of the pointer which points to the node containing 1, since you want to push the next node (which will containing 2) onto the head of the list (which is 1)?

In the second call to push() in the for loop in constructList, we're passing in lastPtrRef which points to head->next (NULL) and the value 2. In push() the new node is created, containing the value 2, and newNode->next points to head->next which is NULL. headRef in push gets changed so that it points to newNode (which contains 2).

Maybe I'm understanding the code wrong, but it seems that by changing what lastPtrRef points to, the node containing 1 is getting disregarded. I don't see how the linked list is created if we change the address lastPtrRef holds.

I would really appreciate any insights as to how this code works. Thank you.

Upvotes: 1

Views: 112

Answers (1)

WhozCraig
WhozCraig

Reputation: 66234

This uses a technique called forward-chaining, and I believe you already understand that (using a pointer-to-pointer to forward-chain a linked list construction).

This implementation is made confusing by the simple fact that the push function seems like it would be designed to stuff items on the head of a list, but in this example, it's stuffing them on the tail. So how does it do it?

The part that is important to understand is this seemingly trivial little statement in push:

newNode->next = *headRef

That may not seem important, but I assure you it is. The function push, in this case, does grave injustice to what this function really does. In reality it is more of a generic insert. Some fact about that function

  • It accepts a pointer-to-pointer headRef as an argument, as well as some data to put in to the linked list being managed.
  • After allocating a new node and saving the data within, it sets the new node's next pointer to whatever value is currently stored in the dereferenced headRef pointer-to-pointer (so.. a pointer) That's what the line I mentioned above accomplishes.
  • It then stores the new node's address at the same place it just pulled the prior address from; i.e. *headRef
  • Interestingly, it has no return value (it is void) further making this somewhat confusing. Turns out it doesn't need one.

Upon returning to the caller, at first nothing may seem to have changed. lastPtrRef still points to some pointer (in fact the same pointer as before; it must, since it was passed by value to the function). But now that pointer points to the new node just allocated. Further, that new node's next pointer points to whatever was in *lastPtrRef before the function call (i.e. whatever value was in the pointer pointed to by lastPtrRef before the function call).

That's important. That is what that line of code enforces, That means if you invoke this with lastPtrRef addressing a pointer pointing to NULL (such as head on initial loop entry), that pointer will receive the new node, and the new node's next pointer will be NULL. If you then change the address in lastPtrRef to point to the next pointer of the last-inserted node (which points to NULL; we just covered that), and repeat the process, it will hang another node there, setting that node's next pointer to NULL, etc. With each iteration, lastPtrRef addresses the last-node's next pointer, which is always NULL.

That's how push is being used to construct a forward linked list. One final thought. What would you get for a linked list if you had this:

#include <stdio.h>
#include <stdlib.h>

struct Node
{
    int data;
    struct Node* next;
};

void push(struct Node** headRef, int data)
{
    struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
    newNode->data = data;
    newNode->next = *headRef;
    *headRef = newNode;
}

int main()
{
    //points to the head node of the linked list
    struct Node* head = NULL;
    push(&head, 1);
    push(&head->next, 2);
    push(&head->next, 3);

    for (struct Node const *p = head; p; p = p->next)
        printf("%p ==> %d\n", p, p->data);
}

This seemingly innocent example amplifies why I said push is more of a generic insert than anything else. This just populates the initial head node.

push(&head, 1);

Then this appends to that node by using the address of the new node's next pointer as the first argument, similar to what your constructList is doing, but without the lastPtrRef variable (we don't need it here):

push(&head->next, 2);

But then this:

push(&head->next, 3);

Hmmm. Same pointer address as the prior call, so what will it do? Hint: remember what that newNode->next = *headRef line does (I droned on about it forever; I hope something stuck).

The output of the program above is this (obviously the actual address values will be different, dependent to your instance and implementation):

0x100705950 ==> 1
0x10073da90 ==> 3
0x100740b90 ==> 2

Hope that helps.

Upvotes: 1

Related Questions