Sarah
Sarah

Reputation: 516

Stored value disappears when setting a struct pointer to null in C++

I'm writing a C++ application to do a word search across a large database of song lyrics. to start, I'm taking each word and putting it into a Word struct that looks like this:

struct Word{
    char* clean;
    int size;
    int position;
    SongId id;
    Word* same;
    Word* diff;
};

I have a "makeNode" function that does the following:

  1. takes in each word
  2. creates a new Word struct and adds the word to it
  3. creates a Word* called node which points to the new word
  4. stores the pointer in a hash table.

In my makeNode function, I set node->clean to my "clean" word. I can print the word by cout'ing node->clean. But when I set node->same to NULL, I lose node->clean. I don't lose node->position or node->size. If I remove the line where I assign node->same to to NULL, I do not lose node->clean.

char* clean = cleanse(word);
Word* node = new Word;
node->size = strlen(word);
node->clean = clean;
cout<<"MADE NODE FOR "<<node->clean<<endl;
node->position = position;
cout<<"4 node clean: "<<node->clean<<endl;
node->id = id;
cout<<"5 node clean: "<<node->clean<<endl;
node->same = NULL;
cout<<"6 node clean: "<<node->clean<<endl;
cout<<"node position: "<<node->position<<endl;
cout<<"node size: "<<node->size<<endl;
node->diff = NULL;

yields the following output:

MADE NODE FOR again
4 node clean: again
5 node clean: again
6 node clean:
node position: 1739
node size: 6
0 node clean:
1 node clean:
3 node clean: 

Can anyone help me get past this error? If you need more info, let me know. Thanks in advance!

EDIT: here is the cleanse function.

char* SongSearch::cleanse(char* dirty)
{

string clean;
int iter = 0;
while (!isalnum(dirty[iter]))
{
    iter++;
}
while(dirty[iter]!='\0')
{
    clean += dirty[iter];
    iter++;
}

int backiter = clean.length() - 1;
while(!isalnum(clean[backiter]))
{
    clean.erase(backiter, 1);
    backiter--;
}


char c;
  for (int i = 0; i<clean.length(); i++)
{
    c = tolower(clean[i]);
    clean[i] = c;
}

char* toReturn = (char*)(clean.c_str());
return toReturn;
}

Upvotes: 3

Views: 1842

Answers (4)

Greg Domjan
Greg Domjan

Reputation: 14125

Aside from new and cout this might as well be C.

Some other reading
What are the differences between struct and class in C++?
char * Vs std::string
Remove spaces from std::string in C++
tolower function for C++ strings
How can I negate a functor in C++ (STL)?

Try the following alternative (uncompiled sample)

#include <iostream>
#include <string>
#include <algorithm>
#include <functional>

typedef int SongId;

class Word{
    int position;
    SongId id;
    Word* same;
    Word* diff;

public: 
  const std::string word;

  const int size() const { return clean.length() };

  Word( const std::string& word_, const int position_ = 1739, const int id_ = 0 )
    : clean( cleanse(word_) )
    , position( position_ )
    , id( id_ )
    , same( NULL )
    , diff( NULL )
  {
    cout<<"MADE NODE FOR "<< word_ << "\n"
      <<"node clean: "<< word << "\n"
      <<"node position: "<< position << "\n";
      <<"node size: "<< size() << endl;
  }

  static std::string cleanse( const std::string& dirty)
  {
    string clean( dirty );

// Remove anything thats not alpha num
    clean.erase(remove_if(clean.begin(), clean.end(), std::not1(::isalnum) ), clean.end());
// make it lower case
    std::transform( clean.begin(), clean.end(), clean.begin(), ::tolower);  // or boost::to_lower(str);

    return clean;
  }
};

const char *word = "again ";

int main() {
    Word* node = new Word(word);
}

Upvotes: 0

Steve Jessop
Steve Jessop

Reputation: 279395

The problem is probably that in cleanse, you return clean.c_str().

That pointer value ceases to be valid when clean ceases to exist, which is when the function exits. It is no longer guaranteed to point to anything, so it's pure luck that you're ever seeing the string "again" as expected.

What I suspect happens is that the memory that used to be occupied by the data for the string clean in cleanse, has been re-used for the structure word, but is not immediately overwritten. It just so happens that the byte that used to hold the first a now holds part of the same member of your struct. So, when you write a null pointer to node->same, it has the effect of writing a 0 byte to the location pointed to by node->clean. Thereafter, it appears to point to an empty string.

Upvotes: 2

Steve Jessop
Steve Jessop

Reputation: 279395

You need to reduce your code to a minimal example which displays the problem, and post that.

The following code fails to display the problem. The contents of main and the definition of Word are copied from your code, then I have added code as necessary to get it to compile:

#include <iostream>
#include <cstring>
using namespace std;

typedef int SongId;

struct Word{
    char* clean;
    int size;
    int position;
    SongId id;
    Word* same;
    Word* diff;
};

char *cleanse(const char *w) {
    return (char *)w;
}
const char *word = "again ";
const int position = 1739;
const int id = 0;

int main() {
    char* clean = cleanse(word);
    Word* node = new Word;
    node->size = strlen(word);
    node->clean = clean;
    cout<<"MADE NODE FOR "<<node->clean<<endl;
    node->position = position;
    cout<<"4 node clean: "<<node->clean<<endl;
    node->id = id;
    cout<<"5 node clean: "<<node->clean<<endl;
    node->same = NULL;
    cout<<"6 node clean: "<<node->clean<<endl;
    cout<<"node position: "<<node->position<<endl;
    cout<<"node size: "<<node->size<<endl;
    node->diff = NULL;
}

Output is:

MADE NODE FOR again 
4 node clean: again 
5 node clean: again 
6 node clean: again 
node position: 1739
node size: 6

Upvotes: 2

Charlie Martin
Charlie Martin

Reputation: 112414

Okay, we'd need to actually see the code for some of these to be sure, but here's what the bug is telling you: at some point, you're assigning to something that overwrites or deletes your clean. Since y,ou declare it as a char *, I'm guessing you use it as a pointer to an array of characters, and the odds are good that one array is being aliased to two "clean" pointers in two different Words.

Upvotes: 0

Related Questions