Wlerin
Wlerin

Reputation: 260

What unexpected behaviour can returning a pointer to a char array member cause?

Okay, so. I've been working on a class project (we haven't covered std::string and std::vector yet though obviously I know about them) to construct a time clock of sorts. The main portion of the program expects time and date values as formatted c-strings (e.g. "12:45:45", "12/12/12" etc.), and I probably could have kept things simple by storing them the same way in my basic class. But, I didn't.

Instead I did this:

class UsageEntry {
public:
    ....

    typedef     time_t  TimeType;
    typedef     int     IDType;

    ...
    // none of these getters are thread safe
    // furthermore, the char* the getters return should be used immediately
    // and then discarded: its contents will be modified on the next call
    // to any of these functions.

    const char*         getUserID();
    const char*         getDate();
    const char*         getTimeIn();
    const char*         getTimeOut();

private:
    IDType      m_id;
    TimeType    m_timeIn;
    TimeType    m_timeOut;

    char        m_buf[LEN_MAX];
};

And one of the getters (they all do basically the same thing):

const char* UsageEntry::getDate()
{
    strftime(m_buf, LEN_OF_DATE, "%D", localtime(&m_timeIn));
    return m_buf;
}

And here is a function that uses this pointer:

// ==== TDataSet::writeOut ====================================================
// writes an entry to the output file
void TDataSet::writeOut(int index, FILE* outFile)
{
    // because of the m_buf kludge, this cannot be a single
    // call to fprintf
    fprintf(outFile, "%s,", m_data[index].getUserID());
    fprintf(outFile, "%s,", m_data[index].getDate());
    fprintf(outFile, "%s,", m_data[index].getTimeIn());
    fprintf(outFile, "%s\n", m_data[index].getTimeOut());
    fflush(outFile);
} // end of TDataSet::writeOut

How much trouble will this cause? Or to look at it from another angle, what other sorts of interesting and !!FUN!! behaviour can this cause? And, finally, what can be done to fix it (besides the obvious solution of using strings/vectors instead)?

Somewhat related: How do the C++ library functions that do similar things handle this? e.g. localtime() returns a pointer to a struct tm object, which somehow survives the end of that function call at least long enough to be used by strftime.

Upvotes: 1

Views: 180

Answers (3)

IdeaHat
IdeaHat

Reputation: 7881

The main problem here, as the main problem with most pointer based code, is the issue of ownership. The problem is the following:

  const char* val;
  {
     UsageEntry ue;
     val = ue.getDate();
  }//ue goes out of scope
  std::cout << val << std::endl;//SEGFAULT (maybe, really nasal demons)

Because val is actually owned by ue, you shoot yourself in the foot if they exist in different scopes. You COULD document this, but it is oh-so-much simpler to pass the buffer in as an argument (just like the strftime function does).

(Thanks to odedsh below for pointing this one out)

Another issue is that subsequent calls will blow away the info gained. The example odesh used was

fprintf(outFile, "%s\n%s",ue.getUserID(), ue.getDate());

but the problem is more pervasive:

const char* id = ue.getUserID();
const char* date = ue.getDate();//Changes id!

This violates the "Principal of Least Astonishment" becuase...well, its weird.

This design also breaks the rule-of-thumb that each class should do exactly one thing. In this case, UsageEntry both provides accessors to get the formatted time as a string, AND manages that strings buffer.

Upvotes: 2

Jason C
Jason C

Reputation: 40366

There is not enough information to determine if it will cause trouble because you do not show how you use it. As long as you document the caveats and keep them in mind when using your class, there won't be issues.

There are some common gotchas to watch out for, but hopefully these are common sense:

  • Deleting the UsageEntry will invalidate the pointers returned by your getters, since those buffers will be deleted too. (This is especially easy to run into if using locally declared UsageEntrys, as in MadScienceDream's example.) If this is a risk, callers should create their own copy of the string. Document this.

  • It does not look like m_timeIn is const, and therefore it may change. Calling the getter will modify the internal buffer and these changes will be visible to anything that has that pointer. If this is a risk, callers should create their own copy of the string. Document this.

  • Your getters are neither reentrant nor thread-safe. Document this.

It would be safer to have the caller supply a destination buffer and length as a parameter. The function can return a pointer to that buffer for convenience. This is how e.g. read works.

A strong API can avoid issues. Failing that, good documentation and common sense can also reduce the chance of issues. Behavior is only unexpected if nobody expects it, this is why documentation about the behavior is important: It generally eliminates unexpected behavior.

Think of it like the "CAUTION: HOT SURFACE" warning on top of a toaster oven. You could design the toaster oven with insulation on top so that an accident can't happen. Failing that, the least you can do is put a warning label on it and there probably won't be an accident. If there's neither insulation nor a warning, eventually somebody will burn themselves.


Now that you've edited your question to show some documentation in the header, many of the initial risks have been reduced. This was a good change to make.


Here is an example of how your usage would change if user-supplied buffers were used (and a pointer to that buffer returned):

// ==== TDataSet::writeOut ====================================================
// writes an entry to the output file
void TDataSet::writeOut(int index, FILE* outFile)
{
    char userId[LEN_MAX], date[LEN_MAX], timeIn[LEN_MAX], timeOut[LEN_MAX];
    fprintf(outFile, "%s,%s,%s,%s\n",
        m_data[index].getUserID(userId, sizeof(userId)),
        m_data[index].getDate(date, sizeof(date)),
        m_data[index].getTimeIn(timeIn, sizeof(timeIn)),
        m_data[index].getTimeOut(timeOut, sizeof(timeOut))
    );
    fflush(outFile);
} // end of TDataSet::writeOut

Upvotes: 2

Drax
Drax

Reputation: 13288

How much trouble will this cause? Or to look at it from another angle, what other sorts of interesting and !!FUN!! behaviour can this cause? And, finally, what can be done to fix it (besides the obvious solution of using strings/vectors instead)?

Well there is nothing very FUN here, it just means that the results of your getter cannot outlive the corresponding instance of UsageEntry or you have a dangling pointer.

How do the C++ library functions that do similar things handle this? e.g. localtime() returns a pointer to a struct tm object, which somehow survives the end of that function call at least long enough to be used by strftime.

The documentation of localtime says:

Return value

pointer to a static internal std::tm object on success, or NULL otherwise. The structure may be shared between std::gmtime, std::localtime, and std::ctime, and may be overwritten on each invocation.

Upvotes: 2

Related Questions