advance c++: 16 January 2011

Tuesday, January 18, 2011

Exception dangers and downsides

As with almost everything that has benefits, there are some potential downsides to exceptions as well. This article is not meant to be comprehensive, but just to point out some of the major issues that should be considered when using exceptions (or deciding whether to use them).
Cleaning up resources
One of the biggest problems that new programmers run into when using exceptions is the issue of cleaning up resources when an exception occurs. Consider the following example:

01try
02{
03    OpenFile(strFilename);
04    WriteFile(strFilename, strData);
05    CloseFile(strFilename);
06}
07catch (FileException &cException)
08{
09    cerr << "Failed to write to file: " << cException.what() << endl;
10}

What happens if WriteFile() fails and throws a FileException? At this point, we’ve already opened the file, and now control flow jumps to the FileException handler, which prints an error and exits. Note that the file was never closed! This example should be rewritten as follows:

01try
02{
03    OpenFile(strFilename);
04    WriteFile(strFilename, strData);
05    CloseFile(strFilename);
06}
07catch (FileException &cException)
08{
09    // Make sure file is closed
10    CloseFile(strFilename);
11    // Then write error
12    cerr << "Failed to write to file: " << cException.what() << endl;
13}

This kind of error often crops up in another form when dealing with dynamically allocated memory:

01try
02{
03    Person *pJohn = new Person("John", 18, E_MALE);
04    ProcessPerson(pJohn);
05    delete pJohn;
06}
07catch (PersonException &cException)
08{
09    cerr << "Failed to process person: " << cException.what() << endl;
10}

If ProcessPerson() throws an exception, control flow jumps to the catch handler. As a result, pJohn is never deallocated! This example is a little more tricky than the previous one — because pJohn is local to the try block, it goes out of scope when the try block exits. That means the exception handler can not access pJohn at all (its been destroyed already), so there’s no way for it to deallocate the memory.
However, there are two relatively easy ways to fix this. First, declare pJohn outside of the try block so it does not go out of scope when the try block exits:

01Person *pJohn = NULL;
02try
03{
04    pJohn = new Person("John", 18, E_MALE);
05    ProcessPerson(pJohn );
06    delete pJohn;
07}
08catch (PersonException &cException)
09{
10    delete pJohn;
11    cerr << "Failed to process person: " << cException.what() << endl;
12}

Because pJohn is declared outside the try block, it is accessible both within the try block and the catch handlers. This means the catch handler can do cleanup properly.
The second way is to use a local variable of a class that knows how to cleanup itself when it goes out of scope. The standard library provides a class called std::auto_ptr that can be used for this purpose. std::auto_ptr is a template class that holds a pointer, and deallocates it when it goes out of scope.

01#include <memory> // for std::auto_ptr
02try
03{
04    pJohn = new Person("John", 18, E_MALE);
05    auto_ptr<Person> pxJohn(pJohn); // pxJohn now owns pJohn
06 
07    ProcessPerson(pJohn);
08 
09    // when pxJohn goes out of scope, it will delete pJohn
10}
11catch (PersonException &cException)
12{
13    cerr << "Failed to process person: " << cException.what() << endl;
14}

Note that std::auto_ptr should not be set to point to arrays. This is because it uses the delete operator to clean up, not the delete[] operator. In fact, there is no array version of std::auto_ptr! It turns out, there isn’t really a need for one. In the standard library, if you want to do dynamically allocated arrays, you’re supposed to use the std::vector class, which will deallocate itself when it goes out of scope.
Exceptions and destructors
Unlike constructors, where throwing exceptions can be a useful way to indicate that object creation succeeded, exceptions should not be thrown in destructors.
The problem occurs when an exception is thrown from a destructor during the stack unwinding process. If that happens, the compiler is put in a situation where it doesn’t know whether to continue the stack unwinding process or handle the new exception. The end result is that your program will be terminated immediately.
Consequently, the best course of action is just to abstain from using exceptions in destructors altogether. Write a message to a log file instead.
Performance concerns
Exceptions do come with a small performance price to pay. They increase the size of your executable, and they will also cause it to run slower due to the additional checking that has to be performed. However, the main performance penalty for exceptions happens when an exception is actually thrown. In this case, the stack must be unwound and an appropriate exception handler found, which is a relatively an expensive operation. Consequently, exception handling should only be used for truly exceptional cases and catastrophic errors.

Exceptions, classes, and inheritance

Up to this point in the tutorial, you’ve only seen exceptions used in non-member functions. However, exceptions are equally useful in member functions, and even moreso in overloaded operators. Consider the following overloaded [] operator as part of a simple integer array class:

1int IntArray::operator[](const int nIndex)
2{
3    return m_nData[nIndex];
4}

Although this function will work great as long as nIndex is a valid array index, this function is sorely lacking in some good error checking. We could add an assert statement to ensure the index is valid:

1int IntArray::operator[](const int nIndex)
2{
3    assert (nIndex >= 0 && nIndex < GetLength());
4    return m_nData[nIndex];
5}

Now if the user passes in an invalid index, the program will cause an assertion error. While this is useful to indicate to the user that something went wrong, sometimes the better course of action is to fail silently and let the caller know something went wrong so they can deal with it as appropriate.
Unfortunately, because overloaded operators have specific requirements as to the number and type of parameter(s) they can take and return, there is no flexibility for passing back error codes or boolean values to the caller. However, since exceptions do not change the signature of a function, they can be put to great use here. Here’s an example:

1int IntArray::operator[](const int nIndex)
2{
3    if (nIndex < 0 || nIndex >= GetLength())
4        throw nIndex;
5 
6    return m_nData[nIndex];
7}

Now, if the user passes in an invalid exception, operator[] will throw an int exception.
When constructors fail
Constructors are another area of classes in which exceptions can be very useful. If a constructor fails, simply throw an exception to indicate the object failed to create. The object’s construction is aborted and its destructor is never executed.
Exception classes
One of the major problem with using basic data types (such as int) as exception types is that they are inherently vague. An even bigger problem is disambiguation of what an exception means when there are multiple statements or function calls within a try block.

1try
2{
3    int *nValue = new int(anArray[nIndex1] + anArray[nIndex2]);
4}
5catch (int nValue)
6{
7    // What are we catching here?
8}

In this example, if we were to catch an int exception, what does that really tell us? Was one of the array indexes out of bounds? Did operator+ cause integer overflow? Did operator new fail because it ran out of memory? Unfortunately, in this case, there’s just no easy way to disambiguate. While we can throw char* exceptions to solve the problem of identifying WHAT went wrong, this still does not provide us the ability to handle exceptions from various sources differently.
One way to solve this problem is to use exception classes. An exception class is just a normal class that is designed specifically to be thrown as an exception. Let’s design a simple exception class to be used with our IntArray class:

01class ArrayException
02{
03private:
04    std::string m_strError;
05 
06    ArrayException() {}; // not meant to be called
07public:
08    ArrayException(std::string strError)
09        : m_strError(strError)
10    {
11    }
12 
13    std::string GetError() { return m_strError; }
14}

Here’s our overloaded operator[] throwing this class:

1int IntArray::operator[](const int nIndex)
2{
3    if (nIndex < 0 || nIndex >= GetLength())
4        throw ArrayException("Invalid index");
5 
6    return m_nData[nIndex];
7}

And a sample usage of this class:

1try
2{
3    int nValue = anArray[5];
4}
5catch (ArrayException &cException)
6{
7    cerr << "An array exception occurred (" << cException.GetError() << ")" << endl;
8}

Using such a class, we can have the exception return a description of the problem that occurred, which provides context for what went wrong. And since ArrayException is it’s own unique type, we can specifically catch exceptions thrown by the array class and treat them differently from other exceptions if we wish.
Note that exception handlers should catch class exception objects by reference instead of by value. This prevents the compiler from make a copy of the exception, which can be expensive when the exception is a class object. Catching exceptions by pointer should generally be avoided unless you have a specific reason to do so.
std::exception
The C++ standard library comes with an exception class that is used by many of the other standard library classes. The class is almost identical to the ArrayException class above, except the GetError() function is named what():

1try
2{
3    // do some stuff with the standard library here
4}
5catch (std::exception &cException)
6{
7    cerr << "Standard exception: " << cException.what() << endl;
8}

We’ll talk more about std::exception in a moment.
Exceptions and inheritance
Since it’s possible to throw classes as exceptions, and classes can be derived from other classes, we need to consider what happens when we use inherited classes as exceptions. As it turns out, exception handlers will not only match classes of a specific type, they’ll also match classes derived from that specific type as well! Consider the following example:

01class Base
02{
03public:
04    Base() {}
05};
06 
07class Derived: public Base
08{
09public:
10    Derived() {}
11};
12 
13int main()
14{
15    try
16    {
17        throw Derived();
18    }
19    catch (Base &cBase)
20    {
21        cerr << "caught Base";
22    }
23    catch (Derived &cDerived)
24    {
25        cerr << "caught Derived";
26    }
27 
28    return 0;
29}

In the above example we throw an exception of type Derived. However, the output of this program is:

caught Base

What happened?
First, as mentioned above, derived classes will be caught by handlers for the base type. Because Derived is derived from Base, Derived is-a Base (they have an is-a relationship). Second, when C++ is attempting to find a handler for a raised exception, it does so sequentially. Consequently, the first thing C++ does is check whether the exception handler for Base matches the Derived exception. Because Derived is-a Base, the answer is yes, and it executes the catch block for type Base! The catch block for Derived is never even tested in this case.
In order to make this example work as expected, we need to flip the order of the catch blocks:

01class Base
02{
03public:
04    Base() {}
05};
06 
07class Derived: public Base
08{
09public:
10    Derived() {}
11};
12 
13int main()
14{
15    try
16    {
17        throw Derived();
18    }
19    catch (Derived &cDerived)
20    {
21        cerr << "caught Derived";
22    }
23    catch (Base &cBase)
24    {
25        cerr << "caught Base";
26    }
27 
28    return 0;
29}

This way, the Derived handler will get first shot at catching objects of type Derived (before the handler for Base can). Objects of type Base will not match the Derived handler (Derived is-a Base, but Base is not a Derived), and thus will “fall through” to the Base handler.
Rule: Handlers for derived exception classes should be listed before those for base classes.
The ability to use a handler to catch exceptions of derived types using a handler for the base class turns out to be exceedingly useful.
Let’s take a look at this using std::exception. There are many classes derived from std::exception, such as std::bad_alloc, std::bad_cast, std::runtime_error, and others. When the standard library has an error, it can throw a derived exception correlating to the appropriate specific problem it has encountered.
Most of the time, we probably won’t care whether the problem was a bad allocation, a bad cast, or something else. We just care that we got an exception from the standard library. In this case, we just set up an exception handler to catch std::exception, and we’ll end up catching std::exception and all of the derived exceptions together in one place. Easy!

1try
2{
3     // code using standard library goes here
4}
5// This handler will catch std::exception and all the derived exceptions too
6catch (std::exception &cException)
7{
8    cerr << "Standard exception: " << cException.what() << endl;
9}

However, sometimes we’ll want to handle a specific type of exception differently. In this case, we can add a handler for that specific type, and let all the others “fall through” to the base handler. Consider:

01try
02{
03     // code using standard library goes here
04}
05// This handler will catch std::bad_alloc (and any exceptions derived from it) here
06catch (std::bad_alloc &cException)
07{
08    cerr << "You ran out of memory!" << endl;
09}
10// This handler will catch std::exception (and any exception derived from it) that fall
11// through here
12catch (std::exception &cException)
13{
14    cerr << "Standard exception: " << cException.what() << endl;
15}

In this example, exceptions of type std::bad_alloc will be caught by the first handler and handled there. Exceptions of type std::exception and all of the other derived classes will be caught by the second handler.
Such inheritance hierarchies allow us to use specific handlers to target specific derived exception classes, or to use base class handlers to catch the whole hierarchy of exceptions. This allows us a fine degree of control over what kind of exceptions we want to handle while ensuring we don’t have to do too much work to catch “everything else” in a hierarchy.

Uncaught exceptions, catch-all handlers, and exception specifiers

By now, you should have a reasonable idea of how exceptions work. In this lesson, we’ll cover a few more interesting exception cases.
Uncaught exceptions:
In the past few examples, there are quite a few cases where a function assumes its caller (or another function somewhere up the call stack) will handle the exception. In the following example, MySqrt() assumes someone will handle the exception that it throws — but what happens if nobody actually does?
Here’s our square root program again, minus the try block in main():

01#include "math.h" // for sqrt() function
02using namespace std;
03
04// A modular square root function
05double MySqrt(double dX)
06{
07    // If the user entered a negative number, this is an error condition
08    if (dX < 0.0)
09        throw "Can not take sqrt of negative number"; // throw exception of type char*
10
11    return sqrt(dX);
12}
13
14int main()
15{
16    cout << "Enter a number: ";
17    double dX;
18    cin >> dX;
19
20    // Look ma, no exception handler!
21    cout << "The sqrt of " << dX << " is " << MySqrt(dX) << endl;
22}

Now, let’s say the user enters -4, and MySqrt(-4) raises an exception. MySqrt() doesn’t handle the exception, so the program stack unwinds and control returns to main(). But there’s no exception handler here either, so main() terminates. At this point, we just terminated our application!
When main() terminates with an unhandled exception, the operating system will generally notify you that an unhandled exception error has occurred. How it does this depends on the operating system, but possibilities include printing an error message, popping up an error dialog, or simply crashing. Some OS’s are less graceful than others. Generally this is something you want to avoid altogether!
Catch-all handlers
And now we find ourselves in a condundrum: functions can potentially throw exceptions of any data type, and if an exception is not caught, it will propagate to the top of your program and cause it to terminate. Since it’s possible to call functions without knowing how they are even implemented, how can we possibly prevent this from happening?
Fortunately, C++ provides us with a mechanism to catch all types of exceptions. This is known as a catch-all handler. A catch-all handler works just like a normal catch block, except that instead of using a specific type to catch, it uses the ellipses operator (…) as the type to catch. If you recall from lesson 7.14 on ellipses and why to avoid them, ellipses were previously used to pass arguments of any type to a function. In this context, they represent exceptions of any data type. Here’s an simple example:

01try
02{
03    throw 5; // throw an int exception
04}
05catch (double dX)
06{
07    cout << "We caught an exception of type double: " << dX << endl;
08}
09catch (...) // catch-all handler
10{
11    cout << "We caught an exception of an undetermined type" << endl;
12}

Because there is no specific exception handler for type int, the catch-all handler catches this exception. This example produces the following result:

We caught an exception of an undetermined type

The catch-all handler should be placed last in the catch block chain. This is to ensure that exceptions can be caught by exception handlers tailored to specific data types if those handlers exist. Visual Studio enforces this constraint — I am unsure if other compilers do.
Often, the catch-all handler block is left empty:

`1`	`catch(...) {}` `// ignore any unanticipated exceptions`

This will catch any unanticipated exceptions and prevent them from stack unwinding to the top of your program, but does no specific error handling.
Using the catch-all handler to wrap main()
One interesting use for the catch-all handler is to wrap the contents of main():

01int main()
02{
03
04    try
05    {
06        RunGame();
07    }
08    catch(...)
09    {
10        cerr << "Abnormal termination" << endl;
11    }
12
13    SaveState(); // Save user's game
14    return 1;
15}

In this case, if RunGame() or any of the functions it calls throws an exception that is not caught, that exception will unwind up the stack and eventually get caught by this catch-all handler. This will prevent main() from terminating, and gives us a chance to print an error of our choosing and then save the user’s state before exiting. This can be useful to catch and handle problems that may be unanticipated.
Exception specifiers
This subsection should be considered optional reading because exception specifiers are rarely used in practice, are not well supported by compilers, and Bjarne Stroustrup (the creator of C++) considers them a failed experiment.
Exception specifiers are a mechanism that allows us to use a function declaration to specify whether a function may or will not throw exceptions. This can be useful in determining whether a function call needs to be put inside a try block or not.
There are three types of exception specifiers, all of which use what is called the throw (…) syntax.
First, we can use an empty throw statement to denote that a function does not throw any exceptions outside of itself:

`1`	`int` `DoSomething()` `throw();` `// does not throw exceptions`

Note that DoSomething() can still use exceptions as long as they are handled internally. Any function that is declared with throw() is supposed to cause the program to terminate immediately if it does try to throw an exception outside of itself, but implementation is spotty.
Second, we can use a specific throw statement to denote that a function may throw a particular type of exception:

`1`	`int` `DoSomething()` `throw(double);` `// may throw a double`

Finally, we can use a catch-all throw statement to denote that a function may throw an unspecified type of exception:

`1`	`int` `DoSomething()` `throw(...);` `// may throw anything`

Due to the incomplete compiler implementation, the fact that exception specifiers are more like statements of intent than guarantees, some incompatibility with template functions, and the fact that most C++ programmers are unaware of their existence, I recommend you do not bother using exception specifiers.

Exceptions, functions, and stack unwinding

In the previous lesson on basic exception handling, we explained how throw, try, and catch work together to enable exception handling. This lesson is dedicated to showing more examples of exception handling at work in various cases.
Exceptions within functions
In all of the examples in the previous lesson, the throw statements were placed directly within a try block. If this were a necessity, exception handling would be of limited use.
One of the most useful properties of exception handling is that the throw statements do NOT have to be placed directly inside a try block due to the way exceptions propagate when thrown. This allows us to use exception handling in a much more modular fashion. We’ll demonstrate this by rewriting the square root program from the previous lesson to use a modular function.

01#include "math.h" // for sqrt() function
02using namespace std;
03 
04// A modular square root function
05double MySqrt(double dX)
06{
07    // If the user entered a negative number, this is an error condition
08    if (dX < 0.0)
09        throw "Can not take sqrt of negative number"; // throw exception of type char*
10 
11    return sqrt(dX);
12}
13 
14int main()
15{
16    cout << "Enter a number: ";
17    double dX;
18    cin >> dX;
19 
20    try // Look for exceptions that occur within try block and route to attached catch block(s)
21    {
22        cout << "The sqrt of " << dX << " is " << MySqrt(dX) << endl;
23    }
24    catch (char* strException) // catch exceptions of type char*
25    {
26        cerr << "Error: " << strException << endl;
27    }
28}

In this program, we’ve taken the code that checks for an exception and calculates the square root and put it inside a modular function called MySqrt(). We’ve then called this MySqrt() function from inside a try block. Let’s verify that it still works as expected:

Enter a number: -4
Error: Can not take sqrt of negative number

It does!
The most interesting part of this program is the MySqrt() function, which potentially raises an exception. However, you will note that this exception is not inside of a try block! This essentially means MySqrt is willing to say, “Hey, there’s a problem!”, but is unwilling to handle the problem itself. It is, in essence, delegating that responsibility to its caller (the equivalent of how using a return code passes the responsibility of handling an error back to a function’s caller).
Let’s revisit for a moment what happens when an exception is raised. First, the program looks to see if the exception can be handled immediately (which means it was thrown inside a try block). If not, it immediately terminates the current function and checks to see if the caller will handle the exception. If not, it terminates the caller and checks the caller’s caller. Each function is terminated in sequence until a handler for the exception is found, or until main() terminates. This process is called unwinding the stack (see the lesson on the stack and the heap if you need a refresher on what the call stack is).
Now, let’s take a detailed look at how that applies to this program when MySqrt(-4) is called and an exception is raised.
First, the program checks to see if we’re immediately inside a try block within the function. In this case, we are not. Then, the stack begins to unwind. First, MySqrt() terminates, and control returns to main(). The program now checks to see if we’re inside a try block. We are, and there’s a char* handler, so the exception is handled by the try block within main(). To summarize, MySqrt() raised the exception, but the try/catch block in main() was the one who captured and handled the exception.
At this point, some of you are probably wondering why it’s a good idea to pass errors back to the caller. Why not just make MySqrt() handle it’s own error? The problem is that different applications may want to handle errors in different ways. A console application may want to print a text message. A windows application may want to pop up an error dialog. In one application, this may be a fatal error, and in another application it may not. By passing the error back up the stack, each application can handle an error from MySqrt() in a way that is the most context appropriate for it! Ultimately, this keeps MySqrt() as modular as possible, and the error handling can be placed in the less-modular parts of the code.
Another stack unwinding example
Here’s another example showing stack unwinding in practice, using a larger stack. Although this program is long, it’s pretty simple: main() calls First(), First() calls Second(), Second() calls Third(), Third() calls Last(), and Last() throws an exception.

01#include <iostream>
02using namespace std;
03 
04void Last() // called by Third()
05{
06    cout << "Start Last" << endl;
07    cout << "Last throwing int exception" << endl;
08    throw -1;
09    cout << "End Last" << endl;
10 
11}
12 
13void Third() // called by Second()
14{
15    cout << "Start Third" << endl;
16    Last();
17    cout << "End Third" << endl;
18}
19 
20void Second() // called by First()
21{
22    cout << "Start Second" << endl;
23    try
24    {
25        Third();
26    }
27    catch(double)
28    {
29         cerr << "Second caught double exception" << endl;
30    }
31    cout << "End Second" << endl;
32}
33 
34void First() // called by main()
35{
36    cout << "Start First" << endl;
37    try
38    {
39        Second();
40    }
41    catch (int)
42    {
43         cerr << "First caught int exception" << endl;
44    }
45    catch (double)
46    {
47         cerr << "First caught double exception" << endl;
48    }
49    cout << "End First" << endl;
50}
51 
52int main()
53{
54    cout << "Start main" << endl;
55    try
56    {
57        First();
58    }
59    catch (int)
60    {
61         cerr << "main caught int exception" << endl;
62    }
63    cout << "End main" << endl;
64 
65    return 0;
66}

Take a look at this program in more detail, and see if you can figure out what gets printed and what doesn’t when it is run. The answer follows:

Start main
Start First
Start Second
Start Third
Start Last
Last throwing int exception
First caught int exception
End First
End main

Let’s examine what happens in this case. The printing of all the start statements is straightforward and doesn’t warrant further explanation. Last() prints “Last throwing int exception” and then throws an int exception. This is where things start to get interesting.
Because Last() doesn’t handle the exception itself, the stack begins to unwind. Last() terminates immediately and control returns to the caller, which is Third().
Third() doesn’t handle any exceptions either, so it terminates immediately and control returns to Second().
Second() has a try block, and the call to Third() is within it, so the program attempts to match the exception with an appropriate catch block. However, there are no handlers for exceptions of type int here, so Second() terminates immediately and control returns to First().
First() also has a try block, and the call to Second() is within it, so the program looks to see if there is a catch handler for int exceptions. There is! Consequently, First() handles the exception, and prints “First caught int exception”.
Because the exception has now been handled, control continues normally at the end of the catch block within First(). This means First() prints “End First” and then terminates normally.
Control returns to main(). Although main() has an exception handler for int, our exception has already been handled by First(), so the catch block within main() does not get executed. main() simply prints “End main” and then terminates normally.
There are quite a few interesting principles illustrated by this program:
First, the immediate caller of a function that throws an exception doesn’t have to handle the exception if it doesn’t want to. In this case, Third() didn’t handle the exception thrown by Last(). It delegated that responsibility to one of it’s callers up the stack.
Second, if a try block doesn’t have a catch handler for the type of exception being thrown, stack unwinding occurs just as if there were no try block at all. In this case, Second() didn’t handle the exception either because it didn’t have the right kind of catch block.
Third, once an exception is handled, control flow proceeds as normal starting from the end of the catch blocks. This was demonstrated by First() handling the error and then terminating normally. By the time the program got back to main(), the exception had been thrown and handled already — main() had no idea there even was an exception at all!
As you can see, stack unwinding provides us with some very useful behavior — if a function does not want to handle an exception, it doesn’t not have to. The exception will propagate up the stack until it finds someone who will! This allows us to decide where in the call stack is the most appropriate place to handle any errors that may occur.
In the next lesson, we’ll take a look at what happens when you don’t capture an exception, and a method to prevent that from happening.

Tuesday, January 18, 2011

Exception dangers and downsides

Exceptions, classes, and inheritance

Uncaught exceptions, catch-all handlers, and exception specifiers

Exceptions, functions, and stack unwinding

Total Pageviews