Tuesday, January 18, 2011

Handling errors (assert, cerr, exit, and exceptions)

When writing programs, it is almost inevitable that you will make mistakes. In this section, we will talk about the different kinds of errors that are made, and how they are commonly handled.
Errors fall into two categories: syntax and semantic errors.
Syntax errors
A syntax error occurs when you write a statement that is not valid according to the grammar of the C++ language. For example:
if 5 > 6 then write "not equal";
Although this statement is understandable by humans, it is not valid according to C++ syntax. The correct C++ statement would be:
1if (5 > 6)
2    std::cout << "not equal";
Syntax errors are almost always caught by the compiler and are usually easy to fix. Consequently, we typically don’t worry about them too much.
Semantic errors
A semantic error occurs when a statement is syntactically valid, but does not do what the programmer intended. For example:
1for (int nCount=0; nCount<=3; nCount++)
2    std::cout << nCount << " ";
The programmer may have intended this statement to print 0 1 2, but it actually prints 0 1 2 3.
Semantic errors are not caught by the compiler, and can have any number of effects: they may not show up at all, cause the program to produce the wrong output, cause erratic behavior, corrupt data, or cause the program to crash.
It is largely the semantic errors that we are concerned with.
Semantic errors can occur in a number of ways. One of the most common semantic errors is a logic error. A logic error occurs when the programmer incorrectly codes the logic of a statement. The above for statement example is a logic error. Here is another example:
1if (x >= 5)
2    std::cout << "x is greater than 5";
What happens when x is exactly 5? The conditional expression evaluates to true, and the program prints “x is greater than 5″. Logic errors can be easy or hard to locate, depending on the nature of the problem.
Another common semantic error is the violated assumption. A violated assumption occurs when the programmer assumes that something will be either true or false, and it isn’t. For example:
1char strHello[] = "Hello, world!";
2std::cout << "Enter an index: ";
3 
4int nIndex;
5std::cin >> nIndex;
6 
7std::cout << "Letter #" << nIndex << " is " << strHello[nIndex] << std::endl;
See the potential problem here? The programmer has assumed that the user will enter a value between 0 and the length of “Hello, world!”. If the user enters a negative number, or a large number, the array index nIndex will be out of bounds. In this case, since we are just reading a value, the program will probably print a garbage letter. But in other cases, the program might corrupt other variables, the stack, or crash.
Defensive programming is a form of program design that involves trying to identify areas where assumptions may be violated, and writing code that detects and handles any violation of those assumptions so that the program reacts in a predictable way when those violations do occur.
Detecting assumption errors
As it turns out, we can catch almost all assumptions that need to be checked in one of three locations:
  • When a function has been called, the user may have passed the function parameters that are semantically meaningless.
  • When a function returns, the return value may indicate that an error has occured.
  • When program receives input (either from the user, or a file), the input may not be in the correct format.
Consequently, the following rules should be used when programming defensively:
  • At the top of each function, check to make sure any parameters have appropriate values.
  • After a function is called, check it’s return value (if any), and any other error reporting mechanisms, to see if an error occured.
  • Validate any user input to make sure it meets the expected formatting or range criteria.
Let’s take a look at examples of each of these.
Problem: When a function is called, the user may have passed the function parameters that are semantically meaningless.
1void PrintString(char *strString)
2{
3    std::cout << strString;
4}
Can you identify the assumption that may be violated? The answer is that the user might pass in a NULL pointer instead of a valid C-style string. If that happens, the program will crash. Here’s the function again with code that checks to make sure the function parameter is non-NULL:
1void PrintString(char *strString)
2{
3    // Only print if strString is non-null
4    if (strString)
5        std::cout << strString;
6}
Problem: When a function returns, the return value may indicate that an error has occured.
1// Declare an array of 10 integers
2int *panData = new int[10];
3panData[5] = 3;
Can you identify the assumption that may be violated? The answer is that operator new (which actually calls a function to do the allocation) could fail if the user runs out of memory. If that happens, panData will be set to NULL, and when we use the subscript operator on panData, the program will crash. Here’s a new version with error checking:
1// Delcare an array of 10 integers
2int *panData = new int[10];
3// If something went wrong
4if (!panData)
5    exit(2); // exit the program with error code 2
6panData[5] = 3;
Problem: When program receives input (either from the user, or a file), the input may not be in the correct format. Here’s the sample program you saw previously:
1char strHello[] = "Hello, world!";
2std::cout << "Enter an index: ";
3 
4int nIndex;
5std::cin >> nIndex;
6 
7std::cout << "Letter #" << nIndex << " is " << strHello[nIndex] << std::endl;
And here’s the version that checks the user input to make sure it is valid:
01char strHello[] = "Hello, world!";
02 
03int nIndex;
04do
05{
06    std::cout << "Enter an index: ";
07    std::cin >> nIndex;
08} while (nIndex < 0 || nIndex >= strlen(strHello));
09 
10std::cout << "Letter #" << nIndex << " is " << strHello[nIndex] << std::endl;
Handling assumption errors
Now that you know where assumption errors typically occur, let’s finish up by talking about different ways to handle them when they do occur. There is no best way to handle an error — it really depends on the nature of the problem.
Here are some typical responses:
1) Quietly skip the code that depends on the assumption being valid:
1void PrintString(char *strString)
2{
3    // Only print if strString is non-null
4    if (strString)
5        std::cout << strString;
6}
In the above example, if strString is null, we don’t print anything. We have skipped the code that depends on strString being non-null.
2) If we are in a function, return an error code back to the caller and let the caller deal with the problem.
01int g_anArray[10]; // a global array of 10 characters
02 
03int GetArrayValue(int nIndex)
04{
05    // use if statement to detect violated assumption
06    if (nIndex < 0 || nIndex > 9)
07       return -1; // return error code to caller
08 
09    return g_anArray[nIndex];
10}
In this case, the function returns -1 if the user passes in an invalid index.
3) If we want to terminate the program immediately, the exit function can be used to return an error code to the operating system:
01int g_anArray[10]; // a global array of 10 characters
02 
03int GetArrayValue(int nIndex)
04{
05    // use if statement to detect violated assumption
06    if (nIndex < 0 || nIndex > 9)
07       exit(2); // terminate program and return error number 2 to OS
08 
09    return g_anArray[nIndex];
10}
If the user enters an invalid index, this program will terminate immediately (with no error message) and pass error code 2 to the operating system.
4) If the user has entered invalid input, ask the user to enter the input again.
01char strHello[] = "Hello, world!";
02 
03int nIndex;
04do
05{
06    std::cout << "Enter an index: ";
07    std::cin >> nIndex;
08} while (nIndex < 0 || nIndex >= strlen(strHello));
09 
10std::cout << "Letter #" << nIndex << " is " << strHello[nIndex] << std::endl;
5) cerr is another mechanism that is meant specifically for printing error messages. cerr is an output stream (just like cout) that is also defined in iostream.h. Typically, cerr writes the error messages on the screen (just like cout), but it can also be individually redirected to a file.
1void PrintString(char *strString)
2{
3    // Only print if strString is non-null
4    if (strString)
5        std::cout << strString;
6    else
7        std::cerr << "PrintString received a null parameter";
8}
6) If working in some kind of graphical environment (eg. MFC or SDL), it is common to pop up a message box with an error code and then terminate the program.
Assert
Using a conditional statement to detect a violated assumption, along with printing an error message and terminating the program is such a common response to problems that C++ provides a shortcut method for doing this. This shortcut is called an assert.
An assert statement is a preprocessor macro that evaluates a conditional expression. If the conditional expression is true, the assert statement does nothing. If the conditional expression evaluates to false, an error message is displayed and the program is terminated. This error message contains the conditional expression that failed, along with the name of the code file and the line number of the assert. This makes it very easy to tell not only what the problem was, but where in the code the problem occurred. This can help with debugging efforts immensely.
The assert functionality lives in the cassert header, and is often used both to check that the parameters passed to a function are valid, and to check that the return value of a function call is valid.
01int g_anArray[10]; // a global array of 10 characters
02 
03#include <cassert> // for assert()
04int GetArrayValue(int nIndex)
05{
06    // we're asserting that nIndex is between 0 and 9
07    assert(nIndex >= 0 && nIndex <= 9); // this is line 7 in Test.cpp
08 
09    return g_anArray[nIndex];
10}
If the user calls GetValue(-3), the program prints the following message:
Assertion failed: nIndex >= 0 && nIndex <=9, file C:\VCProjects\Test.cpp, line 7
We strongly encourage you to use assert statements liberally throughout your code.
Exceptions
C++ provides one more method for detecting and handling errors known as exception handling. The basic idea is that when an error occurs, it is “thrown”. If the current function does not “catch” the error, the caller of the function has a chance to catch the error. If the caller does not catch the error, the caller’s caller has a chance to catch the error. The error progressively moves up the stack until it is either caught and handled, or until main() fails to handle the error. If nobody handles the error, the program typically terminates with an exception error.

No comments: