Tuesday, January 18, 2011

The hidden “this” pointer

One of the big questions that new programmers often ask is, “When a member function is called, how does C++ know which object it was called on?”. The answer is that C++ utilizes a hidden pointer named “this”! Let’s take a look at “this” in more detail.
The following is a simple class that holds an integer and provides a constructor and access functions. Note that no destructor is needed because C++ can clean up integers for us.
01class Simple
02{
03private:
04    int m_nID;
05 
06public:
07    Simple(int nID)
08    {
09        SetID(nID);
10    }
11 
12    void SetID(int nID) { m_nID = nID; }
13    int GetID() { return m_nID; }
14};
Here’s a sample program that uses this class:
1int main()
2{
3    Simple cSimple(1);
4    cSimple.SetID(2);
5    std::cout << cSimple.GetID() << std::endl;
6}
Let’s take a closer look at the following line: cSimple.SetID(2);. Although it looks like this function only has one parameter, it actually has two! When you call cSimple.SetID(2);, C++ internally converts this to SetID(&cSimple, 2);. Note that this is just a normal function call where C++ has added a parameter, and automatically passed in the address of the class object!
Since C++ converts the function call, it also needs to convert the function itself. It does so like this:
1void SetID(int nID) { m_nID = nID; }
becomes:
1void SetID(Simple* const this, int nID) { this->m_nID = nID; }
C++ has added a new parameter to the function. The added parameter is a pointer to the class object the class function is working with, and it is always named “this”. The this pointer is a hidden pointer inside every class member function that points to the class object the member function is working with.
Note that m_nID (which is a class member variable) has been converted to this->m_nID. Since “this” is currently pointing to cSimple, this actually resolves to cSimple->m_nID, which is exactly what we wanted!
Most of the time, you never need to explicitly reference the “this” pointer. However, there are a few occasions where it can be useful:
1) If you have a constructor (or member function) that has a parameter of the same name as a member variable, you can disambiguate them by using “this”:
01class Something
02{
03private:
04    int nData;
05 
06public:
07    Something(int nData)
08    {
09        this->nData = nData;
10    }
11};
Note that our constructor is taking a parameter of the same name as a member variable. In this case, “nData” refers to the parameter, and “this->nData” refers to the member variable. Although this is acceptable coding practice, we find using the “m_” prefix on all member variable names provides a better solution by preventing duplicate names altogether!
2) Occasionally it can be useful to have a function return the object it was working with. Returning *this will return a reference to the object that was implicitly passed to the function by C++.
One use for this feature is that it allows a series of functions to be “chained” together, so that the output of one function becomes the input of another function! The following is somewhat more advanced and can be considered optional material at this point.
Consider the following class:
01class Calc
02{
03private:
04    int m_nValue;
05 
06public:
07    Calc() { m_nValue = 0; }
08 
09    void Add(int nValue) { m_nValue += nValue; }
10    void Sub(int nValue) { m_nValue -= nValue; }
11    void Mult(int nValue) { m_nValue *= nValue; }
12 
13    int GetValue() { return m_nValue; }
14};
If you wanted to add 5, subtract 3, and multiply by 4, you’d have to do this:
1Calc cCalc;
2cCalc.Add(5);
3cCalc.Sub(3);
4cCalc.Mult(4);
However, if we make each function return *this, we can chain the calls together. Here is the new version of Calc with “chainable” functions:
01class Calc
02{
03private:
04    int m_nValue;
05 
06public:
07    Calc() { m_nValue = 0; }
08 
09    Calc& Add(int nValue) { m_nValue += nValue; return *this; }
10    Calc& Sub(int nValue) { m_nValue -= nValue; return *this; }
11    Calc& Mult(int nValue) { m_nValue *= nValue; return *this; }
12 
13    int GetValue() { return m_nValue; }
14};
Note that Add(), Sub() and Mult() are now returning *this, which is a reference to the class itself. Consequently, this allows us to do the following:
1Calc cCalc;
2cCalc.Add(5).Sub(3).Mult(4);
We have effectively condensed three lines into one expression! Let’s take a closer look at how this works.
First, cCalc.Add(5) is called, which adds 5 to our m_nValue. Add() then returns *this, which is a reference to cCalc. Our expression is now cCalc.Sub(3).Mult(4). cCalc.Sub(3) subtracts 3 from m_nValue and returns cCalc. Our expression is now cCalc.Mult(4). cCalc.Mult(4) multiplies m_nValue by 4 and returns cCalc, which is then ignored. However, since each function modified cCalc as it was executed, cCalc now contains the value ((0 + 5) – 3) * 4), which is 8.
Although this is a pretty contrived example, chaining functions in such a manner is common with String classes. For example, it is possible to overload the + operator to do a string append. If the + operator returns *this, then it becomes possible to write expressions like:
1cMyString = "Hello " + strMyName + " welcome to " + strProgramName + ".";
And it is pretty easy to see the benefit in being able to do that! We will cover overloading the + operator (and other operators) in a future lesson.
The important point to take away from this lesson is that the “this” pointer is a hidden parameter of any member function. Most of the time, you will not need to access it directly. It’s worth noting that “this” is a const pointer — you can change the value of the object it points to, but you can not make it point to something else!

No comments: