Thursday, September 16, 2010

C++: What if I want to point to more than one item?

At least the formatting would be okay this time... Thanks to a consultant's
suggestion, I've changed the formatting a bit.

As we've seen, pointers are used to hold the address of a variable to which
it points to. This has one problem though: what if we want to work with more
than one item at a time? To help out with this task, we can use dynamic
memory - a memory that is allocted at runt time. Also, to traverse a set of
items, we can employ memory arithmetic - moving from one item to next
depending on number of bytes that a given item occupies in RAM.

Let's come back to our basic "car" class - where we have model, manufactured
year and company name as member variables. Suppose if you are a Nissan
dealer who needs someone to print a list of Nissans available, and suppose
there are five cars left after a busy day of sale. The programmer (me, in
this case), is thinking of employing pointers to point to Nissan car
objects, but is wondering how to store five cars and have the pointer point
to it.

To help me out, what would be your suggestion? As it turns out, after
noticing that all Nissans share a common attribute - that they are cars -
allows me to employ a dynamic array of cars to help me print out what cars
we have left at our dealership. So, I started to write...

Car * car_ptr = new Car[5];

Remember a similar code from last time? We've just declared a dynamic array
of five cars. The keyword "new" means that we wish to use dynamic memory
(stored on the heap, as opposed to stack, which holds static variables; I'll
come back to it in a second). Just like arrays, we use brackets ([]) to tell
the computer how many items we would like to use - so that we can use index
0 through 4. This is one form of declaring a collection of dynamic memory.
Note that the pointer would point to the first of these elements (index
zero).

But if your goal is to declare just one item and put it in dynamic memory,
we could employ new keyword as follows: Suppose if I have a character called
"C" that I'd like to put it up on dynamic memory, and have a character
pointer point to it. We can write it as follows:

char * char_ptr = new char)"C"); // Hope my syntax was correct...

Or, suppose if I have a double called "3.0" and wish to stored it on dynamic
memory, we can do this:

double * double_ptr = new double(3.0); // I think this is better.

The above two lines just declares a pointer to point to an item on the
dynamic memory (only one item, not an array of items). As you'll learn
later, if you forget about your declaration, a troubling consequence may
occur - especially if you've declared a dynamic array.

Now back to the dealership. What if I want to assign a particular car model
to a specific index? We can use memory arithmetic to help us with this task.
When this is invoked, moving from one item to next is dictated by how many
bytes does a given object use in memory. Common sizes are 4 bytes for an
integer, 8 bytes for a double and 1 byte for characters (strings is a
special case, as it is an array of characters). For example, if I have four
integers (starting at address 100), if I use memory arithmetic, index 1 of
this array would get the address 104, index 2 at 108 and index 3 at 112.

But counting bytes in memory is very tedious (note that RAM addresses are
expressed in hexadecimal numbers - base 16). To make coding faster, we use
subscript operator ([]) to help out with this task. All we need to access a
given item in a set of items is via its index location. For instance, if I
want to access the second car in an array of dynamic cars, we'd write:

Car_ptr[1] = whatever; // remember our discussion on zero-based indexing.

So, the problem of assigning a car model in an item in our car array has
been fixed - and even allowing me to assign the same car model to all of
them has been fixed via for loop, as shown below (pseudo code first, then
C++ code):

Pseudo code:
1. First, find out how many cars are there. Here, we don't need to do that,
since we do know how many there are (5).
2. Use for loop to assign these car objects to the same manufacturer, model
and manufactured year. We would use subscript operator to assign items in
the car object to same values.

In C++, we'd get:

For (int i = 0; i < 5; i++)
{
Car Nissan (2008, "Nissan", "Ultima");
Car_ptr[i] = Nissan;
} // If I remember right.

At least, three quarters of my job is done. The only thing left is printing
it: same principle as assignment code above.

For (int i = 0; i < 5; i++)
{
Car_ptr[i]->print();
}

Ah, yes, I should have gone over the arrow operator. Whenever we access data
that a pointer points to (including member functions), we use arrows (->)
instead of dot (.) operator. For instance, if we have a table class and want
to print its length (given if length variable (an integer) is declared in
there), we'd write:

Table test (3, 5); // Length = 3, height = 5.
Table * test_ptr = &test; // The address of our test table.
Cout << test_ptr->length; // Print the length of this table that the pointer
is pointing at.

Look at the last line. Instead of the above line, the compiler would give an
error if you write:

Cout << test_ptr.length; // Uh oh...

Thus, whenever we work with pointers that needs to access data (for our
classes), we'd use arrows.

One last bit: What if you forget about assigning dynamic pointers and let
the program terminate? The next time you run a program, there might not be
any more room for you to work with data. This is the consequence of memory
leak where a program would forget to deallocate (remove, well, almost)
memory when it is done. To help avoid this problem, we'd use "delete"
operator to return memory locations that we've used back to the operating
system.

Depending on the memory we've used, there are two ways that relies on
presence of subscript operator:
* If you used only one dynamic variable, it is sufficient to write:
delete dynamic_ptr;
* If you've used a dynamic array, then it is CRUCIAL to put subscript
operator, otherwise only the first of these items (index 0) will be
deallocated, leaving the rest of the array "floating around":
Delete [] array_ptr;

So, to clean up our dealership and other dynamic variables we've used, I'd
write:

delete [] car_ptr; // Forgetting the subscript operator produces memory
leak.
Delete double_ptr; // Only one double was used.
Delete test_ptr; // Only one test table.
Delete char_ptr; // Only one character.

I think that's all I can think of in terms of memory allocation (mostly
dealing with dynamic memory allocation) and memory arithmetic. Next time,
I'll discuss the self-assignment (this->) operator, as well as more on stack
versus heap.
// JL

2 comments:

  1. The formatting you did here is MUCH better than the previous posts and it is so much easier to look through all the information (although, making two separate entries to cover the examples would make it easier to read through for those who don't want to scroll too far).

    There is one problem I noticed when you used car_ptr. When you accessed car_ptr by brackets ("[]"), that is the same as dereferencing it at a certain address (car_ptr[3] == *(car_ptr + 3)). So when you use car_ptr[i]->print(), it is incorrect, because that is the same as dereferencing twice and trying to use it's member function.

    Example: (*(*(car_ptr + i)).print())

    To make things more easier to understand, you may want to have a paragraph with the list of variables/pointers you will be using at the beginning so readers can keep track of it when you mention deleting.

    All in all, I like this example.

    ReplyDelete