Jump to content

Pointer arithmetic

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Neilc (talk | contribs) at 16:05, 4 July 2005 (checkpoint). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Pointer arithmetic is a particular arithmetic involving pointers, typical of the C programming language.

In pointer arithmetic, the unit is the size of the pointer's type. For example, adding 1 to a pointer to 4-byte integer values will increment the pointer by 4. This has the affect of incrementing the pointer to point at the next integer in a contiguous array of integers -- which is often the intended result.

Pointer arithmetic provides the programmer with a single way of dealing with different types: adding and subtracting the number of elements required instead of the actual offset in bytes. In particular, the C definition explicitly declares that the syntax a[n], which is the n-th element of the array pointed by a, is equivalent to *(a+n), which is the content of the element pointed by a+n.

While powerful, pointer arithmetic can be a source of computer bugs. It tends to confuse programmers, forcing them into different contexts: an expression can be an ordinary arithmetic one or a pointer arithmetic one, and sometimes it is easy to mistake one for the other.

Many modern high level computer languages (for example Java) do not permit direct access to memory using addresses, so concepts of pointers and pointer arithmetic are not relevant. This is deliberate, as many programming tasks do not require specific knowledge of where and how in computer memory data is stored.

Pointer manipulation in C/C++

Shown below are some pointer manipulation techniques.

Declaration and initialisation

The basic rules to successfully use pointers are the same as those for normal variables:

  1. A pointer variable is a "name" given to a memory location, just like any other variable.
  2. As any variable, the data that the pointer variable contains is not initialised.
  3. The data contained in a pointer variable is interpreted by the run-time as an address.

Points 1, 2 and 3 together mean that it is dangerous to use pointer variables without initialising them. For example,


int *p; /* Variable of type pointer to int */
int x;  /* Variable of type int */

*p = 10; /* Undefined behavior: p contains an unitialized value */ 
p = NULL; /* Valid: set p to the null pointer */
*p = 10; /* Will crash: writing through the null pointer is not allowed */ 
p = malloc(sizeof(int)); /* Make
/* For C. Valid, can be used since we are now asking the system
                           to allocate a valid address into p. */
/* OR */
p = &x; /* Both C/C++. Valid, here we are putting the address of x into p. */
*p = 10; /* Will work, since p now contains a valid address. */
printf("%d", *p); //Will work, since p now contains a valid address.

Using pointers

Once a pointer has been initialised, it can be used to perform various operations on the memory block that it points to.

int *p; //Pointer to an int datatype.
int x; //Variable of datatype int.

p = &x; //Initialise the pointer with the address of x.

x = 10;
printf("x = %d", *p);

Prints:
x = 10

*p = 100;
printf("x = %d", x);

Prints:
x = 100

The above shows that x or *p refer to the same memory location and modifying either causes the change to appear in the other. The difference between x and p is that while x literally is the name given to a memory location, p stores the address of that memory location as it's data. The asterisk (*) that is put before the pointer variable p tells the run-time, "Go to the memory location identified by the address stored in variable p and do something to the data stored at that memory location...", and the process of putting an asterisk (*) before a pointer variable is known as indirection. Of course, p itself is also bound to be literally the name of another memory location just as x is. Hence,

printf("%X", &x); //Address of x.

Prints:
0xAABBEFF7 (This is the address that x is the name of)

printf("%X", p); //Data contained in p -> the address of x.

Prints:
0xAABBEFF7 (This is the address that p contains, and is the address that x represents)

printf("%X", &p); //Address of p itself.

Prints:
0xFFCC4433 (This is the address that p is the name of)

NOTE
1)The addresses "0xAA..." and "0xFF..." are numbers in hexadecimal form and this is the usual way to display memory addresses.
2) The addresses displayed are for example only. In practice every time the above code is executed, the addresses will change - they are decided by the system at run time.

Pointers initialised (i.e., memory allocated) using the malloc(), calloc() or realloc() functions or the new operator must be reclaimed explicitly since, when the program (or function) terminates, although the pointer variable is destroyed (i.e., the memory which the variable represents is reclaimed), the memory location (address) that the pointer was pointing to (i.e., allocated memory) might not be reclaimed - this un-reclaimed memory, over time can cause the system to run out of memory and constitutes a memory leak. Memory must be reclaimed explicitely before the program or function terminates.
Memory allocated using the malloc(), calloc() or realloc() functions must be reclaimed by using the free() function.
Memory allocated using the new operator must be reclaimed by using the delete operator.

free(p); //For memory allocated using malloc(), calloc(), realloc()
delete p; //For memory allocated using the new operator

A complete example

#include <stdio.h>
#include <stdlib.h>
void someFunction(void)
{
   int x; //Variable of type int.
   int *p1, *p2, *p3; //Pointers to type int.

   p1 = &x; //Initialise p1 with address of x.
   p2 = malloc(sizeof(int)); //For C. Initialise using malloc().
   p3 = new int; //For C++. Initialise using new.

   *p1 = 100;
   printf("x = %d, *p1 = %d \n", x, *p1);

   *p2 = 200;
   printf("*p2 = %d \n", *p2);

   *p3 = 300;
   printf("*p3 = %d \n", *p3);

   //p1 need not be reclaimed since it has the address of x which itself is a going to be destroyed.

   //p2 has been initialised using malloc(), so reclaim memory it points to using free().
   free(p2);

   //p3 has been initialised using new, so reclaim memory it points to using delete.
   delete p3;
 }

int main(void)
{
   someFunction();
   return 0;
}

Output:

x = 100, *p1 = 100

  • p2 = 200
  • p3 = 300

</nowiki>