User Contributed Dictionary
Noun
pointers- Plural form of pointer.
- tips
- address spaces within a computers memory (See: C programming)
Extensive Definition
In computer
science, a pointer is a programming
language data type whose
value refers directly to (or "points to") another value stored
elsewhere in the computer
memory using its address.
Obtaining the value to which a pointer refers is called
dereferencing the pointer. A pointer is a simple implementation of
the general
reference data type (although it is quite different from the
facility referred to as a reference
in C++). Pointers to data improve performance for repetitive
operations such as traversing
string and tree
structures, and pointers to
functions are used for binding
methods
in Object-oriented
programming and
run-time linking to Dynamic Link Libraries (DLLs).
While "pointer" has been used to refer to
references in general, it more properly applies to data structures
whose interface explicitly allows the pointer to be manipulated as
a memory address. Because pointers allow largely unprotected access
to memory addresses, there are risks associated with using them.
For general information about references, see
reference (computer science).
Pointers in data structures
When setting up data structures like lists, queues and trees, it is necessary to have pointers to help manage the way in which the structure is implemented and controlled. Typical examples of pointers would be start pointers, end pointers, or stack pointers.Architectural roots
Pointers are a very thin abstraction on top of the addressing capabilities provided by most modern architectures. In the simplest scheme, an address, or a numeric index, is assigned to each unit of memory in the system, where the unit is typically either a byte or a word, effectively transforming all of memory into a very large array. Then, if we have an address, the system provides an operation to retrieve the value stored in the memory unit at that address.In the usual case, a pointer is large enough to
hold more addresses than there are units of memory in the system.
This introduces the possibility that a program may attempt to
access an address which corresponds to no unit of memory, either
because not enough memory is installed or the architecture does not
support such addresses. The first case may, in certain platforms as
the Intel x86
architecture, be called a segmentation
fault (segfault). The second case is possible in the current
implementation of AMD64, where
pointers are 64 bit long and addresses only extend to 48 bits.
There, pointers must conform to certain rules (canonical
addresses), so if a noncanonical pointer is dereferenced, the
processor raises a general
protection fault.
On the other hand, some systems have more units
of memory than there are addresses. In this case, a more complex
scheme such as memory
segmentation or paging is employed to use
different parts of the memory at different times. The last
incarnations of the x86 architecture support up to 36 bits of
physical memory addresses, which were mapped to the 32-bit linear
address space through the PAE
paging mechanism. Thus, only 1/16 of the possible total memory may
be accessed at a time. Another example in the same computer family
was the 16-bit protected
mode of the 80286 processor,
which, though supporting only 16 MiB of physical memory, could
access up to 1 GiB of virtual memory, but the combination of 16-bit
address and segment registers made accessing more than 64 KiB in
one data structure cumbersome. Some restrictions of ANSI pointer
arithmetic may have been due to the segmented memory models of this
processor family.
In order to provide a consistent interface, some
architectures provide memory-mapped
I/O, which allows some addresses to refer to units of memory
while others refer to device
registers of other devices in the computer. There are analogous
concepts such as file offsets, array indices, and remote object
references that serve some of the same purposes as addresses for
other types of objects.
Uses
Pointers are directly supported without restrictions in languages such as C, C++, Pascal and most assembly languages. They are primarily used for constructing references, which in turn are fundamental to constructing nearly all data structures, as well as in passing data between different parts of a program.In functional programming languages that rely
heavily on lists, pointers and references are managed abstractly by
the language using internal constructs like cons.
When dealing with arrays, the critical lookup
operation typically involves a stage called address calculation
which involves constructing a pointer to the desired data element
in the array. In other data structures, such as linked lists,
pointers are used as references to explicitly tie one piece of the
structure to another.
Pointers are used to pass parameters by
reference. This is useful if we want a function's modifications to
a parameter to be visible to the function's caller. This is also
useful for returning multiple values from a function.
C pointers
The basic syntax to define a pointer isint *money;
This declares money as a pointer to an integer.
Since the contents of memory are not guaranteed to be of any
specific value in C, care must be taken to ensure that the address
that money points to is valid. This is why it is suggested to
initialize the pointer to NULL
int *money = NULL;
If a NULL pointer is dereferenced then a runtime
error will occur and execution will stop likely with a segmentation
fault.
Once a pointer has been declared then, perhaps,
the next logical step is to point it at something
int a = 5; int *money = NULL; money =
&a;
This assigns the value of money to be the address
of a. For example, if a is stored at memory location of 0x8130 then
the value of money will be 0x8130 after the assignment. To
dereference the pointer, an asterisk is used again
- money = 8;
This says to take the contents of money (which is
0x8130), go to that address in memory and set its value to 8. If a
were then accessed then its value will be 8.
This example may be more clear if memory were
examined directly. Assume that a is located at address 0x8130 in
memory and money at 0x8134; also assume this is a 32-bit machine
such that an int is 32-bits wide. The following is what would be in
memory after the following code snippet were executed
int a = 5; int *money = NULL;
(The NULL pointer shown here is 0x00000000.) By
assigning the address of a to money
money = &a;
yields the following memory values
Then by dereferencing money by doing
*money = 8;
the computer will take the contents of money
(which is 0x8130), go to that address, and assign 8 to that
location yielding the following memory.
Clearly, accessing a will yield the value of 8
because the previous instruction modified the contents of a by way
of the pointer money.
C arrays
Taking C pointers to the next step is the array.In C, array indexing is formally defined in terms
of pointer arithmetic; that is, the language specification requires
that array[i] be equivalent to *(array + i). Thus in C, arrays can
be thought of as pointers to consecutive areas of memory, and the
syntax for accessing arrays is identical for that which can be used
to dereference pointers. For example, an array array can be
declared and used in the following manner:
int array[5]; /* Declares 5 contiguous integers
*/ int *ptr = array; /* Arrays can be used as pointers */ ptr[0] =
1; /* Pointers can be indexed with array syntax */
- (array + 1) = 2; /* Arrays can be dereferenced with pointer syntax */
This allocates a block of five integers and
declares array as a pointer to this block. Another common use of
pointers is to point to dynamically allocated memory from malloc which returns a
consecutive block of memory of no less than the requested size that
can be used as an array.
While most operators on arrays and pointers are
equivalent, it is important to note that the sizeof operator will
differ. In this example, sizeof(array) will evaluate to
5*sizeof(int) (the size of the array), while sizeof(ptr) will
evaluate to sizeof(int*), the size of the pointer itself.
Default values of an array can be declared
like:
int array[5] = ;
If you assume that array is located in memory
starting at address 0x1000 on a 32-bit little-endian
machine then memory will contain the following:
Represented here are five integers: 2, 4, 3, 1,
and 5. These five integers occupy 32 bits (4 bytes) each with the
least-significant byte stored first (this is a little-endian
architecture) and are stored consecutively starting at address
0x1000.
The syntax for C with pointers is:
- array means 0x1000
- array+1 means 0x1004 (note that the "+1" really means to add one times the size of an int (4 bytes) not literally "plus one")
- *array means to dereference the contents of array which means to consider the contents as a memory address (0x1000) and to go look up the value at that memory location (0x1000)
- array[i] means the ith index of array which is translated into *(array + i)
The last example is how to access the contents of
array. Breaking it down:
- array + i is the memory location of the ith element of array
- *(array + i) takes that memory address and dereferences it to access the value.
E.g. array[3] is synonymous with *(array+3),
meaning *(0x1000 + 3*sizeof(int)), which says "dereference the
value stored at 0x100C", in this case 0x0001.
C linked list
Below is an example of the definition of a linked list in C./* the empty linked list is * represented by NULL
or some * other signal value */
- define EMPTY_LIST NULL
struct link ;
Note that this pointer-recursive definition is
essentially the same as the reference-recursive definition from the
Haskell programming language: data Link a = Nil | Cons a (Link
a) Nil is the empty list, and Cons a (Link a) is a cons cell of type a with another
link also of type a.
The definition with references, however, is
type-checked and doesn't use potentially confusing signal values.
For this reason, data structures in C are usually dealt with via
wrapper
functions, which are carefully checked for correctness.
Pass by reference
Pointers can be used to pass variables by reference, allowing their value to be changed. For example:void not_alter(int n)
void alter(int *n)
void func(void)
Memory-mapped hardware
On some computing architectures, pointers can be used to directly manipulate memory or memory-mapped devices.Assigning addresses to pointers is an invaluable
tool when programming microcontrollers. Below
is a simple example declaring a pointer of type int and
initialising it to a hexadecimal address in this
example the constant 0x7FFF:
int *hardware_address = (int *)0x7FFF;
In the mid 80s, using the BIOS to access the
video capabilities of PCs was slow. Applications that were
display-intensive typically used to access CGA
video memory directly by casting the hexadecimal constant
0xB8000000 to a pointer to an array of 80 unsigned 16-bit int
values. Each value consisted of an ASCII code in the low
byte, and a colour in the high byte. Thus, to put the letter 'A' at
row 5, column 2 in bright white on blue, one would write code like
the following:
- define VID ((unsigned (*)[80])0xB8000000)
void foo()
Typed pointers and casting
In many languages, pointers have the additional restriction that the object they point to has a specific type. For example, a pointer may be declared to point to an integer; the language will then attempt to prevent the programmer from pointing it to objects which are not integers, such as floating-point numbers, eliminating some errors.For example, in C
int *money; char *bags;
money would be an integer pointer and bags would
be a char pointer. The following would yield a compiler warning of
"assignment from incompatible pointer type" under GCC
bags = money;
because money and bags were declared with
different types. To suppress the compiler warning, it must be made
explicit that you do indeed wish to make the assignment by typecasting
it
bags = (char *)money;
which says to cast the integer pointer of money
to a char pointer and assign to bags.
In languages that allow pointer arithmetic,
arithmetic on pointers takes into account the size of the type. For
example, adding an integer number to a pointer produces another
pointer that points to an address that is higher by that number
times the size of the type. This allows us to easily compute the
address of elements of an array of a given type, as was shown in
the C arrays example above. When a pointer of one type is cast to
another type of a different size, the programmer should expect that
pointer arithmetic will be calculated differently. In C, for
example, if the money array starts at 0x2000 and sizeof(int) is 4
bytes whereas sizeof(char) is 2 bytes, then (money+1) will point to
0x2004 but (bags+1) will point to 0x2002. Other risks of casting
include loss of data when "wide" data is written to "narrow"
locations (e.g. bags[0]=65537;), unexpected results when bit-shifting
values, and comparison problems, especially with signed vs unsigned
values.
Although it's impossible in general to determine
at compile-time which casts are safe, some languages store run-time
type information which can be used to confirm that these
dangerous casts are valid at runtime. Other languages merely accept
a conservative approximation of safe casts, or none at all.
Making pointers safer
Because pointers allow a program to access objects that are not explicitly declared beforehand, they enable a variety of programming errors. However, the power they provide is so great that it can be difficult to do some programming tasks without them. To help deal with their problems, many languages have created objects that have some of the useful features of pointers, while avoiding some of their pitfalls.One major problem with pointers is that as long
as they can be directly manipulated as a number, they can be made
to point to unused addresses or to data which is being used for
other purposes. Many languages, including most
functional programming languages and recent imperative
languages like
Java, replace pointers with a more opaque type of reference,
typically referred to as simply a reference, which can only be used
to refer to objects and not manipulated as numbers, preventing this
type of error. Array indexing is handled as a special case.
A pointer which does not have any address
assigned to it is called a wild
pointer. Any attempt to use such uninitialized pointers can
cause unexpected behaviour, either because the initial value is not
a valid address, or because using it may damage the runtime system
and other unrelated parts of the program.
In systems with explicit memory allocation, it's
possible to create a dangling
pointer by deallocating the memory region it points into. This
type of pointer is dangerous and subtle because a deallocated
memory region may contain the same data as it did before it was
deallocated but may be then reallocated and overwritten by
unrelated code, unknown to the earlier code. Languages with
garbage collection prevent this type of error.
Some languages, like C++, support smart
pointers, which use a simple form of reference
counting to help track allocation of dynamic memory in addition
to acting as a reference. In the absence of reference cycles, where
an object refers to itself indirectly through a sequence of smart
pointers, these eliminate the possibility of dangling pointers and
memory leaks. Delphi
strings support reference counting natively.
The null pointer
A null pointer has a reserved value, often but not necessarily the value zero, indicating that it refers to no object. Null pointers are used routinely, particularly in C and C++ where the compile-time constant NULL is used, to represent conditions such as the lack of a successor to the last element of a linked list, while maintaining a consistent structure for the list nodes. This use of null pointers can be compared to the use of null values in relational databases and to the “Nothing” value in the “Maybe” monad.Because it does not refer to a meaningful object,
an attempt to dereference a null pointer usually causes a run-time
error that, if unhandled, terminates the program immediately. In
the case of C, execution halts with a segmentation fault because
the literal address of NULL is never allocated to a running
program. In Java, access to a null reference triggers a NullPointerException,
which can be caught by error handling code, but the preferred
practice is to ensure that such exceptions never occur. In safe
languages a possibly-null pointer can be replaced with a tagged union
which enforces explicit handling of the exceptional case; in fact,
a possibly-null pointer can be seen as a tagged union with a
computed tag.
In C and C++ programming, two null pointers are
guaranteed to compare equal; ANSI C guarantees
that any NULL pointer will be equal to 0 in a comparison with an
integer type.
A null pointer should not be confused with an
uninitialized pointer: a null pointer is guaranteed to compare
unequal to any valid pointer, whereas depending on the language and
implementation an uninitialized pointer might have either an
indeterminate (random or meaningless) value or might be initialised
to an initial constant (possibly but not necessarily NULL).
In most C programming environments malloc returns a NULL pointer if
it is unable to allocate the memory region requested, which
notifies the caller that there is insufficient memory available.
However, some implementations of malloc allow malloc(0) with the
return of a NULL pointer and instead indicate failure by both
returning NULL and setting errno to an appropriate
value.
Computer systems based on a tagged
architecture are able to distinguish in hardware between a NULL
dereference and a legitimate attempt to access a word or structure
at address zero.
In some programming language environments (at
least one proprietary Lisp implementation, for example) the value
used as the null pointer (called nil in Lisp) may actually be a
pointer to a block of internal data useful to the implementation
(but not explicitly reachable from user programs), thus allowing
the same register to be used as a useful constant and a quick way
of accessing implementation internals. This is known as the nil
vector.
Double indirection
In C, it is possible to have a pointer point at another pointer. Although a higher number of pointer dereferences will add a performance penalty, this can make manipulating certain data structures particularly neat and elegant. For instance, consider this code to insert an item into a simple linked list:struct element ;
struct element *head = NULL;
void insert(struct element *item)
Wild pointers
Wild pointers are pointers that have not been initialized (that is, set to point to a valid address) and may make a program crash or behave oddly. In the Pascal or C programming languages, pointers that are not specifically initialized may point to unpredictable addresses in memory.The following example code shows a wild
pointer:
int func(void) Here, p2 may point to anywhere in
memory, so performing the assignment *p2 = 'b' will corrupt an
unknown area of memory that may contain sensitive data.
Note that in C and derived languages static
variables without an initializer is initialized to zero on the
program's start. Thus, the example above will dereference a NULL
pointer which will lead to a segmentation fault.
Support in various programming languages
A number of languages support some type of pointer, although some are more restricted than others. If a pointer is significantly abstracted, such that it can no longer be manipulated as an address, the resulting data structure is no longer a pointer; see the more general reference article for more discussion of these.Ada
Ada is a strongly typed language where all pointers are typed and only safe type conversions are permitted. All pointers are by default initialized to null, and any attempt to access data through a null pointer causes an exception to be raised. Pointers in Ada are called access types. Ada 83 did not permit arithmetic on access types (although many compiler vendors provided for it as a non-standard feature), but Ada 95 supports “safe” arithmetic on access types via the package System.Storage_Elements.BASIC
BASIC does not support pointers. Some dialects of BASIC, including FreeBASIC, have exhaustive pointer implementations, however.In FreeBASIC, maths on ANY pointers (equivalent
to C's void*) are treated as though the ANY pointer was a byte
width. ANY pointers cannot be dereferenced, as in C. Also, casting
between ANY and any other type's pointers will not generate any
warnings.
dim as integer f = 257 dim as any ptr g = @f dim
as integer ptr i = g assert(*i = 257) assert( (g + 4) = (@f + 1)
)
C and C++
In C and C++ pointers are variables that store addresses and can be null. Each pointer has a type it points to, but one can freely cast between pointer types. A special pointer type called the “void pointer” points to an object of unspecified type and cannot be dereferenced. The address can be directly manipulated by casting a pointer to and from an integral type of sufficient size (not defined in the language itself, but possibly in standard headers).C++ fully supports C
pointers and C typecasting. It also supports a new group of
typecasting operators to help catch some unintended dangerous casts
at compile-time. The C++
standard library also provides auto_ptr, a sort
of smart
pointer which can be used in some situations as a safe
alternative to primitive C pointers. C++ also supports another form
of reference, quite different from a pointer, called simply a
reference
or reference type.
Pointer arithmetic, that is, the ability to
modify a pointer's target address with arithmetic operations (as
well as magnitude comparisons), is restricted by the language
standard to remain within the bounds of a single array object (or
just after it), though many non-segmented architectures will allow
for more lenient arithmetic. Adding or subtracting from a pointer
moves it by a multiple of the size of the datatype it points to. For
example, adding 1 to a pointer to 4-byte integer values will
increment the pointer by 4. This has the effect of incrementing the
pointer to point at the next element in a contiguous array of
integers -- which is often the intended result. Pointer arithmetic
cannot be performed on void pointers because the void type has
no size, and thus the pointed address can not be added to. For
working 'directly' with bytes they usually cast pointers to BYTE*,
or unsigned char* if BYTE isn't defined in the standard library
used.
Pointer arithmetic provides the programmer with a
single way of dealing with different types: adding and subtracting
the number of elements required instead of the actual offset in
bytes. (though the char pointer, char being defined as always
having a size of one byte, allows the element offset of pointer
arithmetic to in practice be equal to a byte offset) In particular,
the C definition explicitly declares that the syntax a[n], which is
the n-th element of the array a, is equivalent to *(a+n), which is
the content of the element pointed by a+n. This implies that n[a]
is equivalent to a[n].
While powerful, pointer arithmetic can be a
source of computer
bugs. It tends to confuse novice programmers, forcing them
into different contexts: an expression can be an ordinary
arithmetic one or a pointer arithmetic one, and sometimes it is
easy to mistake one for the other. In response to this, many modern
high level computer languages (for example
Java) do not permit direct access to memory using addresses.
Also, the safe C dialect
Cyclone addresses many of the issues with pointers. See
C programming language for more criticism.
The void pointer, or void*, is supported in ANSI
C and C++ as a generic pointer type. A pointer to void can store an
address to any data type, and, in C, is automatically cast to any
other pointer type on assignment, but it must be explicitly cast if
dereferenced inline. K&R C used
char* for the “type-agnostic pointer” purpose.
int x = 4; void* q = &x; int* p = q; /* void*
automatically cast to int*: valid C, but not C++ */ int i = *p; int
j = *((int*)q); /* when dereferencing inline, there is no automatic
casting */
C++ does not allow the automatic casting of void*
to other pointer types, not even in assignments. This was a design
decision to avoid careless and even unintended casts, though most
compilers only output warnings, not errors, when encountering other
ill casts.
int x = 4; void* q = &x; // int* p = q; //
This fails in C++: there is no autocast from void* int* a =
(int*)q; // C-style cast int* b = static_cast(q); // C++ cast
In C++, there is no void& (reference to void)
to complement void* (pointer to void), because references behave
like aliases to the variables they point to, and there can never be
a variable whose type is void.
C#
In the C# programming language, pointers are supported only under certain conditions: any block of code including pointers must be marked with the unsafe keyword. Such blocks usually require higher security permissions than pointerless code to be allowed to run. The syntax is essentially the same as in C++, and the address pointed can be either managed or unmanaged memory. However, pointers to managed memory (any pointer to a managed object) must be declared using the fixed keyword, which prevents the garbage collector from moving the pointed object as part of memory management while the pointer is in scope, thus keeping the pointer address valid.The .NET
framework includes many classes and methods in the System and
System.Runtime.InteropServices namespaces (such as the Marshal
class) which convert .NET types (for example, System.String) to and
from many unmanaged types and pointers (for example, LPWSTR or void
*) to allow communication with unmanaged code.
D
The D programming language is a derivative of C and C++ which fully supports C pointers and C typecasting. However D also offers numerous constructs such as foreach loops, out function parameters, reference types, and advanced array handling which replace pointers for most routine programming tasks.Fortran
Fortran-90 introduced a strongly-typed pointer capability. Fortran pointers contain more than just a simple memory address. They also encapsulate the lower and upper bounds of array dimensions, strides (for example, to support arbitrary array sections), and other metadata. An association operator, => is used to associate a POINTER to a variable which has a TARGET attribute. The Fortran-90 ALLOCATE statement may also be used to associate a pointer to a block of memory. For example, the following code might be used to define and create a linked list structure:type real_list_t real :: sample_data(100) type
(real_list_t), pointer :: next => null () end type
type (real_list_t), target :: my_real_list type
(real_list_t), pointer :: real_list_temp real_list_temp =>
my_real_list do read (1,iostat=ioerr) real_list_temp%sample_data if
(ioerr /= 0) exit allocate (real_list_temp%next) real_list_temp
=> real_list_temp%next end do
Fortran-2003 adds support for procedure pointers.
Also, as part of the C Interoperability feature, Fortran-2003
supports intrinsic functions for converting C-style pointers into
Fortran pointers and back.
Modula-2
Pointers are implemented very much as in Pascal, as are VAR parameters in procedure calls. Modula 2 is even more strongly typed than Pascal, with fewer ways to escape the type system. Some of the variants of Modula 2 (such as Modula-3) include garbage collection.Oberon
Much as with Modula-2, pointers are available. There are still fewer ways to evade the type system and so Oberon and its variants are still safer with respect to pointers than Modula-2 or its variants. As with Modula-3, garbage collection is a part of the language specification.Pascal
Pascal implements pointers in a straightforward, limited, and relatively safe way. It helps catch mistakes made by people who are new to programming, like dereferencing a pointer into the wrong datatype; however, a pointer can be cast from one pointer type to another. Pointer arithmetic is unrestricted; adding or subtracting from a pointer moves it by that number of bytes in either direction, but using the Inc or Dec standard procedures on it moves it by the size of the datatype it is declared to point to. Trying to dereference a null pointer, named nil in Pascal, or a pointer referencing unallocated memory, raises an exception in protected mode. Parameters may be passed using pointers (as var parameters) but are automatically handled by the static compilation system.See also
External links
- Pointers and Memory Introduction to pointers - Stanford Computer Science Education Library
- 0pointer.de A terse list of minimum length source codes that dereference a null pointer in several different programming languages
- A tutorial in C Pointers and Arrays by Ted Jensen
- Pointers | Resourceful Idiot Brief Overview of Pointers and Why they are important
pointers in Catalan: Punter (programació)
pointers in Czech: Ukazatel (informatika)
pointers in German: Zeiger (Informatik)
pointers in Spanish: Puntero
(programación)
pointers in Persian: اشارهگر
pointers in French: Pointeur
(programmation)
pointers in Korean: 포인터 (프로그래밍)
pointers in Icelandic: Bendir
pointers in Italian: Puntatore
(programmazione)
pointers in Hebrew: מצביע
pointers in Dutch: Pointer
(programmeerconcept)
pointers in Japanese: ポインタ (プログラミング)
pointers in Polish: Zmienna wskaźnikowa
pointers in Portuguese: Ponteiro
(programação)
pointers in Russian: Указатель (тип
данных)
pointers in Slovak: Ukazovateľ
(informatika)
pointers in Serbian: Показивач
(програмирање)
pointers in Finnish: Osoitin (ohjelmointi)
pointers in Swedish: Pekare
pointers in Tamil: சுட்டு (நிரலாக்கம்)
pointers in Turkish: İşaretçiler
pointers in Ukrainian:
Вказівник