Pointers and Typecasting |
Top Previous Next |
A pointer is a special type of variable that can reference, or point to, another variable. The pointer when initialized is said to reference another variable, or to contain a memory location. To understand pointers fully you need to remember some key concepts. 1) A variable is really a memory address, 2) The memory address is where the data is stored. So when we assign a variable a value there are really two values involved. The first is the value we want to store, and the second is the starting address of the memory we want to store it in. That address is a value in and of itself. On 32 bit processors you can think of a memory address as a UINT type. A = 1 Seems simple enough. Internally the compiler tells the linker that we want to move the number 1 into the address that A represents. That address is picked by the linker when your program is compiled and can be any where in your programs address space. A pointer then is a variable that has an address, and in that address we store another address. A = 1 : 'Store the number 1 in the address that A represents Retrieving the value that is contained in the address, that is stored in the pointer is called dereferencing. In some texts you may see it referred to as indirection. The EBASIC compiler supports two general dereferencing operators, the # symbol and a 'C' style dereference operator, the '*'. The hash dereference '#' operator is unique to the EBASIC language and is suitable for most basic pointer needs. PRINT #<INT>p The above statement may look a bit strange. Lets break it down by what it does. # - We want to perform a dereferencing operation p - The pointer that contains the address of a variable <INT> - The value stored in the address of the variable is of type INT When we combine the two examples the dereferencing operation returns the number 1. The type between the < > is called type casting. The compiler must know what type is stored in the address being dereferenced. If the pointer was assigned directly as above you can omit the type cast and simplify the statement to: PRINT #p This only works if the pointer was assigned an actual variable and you haven't crossed a subroutine boundary with the pointer. The compiler will generate an error message of "Type cast must be specified" if it cannot automatically determine the type of the value an address contains. For pointers to a UDT variable (user data type) a typecast must always be specified. You can determine the type that a pointer is currently referencing by using the TYPEOF function. If a pointer has never been assigned an address then TYPEOF will return -1. Dereferencing operations can appear on both the left hand and right hand side of an assignment operation. In this way you can indirectly change the contents of a variable by using the pointer. Continuing with the statements from above: #<INT>p = 2 Stores the value of 2 into the address of the variable pointed to by p and that variable is of type INT. It is important to note that if a pointer is a NULL pointer then any dereferencing operation will result in an access violation crash. A NULL pointer is a pointer that hasn't been initialized, or assigned an address yet. This is the most common reason that programs fail. If your unsure what a pointer will contain, such as a pointer passed to a subroutine, then always test your pointers for NULL before performing a dereference operation. IF p <> NULL Type casting Type casting was briefly mentioned in the last section. It is important enough of a topic to cover in further detail because the real power of pointers does not really become apparent until you realize that a memory address can be dereferenced as any built in or user defined type. As mentioned above a type cast is the variable type between the < > symbols. A common use for type casting is using a block of memory to store any kind of data, structured or not. DEF pMem as POINTER When accessing a UDT with a pointer you use the same dot notation you would with a statically defined UDT variable. DEF pMem as POINTER The above seems a bit cumbersome. Luckily you can use the SETTYPE command to preset the type of a pointer, so you only need to specify the type once: DEF pMem as POINTER Pointer math and array indexes Since a pointer contains an address, and that address is a 32 bit value, you can change the address the pointer contains by using any standard math operator. pMem = NEW(CHAR,100) The C style pointer dereferencing supports direct pointer math: pMem = NEW(CHAR,100) It is important to remember that pointer math in this manner always refers to a single byte, regardless of the type cast. Multiply the size of the variable for more complex operations: pMem = NEW(UINT,100) A simpler method to access a pointer to an array of data is to use the standard array indexes. In this manner you do not have to account for the size of the variable, as the type cast will specify the size of the data being stored. pMem = NEW(UINT,100) Multiple indirection Multiple indirection, or multiple dereferencing, is performing a dereferencing operation on a pointer more than once. In other words a pointer can contain the address of another pointer. Emergence BASIC supports multiple indirection through both the Emergence BASIC and C style operators. It is a bit easier with the C style. There really is no simple example of multiple indirection, however we will cover the basics. To get an address of a pointer use the & operator: DEF p1,p2,temp as POINTER In this example the & operator gets the address of a variable. This is necessary when using pointers as an assignment of one pointer to another would normally copy the address contained by the pointer, not the pointers address itself. The statement *<INT>*<POINTER>p2 = 5 when broken down means "p2 is a pointer, it contains the address of a POINTER that contains the address of an INT, store the number 5 in that address". There is no limit on the amount of indirection, although it would be unusual to see more that two levels of indirection. Text editors routinely use two levels of indirection to store lines of text. The first level is an array of pointers, and the second levels are pointers to strings containing each line of text. In this manner the text on each line can be adjusted individually without affecting any other line. |