When writing a program, you frequently can't know how much memory the program will need when it runs. For example, a line read from a file at runtime might have any finite length. C and C++ programs use malloc, free, and their variants to dynamically allocate memory while the program is running. The rules for dynamic memory use include these:
· The number of allocation calls (calls to malloc) must exactly match the number of deallocation calls (calls to free).
· Reads and writes to the allocated memory must occur within the memory, not outside its range.
· The allocated memory cannot be used before it is allocated or after it is deallocated.
Because dynamic memory allocation and deallocation occur at runtime, static program analysis rarely find violations. Instead, memory-checking tools run the program, collecting data to determine if any of these rules have been violated. The violations a tool may find include the following:
· Reading from memory before allocating it
· Writing to memory before allocating it
· Reading before the beginning of allocated memory
· Writing before the beginning of allocated memory
· Reading after the end of allocated memory
· Writing after the end of allocated memory
· Reading from memory after its deallocation
· Writing to memory after its deallocation
· Failing to deallocate allocated memory
· Deallocating the same memory twice
· Deallocating memory that is not allocated
It is also useful to warn about requesting an allocation with 0 bytes, which probably indicates programmer error.
Table A.1 indicates four different tools' diagnostic capabilities. Unfortunately, no single tool diagnoses all the memory use errors. Also, no tool claims to detect reading or writing before allocating memory, but doing so will probably cause a segmentation fault. Deallocating memory twice will probably also cause a segmentation fault. These tools diagnose only errors that actually occur while the program is running. If you run the program with inputs that cause no memory to be allocated, the tools will indicate no memory errors. To test a program thoroughly, you must run the program using different inputs to ensure that every possible path through the program occurs. Also, you may use only one tool at a time, so you'll have to repeat testing with several tools to get the best error checking.
In the sections that follow, we first describe how to use the more easily used malloc checking and mtrace, and then ccmalloc and Electric Fence.
We'll use the malloc-use program in Listing A.2 to illustrate memory allocation, deallocation, and use. To begin running it, specify the maximum number of allocated memory regions as its only command-line argument. For example, malloc-use 12 creates an array A with 12 character pointers that do not point to anything. The program accepts five different commands:
· To allocate b bytes pointed to by array entry A[ i ], enter a i b . The array index i can be any non-negative number smaller than the command-line argument. The number of bytes must be non-negative.
· To deallocate memory at array index i, enter d i .
· To read the pth character from the allocated memory at index i (as in A[i][p]), enter r i p . Here, p can have an integral value.
· To write a character to the pth position in the allocated memory at index i, enter w i p .
· When finished, enter q.
We'll present the program's code later, in Section A.2.7, and illustrate how to use it.
The memory allocation functions provided by the GNU C library can detect writing before the beginning of an allocation and deallocating the same allocation twice. Defining the environment variable MALLOC_CHECK_ to the value 2 causes a program to halt when such an error is detected. (Note the environment variable's ending underscore.) There is no need to recompile the program.
We illustrate diagnosing a write to memory to a position just before the beginning of an allocation.
% export MALLOC_CHECK_=2
% ./malloc-use 12
Please enter a command: a 0 10
Please enter a command: w 0 -1
Please enter a command: d 0
Aborted (core dumped)
export turns on malloc checking. Specifying the value 2 causes the program to halt as soon as an error is detected.
Using malloc checking is advantageous because the program need not be recompiled, but its capability to diagnose errors is limited. Basically, it checks that the allocator data structures have not been corrupted. Thus, it can detect double deallocation of the same allocation. Also, writing just before the beginning of a memory allocation can usually be detected because the allocator stores the size of each memory allocation just before the allocated region. Thus, writing just before the allocated memory will corrupt this number. Unfortunately, consistency checking can occur only when your program calls allocation routines, not when it accesses memory, so many illegal reads and writes can occur before an error is detected. In the previous example, the illegal write was detected only when the allocated memory was deallocated.
The mtrace tool helps diagnose the most common error when using dynamic memory: failure to match allocations and deallocations. There are four steps to using mtrace, which is available with the GNU C library:
1. Modify the source code to include <mcheck.h> and to invoke mtrace () as soon as the program starts, at the beginning of main. The call to mtrace turns on tracking of memory allocations and deallocations.
2. .Specify the name of a file to store information about all memory allocations and deallocations:
3.
% export MALLOC_TRACE=memory.log
4. .Run the program. All memory allocations and deallocations are stored in the logging file.
5. Using the mtrace command, analyze the memory allocations and deallocations to ensure that they match.
6.
% mtrace my_program $MALLOC_TRACE
The messages produced by mtrace are relatively easy to understand. For example, for our malloc-use example, the output would look like this:
- 0000000000 Free 3 was never alloc'd malloc-use.c:39
Memory not freed:
-----------------
Address Size Caller
0x08049d48 0xc at malloc-use.c:30
These messages indicate an attempt on line 39 of malloc-use.c to free memory that was never allocated, and an allocation of memory on line 30 that was never freed.
mtrace diagnoses errors by having the executable record all memory allocations and deallocations in the file specified by the MALLOC_TRACE environment variable. The executable must terminate normally for the data to be written. The mtrace command analyzes this file and lists unmatched allocations and deallocations.
The ccmalloc library diagnoses dynamic memory errors by replacing malloc and free with code tracing their use. If the program terminates gracefully, it produces a report of memory leaks and other errors. The ccmalloc library was written by Armin Bierce.
You'll probably have to download and install the ccmalloc library yourself. Download it from http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/, unpack the code, and run configure. Run make and make install, copy the ccmalloc.cfg file to the directory where you'll run the program you want to check, and rename the copy to .ccmalloc. Now you are ready to use the tool.
The program's object files must be linked with ccmalloc's library and the dynamic linking library. Append -lccmalloc -ldl to your link command, for instance.
% gcc -g -Wall -pedantic malloc-use.o -o ccmalloc-use -lccmalloc –ldl
Execute the program to produce a report. For example, running our malloc-use program to allocate but not deallocate memory produces the following report:
% ./ccmalloc-use 12
file-name=a.out does not contain valid symbols
trying to find executable in current directory ...
using symbols from 'ccmalloc-use'
(to speed up this search specify 'file ccmalloc-use'
in the startup file '.ccmalloc')
Please enter a command: a 0 12
Please enter a command: q
The last few lines indicate the chain of function calls that allocated memory that was not deallocated.
To use ccmalloc to diagnose writes before the beginning or after the end of the allocated region, you'll have to modify the .ccmalloc file in the current directory. This file is read when the program starts execution.
Written by Bruce Perens, Electric Fence halts executing programs on the exact line where a write or a read outside an allocation occurs. This is the only tool that discovers illegal reads. It is included in most GNU/Linux distributions, but the source code can be found at http://www.perens.com/FreeSoftware/.
As with ccmalloc, your program's object files must be linked with Electric Fence's library by appending -lefence to the linking command, for instance:
% gcc -g -Wall -pedantic malloc-use.o -o emalloc-use –lefence
As the program runs, allocated memory uses are checked for correctness. A violation causes a segmentation fault:
% ./emalloc-use 12
Electric Fence 2.0.5 Copyright (C) 1987-1998 Bruce Perens.
Please enter a command: a 0 12
Please enter a command: r 0 12
Segmentation fault
Using a debugger, you can determine the context of the illegal action.
By default, Electric Fence diagnoses only accesses beyond the ends of allocations. To find accesses before the beginning of allocations instead of accesses beyond the end of allocations, use this code:
% export EF_PROTECT_BELOW=1
To find accesses to deallocated memory, set EF_PROTECT_FREE to 1. More capabilities are described in the libefence manual page.
Electric Fence diagnoses illegal memory accesses by storing each allocation on at least two memory pages. It places the allocation at the end of the first page; any access beyond the end of the allocation, on the second page, causes a segmentation fault. If you set EF_PROTECT_BELOW to 1, it places the allocation at the beginning of the second page instead. Because it allocates two memory pages per call to malloc, Electric Fence can use an enormous amount of memory. Use this library for debugging only.
We have discussed four separate, incompatible tools to diagnose erroneous use of dynamic memory. How does a GNU/Linux programmer ensure that dynamic memory is correctly used? No tool guarantees diagnosing all errors, but using any of them does increase the probability of finding errors. To ease finding dynamically allocated memory errors, separately develop and test the code that deals with dynamic memory. This reduces the amount of code that you must search for errors. If you are using C++, write a class that handles all dynamic memory use. If you are using C, minimize the number of functions using allocation and deallocation. When testing this code, be sure to use only one tool at a one time because they are incompatible. When testing a program, be sure to vary how the program executes, to test the most commonly executed portions of the code.
Which of the four tools should you use? Because failing to match allocations and deallocations is the most common dynamic memory error, use mtrace during initial development. The program is available on all GNU/Linux systems and has been well tested. After ensuring that the number of allocations and deallocations match, use Electric Fence to find illegal memory accesses. This will eliminate almost all memory errors. When using Electric Fence, you will need to be careful to not perform too many allocations and deallocations because each allocation requires at least two pages of memory. Using these two tools will reveal most memory errors.
Listing A.2 shows the source code for a program illustrating dynamic memory allocation, deallocation, and use. See Section A.2.1, "A Program to Test Memory Allocation and Deallocation," for a description of how to use it.
/* Use C's dynamic memory allocation functions. */
/* Invoke the program using one command-line argument specifying the
size of an array. This array consists of pointers to (possibly)
allocated arrays.
When the programming is running, select among the following
commands:
o allocate memory: a <index> <memory-size>
o deallocate memory: d <index>
o read from memory: r <index> <position-within-allocation>
o write to memory: w <index> <position-within-allocation>
o quit: q
The user is responsible for obeying (or disobeying) the rules on dynamic
memory use. */
#ifdef MTRACE
#include <mcheck.h>
#endif /* MTRACE */
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
/* Allocate memory with the specified size, returning nonzero upon
success. */
void allocate (char** array, size_t size)
{
*array = malloc (size);
}
/* Deallocate memory. */
void deallocate (char** array)
{
free ((void*) *array);
}
/* Read from a position in memory. */
void read_from_memory (char* array, int position)
{
char character = array[position];
}
/* Write to a position in memory. */
void write_to_memory (char* array, int position)
{
array[position] = 'a';
}
int main (int argc, char* argv[])
{
char** array;
unsigned array_size;
char command[32];
unsigned array_index;
char command_letter;
int size_or_position;
int error = 0;
#ifdef MTRACE
mtrace ();
#endif /* MTRACE */
if (argc != 2) {
fprintf (stderr, "%s: array-size\n", argv[0]);
return 1;
}
array_size = strtoul (argv[1], 0, 0);
array = (char **) calloc (array_size, sizeof (char *));
assert (array != 0);
/* Follow the user's commands. */
while (!error) {
printf ("Please enter a command: ");
command_letter = getchar ();
assert (command_letter != EOF);
switch (command_letter) {
case 'a':
fgets (command, sizeof (command), stdin);
if (sscanf (command, "%u %i", &array_index, &size_or_position) == 2
&& array_index < array_size)
allocate (&(array[array_index]), size_or_position);
else
error = 1;
break;
case 'd':
fgets (command, sizeof (command), stdin);
if (sscanf (command, "%u", &array_index) == 1
&& array_index < array_size)
deallocate (&(array[array_index]));
else
error = 1;
break;
case 'r':
fgets (command, sizeof (command), stdin);
if (sscanf (command, "%u %i", &array_index, &size_or_position) == 2
&& array_index < array_size)
read_from_memory (array[array_index], size_or_position);
else
error = 1;
break;
case 'w':
fgets (command, sizeof (command), stdin);
if (sscanf (command, "%u %i", &array_index, &size_or_position) == 2
&& array_index < array_size)
write_to_memory (array[array_index], size_or_position);
else
error = 1;
break;
case 'q':
free ((void *) array);
return 0;
default:
error = 1;
}
}
free ((void *) array);
return 1;
}