3.4 Stack use in C and C++
C and C++ both use the stack intensively.
For example, the stack holds:
• The return address of functions.
• Registers that must be preserved, as determined by the Arm
®
Architecture Procedure Call Standard
(AAPCS) or the Arm
®
Architecture Procedure Call Standard for the Arm
®
64-bit Architecture
(AAPCS64). For example, when register contents are saved on entry into subroutines.
• Local variables, including local arrays, structures, and unions.
• Classes in C++.
Some stack usage is not obvious, such as:
• If local integer or floating-point variables are spilled (that is, not allocated to a register), they are
allocated stack memory.
• Structures are normally allocated to the stack. A space equivalent to sizeof(struct) padded to a
multiple of n bytes is reserved on the stack, where n is 16 for AArch64 state, or 8 for AArch32 state.
However, the compiler might try to allocate structures to registers instead.
• If the size of an array is known at compile time, the compiler allocates memory on the stack. Again, a
space equivalent to sizeof(array) padded to a multiple of n bytes is reserved on the stack, where n
is 16 for AArch64 state, or 8 for AArch32 state.
Note
Memory for variable length arrays is allocated at runtime, on the heap.
• Several optimizations can introduce new temporary variables to hold intermediate results. The
optimizations include: CSE elimination, live range splitting, and structure splitting. The compiler
tries to allocate these temporary variables to registers. If not, it spills them to the stack. For more
information about what these optimizations do, see Overview of optimizations.
• Generally, code that is compiled for processors that only support 16-bit encoded T32 instructions
makes more use of the stack than A64 code, A32 code, and code that is compiled for processors that
support 32-bit encoded T32 instructions. This is because 16-bit encoded T32 instructions have only
eight registers available for allocation, compared to fourteen for A32 code and 32-bit encoded T32
instructions.
• The AAPCS64 requires that some function arguments are passed through the stack instead of the
registers, depending on their type, size, and order.
Processors for embedded applications have limited memory and therefore the amount of space available
on the stack is also limited. You can use Arm Compiler to determine how much stack space is used by
the functions in your application code. The amount of stack that a function uses depends on factors such
as the number and type of arguments to the function, local variables in the function, and the
optimizations that the compiler performs.
Methods of estimating stack usage
Stack use is difficult to estimate because it is code dependent, and can vary between runs depending on
the code path that the program takes on execution. However, it is possible to manually estimate the
extent of stack utilization using the following methods:
• Compile with -g and link with --callgraph to produce a static callgraph. This callgraph shows
information on all functions, including stack usage.
• Link with --info=stack or --info=summarystack to list the stack usage of all global symbols.
• Use a debugger to set a watchpoint on the last available location in the stack and see if the watchpoint
is ever hit. Compile with the -g option to generate the necessary DWARF information.
• Use a debugger, and:
1. Allocate space in memory for the stack that is much larger than you expect to require.
2. Fill the stack space with copies of a known value, for example, 0xDEADDEAD.
3 Writing Optimized Code
3.4 Stack use in C and C++
100748_0613_00_en Copyright © 2016–2019 Arm Limited or its affiliates. All rights
reserved.
3-66
Non-Confidential