How C/C++ Compiler Generate x86 Assembly Code For Large Return Values

type

status

date

slug

summary

category

icon

password

For C/C++ programmers, understanding how functions return values is crucial for writing efficient and robust code. When dealing with simple data types such as integers or pointers, registers are commonly used to store the return value - In the x86 architecture, the two default registers used for this purpose are EAX and EDX. However, when it comes to large structures, the process becomes a bit more intricate. In this blog post, we'll explore how large return values, such as structures that cannot fit into registers, are handled in x86 assembly.

When a function returns a structure or a value that is too large to be stored in registers, x86 assembly employs a hidden pointer technique. This approach involves passing a hidden pointer to the memory location where the result should be stored. The memory location is provided by the caller, ensuring that there is enough space to store the return value. If the caller decides to hold the return value of the function with a local variable, then the return value will be built directly on the caller’s stack frame.

Let's evaluate this process with an example. Assume we define a structure AA. Then, we define a new function getAA() that creates a local variable of type AA, initializes it, and returns the local variable by value. Finally, in our main function, we call this newly defined function and use a local variable to hold its return value:

Now, let's dive into the assembly code to see how the compiler handles this code snippet for us. We will first investigate the main function’s assembly code:

Hummm… Our simple function getAA() does not take any parameter, so what exactly is in eax that got pushed into the stack as an argument of the function call?

This is the hidden pointer technique we've talked about earlier - When a function returns a structure or a value that is too large to be stored in registers, x86 assembly employs a hidden pointer technique. This approach involves passing a hidden pointer to the memory location where the result should be stored. The memory location is provided by the caller, ensuring that there is enough space to store the return value.

In this example, the function caller, which is the main function, allocated enough memory space in its own stack frame to hold the return value. It then passed the initial address of the allocated memory to the function call as a hidden parameter.

Now let's dive into the assembly code of getAA() to see how it works (unrelated code has been removed for clarity):

If you are familiar with x86 assembly, you will know that ebp+8 is the address of the first argument of a function call. In our case, this argument stores an address, which is the initial address of the allocated memory to the function call as a hidden parameter. There you go! This function directly copies the return value (which in this case is the local variable aa) to the memory location provided by the caller. This ensures that the caller can access the returned structure using the memory pointer once the function call returns.

takeaway

Here's a step-by-step guide on how to handle large return values in x86 assembly:

Allocating Memory for the Structure: Before calling the function, you need to allocate memory for the structure. This can be done on the stack or in the data segment. Ensure that you allocate enough space to accommodate the entire structure.

Passing the Hidden Pointer to the Function: Once the memory is allocated, you should pass a pointer to that memory as a hidden argument to the function. In x86 assembly, the hidden pointer is typically passed in the EAX or ECX register, depending on the calling convention used.

Writing the Return Value to the Memory Location: Inside the function, the return value should be written to the memory location provided by the caller. This ensures that the caller can access the returned structure using the memory pointer once the function call returns.

Accessing the Returned Structure: After the function call returns, the caller can access the returned structure using the memory pointer that was passed as a hidden argument. This allows the program to continue executing while safely handling the large return value.