Discussion: Counting Copies

Execute the code to understand the output and gain insights into copy elision and return value optimization.

Run the code

Now, it’s time to execute the code and observe the output.

Press + to interact
#include <iostream>
struct Resource
{
Resource() = default;
Resource(const Resource &other)
{
std::cout << "copy\n";
}
};
Resource getResource()
{
return Resource{};
}
int main()
{
Resource resource1 = getResource();
Resource resource2{resource1};
}

Understanding the output

It might seem like three copies are being made in this program. One to initialize the return object of getResource with the temporary Resource{}, one to initialize resource1 with said return object, and one to initialize resource2. However, no copies are being made when returning from getResource() and initializing resource1. How is this possible?

The pure rvalues (prvalues)

Let’s first look at the return statement, which seemingly initializes the return object from the temporary Resource{}. The Resource{} expression in the return statement is a type of rvalue called a prvalue, or the pure rvalue. In the Will It Move? puzzle, we first met lvalues and rvalues and learned that rvalues are expressions we can move from, bind rvalue references to, and so on. In fact, there are two types of rvalues: xvalues and prvalues:

  • An xvalue (or eXpiring lvalue) is what we get when we call std::move on an lvalue. For instance, in the “Will It Move?” puzzle, the expression objectWithConstMember is an lvalue. It evaluates to an actual object that exists somewhere that we can take the address of, and so on. When we do std::move(objectWithConstMember), we now have an xvalue. This expression still evaluates to an actual object, but we’re also allowed to treat it as an rvalue and move from it.

  • A prvalue, on the other hand, is a pure rvalue. It’s an expression that initializes a new object, as opposed to just turning an existing object into an rvalue.

Return value optimization

In C++11, prvalues were unnamed temporary objects, and the compiler was allowed to elide or omit copies of them. The form of copy elision where the unnamed temporary was the operand of a return statement, like in this puzzle, was called unnamed return value optimization (URVO)—unnamed because it applied when the function returned an unnamed temporary. In C++17, avoiding copies in these situations became mandatory. But rather than making URVO mandatory, the language was changed in such a way that now there’s never even an object here to optimize away the copy in the first place! Prvalues are no longer unnamed temporaries; they merely represent the idea of an object, which will only materialize later.

Here’s an example of unnamed return value optimization:

Resource getResource() {
return Resource{}; // Unnamed RVO (pre C++17)
}
Code example of the URVO

Another form of return value optimization, which is still not mandatory, is called named return value optimization (NRVO). This applies when we return a local variable rather than a prvalue; in this case, the implementation is allowed to omit the copy but isn’t required to. NRVO applies even if the copy constructor has side effects; these side effects then simply don’t happen. This is a very rare case where an optimization can change our program’s observable behavior!

Here’s an example of named return value optimization:

Press + to interact
Resource getResource()
{
Resource resource;
return resource; // Named RVO
}

Let’s now look at the Resource resource1 = getResource() expression, which seemingly copies the returned object into resource1. A function call is a prvalue, so the same idea applies here. The call to getResource() does not result in a temporary object; it merely represents the idea of the return value, and no Resource object is actually created until resource1 is initialized with it.

Defining things like the idea of an object in the standard is all well and good, but how is this possible to do in practice? After all, the Resource object is supposed to end up on the stack in main, but the actual code that creates it is in getResource(), which does not have access to the stack frame of main.

The trick that most systems use to avoid the copy is that main sets aside space in its stack frame for resource1, just like for any other local variables. Then, main passes a pointer to this memory as a secret extra argument to getResource(), as if its signature was getResource(Resource* ptr). Now, getResource() can construct the object directly in this memory location.

Application binary interface

For this to work, the caller and the callee must agree on this convention, so this is part of the system’s Application Binary Interface (ABI). One popular ABI is the Itanium C++ ABI, which, for instance, is used on Linux. GCC and Clang both conform to this ABI, so we can freely call a function compiled with one from a function compiled by the other.

Note: The Itanium C++ ABI says: “If the return type is a class type that’s non-trivial for the purposes of calls, the caller passes an address as an implicit parameter. The callee then constructs the return value into this address.”

If we try this out on Compiler Explorer and choose the x86_64 architecture, we find the current versions of GCC and Clang both generate this code in main:

lea rax, [rbp-4]
mov rdi, rax
call getResource()

Here, [rbp-4] is the address of resource1 in the stack frame of main. This address gets loaded in rdi, the register used to pass the first integer/pointer argument to a function, and then getResource() is called.

For MSVC, it’s harder to track down information about their ABI. But using Compiler Explorer, we can see a similar pattern on x86_64:

lea rcx, QWORD PTR resource1$[rsp]
call Resource getResource(void) ; getResource

The convention is the same, except that the register used for the pointer is rcx.

Recommendation

All the mainstream compilers have done both named and unnamed return value optimization for many years now (as far back as Clang 3, GCC 4, and Visual Studio 2008), so the takeaway is the following:

Simply return by value and not do something clever to optimize the returns.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.