Discussion: A Little Sum Thing

Execute the code to understand the output and gain insights into integer type conversions and arithmetic behavior.

We'll cover the following

Run the code
Understanding the output
What can we do?
Recommendations

Press + to interact

This computes its result by initializing the accumulator acc with the initial value init and then modifies it with acc = std::move(acc) + *i (…) for every iterator i in the [first, last) range in order.

Template parameter deduction

Let’s start by deducing the template parameters. The InputIterator will be deduced to the iterator type for the vector, which isn’t interesting for this puzzle since it’s only used for iterating. T, on the other hand, will be deduced to the type of std::vector<int>::size(), which is key to the puzzle. The std::vector<int>::size() returns size_type, which is not specified in detail by the standard. It’s only required to be an unsigned integer type, and as long as that requirement is met, the implementation can choose whatever representation is natural to use on that particular system.

So both the type of init and the return type, as well as the accumulator acc that’s inside std::accumulate() will be of this unsigned integer type size_type. When we add negative numbers to an unsigned type, something is bound to go wrong. But what exactly? And, importantly, is it undefined behavior?

Major error scenarios

Two error scenarios are worrisome:

Integer type conversion: When converting a value of one integer type to another integer type, if the value is out of range for the new type. This is not as dangerous as it sounds, as the operation doesn’t actually overflow but rather wraps around. This is true both for unsigned integers and, as of C++20, also for signed integers. For example, an 8-bit signed integer such as int8_t can represent values from -128 to 127. If we try to assign the value 128 to it, which is one too large, it wraps around to -128. When a value wraps around, we’re bound to get a wrong result, but at least it’s not undefined behavior.
Arithmetic overflow: When doing arithmetic, if the result is out of range for the result type. For unsigned integers, we again wrap around. For signed integers, however, we get overflow, which is undefined behavior.

What happens in `std::accumulate()`

Since we’re adding the negative numbers -2 and -3 to our unsigned acc, we can at least expect some wraparound, but can we expect overflow and, thus, undefined behavior? That depends on whether we ever end up adding two signed integers whose sum is out of range for the result type. Let’s have a closer look at what happens in std::accumulate() to see if that’s possible.

The algorithm starts by initializing the accumulator acc with the size 2. The acc is of type T, which we previously deduced to the unspecified unsigned integer type size_type. We then add the signed int types -2 and -3 to acc one at a time. Finally, we return acc. These are all the steps spelled out:

Now, what happens when we add the two signed int types -2 and -3 to the unknown unsigned integer type size_type? In the Aristotle’s Sum of Parts puzzle, we learned about the usual arithmetic conversions and integer promotions, which also happen here.

Since we don’t know the widths of size_type and int, we need to consider three scenarios:

Scenario 1: If `size_type` fits in `int`

If int can represent all of the values of size_type, acc gets promoted to int, and the addition happens with the type int. Since size_type is unsigned, this can only happen if size_type is smaller than int. It’s very unlikely that an implementation would choose a size_type smaller than int, but it would be okay according to the standard. Here’s what would happen in this scenario.

First, we add -2 to acc. After promoting acc from size_type to int, the operands are of the same type (int), and no further conversion is needed. We add 2 and -2, and get 0, which fits in the result type int. This integer value 0 gets converted back to size_type and stored in acc.
Next, we add -3 to acc. After promoting acc from size_type to int, we add 0 and -3, and get -3, which again fits in the result type int. The integer value -3 gets converted back to size_type. Since size_type is unsigned, -3 wraps around to std::numeric_limits<size_type>::max() - 2, which gets stored in acc as the final result and returned.

If size_type was, for instance, uint16_t, the result would be 65533. This produces a wraparound and a wrong result but no undefined behavior.

Here’s a diagram of the types and conversions involved in the process. The acc is promoted to int, added to the first value, and the result converts back to size_type to get stored in acc. Then, we go around one more time for the second value.

Press + to interact

The size_type to int promotion is unproblematic since size_type fits in int. The signed addition has the potential for overflow with undefined behavior if the result doesn’t fit in int (but that doesn’t happen, given the input in this puzzle). The int to size_type conversion has the potential for wraparound if the result is negative (as happens in this puzzle) or if the value is too large (does not happen in this puzzle).

Scenario 2: If `size_type` doesn’t fit in `int` but fits in `unsigned int`

If int cannot represent all of the values of size_type but unsigned int can, the acc gets promoted to unsigned int. This can only happen if size_type is the same width as int and unsigned int. If size_type was smaller, it would fit in int; if it was wider, it would not fit in unsigned int.

After acc has been promoted from size_type to unsigned int, the operands are still not of the same type, so the usual arithmetic conversions continue. Since size_type and int have the same width, they also have the same rank, and the int is converted to unsigned int. The addition happens with the type unsigned int.

First, we add -2 to acc. -2 is converted to unsigned int, and since it is negative, it wraps around to std::numeric_limits<unsigned int>::max() - 1. When added to acc, which has the value 2, we wrap around again to 0.
Next, we add -3 to acc. -3 is converted to unsigned int, and again, since it is negative, it wraps around to std::numeric_limits<unsigned int>::max() - 2. When added to acc, which has the value 0, the result is std::numeric_limits<unsigned int>::max() - 2, which gets stored in acc as the final result. On a typical system with 32-bit int, the result would be 4294967293. This produced several wraparounds and a wrong result but no undefined behavior.

Here’s a diagram of the types and conversions involved in the process. The acc is promoted to unsigned int and the first value is converted to unsigned int, they are added together, and the result is converted back to size_type to get stored in acc. Then, we go around one more time for the second value.

Press + to interact

The size_type to unsigned int promotion is unproblematic since size_type and unsigned int are both unsigned and of the same width. The int to unsigned int conversion has the potential to wrap around if the value is negative (which happens, given the input in this puzzle). The unsigned addition has the potential to wrap around if the result is too large (which does not happen in this puzzle). The unsigned int to size_type conversion is unproblematic since unsigned int and size_type are both unsigned and of the same width.

Scenario 3: If `size_type` doesn’t fit in either `int` or `unsigned int`

This is the most common scenario if, for example, int is 32 bits and size_type is a 64-bit unsigned type. In this case, integral promotion on size_type is not performed; we only have the rest of the usual arithmetic conversions to worry about. Since size_type is wider than int, the size_type has a greater rank, which means that the int is converted to the unsigned type size_type, and the addition happens with the type size_type.

First, we add -2 to acc. -2 is converted to size_type and wraps around to std::numeric_limits<size_type>::max() - 1. When added to acc, which has the value 2, we wrap around again to 0.
Next, we add -3 to acc. -3 is converted to size_type and wraps around to std::numeric_limits<size_type>::max() - 2. When added to acc, which has the value 0, the result is std::numeric_limits<size_type>::max() - 2, which gets stored in acc as the final result. On a typical system with 32-bit int and 64-bit size_type, the result would be 18446744073709551613. Again, we saw several wraparounds and a wrong result but no undefined behavior.

Here’s a diagram of the types and conversions involved in the process. The first value is converted to size_type, added to acc, and the result gets stored in acc. Then, we go around one more time for the second value.

Press + to interact

The int to size_type conversion has the potential to wrap around if value is negative, which happens given the input in this puzzle. The unsigned addition has the potential to wrap around if the result is too large (which does not happen in this puzzle).

We omitted many details above, especially the exact rules resulting in these particular conversion sequences, and still, the explanation got quite long. If you followed it to the end, great; you got the point. If you dropped off at some point, though, no worries; you still got the point! The point is that reasoning about implicit conversions between integers is complicated and best avoided in production code.

What can we do?

Unfortunately, none of the major compilers warn about this, even with -Wall-Wextra for GCC and Clang, and /Wall for MSVC. Some compilers, such as GCC and Clang, have additional options like -Wsign-conversion, which are not included in -Wall or -Wextra and must be enabled individually. Unfortunately, even after enabling them, we get no warnings for this particular puzzle. We would if the accumulate() function template was defined in our own code rather than in the standard library, though.

Clang-Tidy is also unable to warn about this, and undefined behavior sanitizer would only trigger if undefined behavior actually happened during the execution of the program, which is not the case.

Recommendations

So, for this particular puzzle, there’s not much help to get from tools, and we just have to be aware of the potential problems when mixing integer types and signedness. However, the following is still good advice and can help in similar scenarios:

Avoid mixing signed and unsigned types: Don’t mix signed and unsigned types in arithmetic, assignments, or comparisons, as this can lead to unexpected behavior and bugs.
Enable additional compiler warnings: In addition to -Wall and -Wextra, manually enable options like -Wsign-conversion to catch potential issues related to signedness.
Utilize Clang-Tidy and sanitizers: Use clang-tidy and runtime sanitizers to detect and prevent issues in your code.
Understand the limitations of tools: Be aware that warnings and tools will not always find all potential problems, so a deep understanding of the underlying issues is essential.

Level up your interview prep. Join Educative to access 70+ hands-on prep courses.

Before We Begin

Puzzles

Wrapping Up

Appendix

Discussion: A Little Sum Thing

Run the code