Discussion: A Little Sum Thing
Execute the code to understand the output and gain insights into integer type conversions and arithmetic behavior.
We'll cover the following
Run the code
Now, it’s time to execute the code and observe the output.
#include <iostream>#include <numeric>#include <vector>int main(){std::vector<int> v{-2, -3};std::cout << std::accumulate(v.cbegin(), v.cend(), v.size());}
Understanding the output
The size of the vector is 2
. When we add -2
and -3
to it, why is the result not -3
?
The std::accumulate()
function template is defined like so:
template<class InputIterator, class T>constexpr T accumulate(InputIterator first, InputIterator last, T init);
This computes its result by initializing the accumulator acc
with the initial value init
and then modifies it with acc = std::move(acc) + *i
(…) for every iterator i
in the [first, last)
range in order.
Template parameter deduction
Let’s start by deducing the template parameters. The InputIterator
will be deduced to the iterator type for the vector, which isn’t interesting for this puzzle since it’s only used for iterating. T
, on the other hand, will be deduced to the type of std::vector<int>::size()
, which is key to the puzzle. The std::vector<int>::size()
returns size_type
, which is not specified in detail by the standard. It’s only required to be an unsigned integer type, and as long as that requirement is met, the implementation can choose whatever representation is natural to use on that particular system.
So both the type of init
and the return type, as well as the accumulator acc
that’s inside std::accumulate()
will be of this unsigned integer type size_type
. When we add negative numbers to an unsigned type, something is bound to go wrong. But what exactly? And, importantly, is it undefined behavior?
Major error scenarios
Two error scenarios are worrisome:
Integer type conversion: When converting a value of one integer type to another integer type, if the value is out of range for the new type. This is not as dangerous as it sounds, as the operation doesn’t actually overflow but rather wraps around. This is true both for unsigned integers and, as of C++20, also for signed integers. For example, an 8-bit signed integer such as
int8_t
can represent values from-128
to127
. If we try to assign the value128
to it, which is one too large, it wraps around to-128
. When a value wraps around, we’re bound to get a wrong result, but at least it’s not undefined behavior.Arithmetic overflow: When doing arithmetic, if the result is out of range for the result type. For unsigned integers, we again wrap around. For signed integers, however, we get overflow, which is undefined behavior.
What happens in std::accumulate()
Since we’re adding the negative numbers -2
and -3
to our unsigned acc
, we can at least expect some wraparound, but can we expect overflow and, thus, undefined behavior? That depends on whether we ever end up adding two signed integers whose sum is out of range for the result type. Let’s have a closer look at what happens in std::accumulate()
to see if that’s possible.
The algorithm starts by initializing the accumulator acc
with the size 2
. The acc
is of type T
, which we previously deduced to the unspecified unsigned integer type size_type
. We then add the signed int
types -2
and -3
to acc
one at a time. Finally, we return acc
. These are all the steps spelled out:
size_type acc = 2; // size_type is unsignedacc = acc + -2; // Adding an unsigned and a negative signed numberacc = acc + -3; // Adding an unsigned and a negative signed numberreturn acc;
Now, what happens when we add the two signed int
types -2
and -3
to the unknown unsigned integer type size_type
? In the Aristotle’s Sum of Parts puzzle, we learned about the usual arithmetic conversions and integer promotions, which also happen here.
Since we don’t know the widths of size_type
and int
, we need to consider three scenarios:
Scenario 1: If size_type
fits in int
If int
can represent all of the values of size_type
, acc
gets promoted to int
, and the addition happens with the type int
. Since size_type
is unsigned, this can only happen if size_type
is smaller than int
. It’s very unlikely that an implementation would choose a size_type
smaller than int
, but it would be okay according to the standard. Here’s what would happen in this scenario.
First, we add
-2
toacc
. After promotingacc
fromsize_type
toint
, the operands are of the same type (int
), and no further conversion is needed. We add2
and-2
, and get0
, which fits in the result typeint
. This integer value0
gets converted back tosize_type
and stored inacc
.Next, we add
-3
toacc
. After promotingacc
fromsize_type
toint
, we add0
and-3
, and get-3
, which again fits in the result typeint
. The integer value-3
gets converted back tosize_type
. Sincesize_type
is unsigned,-3
wraps around tostd::numeric_limits<size_type>::max() - 2
, which gets stored inacc
as the final result and returned.
If size_type
was, for instance, uint16_t
, the result would be 65533
. This produces a wraparound and a wrong result but no undefined behavior.
Here’s a diagram of the types and conversions involved in the process. The acc
is promoted to int
, added to the first value, and the result converts back to size_type
to get stored in acc
. Then, we go around one more time for the second value.
The size_type
to int
promotion is unproblematic since size_type
fits in int
. The signed addition has the potential for overflow with undefined behavior if the result doesn’t fit in int
(but that doesn’t happen, given the input in this puzzle). The int
to size_type
conversion has the potential for wraparound if the result is negative (as happens in this puzzle) or if the value is too large (does not happen in this puzzle).
Scenario 2: If size_type
doesn’t fit in int
but fits in unsigned int
If int
cannot represent all of the values of size_type
but unsigned int
can, the acc
gets promoted to unsigned int
. This can only happen if size_type
is the same width as int
and unsigned int
. If size_type
was smaller, it would fit in int
; if it was wider, it would not fit in unsigned int
.
After acc
has been promoted from size_type
to unsigned int
, the operands are still not of the same type, so the usual arithmetic conversions continue. Since size_type
and int
have the same width, they also have the same rank, and the int
is converted to unsigned int
. The addition happens with the type unsigned int
.
First, we add
-2
toacc
.-2
is converted tounsigned int
, and since it is negative, it wraps around tostd::numeric_limits<unsigned int>::max() - 1
. When added toacc
, which has the value2
, we wrap around again to0
.Next, we add
-3
toacc
.-3
is converted tounsigned int
, and again, since it is negative, it wraps around tostd::numeric_limits<unsigned int>::max() - 2
. When added toacc
, which has the value0
, the result isstd::numeric_limits<unsigned int>::max() - 2
, which gets stored inacc
as the final result. On a typical system with 32-bitint
, the result would be4294967293
. This produced several wraparounds and a wrong result but no undefined behavior.
Here’s a diagram of the types and conversions involved in the process. The acc
is promoted to unsigned int
and the first value is converted to unsigned int
, they are added together, and the result is converted back to size_type
to get stored in acc
. Then, we go around one more time for the second value.
The size_type
to unsigned int
promotion is unproblematic since size_type
and unsigned int
are both unsigned and of the same width. The int
to unsigned int
conversion has the potential to wrap around if the value is negative (which happens, given the input in this puzzle). The unsigned addition has the potential to wrap around if the result is too large (which does not happen in this puzzle). The unsigned int
to size_type
conversion is unproblematic since unsigned int and size_type
are both unsigned and of the same width.
Scenario 3: If size_type
doesn’t fit in either int
or unsigned int
This is the most common scenario if, for example, int
is 32 bits and size_type
is a 64-bit unsigned type. In this case, integral promotion on size_type
is not performed; we only have the rest of the usual arithmetic conversions to worry about. Since size_type
is wider than int
, the size_type
has a greater rank, which means that the int
is converted to the unsigned type size_type
, and the addition happens with the type size_type
.
First, we add
-2
toacc
.-2
is converted tosize_type
and wraps around tostd::numeric_limits<size_type>::max() - 1
. When added toacc
, which has the value2
, we wrap around again to0
.Next, we add
-3
toacc
.-3
is converted tosize_type
and wraps around tostd::numeric_limits<size_type>::max() - 2
. When added toacc
, which has the value0
, the result isstd::numeric_limits<size_type>::max() - 2
, which gets stored inacc
as the final result. On a typical system with 32-bit int and 64-bitsize_type
, the result would be18446744073709551613
. Again, we saw several wraparounds and a wrong result but no undefined behavior.
Here’s a diagram of the types and conversions involved in the process. The first value is converted to size_type
, added to acc
, and the result gets stored in acc
. Then, we go around one more time for the second value.
The int
to size_type
conversion has the potential to wrap around if value
is negative, which happens given the input in this puzzle. The unsigned addition has the potential to wrap around if the result is too large (which does not happen in this puzzle).
We omitted many details above, especially the exact rules resulting in these particular conversion sequences, and still, the explanation got quite long. If you followed it to the end, great; you got the point. If you dropped off at some point, though, no worries; you still got the point! The point is that reasoning about implicit conversions between integers is complicated and best avoided in production code.
What can we do?
Unfortunately, none of the major compilers warn about this, even with -Wall-Wextra
for GCC and Clang, and /Wall
for MSVC. Some compilers, such as GCC and Clang, have additional options like -Wsign-conversion
, which are not included in -Wall
or -Wextra
and must be enabled individually. Unfortunately, even after enabling them, we get no warnings for this particular puzzle. We would if the accumulate()
function template was defined in our own code rather than in the standard library, though.
Clang-Tidy is also unable to warn about this, and undefined behavior sanitizer would only trigger if undefined behavior actually happened during the execution of the program, which is not the case.
Recommendations
So, for this particular puzzle, there’s not much help to get from tools, and we just have to be aware of the potential problems when mixing integer types and signedness. However, the following is still good advice and can help in similar scenarios:
Avoid mixing signed and unsigned types: Don’t mix signed and unsigned types in arithmetic, assignments, or comparisons, as this can lead to unexpected behavior and bugs.
Enable additional compiler warnings: In addition to
-Wall
and-Wextra
, manually enable options like-Wsign-conversion
to catch potential issues related to signedness.Utilize Clang-Tidy and sanitizers: Use
clang-tidy
and runtime sanitizers to detect and prevent issues in your code.Understand the limitations of tools: Be aware that warnings and tools will not always find all potential problems, so a deep understanding of the underlying issues is essential.
Level up your interview prep. Join Educative to access 70+ hands-on prep courses.