Comments
You can use your Mastodon account to reply to this post.
std::vector
has the interesting property of allowing to be used with incomplete types to a small
degree. However, many legacy (and not-so-legacy) code bases use it in ways which are not allowed
by the standard, and which do start breaking with C++20. In this article I’ll explain the
limitations, a mistake I’ve seen several times now and why this starts breaking now.
Update, 2023-1015: @sehe@fosstodon.social made the very valid suggestion of using = default
to provide out-of-line special members. I have adapted the examples. Thanks!
Take this hypothetical header file, which contains a relationship between two classes which probably exists in many projects:
1#include <vector>
2
3// Forward declaration
4class MyClass;
5
6class MyContainer {
7 std::vector<MyClass> member;
8};
9
10// Definition
11class MyClass {};
12
13int main() {}
The fact that there is a forward-declared (and therefore incomplete) class in this example which we
put into a std::vector
, is pretty obvious here, especially if you read the title of this
article. So what do you think - is this allowed so far?
It is. You can see here on Godbolt that GCC, Clang and MSVC all compile this without any
complaints.1 The standard says in [vector.overview]/4
:
An incomplete type T may be used when instantiating vector if the allocator meets the allocator completeness requirements. T shall be complete before any member of the resulting specialization of vector is referenced.
We may assume that std::allocator
does meet the requirement, so the above snippet is in fact
allowed by the standard. But we already note the warning in the second sentence: We cannot
reference (whatever that means…) any member of std::vector<MyClass>
!
Let’s modify the example a bit (change highlighted):
1#include <vector>
2
3class MyClass;
4
5class MyContainer {
6 MyContainer() {};
7 std::vector<MyClass> member;
8};
9
10class MyClass {};
11
12int main() {}
The only change is that instead of an implicitly-declared and implicitly-defined default constructor, I have spelled it out for the compiler. If you look at the cppreference.com documentation for implicitly-defined default constructors, it says:
[…] it is defined (that is, a function body is generated and compiled) by the compiler […], and it has the same effect as a user-defined constructor with empty body and empty initializer list.
Nice, that’s exactly what I just wrote explicitly - so we should be golden, right? Compiler Says No.
Suddenly, Clang in C++20 mode (and only in C++20 mode) refuses to compile this - regardless of
whether we use libstdc++
or libc++
.
The error we see is a bit cryptic. It’s something like this:
1[…]/stl_vector.h:367:35: error: arithmetic on a pointer to an incomplete type 'MyClass'
2 _M_impl._M_end_of_storage - _M_impl._M_start);
3 ~~~~~~~~~~~~~~~~~~~~~~~~~ ^
4[…]/stl_vector.h:526:7: note: in instantiation of member function 'std::_Vector_base<MyClass, std::allocator<MyClass>>::~_Vector_base' requested here
5 vector() = default;
6 ^
7<source>:6:2: note: in defaulted default constructor for 'std::vector<MyClass>' first required here
8 MyContainer() {};
9 ^
libc++
The error looks a bit different if you tell Clang to use libc++
, i.e., the LLVM standard library
implementation, instead of GCC’s libstdc++
. The error then reads:
1[…]/vector:540:52: error: arithmetic on a pointer to an incomplete type 'MyClass'
2 {return static_cast<size_type>(__end_cap() - this->__begin_);}
3 ~~~~~~~~~~~ ^
4[…]/vector:760:56: note: in instantiation of member function 'std::vector<MyClass>::capacity' requested here
5 __annotate_contiguous_container(data(), data() + capacity(),
6 ^
7[…]/vector:431:7: note: in instantiation of member function 'std::vector<MyClass>::__annotate_delete' requested here
8 __annotate_delete();
9 ^
10<source>:6:2: note: in instantiation of member function 'std::vector<MyClass>::~vector' requested here
11 MyContainer() {};
12 ^
The most important hint buried in this is this:
[…] in defaulted default constructor for ‘std::vector<MyClass>’ […]
The default constructor MyContainer()
uses the default constructor std::vector<MyClass>()
! Of
course this is never really called - we never create an instance of MyContainer
. However, the
standard says that we may not reference any member of std::vector<MyClass>
- it does not only
forbid calling it. One can argue now that typing out the MyContainer()
constructor, which
implicitly contains a call to std::vector<MyClass>()
, is such a “reference”, which is forbidden.
There are multiple ways of fixing this. The best solution is of course to make sure that the class
you put into std::vector
is complete at the time. Most of the time, just changing the order of
class definitions already does the trick.
If that it not possible, you can also just move the definition of the default constructor to a
point where MyClass
is complete, like this (obligatory Godbolt link):
1#include <vector>
2
3class MyClass;
4
5class MyContainer {
6 MyContainer();
7 std::vector<MyClass> member;
8};
9
10class MyClass {};
11
12// Here, MyClass is complete!
13MyContainer::MyContainer() = default;
14
15int main() {}
We know now what the problem is and how to fix it. However, none of the relevant facts (you cannot
reference members of std::vector<Incomplete>
, the default constructor of a class “references” its
members constructors…) have changed from C++17 to C++20. So why is this suddenly breaking when switching to C++20?
I think the reason revolves around user-provided and “non-user-provided” functions. In the examples which do not compile under Clang/C++20, the default constructor is user-provided. In the examples that compile, it is non-user-provided.
The standard says in [dcl.fct.def.default]/5:
A non-user-provided defaulted function […] is implicitly defined when it is odr-used ([basic.def.odr]) or needed for constant evaluation ([expr.const])
Since we never ODR-use MyContainer
in our example (or there is anything that’s
constant-evaluated), the constructor is not defined at all! We can illustrate this by introducing
some ODR-usage:
1// Forward declaration
2class MyClass;
3
4class MyContainer {
5 std::vector<MyClass> member;
6};
7
8void foo() {
9 MyContainer c;
10}
11
12// Definition
13class MyClass {};
With this modification, we see the same error as when we define the default constructor
ourselves. If we move the ODR-use after the definition of MyClass
, it compiles fine:
1// Forward declaration
2class MyClass;
3
4class MyContainer {
5 std::vector<MyClass> member;
6};
7
8// Definition
9class MyClass {};
10
11void foo() {
12 MyContainer c;
13}
So the rule in [dcl.fct.def.default]/5 explains why we don’t run into problems as long as the default constructor is non-user-provided (and we don’t ODR-use it too early).
It does not explain why the error with a user-provided default constructor only appears when
switching to C++20. I think this has to do with the fact that in C++20, std::vector
now has
constexpr
constructors. I assume that as soon as MyContainer
’s default constructor is defined,
Clang tries to figure out whether it can construct a MyContainer
object at compile-time. For that
check, it needs to evaluate std::vector<MyClass>::vector()
at compile-time, which fails.
One thing that makes reasoning about this a bit hard is that the C++ standard never states what it means by “references”. It would be great to clarify that. However, I feel that defining a constructor that (implicitly) calls a function (the constructor of ~std::vector) should probably count as a “reference”, and thus Clang is totally allowed to crash and burn at this point. It would also be great to actually verify why exactly this only starts with C++20 - so far, I only have the assumption above that this is realted to optimization.
I assume that this is a problem that is present in many code bases. I encountered it because Apache Thrift (as of version 0.17) does generate code with this problem (there is a pull request available).
Thanks to StackOverflow user Brian Li for their answer on SO.
Note that this of course does not automatically mean that the program is allowed by the C++ standard. ↩︎
You can use your Mastodon account to reply to this post.