diff --git a/CppCoreGuidelines.md b/CppCoreGuidelines.md index d20cd77..95fb79b 100644 --- a/CppCoreGuidelines.md +++ b/CppCoreGuidelines.md @@ -1,9 +1,10 @@ --- layout: default --- + # C++ Core Guidelines -December 26, 2017 +January 22, 2018 Editors: @@ -483,14 +484,16 @@ What is expressed in code has defined semantics and can (in principle) be checke The first declaration of `month` is explicit about returning a `Month` and about not modifying the state of the `Date` object. The second version leaves the reader guessing and opens more possibilities for uncaught bugs. -##### Example +##### Example; bad + +This loop is a restricted form of `std::find`: void f(vector& v) { string val; cin >> val; // ... - int index = -1; // bad + int index = -1; // bad, plus should use gsl::index for (int i = 0; i < v.size(); ++i) { if (v[i] == val) { index = i; @@ -500,7 +503,8 @@ The second version leaves the reader guessing and opens more possibilities for u // ... } -That loop is a restricted form of `std::find`. +##### Example; good + A much clearer expression of intent would be: void f(vector& v) @@ -582,7 +586,7 @@ Unless the intent of some code is stated (e.g., in names or comments), it is imp ##### Example - int i = 0; + gsl::index i = 0; while (i < v.size()) { // ... do something with v[i] ... } @@ -1023,7 +1027,7 @@ Time and space that you spend well to achieve a goal (e.g., speed of development X x; x.ch = 'a'; x.s = string(n); // give x.s space for *p - for (int i = 0; i < x.s.size(); ++i) x.s[i] = buf[i]; // copy buf into x.s + for (gsl::index i = 0; i < x.s.size(); ++i) x.s[i] = buf[i]; // copy buf into x.s delete[] buf; return x; } @@ -1700,7 +1704,7 @@ Use the ISO Concepts TS style of requirements specification. For example: ##### Note Soon (maybe in 2018), most compilers will be able to check `requires` clauses once the `//` is removed. -For now, the concept TS is supported only in GCC 6.1 and later. +Concepts are supported in GCC 6.1 and later. **See also**: [Generic programming](#SS-GP) and [concepts](#SS-t-concepts). @@ -3263,7 +3267,7 @@ A `span` represents a range of elements, but how do we manipulate elements of th for (int x : s) cout << x << '\n'; // C-style traversal (potentially checked) - for (int i = 0; i < s.size(); ++i) cout << s[i] << '\n'; + for (gsl::index i = 0; i < s.size(); ++i) cout << s[i] << '\n'; // random access (potentially checked) s[7] = 9; @@ -3568,7 +3572,7 @@ The language guarantees that a `T&` refers to an object, so that testing for `nu array w; // ... public: - wheel& get_wheel(size_t i) { Expects(i < w.size()); return w[i]; } + wheel& get_wheel(int i) { Expects(i < w.size()); return w[i]; } // ... }; @@ -4183,7 +4187,7 @@ Flag classes declared with `struct` if there is a `private` or `protected` membe Encapsulation. Information hiding. -Minimize the chance of untended access. +Minimize the chance of unintended access. This simplifies maintenance. ##### Example @@ -5084,7 +5088,7 @@ If you really have to, look at [factory functions](#Rc-factory). One reason people have used `init()` functions rather than doing the initialization work in a constructor has been to avoid code replication. [Delegating constructors](#Rc-delegating) and [default member initialization](#Rc-in-class-initializer) do that better. -Another reason is been to delay initialization until an object is needed; the solution to that is often [not to declare a variable until it can be properly initialized](#Res-init) +Another reason has been to delay initialization until an object is needed; the solution to that is often [not to declare a variable until it can be properly initialized](#Res-init) ##### Enforcement @@ -6855,6 +6859,25 @@ Since each implementation derived from its interface as well as its implementati As mentioned, this is just one way to construct a dual hierarchy. +The implementation hierarchy can be used directly, rather than through the abstract interface. + + void work_with_shape(Shape&); + + int user() + { + Impl::Smiley my_smiley{ /* args */ }; // create concrete shape + // ... + my_smiley.some_member(); // use implementation class directly + // ... + work_with_shape(my_smiley); // use implementation through abstract interface + // ... + } + +This can be useful when the implementation class has members that are not offered in the abstract interface +or if direct use of a member offers optimization opportunities (e.g., if an implementation member function is `final`) + +##### Note + Another (related) technique for separating interface and implementation is [Pimpl](#Ri-pimpl). ##### Note @@ -9135,7 +9158,35 @@ be able to destroy a cyclic structure. ##### Example - ??? + #include + + class bar; + + class foo + { + public: + explicit foo(const std::shared_ptr& forward_reference) + : forward_reference_(forward_reference) + { } + private: + std::shared_ptr forward_reference_; + }; + + class bar + { + public: + explicit bar(const std::weak_ptr& back_reference) + : back_reference_(back_reference) + { } + void do_something() + { + if (auto shared_back_reference = back_reference_.lock()) { + // Use *shared_back_reference + } + } + private: + std::weak_ptr back_reference_; + }; ##### Note @@ -9482,7 +9533,7 @@ Arithmetic rules: * [ES.104: Don't underflow](#Res-underflow) * [ES.105: Don't divide by zero](#Res-zero) * [ES.106: Don't try to avoid negative values by using `unsigned`](#Res-nonnegative) -* [ES.107: Don't use `unsigned` for subscripts](#Res-subscripts) +* [ES.107: Don't use `unsigned` for subscripts, prefer `gsl::index`](#Res-subscripts) ### ES.1: Prefer the standard library to other libraries and to "handcrafted code" @@ -9678,7 +9729,7 @@ Conventional short, local names increase readability: template // good void print(ostream& os, const vector& v) { - for (int i = 0; i < v.size(); ++i) + for (gsl::index i = 0; i < v.size(); ++i) os << v[i] << '\n'; } @@ -9687,9 +9738,9 @@ An index is conventionally called `i` and there is no hint about the meaning of template // bad: verbose, hard to read void print(ostream& target_stream, const vector& current_vector) { - for (int current_element_index = 0; - current_element_index < current_vector.size(); - ++current_element_index + for (gsl::index current_element_index = 0; + current_element_index < current_vector.size(); + ++current_element_index ) target_stream << current_vector[current_element_index] << '\n'; } @@ -9864,7 +9915,6 @@ Flag variable and constant declarations with multiple declarators (e.g., `int* p Consider: auto p = v.begin(); // vector::iterator - auto s = v.size(); auto h = t.future(); auto q = make_unique(s); auto f = [](int x){ return x + 10; }; @@ -10052,11 +10102,6 @@ This cannot trivially be rewritten to initialize `i` and `j` with initializers. Note that for types with a default constructor, attempting to postpone initialization simply leads to a default initialization followed by an assignment. A popular reason for such examples is "efficiency", but a compiler that can detect whether we made a used-before-set error can also eliminate any redundant double initialization. -At the cost of repeating `cond` we could write: - - widget i = (cond) ? f1() : f3(); - widget j = (cond) ? f2() : f4(); - Assuming that there is a logical connection between `i` and `j`, that connection should probably be expressed in code: pair make_related_widgets(bool x) @@ -10064,25 +10109,13 @@ Assuming that there is a logical connection between `i` and `j`, that connection return (x) ? {f1(), f2()} : {f3(), f4() }; } - auto init = make_related_widgets(cond); - widget i = init.first; - widget j = init.second; - -Obviously, what we really would like is a construct that initialized n variables from a `tuple`. For example: - - auto [i, j] = make_related_widgets(cond); // C++17, not C++14 + auto [i, j] = make_related_widgets(cond); // C++17 -Today, we might approximate that using `tie()`: - - widget i; // bad: uninitialized variable - widget j; - tie(i, j) = make_related_widgets(cond); - -This may be seen as an example of the *immediately initialize from input* exception below. +##### Note -Creating optimal and equivalent code from all of these examples should be well within the capabilities of modern C++ compilers -(but don't make performance claims without measuring; a compiler may very well not generate optimal code for every example and -there may be language rules preventing some optimization that you would have liked in a particular case). +Complex initialization has been popular with clever programmers for decades. +It has also been a major source of errors and complexity. +Many such errors are introduced during maintenance years after the initial implementation. ##### Example @@ -10107,12 +10140,6 @@ The compiler will flag the uninitialized `cm3` because it is a `const`, but it w Usually, a rare spurious member initialization is worth the absence of errors from lack of initialization and often an optimizer can eliminate a redundant initialization (e.g., an initialization that occurs immediately before an assignment). -##### Note - -Complex initialization has been popular with clever programmers for decades. -It has also been a major source of errors and complexity. -Many such errors are introduced during maintenance years after the initial implementation. - ##### Exception If you are declaring an object that is just about to be initialized from input, initializing it would cause a double initialization. @@ -11954,7 +11981,7 @@ Readability. Error prevention. Efficiency. ##### Example - for (int i = 0; i < v.size(); ++i) // bad + for (gsl::index i = 0; i < v.size(); ++i) // bad cout << v[i] << '\n'; for (auto p = v.begin(); p != v.end(); ++p) // bad @@ -11963,13 +11990,13 @@ Readability. Error prevention. Efficiency. for (auto& x : v) // OK cout << x << '\n'; - for (int i = 1; i < v.size(); ++i) // touches two elements: can't be a range-for + for (gsl::index i = 1; i < v.size(); ++i) // touches two elements: can't be a range-for cout << v[i] + v[i - 1] << '\n'; - for (int i = 0; i < v.size(); ++i) // possible side effect: can't be a range-for + for (gsl::index i = 0; i < v.size(); ++i) // possible side effect: can't be a range-for cout << f(v, &v[i]) << '\n'; - for (int i = 0; i < v.size(); ++i) { // body messes with loop variable: can't be a range-for + for (gsl::index i = 0; i < v.size(); ++i) { // body messes with loop variable: can't be a range-for if (i % 2 == 0) continue; // skip even elements else @@ -12006,7 +12033,7 @@ Readability: the complete logic of the loop is visible "up front". The scope of ##### Example - for (int i = 0; i < vec.size(); i++) { + for (gsl::index i = 0; i < vec.size(); i++) { // do work } @@ -12437,6 +12464,23 @@ For example: This invokes `istream`'s `operator bool()`. +##### Note + +Explicit comparison of an integer to `0` is in general not redundant. +The reason is that (as opposed to pointers and Booleans) an integer often has more than two reasonable values. +Furthermore `0` (zero) is often used to indicate success. +Consequently, it is best to be specific about the comparison. + + void f(int i) + { + if (i) // suspect + // ... + if (i == success) // possibly better + // ... + } + +Always remember that an integer can have more that two values. + ##### Example, bad It has been noted that @@ -12449,7 +12493,7 @@ Being verbose and writing if(strcmp(p1, p2) != 0) { ... } // are the two C-style strings equal? (mistake!) -would not save you. +would not in itself save you. ##### Note @@ -12488,11 +12532,13 @@ It is harder to spot the problem in more realistic examples. ##### Note Unfortunately, C++ uses signed integers for array subscripts and the standard library uses unsigned integers for container subscripts. -This precludes consistency. +This precludes consistency. Use `gsl::index` for subscripts; [see ES.107](#Res-subscripts). ##### Enforcement -Compilers already know and sometimes warn. +* Compilers already know and sometimes warn. +* (To avoid noise) Do not flag on a mixed signed/unsigned comparison where one of the arguments is `sizeof` or a call to container `.size()` and the other is `ptrdiff_t`. + ### ES.101: Use unsigned types for bit manipulation @@ -12559,25 +12605,29 @@ is going to be surprising for many programmers. ##### Example The standard library uses unsigned types for subscripts. -The build-in array uses signed types for subscripts. +The built-in array uses signed types for subscripts. This makes surprises (and bugs) inevitable. int a[10]; for (int i = 0; i < 10; ++i) a[i] = i; vector v(10); - // compares signed to unsigned; some compilers warn - for (int i = 0; v.size() < 10; ++i) v[i] = i; + // compares signed to unsigned; some compilers warn, but we should not + for (gsl::index i = 0; v.size() < 10; ++i) v[i] = i; int a2[-2]; // error: negative size // OK, but the number of ints (4294967294) is so large that we should get an exception vector v2(-2); + Use `gsl::index` for subscripts; [see ES.107](#Res-subscripts). + ##### Enforcement * Flag mixed signed and unsigned arithmetic * Flag results of unsigned arithmetic assigned to or printed as signed. * Flag unsigned literals (e.g. `-2`) used as container subscripts. +* (To avoid noise) Do not flag on a mixed signed/unsigned comparison where one of the arguments is `sizeof` or a call to container `.size()` and the other is `ptrdiff_t`. + ### ES.103: Don't overflow @@ -12741,33 +12791,47 @@ For example Hard: there is a lot of code using `unsigned` and we don't offer a practical positive number type. -### ES.107: Don't use `unsigned` for subscripts +### ES.107: Don't use `unsigned` for subscripts, prefer `gsl::index` ##### Reason To avoid signed/unsigned confusion. To enable better optimization. To enable better error detection. +To avoid the pitfalls with `auto` and `int`. ##### Example, bad - vector vec {1, 2, 3, 4, 5}; + vector vec = /*...*/; - for (int i = 0; i < vec.size(); i += 2) // mix int and unsigned + for (int i = 0; i < vec.size(); i += 2) // may not be big enough cout << vec[i] << '\n'; for (unsigned i = 0; i < vec.size(); i += 2) // risk wraparound cout << vec[i] << '\n'; + for (auto i = 0; i < vec.size(); i += 2) // may not be big enough + cout << vec[i] << '\n'; for (vector::size_type i = 0; i < vec.size(); i += 2) // verbose cout << vec[i] << '\n'; - for (auto i = 0; i < vec.size(); i += 2) // mix int and unsigned + for (auto i = vec.size()-1; i >= 0; i -= 2) // bug + cout << vec[i] << '\n'; + for (int i = vec.size()-1; i >= 0; i -= 2) // may not be big enough + cout << vec[i] << '\n'; + +##### Example, good + + vector vec = /*...*/; + + for (gsl::index i = 0; i < vec.size(); i += 2) // ok + cout << vec[i] << '\n'; + for (gsl::index i = vec.size()-1; i >= 0; i -= 2) // ok cout << vec[i] << '\n'; ##### Note The built-in array uses signed subscripts. The standard-library containers use unsigned subscripts. -Thus, no perfect and fully compatible solution is possible. -Given the known problems with unsigned and signed/unsigned mixtures, better stick to (signed) integers. +Thus, no perfect and fully compatible solution is possible (unless and until the standard-library containers change to use signed subscripts someday in the future). +Given the known problems with unsigned and signed/unsigned mixtures, better stick to (signed) integers of a sufficient size, which is guaranteed by `gsl::index`. ##### Example @@ -12775,7 +12839,7 @@ Given the known problems with unsigned and signed/unsigned mixtures, better stic struct My_container { public: // ... - T& operator[](int i); // not unsigned + T& operator[](gsl::index i); // not unsigned // ... }; @@ -12793,7 +12857,11 @@ Alternatives for users ##### Enforcement -Very tricky as long as the standard-library containers get it wrong. +* Very tricky as long as the standard-library containers get it wrong. +* (To avoid noise) Do not flag on a mixed signed/unsigned comparison where one of the arguments is `sizeof` or a call to container `.size()` and the other is `ptrdiff_t`. + + + # Per: Performance @@ -13101,7 +13169,82 @@ Type violations, weak types (e.g. `void*`s), and low-level code (e.g., manipulat ### Per.11: Move computation from run time to compile time -??? +##### Reason + +To decrease code size and run time. +To avoid data races by using constants. +To catch errors at compile time (and thus eliminate the need for error-handling code). + +##### Example + + double square(double d) { return d*d; } + static double s2 = square(2); // old-style: dynamic initialization + + constexpr double ntimes(double d, int n) // assume 0 <= n + { + double m = 1; + while (n--) m *= d; + return m; + } + constexpr double s3 {ntimes(2, 3)}; // modern-style: compile-time initialization + +Code like the initialization of `s2` isn't uncommon, especially for initialization that's a bit more complicated than `square()`. +However, compared to the initialization of `s3` there are two problems: + +* we suffer the overhead of a function call at run time +* `s2` just might be accessed by another thread before the initialization happens. + +Note: you can't have a data race on a constant. + +##### Example + +Consider a popular technique for providing a handle for storing small objects in the handle itself and larger ones on the heap. + + constexpr int on_stack_max = 20; + + template + struct Scoped { // store a T in Scoped + // ... + T obj; + }; + + template + struct On_heap { // store a T on the free store + // ... + T* objp; + }; + + template + using Handle = typename std::conditional<(sizeof(T) <= on_stack_max), + Scoped, // first alternative + On_heap // second alternative + >::type; + + void f() + { + Handle v1; // the double goes on the stack + Handle> v2; // the array goes on the free store + // ... + } + +Assume that `Scoped` and `On_heap` provide compatible user interfaces. +Here we compute the optimal type to use at compile time. +There are similar techniques for selecting the optimal function to call. + +##### Note + +The ideal is {not} to try execute everything at compile time. +Obviously, most computations depend on inputs so they can't be moved to compile time, +but beyond that logical constraint is the fact that complex compile-time computation can seriously increase compile times +and complicate debugging. +It is even possible to slow down code by compile-time computation. +This is admittedly rare, but by factoring out a general computation into separate optimal sub-calculations it is possible to render the instruction cache less effective. + +##### Enforcement + +* Look for simple functions that might be constexpr (but are not). +* Look for functions called with all constant-expression arguments. +* Look for macros that could be constexpr. ### Per.12: Eliminate redundant aliases @@ -13391,6 +13534,7 @@ Making `surface_readings` be `const` (with respect to this function) allow reaso Immutable data can be safely and efficiently shared. No locking is needed: You can't have a data race on a constant. +See also [CP.mess: Message Passing](#SScp-mess) and [CP.31: prefer pass by value](#C#Rconc-data-by-value). ##### Enforcement @@ -14694,7 +14838,7 @@ C++ implementations tend to be optimized based on the assumption that exceptions int find_index(vector& vec, const string& x) { try { - for (int i = 0; i < vec.size(); ++i) + for (gsl::index i = 0; i < vec.size(); ++i) if (vec[i] == x) throw i; // found x } catch (int i) { return i; @@ -15777,9 +15921,9 @@ Templates can also be used for meta-programming; that is, programs that compose A central notion in generic programming is "concepts"; that is, requirements on template arguments presented as compile-time predicates. "Concepts" are defined in an ISO Technical specification: [concepts](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4553.pdf). A draft of a set of standard-library concepts can be found in another ISO TS: [ranges](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4569.pdf) -Currently (July 2016), concepts are supported only in GCC 6.1. +Concepts are supported in GCC 6.1 and later. Consequently, we comment out uses of concepts in examples; that is, we use them as formalized comments only. -If you use GCC 6.1, you can uncomment them. +If you use GCC 6.1 or later, you can uncomment them. Template use rule summary: @@ -15934,9 +16078,9 @@ is to efficiently generalize operations/algorithms over a set of types with simi The `requires` in the comments are uses of `concepts`. "Concepts" are defined in an ISO Technical specification: [concepts](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4553.pdf). -Currently (July 2016), concepts are supported only in GCC 6.1. +Concepts are supported in GCC 6.1 and later. Consequently, we comment out uses of concepts in examples; that is, we use them as formalized comments only. -If you use GCC 6.1, you can uncomment them. +If you use GCC 6.1 or later, you can uncomment them. ##### Enforcement @@ -16127,9 +16271,9 @@ or equivalently and more succinctly: "Concepts" are defined in an ISO Technical specification: [concepts](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4553.pdf). A draft of a set of standard-library concepts can be found in another ISO TS: [ranges](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4569.pdf) -Currently (July 2016), concepts are supported only in GCC 6.1. +Concepts are supported in GCC 6.1 and later. Consequently, we comment out uses of concepts in examples; that is, we use them as formalized comments only. -If you use GCC 6.1, you can uncomment them: +If you use GCC 6.1 or later, you can uncomment them: template requires Input_iterator @@ -16232,9 +16376,9 @@ The shorter versions better match the way we speak. Note that many templates don "Concepts" are defined in an ISO Technical specification: [concepts](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4553.pdf). A draft of a set of standard-library concepts can be found in another ISO TS: [ranges](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/n4569.pdf) -Currently (July 2016), concepts are supported only in GCC 6.1. +Concepts are supported in GCC 6.1 and later. Consequently, we comment out uses of concepts in examples; that is, we use them as formalized comments only. -If you use a compiler that supports concepts (e.g., GCC 6.1), you can remove the `//`. +If you use a compiler that supports concepts (e.g., GCC 6.1 or later), you can remove the `//`. ##### Enforcement @@ -16248,7 +16392,7 @@ Concepts are meant to represent fundamental concepts in an application domain (h Similarly throwing together a set of syntactic constraints to be used for a the arguments for a single class or algorithm is not what concepts were designed for and will not give the full benefits of the mechanism. -Obviously, defining concepts will be most useful for code that can use an implementation (e.g., GCC 6.1), +Obviously, defining concepts will be most useful for code that can use an implementation (e.g., GCC 6.1 or later), but defining concepts is in itself a useful design technique and help catch conceptual errors and clean up the concepts (sic!) of an implementation. ### T.20: Avoid "concepts" without meaningful semantics @@ -18561,6 +18705,13 @@ For a variable-length array, use `std::vector`, which additionally can change it Use `gsl::span` for non-owning references into a container. +##### Note + +Comparing the performance of a fixed-sized array allocated on the stack against a `vector` with its elements on the free store is bogus. +You could just as well compare a `std::array` on the stack against the result of a `malloc()` accessed through a pointer. +For most code, even the difference between stack allocation and free-store allocation doesn't matter, but the convenience and safety of `vector` does. +People working with code for which that difference matters are quite capable of choosing between `array` and `vector`. + ##### Enforcement * Flag declaration of a C array inside a function or class that also declares an STL container (to avoid excessive noisy warnings on legacy non-STL code). To fix: At least change the C array to a `std::array`. @@ -19875,12 +20026,13 @@ for example, `Expects(p != nullptr)` will become `[[expects: p != nullptr]]`. ## GSL.util: Utilities -* `finally` // `finally(f)` makes a `final_action{f}` with a destructor that invokes `f` -* `narrow_cast` // `narrow_cast(x)` is `static_cast(x)` -* `narrow` // `narrow(x)` is `static_cast(x)` if `static_cast(x) == x` or it throws `narrowing_error` -* `[[implicit]]` // "Marker" to put on single-argument constructors to explicitly make them non-explicit. -* `move_owner` // `p = move_owner(q)` means `p = q` but ??? +* `finally` // `finally(f)` makes a `final_action{f}` with a destructor that invokes `f` +* `narrow_cast` // `narrow_cast(x)` is `static_cast(x)` +* `narrow` // `narrow(x)` is `static_cast(x)` if `static_cast(x) == x` or it throws `narrowing_error` +* `[[implicit]]` // "Marker" to put on single-argument constructors to explicitly make them non-explicit. +* `move_owner` // `p = move_owner(q)` means `p = q` but ??? * `joining_thread` // a RAII style version of `std::thread` that joins. +* `index` // a type to use for all container and array indexing (currently an alias for `ptrdiff_t`) ## GSL.concept: Concepts @@ -20293,24 +20445,44 @@ When declaring a class use the following order Use the `public` before `protected` before `private` order. -Private types and functions can be placed with private data. +##### Example -Avoid multiple blocks of declarations of one access (e.g., `public`) dispersed among blocks of declarations with different access (e.g. `private`). + class X { + public: + // interface + protected: + // unchecked function for use by derived class implementations + private: + // implementation details + }; ##### Example +Sometimes, the default order of members conflicts with a desire to separate the public interface from implementation details. +In such cases, private types and functions can be placed with private data. + class X { public: // interface protected: // unchecked function for use by derived class implementations private: - // implementation details + // implementation details (types, functions, and data) }; -##### Note +##### Example, bad + +Avoid multiple blocks of declarations of one access (e.g., `public`) dispersed among blocks of declarations with different access (e.g. `private`). + + class X { // bad + public: + void f(); + public: + int g(); + // ... + }; -The use of macros to declare groups of members often violates any ordering rules. +The use of macros to declare groups of members often leads to violation of any ordering rules. However, macros obscures what is being expressed anyway. ##### Enforcement