Separated F.15 into F.15-21: in, inout, consume, forward, out, multi-out

pull/416/head
hsutter 10 years ago
parent 08b83eaea0
commit 6228e99a51

@ -1778,9 +1778,18 @@ Function definition rules:
* [F.7: For general use, take `T*` arguments rather than smart pointers](#Rf-smart) * [F.7: For general use, take `T*` arguments rather than smart pointers](#Rf-smart)
* [F.8: Prefer pure functions](#Rf-pure) * [F.8: Prefer pure functions](#Rf-pure)
Parameter passing rules: Parameter passing expression rules:
* [F.15: Prefer simple and conventional ways of passing information](#Rf-conventional) * [F.15: Prefer simple and conventional ways of passing information](#Rf-conventional)
* [F.16: For "in" parameters, pass cheaply copied types by value and others by reference to `const`](#Rf-in)
* [F.17: For "in-out" parameters, pass by reference to non-`const`](#Rf-inout)
* [F.18: For "consume" parameters, pass by `X&&` and `std::move` the parameter](#Rf-consume)
* [F.19: For "forward" parameters, pass by `TP&&` and only `std::forward` the parameter](#Rf-forward)
* [F.20: For "out" output values, prefer return values to output parameters](#Rf-out)
* [F.21: To return multiple "out" values, prefer returning a tuple or struct](#Rf-out-multi)
Parameter passing semantic rules:
* [F.22: Use `T*` or `owner<T*>` or a smart pointer to designate a single object](#Rf-ptr) * [F.22: Use `T*` or `owner<T*>` or a smart pointer to designate a single object](#Rf-ptr)
* [F.23: Use a `not_null<T>` to indicate "null" is not a valid value](#Rf-nullptr) * [F.23: Use a `not_null<T>` to indicate "null" is not a valid value](#Rf-nullptr)
* [F.24: Use a `span<T>` or a `span_p<T>` to designate a half-open sequence](#Rf-range) * [F.24: Use a `span<T>` or a `span_p<T>` to designate a half-open sequence](#Rf-range)
@ -1788,9 +1797,8 @@ Parameter passing rules:
* [F.26: Use a `unique_ptr<T>` to transfer ownership where a pointer is needed](#Rf-unique_ptr) * [F.26: Use a `unique_ptr<T>` to transfer ownership where a pointer is needed](#Rf-unique_ptr)
* [F.27: Use a `shared_ptr<T>` to share ownership](#Rf-shared_ptr) * [F.27: Use a `shared_ptr<T>` to share ownership](#Rf-shared_ptr)
Value return rules: Value return semantic rules:
* [F.41: Prefer to return tuples or structs instead of multiple out-parameters](#Rf-T-multi)
* [F.42: Return a `T*` to indicate a position (only)](#Rf-return-ptr) * [F.42: Return a `T*` to indicate a position (only)](#Rf-return-ptr)
* [F.43: Never (directly or indirectly) return a pointer to a local object](#Rf-dangle) * [F.43: Never (directly or indirectly) return a pointer to a local object](#Rf-dangle)
* [F.44: Return a `T&` when "returning no object" isn't an option](#Rf-return-ref) * [F.44: Return a `T&` when "returning no object" isn't an option](#Rf-return-ref)
@ -2182,6 +2190,7 @@ Not possible.
There are a variety of ways to pass parameters to a function and to return values. There are a variety of ways to pass parameters to a function and to return values.
### <a name="Rf-conventional"></a> Rule F.15: Prefer simple and conventional ways of passing information ### <a name="Rf-conventional"></a> Rule F.15: Prefer simple and conventional ways of passing information
##### Reason ##### Reason
@ -2191,41 +2200,82 @@ If you really feel the need for an optimization beyond the common techniques, me
![Normal parameter passing table](./param-passing-normal.png "Normal parameter passing") ![Normal parameter passing table](./param-passing-normal.png "Normal parameter passing")
**For an "output-only" value:** Prefer return values to output parameters. ![Advanced parameter passing table](./param-passing-advanced.png "Advanced parameter passing")
This includes large objects like standard containers that use implicit move operations for performance and to avoid explicit memory management. A return value is self-documenting, whereas a `&` could be either in-out or out-only and is liable to be misused.
If you have multiple values to return, [use a tuple](#Rf-T-multi) or similar multi-member type.
### <a name="Rf-in"></a> Rule F.16: For "in" parameters, pass cheaply copied types by value and others by reference to `const`
##### Reason
Both let the caller know that a function will not modify the argument, and both allow initialization by rvalues.
What is "cheap to copy" depends on the machine architecture, but two or three words (doubles, pointers, references) are usually best passed by value.
When copying is cheap, nothing beats the simplicity and safety of copying, and for small objects (up to two or three words) it is also faster than passing by reference because it does not require an extra reference to access from the function.
##### Example ##### Example
vector<const int*> find_all(const vector<int>&, int x); // OK: return pointers to elements with the value x void fct(const string& s); // OK: pass by const reference; always cheap
void find_all(const vector<int>&, vector<const int*>& out, int x); // Bad: place pointers to elements with value x in out void fct2(string s); // bad: potentially expensive
##### Note void fct(int x); // OK: Unbeatable
A struct of many (individually cheap-to-move) elements may be in aggregate expensive to move. void fct2(const int& x); // bad: overhead on access in fct2()
##### Exceptions For advanced uses (only), where you really need to optimize for rvalues passed to "input-only" parameters:
* For non-value types, such as types in an inheritance hierarchy, return the object by `unique_ptr` or `shared_ptr`. * If the function is going to unconditionally move from the argument, take it by `&&`.
* If a type is expensive to move (e.g., `array<BigPOD>`), consider allocating it on the free store and return a handle (e.g., `unique_ptr`), or passing it in a non-`const` reference to a target object to fill (to be used as an out-parameter). * If the function is going to keep a copy of the argument, in addition to passing by `const&` add an overload that passes the parameter by `&&` and in the body `std::move`s it to its destination.
* In the special case of allowing a caller to reuse an object that carries capacity (e.g., `std::string`, `std::vector`) across multiple calls to the function in an inner loop, treat it as an in/out parameter instead and pass by `&`. This is one use of the more generally named "caller-allocated out" pattern. * In special cases, such as multiple "input + copy" parameters, consider using perfect forwarding.
##### Example ##### Example
struct Package { // exceptional case: expensive-to-move object int multiply(int, int); // just input ints, pass by value
char header[16];
char load[2024 - 16];
};
Package fill(); // Bad: large return value string& concatenate(string&, const string& suffix); // suffix is input-only but not as cheap as an int, pass by const&
void fill(Package&); // OK
int val(); // OK void sink(unique_ptr<widget>); // input only, and consumes the widget
void val(int&); // Bad: Is val reading its argument
Avoid "esoteric techniques" such as:
**For an "in-out" parameter:** Pass by non-`const` reference. This makes it clear to callers that the object is assumed to be modified. * Passing arguments as `T&&` "for efficiency". Most rumors about performance advantages from passing by `&&` are false or brittle (but see [F.25](#Rf-pass-ref-move).)
* Returning `const T&` from assignments and similar operations.
##### Example
Assuming that `Matrix` has move operations (possibly by keeping its elements in a `std::vector`.
Matrix operator+(const Matrix& a, const Matrix& b)
{
Matrix res;
// ... fill res with the sum ...
return res;
}
Matrix x = m1 + m2; // move constructor
y = m3 + m3; // move assignment
##### Notes
The return value optimization doesn't handle the assignment case.
A reference may be assumed to refer to a valid object (language rule).
There is no (legitimate) "null reference."
If you need the notion of an optional value, use a pointer, `std::optional`, or a special value used to denote "no value."
##### Enforcement
* (Simple) ((Foundation)) Warn when a parameter being passed by value has a size greater than `4 * sizeof(int)`.
Suggest using a `const` reference instead.
* (Simple) ((Foundation)) Warn when a `const` parameter being passed by reference has a size less than `3 * sizeof(int)`. Suggest passing by value instead.
### <a name="Rf-inout"></a> Rule F.17: For "in-out" parameters, pass by reference to non-`const`
##### Reason
This makes it clear to callers that the object is assumed to be modified.
##### Example ##### Example
@ -2251,96 +2301,136 @@ Thus `T&` could be an in-out-parameter. That can in itself be a problem and a so
Here, the writer of `g()` is supplying a buffer for `f()` to fill, but `f()` simply replaces it (at a somewhat higher cost than a simple copy of the characters). Here, the writer of `g()` is supplying a buffer for `f()` to fill, but `f()` simply replaces it (at a somewhat higher cost than a simple copy of the characters).
If the writer of `g()` makes an assumption about the size of `buffer` a bad logic error can happen. If the writer of `g()` makes an assumption about the size of `buffer` a bad logic error can happen.
##### Enforcement
* (Moderate) ((Foundation)) Warn about functions with non-`const` reference arguments that do *not* write to them.
**For an "input-only" value:** If the object is cheap to copy, pass by value; nothing beats the simplicity and safety of copying, and for small objects (up to two or three words) it is also faster than passing by reference.
Otherwise, pass by `const&` which is always cheap for larger objects. Both let the caller know that a function will not modify the argument, and both allow initialization by rvalues.
What is "cheap to copy" depends on the machine architecture, but two or three words (doubles, pointers, references) are usually best passed by value.
In particular, an object passed by value does not require an extra reference to access from the function.
##### Example ### <a name="Rf-consume"></a> Rule F.18: For "consume" parameters, pass by `X&&` and `std::move` the parameter
void fct(const string& s); // OK: pass by const reference; always cheap ##### Reason
void fct2(string s); // bad: potentially expensive ##### Enforcement
* Flag all `X&&` parameters (where `X` is not a template type parameter name) where the function body uses them without `std::move`.
* Flag access to moved-from objects.
* Don't conditionally move from objects
void fct(int x); // OK: Unbeatable
void fct2(const int& x); // bad: overhead on access in fct2() ### <a name="Rf-forward"></a> Rule F.19: For "forward" parameters, pass by `TP&&` and only `std::forward` the parameter
![Advanced parameter passing table](./param-passing-advanced.png "Advanced parameter passing") ##### Reason
For advanced uses (only), where you really need to optimize for rvalues passed to "input-only" parameters: If the object is to be passed onward to other code and not directly used by this function, we want to make this function agnostic to the argument `const`-ness and rvalue-ness.
* If the function is going to unconditionally move from the argument, take it by `&&`. In that case, and only that case, make the parameter `TP&&` where `TP` is a template type parameter -- it both *ignores* and *preserves* `const`-ness and rvalue-ness. Therefore any code that uses a `T&&` is implicitly declaring that it itself doesn't care about the variable's `const`'-ness and rvalue-ness (because it is ignored), but that intends to pass the value onward to other code that does care about `const`-ness and rvalue-ness (because it is preserved). When used as a parameter `TP&&` is safe because any temporary objects passed from the caller will live for the duration of the function call. A parameter of type `TP&&` should essentially always be passed onward via `std::forward` in the body of the function.
* If the function is going to keep a copy of the argument, in addition to passing by `const&` add an overload that passes the parameter by `&&` and in the body `std::move`s it to its destination.
* In special cases, such as multiple "input + copy" parameters, consider using perfect forwarding.
##### Example ##### Example
int multiply(int, int); // just input ints, pass by value template <class F, class... Args>
inline auto invoke(F&& f, Args&&... args) {
return forward<F>(f)(forward<Args>(args)...);
}
string& concatenate(string&, const string& suffix); // suffix is input-only but not as cheap as an int, pass by const&
void sink(unique_ptr<widget>); // input only, and consumes the widget ##### Enforcement
* Flag a function that takes a `TP&&` parameter (where `TP` is a template type parameter name) and uses it without `std::forward`.
Avoid "esoteric techniques" such as:
* Passing arguments as `T&&` "for efficiency". Most rumors about performance advantages from passing by `&&` are false or brittle (but see [F.25](#Rf-pass-ref-move).) ### <a name="Rf-out"></a> Rule F.20: For "out" output values, prefer return values to output parameters
* Returning `const T&` from assignments and similar operations.
##### Reason
A return value is self-documenting, whereas a `&` could be either in-out or out-only and is liable to be misused.
This includes large objects like standard containers that use implicit move operations for performance and to avoid explicit memory management.
If you have multiple values to return, [use a tuple](#Rf-out-multi) or similar multi-member type.
##### Example ##### Example
Assuming that `Matrix` has move operations (possibly by keeping its elements in a `std::vector`. vector<const int*> find_all(const vector<int>&, int x); // OK: return pointers to elements with the value x
Matrix operator+(const Matrix& a, const Matrix& b) void find_all(const vector<int>&, vector<const int*>& out, int x); // Bad: place pointers to elements with value x in out
{
Matrix res;
// ... fill res with the sum ...
return res;
}
Matrix x = m1 + m2; // move constructor ##### Note
y = m3 + m3; // move assignment A struct of many (individually cheap-to-move) elements may be in aggregate expensive to move.
##### Notes ##### Exceptions
The return value optimization doesn't handle the assignment case. * For non-value types, such as types in an inheritance hierarchy, return the object by `unique_ptr` or `shared_ptr`.
* If a type is expensive to move (e.g., `array<BigPOD>`), consider allocating it on the free store and return a handle (e.g., `unique_ptr`), or passing it in a non-`const` reference to a target object to fill (to be used as an out-parameter).
* In the special case of allowing a caller to reuse an object that carries capacity (e.g., `std::string`, `std::vector`) across multiple calls to the function in an inner loop, treat it as an in/out parameter instead and pass by `&`. This is one use of the more generally named "caller-allocated out" pattern.
##### Example
struct Package { // exceptional case: expensive-to-move object
char header[16];
char load[2024 - 16];
};
Package fill(); // Bad: large return value
void fill(Package&); // OK
int val(); // OK
void val(int&); // Bad: Is val reading its argument
##### Enforcement
* Flag non-`const` reference parameters that are not read before being written to and are a type that could be cheaply returned; they should be "out" return values.
A reference may be assumed to refer to a valid object (language rule).
There is no (legitimate) "null reference."
If you need the notion of an optional value, use a pointer, `std::optional`, or a special value used to denote "no value."
### <a name="Rf-out-multi"></a> Rule F.21: To return multiple "out" values, prefer returning a tuple or struct
##### Reason
**For an "forwarded" value:** If the object is to be passed onward to other code and not directly used by this function, we want to make this function agnostic to the argument `const`-ness and rvalue-ness. In that case, and only that case, make the parameter `TP&&` where `TP` is a template type parameter -- it both *ignores* and *preserves* `const`-ness and rvalue-ness. Therefore any code that uses a `T&&` is implicitly declaring that it itself doesn't care about the variable's `const`'-ness and rvalue-ness (because it is ignored), but that intends to pass the value onward to other code that does care about `const`-ness and rvalue-ness (because it is preserved). When used as a parameter `TP&&` is safe because any temporary objects passed from the caller will live for the duration of the function call. A parameter of type `TP&&` should essentially always be passed onward via `std::forward` in the body of the function. A return value is self-documenting as an "output-only" value.
And yes, C++ does have multiple return values, by convention of using a `tuple`, with the extra convenience of `tie` at the call site.
##### Example ##### Example
template <class F, class... Args> int f(const string& input, /*output only*/ string& output_data) // BAD: output-only parameter documented in a comment
inline auto invoke(F&& f, Args&&... args) { {
return forward<F>(f)(forward<Args>(args)...); // ...
output_data = something();
return status;
} }
tuple<int, string> f(const string& input) // GOOD: self-documenting
{
// ...
return make_tuple(something(), status);
}
##### Enforcement In fact, C++98's standard library already used this convenient feature, because a `pair` is like a two-element `tuple`.
For example, given a `set<string> myset`, consider:
* (Simple) ((Foundation)) Warn when a parameter being passed by value has a size greater than `4 * sizeof(int)`. // C++98
Suggest using a `const` reference instead. result = myset.insert("Hello");
* (Simple) ((Foundation)) Warn when a `const` parameter being passed by reference has a size less than `3 * sizeof(int)`. Suggest passing by value instead. if (result.second) do_something_with(result.first); // workaround
* (Moderate) ((Foundation)) Warn about functions with non-`const` reference arguments that do *not* write to them.
* Flag a function that takes a `TP&&` parameter (where `TP` is a template type parameter name) and uses it without `std::forward`. With C++11 we can write this, putting the results directly in existing local variables:
* Flag all `X&&` parameters (where `X` is not a template type parameter name) where the function body uses them without `std::move`.
* Flag access to moved-from objects. Sometype iter; // default initialize if we haven't already
* Don't conditionally move from objects Someothertype success; // used these variables for some other purpose
* Flag non-`const` reference parameters that are not read before being written to and are a type that could be cheaply returned; they should be "out" return values.
tie(iter, success) = myset.insert("Hello"); // normal return value
if (success) do_something_with(iter);
With C++17 we may be able to write something like this, also declaring the variables:
auto { iter, success } = myset.insert("Hello");
if (success) do_something_with(iter);
**Exception**: For types like `string` and `vector` that carry additional capacity, it can sometimes be useful to treat it as in/out instead by using the "caller-allocated out" pattern, which is to pass an output-only object by reference to non-`const` so that when the callee writes to it the object can reuse any capacity or other resources that it already contains. This technique can dramatically reduce the number of allocations in a loop that repeatedly calls other functions to get string values, by using a single string object for the entire loop.
##### Note
**See also**: [implicit arguments](#Ri-explicit). In some cases it may be useful to return a specific, user-defined `Value_or_error` type along the lines of `variant<T, error_code>`, rather than using the generic `tuple`.
##### Enforcement ##### Enforcement
This is a philosophical guideline that is infeasible to check directly and completely. * Output parameters should be replaced by return values.
However, many of the detailed rules (F.22-F.45) can be checked, such as passing a `const int&`, returning an `array<BigPOD>` by value, and returning a pointer to free store alloced by the function. An output parameter is one that the function writes to, invokes a non-`const` member function, or passes on as a non-`const`.
### <a name="Rf-ptr"></a> F.22: Use `T*` or `owner<T*>` to designate a single object ### <a name="Rf-ptr"></a> F.22: Use `T*` or `owner<T*>` to designate a single object
@ -2531,59 +2621,6 @@ Note that pervasive use of `shared_ptr` has a cost (atomic operations on the `sh
(Not enforceable) This is a too complex pattern to reliably detect. (Not enforceable) This is a too complex pattern to reliably detect.
### <a name="Rf-T-multi"></a> F.41: Prefer to return tuples or structs instead of multiple out-parameters
##### Reason
A return value is self-documenting as an "output-only" value.
And yes, C++ does have multiple return values, by convention of using a `tuple`, with the extra convenience of `tie` at the call site.
##### Example
int f(const string& input, /*output only*/ string& output_data) // BAD: output-only parameter documented in a comment
{
// ...
output_data = something();
return status;
}
tuple<int, string> f(const string& input) // GOOD: self-documenting
{
// ...
return make_tuple(something(), status);
}
In fact, C++98's standard library already used this convenient feature, because a `pair` is like a two-element `tuple`.
For example, given a `set<string> myset`, consider:
// C++98
result = myset.insert("Hello");
if (result.second) do_something_with(result.first); // workaround
With C++11 we can write this, putting the results directly in existing local variables:
Sometype iter; // default initialize if we haven't already
Someothertype success; // used these variables for some other purpose
tie(iter, success) = myset.insert("Hello"); // normal return value
if (success) do_something_with(iter);
With C++17 we may be able to write something like this, also declaring the variables:
auto { iter, success } = myset.insert("Hello");
if (success) do_something_with(iter);
**Exception**: For types like `string` and `vector` that carry additional capacity, it can sometimes be useful to treat it as in/out instead by using the "caller-allocated out" pattern, which is to pass an output-only object by reference to non-`const` so that when the callee writes to it the object can reuse any capacity or other resources that it already contains. This technique can dramatically reduce the number of allocations in a loop that repeatedly calls other functions to get string values, by using a single string object for the entire loop.
##### Note
In some cases it may be useful to return a specific, user-defined `Value_or_error` type along the lines of `variant<T, error_code>`, rather than using the generic `tuple`.
##### Enforcement
* Output parameters should be replaced by return values.
An output parameter is one that the function writes to, invokes a non-`const` member function, or passes on as a non-`const`.
### <a name="Rf-return-ptr"></a> F.42: Return a `T*` to indicate a position (only) ### <a name="Rf-return-ptr"></a> F.42: Return a `T*` to indicate a position (only)
##### Reason ##### Reason

Loading…
Cancel
Save