Thanks for this post. It is always helpful to hear other perspectives on things we might take for granted everyone sees the same way.
Note though, you have your assert on the destination array prior to the assignment.
for (size_t i = 0; i < N; i++) {
assert(x1[i] < RAND_MAX);
x1[i] = x2[i];
}
Should that assert be after the assignment? Or perhaps on the source array?
I don’t think this affects the conclusion you are making. It just seems unusual to be checking the value against the threshold prior to overwriting it.
As pointed out in the blog post, many systems release code with asserts active. If you write code for others, you cannot always know whether they will set NDEBUG when releasing their software. Compilers do not require NDEBUG to be set even when fully optimizing the binary.
NRKsays:
One might object that you can choose to only enable assertions for the release version of your binary…
I think you meant to say “debug version” here, not release.
Compilers like GCC or clang (LLVM) do not deactivate asserts when compiling with optimizations.
This is indeed a bit unfortunate, but the way I deal with it is via making assertions opt-in, instead of opt-out. More concretely, you need to pass -DDEBUG to enable assertions, and they are disabled by default.
So having asserts in your library may disqualify it for some uses.
I believe this stems from a poor understanding of assertions. They exist to catch programming mistakes as early as possible instead of marching forward as if everything is OK. It’s hard to describe how effective this is during debugging (especially in early development where the code and requirements are changing rapidly).
But spreading asserts in performance critical code might be unwise.
Respectfully disagree, because assertions are supposed to document “impossible conditions”, you can use that information to enable better optimization in release builds. For example you can turn the assertions into something like if (!(EXPR)) __builtin_unreachable(); when DEBUG isn’t defined.
Here’s a very trivial example which shows that assertions (coupled with unreachable()) can improve performance by giving the compiler more information about “impossible situation” (while at the same time helping you catch bugs in debug builds if that “impossible” situation somehow is reached): https://godbolt.org/z/MPWhrhGxx
In simdjson, we have SIMDJSON_ASSUME(COND) which is unrelated to asserts.
NRKsays:
You’re correct, C23 also adds a distinct unreachable() macro in the <stddef.h> header.
And UBSan is capable of detecting if program reaches an unreachable state, which is quite nice (https://godbolt.org/z/5b196jaTr). So this does weaken my argument of catching unreachable state in debug builds.
Benjamen R. Meyersays:
If your library causes my application to fail via assert you can guarantee that I’ll rip it out of the dependencies.
Instead of asserting, generate a error that can be handled, whether that is an Exception or an Error Code doesn’t matter (though libraries really shouldn’t let Exceptions escape them either, but that’s a different discussion).
All-in-all, the application needs to be stable regardless of input to function.
dontcrashmyapp
NRKsays:
Instead of asserting, generate a error that can be handled
It all depends on where the input is coming from. If it’s from uncontrolled source (such as a file) then it shouldn’t be asserted. But if the input is under control of the programmer (e.g an interface requires an argument to be a power of 2 integer, let’s say for alignment purposes) then that’s a prime example of where assertions come in handy.
If the caller has already proven himself to be buggy, what are the chances that the buggy caller is checking for error returns? Not high. And what are the chances that assertion (in debug builds) will be ignored? Pretty much 0.
Assertions are not much different than ASan/UBSan/Valgrind in the regard that they’re a debugging tool meant to check for programming mistakes so that it can be caught as early as possible.
the application needs to be stable regardless of input to function.
An application that produces incorrect results is anything but stable.
dontcrashmyapp
It won’t if the application is not buggy (or if you link against the release build with assertions compiled out).
M.W.says:
But if the input is under control of the programmer (e.g an interface requires an argument to be a power of 2 integer, let’s say for alignment purposes) then that’s a prime example of where assertions come in handy.
My recent example of this was a code ignoring the output of sscanf, and dereferencing a NULL pointer. That led to a dabort, which – as the stack pointer for this exception was not set – led to another dabort, this time with the source address gone (ARM). Searching for this took a while, and the mind-bending debugging of sscanf will stay with me forever. All because “oh, this can never happen, so we don’t need to check” attitude.
Tobin Bakersays:
I disagree: assertions should be used to enforce internal assumptions on state, not external API requirements. It should not be possible for a library client to trigger an assert due to a programming error in their code (though they might trigger an assert due to a programming error in the library itself). A library simply can’t make assumptions about the environment in which it’s used. In some environments it might be fine to crash with an informative message; in others the program must keep running (perhaps after reverting to a known-good state). Throwing exceptions or returning error codes leaves the decision to the client, where it belongs. (Note that the Erlang “let it crash” approach is actually about handling software errors, not expected faults, and is designed for “always-on” systems where a full crash is unacceptable.)
Antoinesays:
Compilers like GCC or clang (LLVM) do not deactivate asserts when compiling with optimizations.
Hmm… ok, if you build the command line from scratch. Common pratice is to include -DNDEBUG in release flags, and that’s what you get by default from CMake, for example.
One might object that you can choose to only enable assertions for the debug version of your binary… but this choice is subject to debate.
Why not have that debate? 🙂
My personal position is that if a check is important to keep a release mode, then your API should have proper error-return semantics, and the check should be turned to return an error instead of crashing out.
In other words: use asserts to check internal invariants in debug mode, when running your test suite (and perhaps a fuzzer of sorts); use error returns for conditions that can happen even if the code is right (for example bad user input, IO error, memory allocation failure…).
One strong argument in favor of that policy is that the code may be called from a higher level language. Crashing out for errors in Python, for example, makes users’ lives miserable.
Konstantinsays:
The best and easiest mitigation to various design and performance problems with asserts is to write your own assert system. This gives you severity control (fine-tuning per-assert behavior), streaming, feedback, and many more things.
I always found the C assert a bit heavy-handed and lacking.
Donovan T Baardasays:
IMHO if your asserts are load-bearing, you are doing it wrong. If the program runs correctly with asserts turned on, it should run correctly with them turned off.
Using asserts liberally to check pre/post/invariant conditions is a simple way to do contract-programming, but they must be safe to turn off for non-debug builds.
Globulessays:
You might be able to have two versions of a function, such as arrayCopyFast and arrayCopySafe. Users could call the fast version when they know their input is valid.
Thanks for this post. It is always helpful to hear other perspectives on things we might take for granted everyone sees the same way.
Note though, you have your assert on the destination array prior to the assignment.
for (size_t i = 0; i < N; i++) {
assert(x1[i] < RAND_MAX);
x1[i] = x2[i];
}
Should that assert be after the assignment? Or perhaps on the source array?
I don’t think this affects the conclusion you are making. It just seems unusual to be checking the value against the threshold prior to overwriting it.
It was a typo, thank you.
It’ not free in debug mode.
in release mode with “NDEBUG” is defined, the assertion is just nothing, so that assertion is sure free with NDEBUG
As pointed out in the blog post, many systems release code with asserts active. If you write code for others, you cannot always know whether they will set NDEBUG when releasing their software. Compilers do not require NDEBUG to be set even when fully optimizing the binary.
I think you meant to say “debug version” here, not release.
This is indeed a bit unfortunate, but the way I deal with it is via making assertions opt-in, instead of opt-out. More concretely, you need to pass
-DDEBUG
to enable assertions, and they are disabled by default.I believe this stems from a poor understanding of assertions. They exist to catch programming mistakes as early as possible instead of marching forward as if everything is OK. It’s hard to describe how effective this is during debugging (especially in early development where the code and requirements are changing rapidly).
Respectfully disagree, because assertions are supposed to document “impossible conditions”, you can use that information to enable better optimization in release builds. For example you can turn the assertions into something like
if (!(EXPR)) __builtin_unreachable();
whenDEBUG
isn’t defined.Here’s a very trivial example which shows that assertions (coupled with
unreachable()
) can improve performance by giving the compiler more information about “impossible situation” (while at the same time helping you catch bugs in debug builds if that “impossible” situation somehow is reached): https://godbolt.org/z/MPWhrhGxxThanks. Everything you write is sensible.
I would nitpick that assume expressions and asserts are distinct. They certainly are in the C++ standard:
https://godbolt.org/z/nchh5jv9v
In simdjson, we have SIMDJSON_ASSUME(COND) which is unrelated to asserts.
You’re correct, C23 also adds a distinct
unreachable()
macro in the <stddef.h> header.And UBSan is capable of detecting if program reaches an unreachable state, which is quite nice (https://godbolt.org/z/5b196jaTr). So this does weaken my argument of catching unreachable state in debug builds.
If your library causes my application to fail via assert you can guarantee that I’ll rip it out of the dependencies.
Instead of asserting, generate a error that can be handled, whether that is an Exception or an Error Code doesn’t matter (though libraries really shouldn’t let Exceptions escape them either, but that’s a different discussion).
All-in-all, the application needs to be stable regardless of input to function.
dontcrashmyapp
It all depends on where the input is coming from. If it’s from uncontrolled source (such as a file) then it shouldn’t be asserted. But if the input is under control of the programmer (e.g an interface requires an argument to be a power of 2 integer, let’s say for alignment purposes) then that’s a prime example of where assertions come in handy.
If the caller has already proven himself to be buggy, what are the chances that the buggy caller is checking for error returns? Not high. And what are the chances that assertion (in debug builds) will be ignored? Pretty much 0.
Assertions are not much different than ASan/UBSan/Valgrind in the regard that they’re a debugging tool meant to check for programming mistakes so that it can be caught as early as possible.
An application that produces incorrect results is anything but stable.
It won’t if the application is not buggy (or if you link against the release build with assertions compiled out).
My recent example of this was a code ignoring the output of sscanf, and dereferencing a NULL pointer. That led to a dabort, which – as the stack pointer for this exception was not set – led to another dabort, this time with the source address gone (ARM). Searching for this took a while, and the mind-bending debugging of sscanf will stay with me forever. All because “oh, this can never happen, so we don’t need to check” attitude.
I disagree: assertions should be used to enforce internal assumptions on state, not external API requirements. It should not be possible for a library client to trigger an assert due to a programming error in their code (though they might trigger an assert due to a programming error in the library itself). A library simply can’t make assumptions about the environment in which it’s used. In some environments it might be fine to crash with an informative message; in others the program must keep running (perhaps after reverting to a known-good state). Throwing exceptions or returning error codes leaves the decision to the client, where it belongs. (Note that the Erlang “let it crash” approach is actually about handling software errors, not expected faults, and is designed for “always-on” systems where a full crash is unacceptable.)
Hmm… ok, if you build the command line from scratch. Common pratice is to include
-DNDEBUG
in release flags, and that’s what you get by default from CMake, for example.Why not have that debate? 🙂
My personal position is that if a check is important to keep a release mode, then your API should have proper error-return semantics, and the check should be turned to return an error instead of crashing out.
In other words: use asserts to check internal invariants in debug mode, when running your test suite (and perhaps a fuzzer of sorts); use error returns for conditions that can happen even if the code is right (for example bad user input, IO error, memory allocation failure…).
One strong argument in favor of that policy is that the code may be called from a higher level language. Crashing out for errors in Python, for example, makes users’ lives miserable.
The best and easiest mitigation to various design and performance problems with asserts is to write your own assert system. This gives you severity control (fine-tuning per-assert behavior), streaming, feedback, and many more things.
I always found the C assert a bit heavy-handed and lacking.
IMHO if your asserts are load-bearing, you are doing it wrong. If the program runs correctly with asserts turned on, it should run correctly with them turned off.
Using asserts liberally to check pre/post/invariant conditions is a simple way to do contract-programming, but they must be safe to turn off for non-debug builds.
You might be able to have two versions of a function, such as
arrayCopyFast
andarrayCopySafe
. Users could call the fast version when they know their input is valid.I’d call the first one arrayCopyUnsafe.
Yes. I agree with that approach and it is very sensible.