Bounds checking overhead is negligible for all but the absolutely hottest code paths (fwiw we shipped active asserts, including bounds checking asserts in all the PC games I was involved with - carefully monitoring the overhead of course).
The main reason to not use the stdlib isn't so much about squeezing out the last bit of performance, but about control of what actually happens under the hood (and also compilation times). The overall runtime cost of all those active asserts (not just the range checks, everything) was somewhere in the 2..3% range, which is fine when budgeted for upfront.
I personally am more conservative on those things. I'll pick the fastest thing that is reliable.
We can all agree it's not medical systems, but audio DSP and game dev both end up rewriting a lot of STL stuff to suit their needs, and often using a restricted subset of modern C++ features for similar reasons.
That isn't some arbitrary choice, but pretty much where everyone continually ends up when solving real-time problems using C++. Whether those be games or not.