C++ proposal: There are 8 bits in a byte
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3477r0.html1. bytes are 8 bits
2. shorts are 16 bits
3. ints are 32 bits
4. longs are 64 bits
5. arithmetic is 2's complement
6. IEEE floating point
and a big chunk of wasted time trying to abstract these away and getting it wrong anyway was saved. Millions of people cried out in relief!
Oh, and Unicode was the character set. Not EBCDIC, RADIX-50, etc.
https://thephd.dev/conformance-should-mean-something-fputc-a...
Me? I just dabble with documenting an unimplemented "50% more bits per byte than the competition!" 12-bit fantasy console of my own invention - replete with inventions such as "UTF-12" - for shits and giggles.
Honest question, haven't followed closely. rand() is broken,I;m told unfixable and last I heard still wasn't deprecated.
Is this proposal a test? "Can we even drop support for a solution to a problem literally nobody has?"
At the same time, a `byte` is already an "alias" for `char` since C++17 anyway[1].
Yes, indexing strings of 6-bit FIELDATA characters was a huge headache. UNIVAC had the unfortunate problem of having to settle on a character code in the early 1960s, before ASCII was standardized. At the time, a military 6-bit character set looked like the next big thing. It was better than IBM's code, which mapped to punch card holes and the letters weren't all in one block.
[1] https://www.unisys.com/siteassets/collateral/info-sheets/inf...
so delegating such by now very very edge cases to non standard C seems fine, i.e. seems to IMHO not change much at all in practice
and C/C++ compilers are anyway full of non standard extensions and it's not that CHAR_BIT go away or you as a non-standard extension assume it might not be 8
Given that Wikipedia says UNIVAC was discontinued in 1986 I’m pretty sure the answer is no and no!
To my naive eye, It seems like moving to 10 bits per byte would be both logical and make learning the trade just a little bit easier?
Or are you suggesting to increase the size of a byte until it's the same size as a word, and merge both concepts ?
Specifically, has there even been a C++ compiler on a system where bytes weren't 8 bits? If so, when was it last updated?
Jean-Luc Picard
I would be amazed if there's any even remotely relevant code that deals meaningfully with CHAR_BIT != 8 these days.
(... and yes, it's about time.)
For some DSP-ish sort of processors I think it doesn't make sense to have addressability at char level, and the gates to support it would be better spent on better 16 and 32 bit multipliers. ::shrugs::
I feel kind of ambivalent about the standards proposal. We already have fixed size types. If you want/need an exact type, that already exists. The non-fixed size types set minimums and allow platforms to set larger sizes for performance reasons.
Having no fast 8-bit level access is a perfectly reasonable decision for a small DSP.
Might it be better instead to migrate many users of char to (u)int8_t?
The proposed alternative of CHAR_BIT congruent to 0 mod 8 also sounds pretty reasonable, in that it captures the existing non-8-bit char platforms and also the justification for non-8-bit char platforms (that if you're not doing much string processing but instead doing all math processing, the additional hardware for efficient 8 bit access is a total waste).
One fun fact I found the other day: ASCII is 7 bits, but when it was used with punch cards there was an 8th bit to make sure you didn't punch the wrong number of holes. https://rabbit.eng.miami.edu/info/ascii.html
https://man7.org/linux/man-pages/man3/fgetc.3.html
fgetc(3) and its companions always return character-by-character input as an int, and the reason is that EOF is represented as -1. An unsigned char is unable to represent EOF. If you're using the wrong return value, you'll never detect this condition.
However, if you don't receive an EOF, then it should be perfectly fine to cast the value to unsigned char without loss of precision.