TITLE: practical concerns of signed versus unsigned characters (Private correspondence, 23 Aug 99) CLARKE: "Allan D. Clarke" > TITLE: signed versus unsigned characters > > (Newsgroup: comp.lang.c++.moderated, 12 Aug 99) > > CLAMAGE: Steve Clamage > > If you want to store characters (as opposed to small positive > integers), use "plain" char. That type will have the correct > behavior for all character-oriented classes and functions, > whereas type unsigned char (or signed char) will not always work. TRIBBLE: david@tribble.com As I pointed out to the ISO C9X committee, plain char and signed char types cause problems with portable code. This is because of the nature of sign-extension for character values with their high bits set. char ch = 0xA0; // ISO-8851-1 nonbreaking space if (ch == 0xA0) // Fails if plain char is signed ... if (ch == '\xA0') // Fails if plain char is signed ... if (isupper(ch)) // Bug when plain char is signed ... Personally, I've encountered far fewer of these kinds of problems by using 'unsigned char', which does not suffer from sign-extension surprises. The drawback is having to cast all my 'unsigned char' buffer/string types to plain 'char*' when passing them to any of the standard library functions (e.g., strcmp()). In my not-so-humble opinion, 'char' should have been an unsigned datatype from the very beginning. But I recognize the roots of C/C++, i.e., the PDP-11, which apparently encouraged the use of signed characters. I also recognize the fact that far too much code would break if the signedness of 'char' were changed. (See http://www.flash.net/~dtribble/text/cbug001.txt.)