Does C and C++ guarantee the ASCII of [a-f] and [A-F] characters? -


i'm looking @ following code test hexadecimal digit , convert integer. code kind of clever in takes advantage of difference between between capital , lower letters 32, , that's bit 5. code performs 1 or, saves 1 jmp , 2 cmps.

static const int bit_five = (1 << 5); static const char str[] = "0123456789abcdefabcdef";  (unsigned int = 0; < countof(str); i++) {     int digit, ch = str[i];      if (ch >= '0' && ch <= '9')         digit = ch - '0';     else if ((ch |= bit_five) >= 'a' && ch <= 'f')         digit = ch - 'a' + 10;     ... } 

do c , c++ guarantee ascii or values of [a-f] , [a-f] characters? here, guarantee means upper , lower character sets differ constant value can represented bit (for trick above). if not, standard them?

(sorry c , c++ tag. i'm interested in both language's position on subject).

there no guarantees particular values you shouldn't care, because software never encounter system not compatible in way ascii. assume space 32 , 65, works fine in modern world.

the c standard guarantees letters a-z , a-z exist , fit within single byte.

it guarantee 0-9 sequential.

in both source , execution basic character sets, value of each character after 0 in above list of decimal digits shall 1 greater value of previous.

justification

there lot of character encodings out in world. if care portability, can either make program portable different character sets, or can choose 1 character set use everywhere (e.g. unicode). i'll go ahead , loosely categorize existing character encodings you:

  1. single byte character encodings compatible iso/iec 646. digits 0-9 , letters a-z , a-z occupy same positions.

  2. multibyte character encodings (big5, shift jis, iso 2022-based). in these encodings, program already broken , you'll need spend time fixing if care. however, parsing numbers still work expected.

  3. unicode encodings. digits 0-9 , letters a-z, a-z occupy same positions. can either work code points or code units freely , same result, if working code points below 128 (which are). (are working utf-7? no, should use email.

  4. ebcdic. digits , letters assigned different values values in ascii, however, 0-9 , a-f, a-f still contiguous. then, chance code run on ebcdic system zero.

so question here is: think hypothetical fifth option invented in future, somehow less compatible / more difficult use unicode?

do care ebcdic?

we dream bizarre systems day... suppose char_bit 11, or sizeof(long) = 100, or suppose use one's complement arithmetic, or malloc() returns null, or suppose pixels on monitor arranged in hexagonal grid. suppose floating-point numbers aren't ieee 754, suppose of data pointers different sizes. @ end of day, not closer our goals of writing working software on actual modern systems (with occasional exception).


Comments

Popular posts from this blog

Payment information shows nothing in one page checkout page magento -

tcpdump - How to check if server received packet (acknowledged) -