Next: ISO-8859-1 Characters, Previous: Miscellaneous Character Operations, Up: Characters
An MIT/GNU Scheme character consists of a code part and a bucky bits part. The MIT/GNU Scheme set of characters can represent more characters than ASCII can; it includes characters with Super and Hyper bucky bits, as well as Control and Meta. Every ASCII character corresponds to some MIT/GNU Scheme character, but not vice versa.1
MIT/GNU Scheme uses a 21-bit character code with 4 bucky bits. The character code contains the Unicode scalar value for the character. This is a change from earlier versions of the system, which used the ISO-8859-1 scalar value, but it is upwards compatible with previous usage, since ISO-8859-1 is a proper subset of Unicode.
Builds a character from code and bucky-bits. Both code and bucky-bits must be exact non-negative integers in the appropriate range. Use
char-code
andchar-bits
to extract the code and bucky bits from the character. If0
is specified for bucky-bits,make-char
produces an ordinary character; otherwise, the appropriate bits are turned on as follows:1 Meta 2 Control 4 Super 8 HyperFor example,
(make-char 97 0) => #\a (make-char 97 1) => #\M-a (make-char 97 2) => #\C-a (make-char 97 3) => #\C-M-a
Returns the exact integer representation of char's bucky bits. For example,
(char-bits #\a) => 0 (char-bits #\m-a) => 1 (char-bits #\c-a) => 2 (char-bits #\c-m-a) => 3
Returns the character code of char, an exact integer. For example,
(char-code #\a) => 97 (char-code #\c-a) => 97Note that in MIT/GNU Scheme, the value of
char-code
is the Unicode scalar value for char.
These variables define the (exclusive) upper limits for the character code and bucky bits (respectively). The character code and bucky bits are always exact non-negative integers, and are strictly less than the value of their respective limit variable.
char->integer
returns the character code representation for char.integer->char
returns the character whose character code representation is k.In MIT/GNU Scheme, if
(char-ascii?
char)
is true, then(eqv? (char->ascii char) (char->integer char))However, this behavior is not required by the Scheme standard, and code that depends on it is not portable to other implementations.
These procedures implement order isomorphisms between the set of characters under the
char<=?
ordering and some subset of the integers under the<=
ordering. That is, if(char<=? a b) => #t and (<= x y) => #t
and
x
andy
are in the range ofchar->integer
, then(<= (char->integer a) (char->integer b)) => #t (char<=? (integer->char x) (integer->char y)) => #tIn MIT/GNU Scheme, the specific relationship implemented by these procedures is as follows:
(define (char->integer c) (+ (* (char-bits c) #x200000) (char-code c))) (define (integer->char n) (make-char (remainder n #x200000) (quotient n #x200000)))This implies that
char->integer
andchar-code
produce identical results for characters that have no bucky bits set, and that characters are ordered according to their Unicode scalar values.Note: If the argument to
char->integer
orinteger->char
is a constant, the compiler will constant-fold the call, replacing it with the corresponding result. This is a very useful way to denote unusual character constants or ASCII codes.
The range of
char->integer
is defined to be the exact non-negative integers that are less than the value of this variable (exclusive). Note, however, that there are some holes in this range, because the character code must be a valid Unicode scalar value.
[1] Note that the Control bucky bit
is different from the ASCII control key. This means that
#\SOH
(ASCII ctrl-A) is different from #\C-A
.
In fact, the Control bucky bit is completely orthogonal to the
ASCII control key, making possible such characters as
#\C-SOH
.