Next: , Previous: String Ports, Up: Input/Output


14.4 Input Procedures

This section describes the procedures that read input. Input procedures can read either from the current input port or from a given port. Remember that to read from a file, you must first open a port to the file.

Input ports can be divided into two types, called interactive and non-interactive. Interactive input ports are ports that read input from a source that is time-dependent; for example, a port that reads input from a terminal or from another program. Non-interactive input ports read input from a time-independent source, such as an ordinary file or a character string.

All optional arguments called input-port, if not supplied, default to the current input port.

— procedure: read-char [input-port]

Returns the next character available from input-port, updating input-port to point to the following character. If no more characters are available, an end-of-file object is returned.

In MIT/GNU Scheme, if input-port is an interactive input port and no characters are immediately available, read-char will hang waiting for input, even if the port is in non-blocking mode.

— procedure: peek-char [input-port]

Returns the next character available from input-port, without updating input-port to point to the following character. If no more characters are available, an end-of-file object is returned.1

In MIT/GNU Scheme, if input-port is an interactive input port and no characters are immediately available, peek-char will hang waiting for input, even if the port is in non-blocking mode.

— procedure: char-ready? [input-port]

Returns #t if a character is ready on input-port and returns #f otherwise. If char-ready? returns #t then the next read-char operation on input-port is guaranteed not to hang. If input-port is a file port at end of file then char-ready? returns #t.2

— procedure: read [input-port [environment]]

Converts external representations of Scheme objects into the objects themselves. read returns the next object parsable from input-port, updating input-port to point to the first character past the end of the written representation of the object. If an end of file is encountered in the input before any characters are found that can begin an object, read returns an end-of-file object. The input-port remains open, and further attempts to read will also return an end-of-file object. If an end of file is encountered after the beginning of an object's written representation, but the written representation is incomplete and therefore not parsable, an error is signalled.

Environment is used to look up the values of control variables such as `*parser-radix*'. If not supplied, it defaults to the REP environment.

— procedure: eof-object? object

Returns #t if object is an end-of-file object; otherwise returns #f.

— procedure: read-char-no-hang [input-port]

If input-port can deliver a character without blocking, this procedure acts exactly like read-char, immediately returning that character. Otherwise, #f is returned, unless input-port is a file port at end of file, in which case an end-of-file object is returned. In no case will this procedure block waiting for input.

— procedure: read-string char-set [input-port]

Reads characters from input-port until it finds a terminating character that is a member of char-set (see Character Sets) or encounters end of file. The port is updated to point to the terminating character, or to end of file if no terminating character was found. read-string returns the characters, up to but excluding the terminating character, as a newly allocated string.

This procedure ignores the blocking mode of the port, blocking unconditionally until it sees either a delimiter or end of file. If end of file is encountered before any characters are read, an end-of-file object is returned.

On many input ports, this operation is significantly faster than the following equivalent code using peek-char and read-char:

          (define (read-string char-set input-port)
            (let ((char (peek-char input-port)))
              (if (eof-object? char)
                  char
                  (list->string
                   (let loop ((char char))
                     (if (or (eof-object? char)
                             (char-set-member? char-set char))
                         '()
                         (begin
                           (read-char input-port)
                           (cons char
                                 (loop (peek-char input-port))))))))))
     
— procedure: read-line [input-port]

read-line reads a single line of text from input-port, and returns that line as a newly allocated string. The #\newline terminating the line, if any, is discarded and does not appear in the returned string.

This procedure ignores the blocking mode of the port, blocking unconditionally until it has read an entire line. If end of file is encountered before any characters are read, an end-of-file object is returned.

— procedure: read-string! string [input-port]
— procedure: read-substring! string start end [input-port]

read-string! and read-substring! fill the specified region of string with characters read from input-port until the region is full or else there are no more characters available from the port. For read-string!, the region is all of string, and for read-substring!, the region is that part of string specified by start and end.

The returned value is the number of characters filled into the region. However, there are several interesting cases to consider:

The importance of read-string! and read-substring! are that they are both flexible and extremely fast, especially for large amounts of data.

The following variables may be bound or assigned to change the behavior of the read procedure. They are looked up in the environment that is passed to read, and so may have different values in different environments. It is recommended that the global bindings of these variables be left unchanged; make local changes by shadowing the global bindings in nested environments.

— variable: *parser-radix*

This variable defines the radix used by the reader when it parses numbers. This is similar to passing a radix argument to string->number. The value of this variable must be one of 2, 8, 10, or 16; any other value is ignored, and the reader uses radix 10.

Note that much of the number syntax is invalid for radixes other than 10. The reader detects cases where such invalid syntax is used and signals an error. However, problems can still occur when *parser-radix* is set to 16, because syntax that normally denotes symbols can now denote numbers (e.g. abc). Because of this, it is usually undesirable to set this variable to anything other than the default.

The default value of this variable is 10.

— variable: *parser-canonicalize-symbols?*

This variable controls how the parser handles case-sensitivity of symbols. If it is bound to its default value of #t, symbols read by the parser are converted to lower case before being interned. Otherwise, symbols are interned without case conversion.

In general, it is a bad idea to use this feature, as it doesn't really make Scheme case-sensitive, and therefore can break features of the Scheme runtime that depend on case-insensitive symbols.


Footnotes

[1] The value returned by a call to peek-char is the same as the value that would have been returned by a call to read-char on the same port. The only difference is that the very next call to read-char or peek-char on that input-port will return the value returned by the preceding call to peek-char. In particular, a call to peek-char on an interactive port will hang waiting for input whenever a call to read-char would have hung.

[2] char-ready? exists to make it possible for a program to accept characters from interactive ports without getting stuck waiting for input. Any input editors associated with such ports must make sure that characters whose existence has been asserted by char-ready? cannot be rubbed out. If char-ready? were to return #f at end of file, a port at end of file would be indistinguishable from an interactive port that has no ready characters.