In the C family of programming languages, a trigraph is a sequence of three characters, the first two of which are both question marks, that represents a single character.
The reason for their existence is that the basic character set of C (a subset of the ASCII character set) includes nine characters which lie outside the ISO 646 invariant character set. This can pose a problem for writing source code when the keyboard being used does not support any of these nine characters. The ANSI C committee invented trigraphs as a way of entering source code using keyboards that support any version of the ISO 646 character set.
Trigraphs might also be used for some EBCDIC code pages that lack characters such as { and }.
Trigraphs are not commonly encountered outside compiler test suites. Some compilers support an option to turn recognition of trigraphs off, or disable trigraphs by default and require an option to turn them on. Some can issue warnings when they encounter trigraphs in source files. Borland supplied a separate program, the trigraph preprocessor, to be used only when trigraph processing is desired (the rationale was to maximise speed of compilation).
The C preprocessor replaces all occurrences of the following nine trigraph sequences by their single-character equivalents before any other processing.
Trigraph Equivalent
??= #
??/ \
??' ^
??( [
??) ]
??! |
??< {
??> }
??- ~
Note that ??? is not a trigraph sequence.
Note also that the problematic characters are nevertheless required to exist within the implementation, in both the source and execution character sets.
The reason for their existence is that the basic character set of C (a subset of the ASCII character set) includes nine characters which lie outside the ISO 646 invariant character set. This can pose a problem for writing source code when the keyboard being used does not support any of these nine characters. The ANSI C committee invented trigraphs as a way of entering source code using keyboards that support any version of the ISO 646 character set.
Trigraphs might also be used for some EBCDIC code pages that lack characters such as { and }.
Trigraphs are not commonly encountered outside compiler test suites. Some compilers support an option to turn recognition of trigraphs off, or disable trigraphs by default and require an option to turn them on. Some can issue warnings when they encounter trigraphs in source files. Borland supplied a separate program, the trigraph preprocessor, to be used only when trigraph processing is desired (the rationale was to maximise speed of compilation).
The C preprocessor replaces all occurrences of the following nine trigraph sequences by their single-character equivalents before any other processing.
Trigraph Equivalent
??= #
??/ \
??' ^
??( [
??) ]
??! |
??< {
??> }
??- ~
Note that ??? is not a trigraph sequence.
Note also that the problematic characters are nevertheless required to exist within the implementation, in both the source and execution character sets.
No comments:
Post a Comment