Talk:C preprocessor
This is the talk page for discussing improvements to the C preprocessor article. This is not a forum for general discussion of the article's subject. |
Article policies
|
Find sources: Google (books · news · scholar · free images · WP refs) · FENS · JSTOR · TWL |
![]() | C/C++ Start‑class Top‑importance | |||||||||
|
do-while-0 and if-1-else
The standard trick of writing macro statements in a do
loop is especially important when writing "wrapper" macros that include their arguments as statements:
#define WITH_FOO(stmts) \
do { \
int __ofoo=foo; \
foo=1; \
stmts; \
foo=__ofoo; \
} while 0
The stmts;
means that the provided code can end with or without a semicolon (although commas are still problematic!), and the overall structure is always one statement (for safety in putting calls to it in control structures) and expects a trailing semicolon (so that it looks like a function call; automatic indentation will like it better). The braces also mean that an if
inside the argument cannot attach to an else
following the macro call. But there's another option:
#define WITH_FOO(stmts) \
if(1) { \
int __ofoo=foo; \
foo=1; \
stmts; \
foo=__ofoo; \
} else
This is very similar, but has the improvement of being transparent to break
and continue
. Disadvantages are that the else is capable of binding to any following statement, and that some compilers may issue warnings about the empty else or about the use of if-if-else-else
without braces
- There is NO compiler that will complain about embedded
if-else
construct without braces which is very standard since the earliest incarnations of C and still in all modern C and C++ compilers. Such warnings would be really harmful if they were enabled by default (but some projects are setting more strict lint-like verifications). Such warninf would be extremely pedantic, and used only for debugging the source, in a specific precompilation rule used to find the location of missing braces, where a lot of warnings will be expected by the programmer... verdy_p (talk) 20:09, 28 November 2009 (UTC)
in constructs like these:
if(need_foo) WITH_FOO(foofunc());
else barfunc();
I suppose that there is a healthy debate on the subject. When you're writing a multiline but non-wrapper macro, you know if there are loop control statements in the loop, and you should prefer do-while-0
in their absence. Is there a consensus otherwise? Should we include both in the article? Are there well-known sources for both styles? --Tardis (talk) 00:52, 13 March 2009 (UTC)
- There is the possibility to insert a do_nothing statement in the else clause (such as "if(1){ ... } else ((void)0)" to discard the value explicilty, but not "if(1){ ... } else 0" which generates a warning almost always), possibly surrounded by a pragma to avoid the warning. Unfortunately the simple typename "void" is not valid as an expression alone (or it could still absorb a following statement in case of a missing colon, notably in C++ where it may be interpreted as an initial typecast to "void".
- I also don't like the dangling else clause, because it can siliently absorb the following statement, if ever there's a semilcolon missing after the macro invokation.
- But if you use GCC, just surround the block within parentheses, without if/else or do/while. In MS compilers, there's a do-nothing intrinsic void function which can work as well, without surrounding the block by a control statement. In all other cases, the do/while(0) is the best approach: the only risk is a compiler warning, not a compile-error or an undetected error (like a missing colon). Warnings about do-nothing statements (or always false conditions when using do/while) can also be controled in the source file using the macro, because these warnings are non-standard (lint-like, possibily wrong) and must not be forced without a compiler option to disable them.
- If you want to be transparent to break and continue, you can replace them by an explicit goto to a labelled statement after end of loop or switch (for break) or before end of loop (for continue) (and for code clarity, the label should include "break_" or "continue_" in its name, the goto being also always downward in a switch or do/while or for-loop, and always upward in a while-loop). verdy_p (talk) 13:18, 28 November 2009 (UTC)
- Anyway, your proposed syntax will not work with most statements that are not single statements or that contain commas. I suggest this instead:
#define BEGIN_BLOCK if (1) {
#define END_BLOCK ; } else
#define BEGIN_FOO BEGIN_BLOCK int __ofoo = foo; foo = 1;
#define END_FOO foo = __ofoo; END_BLOCK
- or still with the alternative (which is still safer):
#define BEGIN_BLOCK do {
#define END_BLOCK ; } while(0)
- that can be used as (with or without semi-colon terminating the middle statements, but with a required semi-colon after END_FOO (its absence will cause a syntax error within "else else" or "while(0) else") :
if (need_foo) BEGIN_FOO foofunc() END_FOO;
else barfunc();
- The implicit termination of if constructs with an optional additional else clause is a wellknown syntax caveat of C/C++/Java/C#/J# (that must also be solved in complex ways to avoid advance/reduce ambiguities in context-free language parser generators taking their decision using a single look-ahead symbol without backtracking); other languages (including the C preprocessor) are avoiding in a safer way by requiring an explicit "endif" construct in all cases (including in Pascal/Modula where the semicolon is still required to terminate the "if" statement even after the "end" keyword terminating multi-statement block used as one of its "statement" clauses).
- verdy_p (talk) 20:45, 28 November 2009 (UTC)
- There's the construct to get rid of the "else" problem. 70.239.12.234 (talk) 19:47, 18 July 2011 (UTC)
if (0); else {block}
- There's the
#warning
Checked and added that the C-compilers by Intel and IBM also support the #warning directive. Is this sufficient to remove the weasel word warning? —Preceding unsigned comment added by 141.84.9.25 (talk) 15:06, 11 November 2009 (UTC)
Refer the Preprocessor section —Preceding unsigned comment added by 203.91.193.5 (talk) 11:11, 12 January 2010 (UTC)
Indentation
I'm currently googling around trying to confirm / deny that the # symbol should be in the first column and that indentation may appear between it and the directive. This article surely should state the indentation rule, or state that it is a myth? Sweavo (talk) 13:53, 19 August 2010 (UTC)
- There's no such rule, you may indent preprocessor directives as you wish. From the C99 standard (6.10:
A preprocessing directive consists of a sequence of preprocessing tokens that begins with # preprocessing token that (at the start of translation phase 4) is either the first character in the source file (optionally after white space containing no new-line characters) or that follows white space containing at least one new-line character, and is ended by the next new-line character.
- And the # and the word following it are separate tokens, so you can put spaces between them too. This is basically the same in C89 and C++. Rwessel (talk) 04:41, 5 November 2011 (UTC)
I'm going to start moving sections over to wikibooks
The article is too detailed. Wikipedia is not an instruction manual. - Richfife (talk) 19:26, 28 February 2011 (UTC)
I came here looking for that detailed info (since there is a stackoverflow page referring to it here), a link to the wikibooks might be useful (at the least on this talk page): http://en.wikibooks.org/wiki/C_Programming/Preprocessor — Preceding unsigned comment added by 94.208.248.165 (talk) 09:13, 10 June 2012 (UTC)
You people are killing wiki. I used to come here looking for info, now I rarely visit wiki as I never find what I need on it anymore, just a bunch of bureaucrats trying to exercise influence over articles to get mod status.
"OlderSmall" example should be much simpler
Encyclopedant (talk) 21:44, 24 September 2011 (UTC)
Syntax highlighing
I'd like to change the syntax highlighting to CPP from C (source lang="cpp"). The CPP highlighting is more attractive, and is what C (programming language) uses. Comments? Rwessel (talk) 01:04, 15 October 2011 (UTC)
Including files section
A perhaps pedantic point is that that the description of including stdio.h as a text image is not strictly correct. The C standard headers do not actually have to be text files in any meaning sense of the world, although most (all?) implementations do have text files for those. The implementation is allowed to handle the standard header in a special way, so while "#include <stdio>" must have the defined result, it does *not* have to happen by including any sort of text file. In short, the explanation is at least potentially incorrect when applied (as it is) to one of the standard headers.
On the flip side, it's a pretty pedantic point, and I'm not actually aware of any implementations that don't treat the system headers as text files.
So there are three options:
(1) leave it alone, and accept that the text is not quite correct (2) add additional text clarifying the (potential) special nature of the system headers (3) change the example to include a non-system header instead (which must be a text file)
Frankly, I mostly lean towards option (1).
Comments? Rwessel (talk) 09:51, 22 March 2012 (UTC)
- Pedantically, I would tend to your second option above, if only to keep editors from having this discussion again. A single sentence would be enough I think. I don't have the C language standard, but this question on Stack Overflow already quotes the relevant sections of the C++ standard.
- Like you, I have never seen an implementation of C or C++ where the standard headers were not represented as text files. I have asked for examples of such implementations at the reference desk. —Tobias Bergemann (talk) 10:42, 22 March 2012 (UTC)
Phases
The Phases section states:
- "The first four (of eight) phases of translation specified in the C Standard are:"
Which begs the question, What are the other four? Rojomoke (talk) 10:08, 22 October 2012 (UTC)
- There are links to several versions of the C standard at the bottom of C (programming language). To quote:
5. Each escape sequence in character constants and string literals is converted to a member of the execution character set. 6. Adjacent character string literal tokens are concatenated and adjacent wide string literal tokens are concatenated. 7. White-space characters separating tokens are no longer significant. Preprocessing tokens are converted into tokens. The resulting tokens are syntactically and semantically analyzed and translated. 8. All external object and function references are resolved. Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment.
- Basically the first four phases define what's normally thought of as preprocessing, although that's not made clear in the article (which I will fix in a minute). Phases five and six finish cleaning up the source after preprocessing, seven is what most people think of as the compilation process itself, and eight is linking. But the eight phases are *conceptual* phases which define the context in which the definition of the language within the standard are made. Almost no implementations follow it strictly in the way they operate, but work as if they had, and the results are (hopefully!) indistinguishable from an implementations where they had actually implemented the eight phases as separate steps. Rwessel (talk) 16:59, 22 October 2012 (UTC)
- Ah, thanks. I'd misread it to mean there were eight phases in the preprocessing activity. Rojomoke (talk) 09:19, 24 October 2012 (UTC)