No puede seleccionar más de 25 temas Los temas deben comenzar con una letra o número, pueden incluir guiones ('-') y pueden tener hasta 35 caracteres de largo.

Macro-Expansion.html 13KB

hace 3 años
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
  4. <head>
  5. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  6. <title>Macro Expansion (The GNU C Preprocessor Internals)</title>
  7. <meta name="description" content="Macro Expansion (The GNU C Preprocessor Internals)">
  8. <meta name="keywords" content="Macro Expansion (The GNU C Preprocessor Internals)">
  9. <meta name="resource-type" content="document">
  10. <meta name="distribution" content="global">
  11. <meta name="Generator" content="makeinfo">
  12. <link href="index.html#Top" rel="start" title="Top">
  13. <link href="Concept-Index.html#Concept-Index" rel="index" title="Concept Index">
  14. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  15. <link href="index.html#Top" rel="up" title="Top">
  16. <link href="Token-Spacing.html#Token-Spacing" rel="next" title="Token Spacing">
  17. <link href="Hash-Nodes.html#Hash-Nodes" rel="prev" title="Hash Nodes">
  18. <style type="text/css">
  19. <!--
  20. a.summary-letter {text-decoration: none}
  21. blockquote.indentedblock {margin-right: 0em}
  22. blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
  23. blockquote.smallquotation {font-size: smaller}
  24. div.display {margin-left: 3.2em}
  25. div.example {margin-left: 3.2em}
  26. div.lisp {margin-left: 3.2em}
  27. div.smalldisplay {margin-left: 3.2em}
  28. div.smallexample {margin-left: 3.2em}
  29. div.smalllisp {margin-left: 3.2em}
  30. kbd {font-style: oblique}
  31. pre.display {font-family: inherit}
  32. pre.format {font-family: inherit}
  33. pre.menu-comment {font-family: serif}
  34. pre.menu-preformatted {font-family: serif}
  35. pre.smalldisplay {font-family: inherit; font-size: smaller}
  36. pre.smallexample {font-size: smaller}
  37. pre.smallformat {font-family: inherit; font-size: smaller}
  38. pre.smalllisp {font-size: smaller}
  39. span.nolinebreak {white-space: nowrap}
  40. span.roman {font-family: initial; font-weight: normal}
  41. span.sansserif {font-family: sans-serif; font-weight: normal}
  42. ul.no-bullet {list-style: none}
  43. -->
  44. </style>
  45. </head>
  46. <body lang="en">
  47. <a name="Macro-Expansion"></a>
  48. <div class="header">
  49. <p>
  50. Next: <a href="Token-Spacing.html#Token-Spacing" accesskey="n" rel="next">Token Spacing</a>, Previous: <a href="Hash-Nodes.html#Hash-Nodes" accesskey="p" rel="prev">Hash Nodes</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
  51. </div>
  52. <hr>
  53. <a name="Macro-Expansion-Algorithm"></a>
  54. <h2 class="unnumbered">Macro Expansion Algorithm</h2>
  55. <a name="index-macro-expansion"></a>
  56. <p>Macro expansion is a tricky operation, fraught with nasty corner cases
  57. and situations that render what you thought was a nifty way to
  58. optimize the preprocessor&rsquo;s expansion algorithm wrong in quite subtle
  59. ways.
  60. </p>
  61. <p>I strongly recommend you have a good grasp of how the C and C++
  62. standards require macros to be expanded before diving into this
  63. section, let alone the code!. If you don&rsquo;t have a clear mental
  64. picture of how things like nested macro expansion, stringizing and
  65. token pasting are supposed to work, damage to your sanity can quickly
  66. result.
  67. </p>
  68. <a name="Internal-representation-of-macros"></a>
  69. <h3 class="section">Internal representation of macros</h3>
  70. <a name="index-macro-representation-_0028internal_0029"></a>
  71. <p>The preprocessor stores macro expansions in tokenized form. This
  72. saves repeated lexing passes during expansion, at the cost of a small
  73. increase in memory consumption on average. The tokens are stored
  74. contiguously in memory, so a pointer to the first one and a token
  75. count is all you need to get the replacement list of a macro.
  76. </p>
  77. <p>If the macro is a function-like macro the preprocessor also stores its
  78. parameters, in the form of an ordered list of pointers to the hash
  79. table entry of each parameter&rsquo;s identifier. Further, in the macro&rsquo;s
  80. stored expansion each occurrence of a parameter is replaced with a
  81. special token of type <code>CPP_MACRO_ARG</code>. Each such token holds the
  82. index of the parameter it represents in the parameter list, which
  83. allows rapid replacement of parameters with their arguments during
  84. expansion. Despite this optimization it is still necessary to store
  85. the original parameters to the macro, both for dumping with e.g.,
  86. <samp>-dD</samp>, and to warn about non-trivial macro redefinitions when
  87. the parameter names have changed.
  88. </p>
  89. <a name="Macro-expansion-overview"></a>
  90. <h3 class="section">Macro expansion overview</h3>
  91. <p>The preprocessor maintains a <em>context stack</em>, implemented as a
  92. linked list of <code>cpp_context</code> structures, which together represent
  93. the macro expansion state at any one time. The <code>struct
  94. cpp_reader</code> member variable <code>context</code> points to the current top
  95. of this stack. The top normally holds the unexpanded replacement list
  96. of the innermost macro under expansion, except when cpplib is about to
  97. pre-expand an argument, in which case it holds that argument&rsquo;s
  98. unexpanded tokens.
  99. </p>
  100. <p>When there are no macros under expansion, cpplib is in <em>base
  101. context</em>. All contexts other than the base context contain a
  102. contiguous list of tokens delimited by a starting and ending token.
  103. When not in base context, cpplib obtains the next token from the list
  104. of the top context. If there are no tokens left in the list, it pops
  105. that context off the stack, and subsequent ones if necessary, until an
  106. unexhausted context is found or it returns to base context. In base
  107. context, cpplib reads tokens directly from the lexer.
  108. </p>
  109. <p>If it encounters an identifier that is both a macro and enabled for
  110. expansion, cpplib prepares to push a new context for that macro on the
  111. stack by calling the routine <code>enter_macro_context</code>. When this
  112. routine returns, the new context will contain the unexpanded tokens of
  113. the replacement list of that macro. In the case of function-like
  114. macros, <code>enter_macro_context</code> also replaces any parameters in the
  115. replacement list, stored as <code>CPP_MACRO_ARG</code> tokens, with the
  116. appropriate macro argument. If the standard requires that the
  117. parameter be replaced with its expanded argument, the argument will
  118. have been fully macro expanded first.
  119. </p>
  120. <p><code>enter_macro_context</code> also handles special macros like
  121. <code>__LINE__</code>. Although these macros expand to a single token which
  122. cannot contain any further macros, for reasons of token spacing
  123. (see <a href="Token-Spacing.html#Token-Spacing">Token Spacing</a>) and simplicity of implementation, cpplib
  124. handles these special macros by pushing a context containing just that
  125. one token.
  126. </p>
  127. <p>The final thing that <code>enter_macro_context</code> does before returning
  128. is to mark the macro disabled for expansion (except for special macros
  129. like <code>__TIME__</code>). The macro is re-enabled when its context is
  130. later popped from the context stack, as described above. This strict
  131. ordering ensures that a macro is disabled whilst its expansion is
  132. being scanned, but that it is <em>not</em> disabled whilst any arguments
  133. to it are being expanded.
  134. </p>
  135. <a name="Scanning-the-replacement-list-for-macros-to-expand"></a>
  136. <h3 class="section">Scanning the replacement list for macros to expand</h3>
  137. <p>The C standard states that, after any parameters have been replaced
  138. with their possibly-expanded arguments, the replacement list is
  139. scanned for nested macros. Further, any identifiers in the
  140. replacement list that are not expanded during this scan are never
  141. again eligible for expansion in the future, if the reason they were
  142. not expanded is that the macro in question was disabled.
  143. </p>
  144. <p>Clearly this latter condition can only apply to tokens resulting from
  145. argument pre-expansion. Other tokens never have an opportunity to be
  146. re-tested for expansion. It is possible for identifiers that are
  147. function-like macros to not expand initially but to expand during a
  148. later scan. This occurs when the identifier is the last token of an
  149. argument (and therefore originally followed by a comma or a closing
  150. parenthesis in its macro&rsquo;s argument list), and when it replaces its
  151. parameter in the macro&rsquo;s replacement list, the subsequent token
  152. happens to be an opening parenthesis (itself possibly the first token
  153. of an argument).
  154. </p>
  155. <p>It is important to note that when cpplib reads the last token of a
  156. given context, that context still remains on the stack. Only when
  157. looking for the <em>next</em> token do we pop it off the stack and drop
  158. to a lower context. This makes backing up by one token easy, but more
  159. importantly ensures that the macro corresponding to the current
  160. context is still disabled when we are considering the last token of
  161. its replacement list for expansion (or indeed expanding it). As an
  162. example, which illustrates many of the points above, consider
  163. </p>
  164. <div class="smallexample">
  165. <pre class="smallexample">#define foo(x) bar x
  166. foo(foo) (2)
  167. </pre></div>
  168. <p>which fully expands to &lsquo;<samp>bar foo (2)</samp>&rsquo;. During pre-expansion
  169. of the argument, &lsquo;<samp>foo</samp>&rsquo; does not expand even though the macro is
  170. enabled, since it has no following parenthesis [pre-expansion of an
  171. argument only uses tokens from that argument; it cannot take tokens
  172. from whatever follows the macro invocation]. This still leaves the
  173. argument token &lsquo;<samp>foo</samp>&rsquo; eligible for future expansion. Then, when
  174. re-scanning after argument replacement, the token &lsquo;<samp>foo</samp>&rsquo; is
  175. rejected for expansion, and marked ineligible for future expansion,
  176. since the macro is now disabled. It is disabled because the
  177. replacement list &lsquo;<samp>bar foo</samp>&rsquo; of the macro is still on the context
  178. stack.
  179. </p>
  180. <p>If instead the algorithm looked for an opening parenthesis first and
  181. then tested whether the macro were disabled it would be subtly wrong.
  182. In the example above, the replacement list of &lsquo;<samp>foo</samp>&rsquo; would be
  183. popped in the process of finding the parenthesis, re-enabling
  184. &lsquo;<samp>foo</samp>&rsquo; and expanding it a second time.
  185. </p>
  186. <a name="Looking-for-a-function_002dlike-macro_0027s-opening-parenthesis"></a>
  187. <h3 class="section">Looking for a function-like macro&rsquo;s opening parenthesis</h3>
  188. <p>Function-like macros only expand when immediately followed by a
  189. parenthesis. To do this cpplib needs to temporarily disable macros
  190. and read the next token. Unfortunately, because of spacing issues
  191. (see <a href="Token-Spacing.html#Token-Spacing">Token Spacing</a>), there can be fake padding tokens in-between,
  192. and if the next real token is not a parenthesis cpplib needs to be
  193. able to back up that one token as well as retain the information in
  194. any intervening padding tokens.
  195. </p>
  196. <p>Backing up more than one token when macros are involved is not
  197. permitted by cpplib, because in general it might involve issues like
  198. restoring popped contexts onto the context stack, which are too hard.
  199. Instead, searching for the parenthesis is handled by a special
  200. function, <code>funlike_invocation_p</code>, which remembers padding
  201. information as it reads tokens. If the next real token is not an
  202. opening parenthesis, it backs up that one token, and then pushes an
  203. extra context just containing the padding information if necessary.
  204. </p>
  205. <a name="Marking-tokens-ineligible-for-future-expansion"></a>
  206. <h3 class="section">Marking tokens ineligible for future expansion</h3>
  207. <p>As discussed above, cpplib needs a way of marking tokens as
  208. unexpandable. Since the tokens cpplib handles are read-only once they
  209. have been lexed, it instead makes a copy of the token and adds the
  210. flag <code>NO_EXPAND</code> to the copy.
  211. </p>
  212. <p>For efficiency and to simplify memory management by avoiding having to
  213. remember to free these tokens, they are allocated as temporary tokens
  214. from the lexer&rsquo;s current token run (see <a href="Lexer.html#Lexing-a-line">Lexing a line</a>) using the
  215. function <code>_cpp_temp_token</code>. The tokens are then re-used once the
  216. current line of tokens has been read in.
  217. </p>
  218. <p>This might sound unsafe. However, tokens runs are not re-used at the
  219. end of a line if it happens to be in the middle of a macro argument
  220. list, and cpplib only wants to back-up more than one lexer token in
  221. situations where no macro expansion is involved, so the optimization
  222. is safe.
  223. </p>
  224. <hr>
  225. <div class="header">
  226. <p>
  227. Next: <a href="Token-Spacing.html#Token-Spacing" accesskey="n" rel="next">Token Spacing</a>, Previous: <a href="Hash-Nodes.html#Hash-Nodes" accesskey="p" rel="prev">Hash Nodes</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Concept-Index.html#Concept-Index" title="Index" rel="index">Index</a>]</p>
  228. </div>
  229. </body>
  230. </html>