No puede seleccionar más de 25 temas Los temas deben comenzar con una letra o número, pueden incluir guiones ('-') y pueden tener hasta 35 caracteres de largo.

RTL-passes.html 16KB

hace 3 años
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1988-2020 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Funding Free Software", the Front-Cover
  8. Texts being (a) (see below), and with the Back-Cover Texts being (b)
  9. (see below). A copy of the license is included in the section entitled
  10. "GNU Free Documentation License".
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  20. <title>RTL passes (GNU Compiler Collection (GCC) Internals)</title>
  21. <meta name="description" content="RTL passes (GNU Compiler Collection (GCC) Internals)">
  22. <meta name="keywords" content="RTL passes (GNU Compiler Collection (GCC) Internals)">
  23. <meta name="resource-type" content="document">
  24. <meta name="distribution" content="global">
  25. <meta name="Generator" content="makeinfo">
  26. <link href="index.html#Top" rel="start" title="Top">
  27. <link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="Passes.html#Passes" rel="up" title="Passes">
  30. <link href="Optimization-info.html#Optimization-info" rel="next" title="Optimization info">
  31. <link href="Tree-SSA-passes.html#Tree-SSA-passes" rel="prev" title="Tree SSA passes">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.indentedblock {margin-right: 0em}
  36. blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
  37. blockquote.smallquotation {font-size: smaller}
  38. div.display {margin-left: 3.2em}
  39. div.example {margin-left: 3.2em}
  40. div.lisp {margin-left: 3.2em}
  41. div.smalldisplay {margin-left: 3.2em}
  42. div.smallexample {margin-left: 3.2em}
  43. div.smalllisp {margin-left: 3.2em}
  44. kbd {font-style: oblique}
  45. pre.display {font-family: inherit}
  46. pre.format {font-family: inherit}
  47. pre.menu-comment {font-family: serif}
  48. pre.menu-preformatted {font-family: serif}
  49. pre.smalldisplay {font-family: inherit; font-size: smaller}
  50. pre.smallexample {font-size: smaller}
  51. pre.smallformat {font-family: inherit; font-size: smaller}
  52. pre.smalllisp {font-size: smaller}
  53. span.nolinebreak {white-space: nowrap}
  54. span.roman {font-family: initial; font-weight: normal}
  55. span.sansserif {font-family: sans-serif; font-weight: normal}
  56. ul.no-bullet {list-style: none}
  57. -->
  58. </style>
  59. </head>
  60. <body lang="en">
  61. <a name="RTL-passes"></a>
  62. <div class="header">
  63. <p>
  64. Next: <a href="Optimization-info.html#Optimization-info" accesskey="n" rel="next">Optimization info</a>, Previous: <a href="Tree-SSA-passes.html#Tree-SSA-passes" accesskey="p" rel="prev">Tree SSA passes</a>, Up: <a href="Passes.html#Passes" accesskey="u" rel="up">Passes</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  65. </div>
  66. <hr>
  67. <a name="RTL-passes-1"></a>
  68. <h3 class="section">9.6 RTL passes</h3>
  69. <p>The following briefly describes the RTL generation and optimization
  70. passes that are run after the Tree optimization passes.
  71. </p>
  72. <ul>
  73. <li> RTL generation
  74. <p>The source files for RTL generation include
  75. <samp>stmt.c</samp>,
  76. <samp>calls.c</samp>,
  77. <samp>expr.c</samp>,
  78. <samp>explow.c</samp>,
  79. <samp>expmed.c</samp>,
  80. <samp>function.c</samp>,
  81. <samp>optabs.c</samp>
  82. and <samp>emit-rtl.c</samp>.
  83. Also, the file
  84. <samp>insn-emit.c</samp>, generated from the machine description by the
  85. program <code>genemit</code>, is used in this pass. The header file
  86. <samp>expr.h</samp> is used for communication within this pass.
  87. </p>
  88. <a name="index-genflags"></a>
  89. <a name="index-gencodes"></a>
  90. <p>The header files <samp>insn-flags.h</samp> and <samp>insn-codes.h</samp>,
  91. generated from the machine description by the programs <code>genflags</code>
  92. and <code>gencodes</code>, tell this pass which standard names are available
  93. for use and which patterns correspond to them.
  94. </p>
  95. </li><li> Generation of exception landing pads
  96. <p>This pass generates the glue that handles communication between the
  97. exception handling library routines and the exception handlers within
  98. the function. Entry points in the function that are invoked by the
  99. exception handling library are called <em>landing pads</em>. The code
  100. for this pass is located in <samp>except.c</samp>.
  101. </p>
  102. </li><li> Control flow graph cleanup
  103. <p>This pass removes unreachable code, simplifies jumps to next, jumps to
  104. jump, jumps across jumps, etc. The pass is run multiple times.
  105. For historical reasons, it is occasionally referred to as the &ldquo;jump
  106. optimization pass&rdquo;. The bulk of the code for this pass is in
  107. <samp>cfgcleanup.c</samp>, and there are support routines in <samp>cfgrtl.c</samp>
  108. and <samp>jump.c</samp>.
  109. </p>
  110. </li><li> Forward propagation of single-def values
  111. <p>This pass attempts to remove redundant computation by substituting
  112. variables that come from a single definition, and
  113. seeing if the result can be simplified. It performs copy propagation
  114. and addressing mode selection. The pass is run twice, with values
  115. being propagated into loops only on the second run. The code is
  116. located in <samp>fwprop.c</samp>.
  117. </p>
  118. </li><li> Common subexpression elimination
  119. <p>This pass removes redundant computation within basic blocks, and
  120. optimizes addressing modes based on cost. The pass is run twice.
  121. The code for this pass is located in <samp>cse.c</samp>.
  122. </p>
  123. </li><li> Global common subexpression elimination
  124. <p>This pass performs two
  125. different types of GCSE depending on whether you are optimizing for
  126. size or not (LCM based GCSE tends to increase code size for a gain in
  127. speed, while Morel-Renvoise based GCSE does not).
  128. When optimizing for size, GCSE is done using Morel-Renvoise Partial
  129. Redundancy Elimination, with the exception that it does not try to move
  130. invariants out of loops&mdash;that is left to the loop optimization pass.
  131. If MR PRE GCSE is done, code hoisting (aka unification) is also done, as
  132. well as load motion.
  133. If you are optimizing for speed, LCM (lazy code motion) based GCSE is
  134. done. LCM is based on the work of Knoop, Ruthing, and Steffen. LCM
  135. based GCSE also does loop invariant code motion. We also perform load
  136. and store motion when optimizing for speed.
  137. Regardless of which type of GCSE is used, the GCSE pass also performs
  138. global constant and copy propagation.
  139. The source file for this pass is <samp>gcse.c</samp>, and the LCM routines
  140. are in <samp>lcm.c</samp>.
  141. </p>
  142. </li><li> Loop optimization
  143. <p>This pass performs several loop related optimizations.
  144. The source files <samp>cfgloopanal.c</samp> and <samp>cfgloopmanip.c</samp> contain
  145. generic loop analysis and manipulation code. Initialization and finalization
  146. of loop structures is handled by <samp>loop-init.c</samp>.
  147. A loop invariant motion pass is implemented in <samp>loop-invariant.c</samp>.
  148. Basic block level optimizations&mdash;unrolling, and peeling loops&mdash;
  149. are implemented in <samp>loop-unroll.c</samp>.
  150. Replacing of the exit condition of loops by special machine-dependent
  151. instructions is handled by <samp>loop-doloop.c</samp>.
  152. </p>
  153. </li><li> Jump bypassing
  154. <p>This pass is an aggressive form of GCSE that transforms the control
  155. flow graph of a function by propagating constants into conditional
  156. branch instructions. The source file for this pass is <samp>gcse.c</samp>.
  157. </p>
  158. </li><li> If conversion
  159. <p>This pass attempts to replace conditional branches and surrounding
  160. assignments with arithmetic, boolean value producing comparison
  161. instructions, and conditional move instructions. In the very last
  162. invocation after reload/LRA, it will generate predicated instructions
  163. when supported by the target. The code is located in <samp>ifcvt.c</samp>.
  164. </p>
  165. </li><li> Web construction
  166. <p>This pass splits independent uses of each pseudo-register. This can
  167. improve effect of the other transformation, such as CSE or register
  168. allocation. The code for this pass is located in <samp>web.c</samp>.
  169. </p>
  170. </li><li> Instruction combination
  171. <p>This pass attempts to combine groups of two or three instructions that
  172. are related by data flow into single instructions. It combines the
  173. RTL expressions for the instructions by substitution, simplifies the
  174. result using algebra, and then attempts to match the result against
  175. the machine description. The code is located in <samp>combine.c</samp>.
  176. </p>
  177. </li><li> Mode switching optimization
  178. <p>This pass looks for instructions that require the processor to be in a
  179. specific &ldquo;mode&rdquo; and minimizes the number of mode changes required to
  180. satisfy all users. What these modes are, and what they apply to are
  181. completely target-specific. The code for this pass is located in
  182. <samp>mode-switching.c</samp>.
  183. </p>
  184. </li><li> <a name="index-modulo-scheduling"></a>
  185. <a name="index-sms_002c-swing_002c-software-pipelining"></a>
  186. Modulo scheduling
  187. <p>This pass looks at innermost loops and reorders their instructions
  188. by overlapping different iterations. Modulo scheduling is performed
  189. immediately before instruction scheduling. The code for this pass is
  190. located in <samp>modulo-sched.c</samp>.
  191. </p>
  192. </li><li> Instruction scheduling
  193. <p>This pass looks for instructions whose output will not be available by
  194. the time that it is used in subsequent instructions. Memory loads and
  195. floating point instructions often have this behavior on RISC machines.
  196. It re-orders instructions within a basic block to try to separate the
  197. definition and use of items that otherwise would cause pipeline
  198. stalls. This pass is performed twice, before and after register
  199. allocation. The code for this pass is located in <samp>haifa-sched.c</samp>,
  200. <samp>sched-deps.c</samp>, <samp>sched-ebb.c</samp>, <samp>sched-rgn.c</samp> and
  201. <samp>sched-vis.c</samp>.
  202. </p>
  203. </li><li> Register allocation
  204. <p>These passes make sure that all occurrences of pseudo registers are
  205. eliminated, either by allocating them to a hard register, replacing
  206. them by an equivalent expression (e.g. a constant) or by placing
  207. them on the stack. This is done in several subpasses:
  208. </p>
  209. <ul>
  210. <li> The integrated register allocator (<acronym>IRA</acronym>). It is called
  211. integrated because coalescing, register live range splitting, and hard
  212. register preferencing are done on-the-fly during coloring. It also
  213. has better integration with the reload/LRA pass. Pseudo-registers spilled
  214. by the allocator or the reload/LRA have still a chance to get
  215. hard-registers if the reload/LRA evicts some pseudo-registers from
  216. hard-registers. The allocator helps to choose better pseudos for
  217. spilling based on their live ranges and to coalesce stack slots
  218. allocated for the spilled pseudo-registers. IRA is a regional
  219. register allocator which is transformed into Chaitin-Briggs allocator
  220. if there is one region. By default, IRA chooses regions using
  221. register pressure but the user can force it to use one region or
  222. regions corresponding to all loops.
  223. <p>Source files of the allocator are <samp>ira.c</samp>, <samp>ira-build.c</samp>,
  224. <samp>ira-costs.c</samp>, <samp>ira-conflicts.c</samp>, <samp>ira-color.c</samp>,
  225. <samp>ira-emit.c</samp>, <samp>ira-lives</samp>, plus header files <samp>ira.h</samp>
  226. and <samp>ira-int.h</samp> used for the communication between the allocator
  227. and the rest of the compiler and between the IRA files.
  228. </p>
  229. </li><li> <a name="index-reloading"></a>
  230. Reloading. This pass renumbers pseudo registers with the hardware
  231. registers numbers they were allocated. Pseudo registers that did not
  232. get hard registers are replaced with stack slots. Then it finds
  233. instructions that are invalid because a value has failed to end up in
  234. a register, or has ended up in a register of the wrong kind. It fixes
  235. up these instructions by reloading the problematical values
  236. temporarily into registers. Additional instructions are generated to
  237. do the copying.
  238. <p>The reload pass also optionally eliminates the frame pointer and inserts
  239. instructions to save and restore call-clobbered registers around calls.
  240. </p>
  241. <p>Source files are <samp>reload.c</samp> and <samp>reload1.c</samp>, plus the header
  242. <samp>reload.h</samp> used for communication between them.
  243. </p>
  244. </li><li> <a name="index-Local-Register-Allocator-_0028LRA_0029"></a>
  245. This pass is a modern replacement of the reload pass. Source files
  246. are <samp>lra.c</samp>, <samp>lra-assign.c</samp>, <samp>lra-coalesce.c</samp>,
  247. <samp>lra-constraints.c</samp>, <samp>lra-eliminations.c</samp>,
  248. <samp>lra-lives.c</samp>, <samp>lra-remat.c</samp>, <samp>lra-spills.c</samp>, the
  249. header <samp>lra-int.h</samp> used for communication between them, and the
  250. header <samp>lra.h</samp> used for communication between LRA and the rest of
  251. compiler.
  252. <p>Unlike the reload pass, intermediate LRA decisions are reflected in
  253. RTL as much as possible. This reduces the number of target-dependent
  254. macros and hooks, leaving instruction constraints as the primary
  255. source of control.
  256. </p>
  257. <p>LRA is run on targets for which TARGET_LRA_P returns true.
  258. </p></li></ul>
  259. </li><li> Basic block reordering
  260. <p>This pass implements profile guided code positioning. If profile
  261. information is not available, various types of static analysis are
  262. performed to make the predictions normally coming from the profile
  263. feedback (IE execution frequency, branch probability, etc). It is
  264. implemented in the file <samp>bb-reorder.c</samp>, and the various
  265. prediction routines are in <samp>predict.c</samp>.
  266. </p>
  267. </li><li> Variable tracking
  268. <p>This pass computes where the variables are stored at each
  269. position in code and generates notes describing the variable locations
  270. to RTL code. The location lists are then generated according to these
  271. notes to debug information if the debugging information format supports
  272. location lists. The code is located in <samp>var-tracking.c</samp>.
  273. </p>
  274. </li><li> Delayed branch scheduling
  275. <p>This optional pass attempts to find instructions that can go into the
  276. delay slots of other instructions, usually jumps and calls. The code
  277. for this pass is located in <samp>reorg.c</samp>.
  278. </p>
  279. </li><li> Branch shortening
  280. <p>On many RISC machines, branch instructions have a limited range.
  281. Thus, longer sequences of instructions must be used for long branches.
  282. In this pass, the compiler figures out what how far each instruction
  283. will be from each other instruction, and therefore whether the usual
  284. instructions, or the longer sequences, must be used for each branch.
  285. The code for this pass is located in <samp>final.c</samp>.
  286. </p>
  287. </li><li> Register-to-stack conversion
  288. <p>Conversion from usage of some hard registers to usage of a register
  289. stack may be done at this point. Currently, this is supported only
  290. for the floating-point registers of the Intel 80387 coprocessor. The
  291. code for this pass is located in <samp>reg-stack.c</samp>.
  292. </p>
  293. </li><li> Final
  294. <p>This pass outputs the assembler code for the function. The source files
  295. are <samp>final.c</samp> plus <samp>insn-output.c</samp>; the latter is generated
  296. automatically from the machine description by the tool <samp>genoutput</samp>.
  297. The header file <samp>conditions.h</samp> is used for communication between
  298. these files.
  299. </p>
  300. </li><li> Debugging information output
  301. <p>This is run after final because it must output the stack slot offsets
  302. for pseudo registers that did not get hard registers. Source files
  303. are <samp>dbxout.c</samp> for DBX symbol table format, <samp>dwarfout.c</samp> for
  304. DWARF symbol table format, files <samp>dwarf2out.c</samp> and <samp>dwarf2asm.c</samp>
  305. for DWARF2 symbol table format, and <samp>vmsdbgout.c</samp> for VMS debug
  306. symbol table format.
  307. </p>
  308. </li></ul>
  309. <hr>
  310. <div class="header">
  311. <p>
  312. Next: <a href="Optimization-info.html#Optimization-info" accesskey="n" rel="next">Optimization info</a>, Previous: <a href="Tree-SSA-passes.html#Tree-SSA-passes" accesskey="p" rel="prev">Tree SSA passes</a>, Up: <a href="Passes.html#Passes" accesskey="u" rel="up">Passes</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  313. </div>
  314. </body>
  315. </html>