Nie możesz wybrać więcej, niż 25 tematów Tematy muszą się zaczynać od litery lub cyfry, mogą zawierać myślniki ('-') i mogą mieć do 35 znaków.

210 lines
9.6KB

  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1988-2020 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Funding Free Software", the Front-Cover
  8. Texts being (a) (see below), and with the Back-Cover Texts being (b)
  9. (see below). A copy of the license is included in the section entitled
  10. "GNU Free Documentation License".
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  20. <title>LTO Overview (GNU Compiler Collection (GCC) Internals)</title>
  21. <meta name="description" content="LTO Overview (GNU Compiler Collection (GCC) Internals)">
  22. <meta name="keywords" content="LTO Overview (GNU Compiler Collection (GCC) Internals)">
  23. <meta name="resource-type" content="document">
  24. <meta name="distribution" content="global">
  25. <meta name="Generator" content="makeinfo">
  26. <link href="index.html#Top" rel="start" title="Top">
  27. <link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="LTO.html#LTO" rel="up" title="LTO">
  30. <link href="LTO-object-file-layout.html#LTO-object-file-layout" rel="next" title="LTO object file layout">
  31. <link href="LTO.html#LTO" rel="prev" title="LTO">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.indentedblock {margin-right: 0em}
  36. blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
  37. blockquote.smallquotation {font-size: smaller}
  38. div.display {margin-left: 3.2em}
  39. div.example {margin-left: 3.2em}
  40. div.lisp {margin-left: 3.2em}
  41. div.smalldisplay {margin-left: 3.2em}
  42. div.smallexample {margin-left: 3.2em}
  43. div.smalllisp {margin-left: 3.2em}
  44. kbd {font-style: oblique}
  45. pre.display {font-family: inherit}
  46. pre.format {font-family: inherit}
  47. pre.menu-comment {font-family: serif}
  48. pre.menu-preformatted {font-family: serif}
  49. pre.smalldisplay {font-family: inherit; font-size: smaller}
  50. pre.smallexample {font-size: smaller}
  51. pre.smallformat {font-family: inherit; font-size: smaller}
  52. pre.smalllisp {font-size: smaller}
  53. span.nolinebreak {white-space: nowrap}
  54. span.roman {font-family: initial; font-weight: normal}
  55. span.sansserif {font-family: sans-serif; font-weight: normal}
  56. ul.no-bullet {list-style: none}
  57. -->
  58. </style>
  59. </head>
  60. <body lang="en">
  61. <a name="LTO-Overview"></a>
  62. <div class="header">
  63. <p>
  64. Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  65. </div>
  66. <hr>
  67. <a name="Design-Overview"></a>
  68. <h3 class="section">25.1 Design Overview</h3>
  69. <p>Link time optimization is implemented as a GCC front end for a
  70. bytecode representation of GIMPLE that is emitted in special sections
  71. of <code>.o</code> files. Currently, LTO support is enabled in most
  72. ELF-based systems, as well as darwin, cygwin and mingw systems.
  73. </p>
  74. <p>Since GIMPLE bytecode is saved alongside final object code, object
  75. files generated with LTO support are larger than regular object files.
  76. This &ldquo;fat&rdquo; object format makes it easy to integrate LTO into
  77. existing build systems, as one can, for instance, produce archives of
  78. the files. Additionally, one might be able to ship one set of fat
  79. objects which could be used both for development and the production of
  80. optimized builds. A, perhaps surprising, side effect of this feature
  81. is that any mistake in the toolchain leads to LTO information not
  82. being used (e.g. an older <code>libtool</code> calling <code>ld</code> directly).
  83. This is both an advantage, as the system is more robust, and a
  84. disadvantage, as the user is not informed that the optimization has
  85. been disabled.
  86. </p>
  87. <p>The current implementation only produces &ldquo;fat&rdquo; objects, effectively
  88. doubling compilation time and increasing file sizes up to 5x the
  89. original size. This hides the problem that some tools, such as
  90. <code>ar</code> and <code>nm</code>, need to understand symbol tables of LTO
  91. sections. These tools were extended to use the plugin infrastructure,
  92. and with these problems solved, GCC will also support &ldquo;slim&rdquo; objects
  93. consisting of the intermediate code alone.
  94. </p>
  95. <p>At the highest level, LTO splits the compiler in two. The first half
  96. (the &ldquo;writer&rdquo;) produces a streaming representation of all the
  97. internal data structures needed to optimize and generate code. This
  98. includes declarations, types, the callgraph and the GIMPLE representation
  99. of function bodies.
  100. </p>
  101. <p>When <samp>-flto</samp> is given during compilation of a source file, the
  102. pass manager executes all the passes in <code>all_lto_gen_passes</code>.
  103. Currently, this phase is composed of two IPA passes:
  104. </p>
  105. <ul>
  106. <li> <code>pass_ipa_lto_gimple_out</code>
  107. This pass executes the function <code>lto_output</code> in
  108. <samp>lto-streamer-out.c</samp>, which traverses the call graph encoding
  109. every reachable declaration, type and function. This generates a
  110. memory representation of all the file sections described below.
  111. </li><li> <code>pass_ipa_lto_finish_out</code>
  112. This pass executes the function <code>produce_asm_for_decls</code> in
  113. <samp>lto-streamer-out.c</samp>, which takes the memory image built in the
  114. previous pass and encodes it in the corresponding ELF file sections.
  115. </li></ul>
  116. <p>The second half of LTO support is the &ldquo;reader&rdquo;. This is implemented
  117. as the GCC front end <samp>lto1</samp> in <samp>lto/lto.c</samp>. When
  118. <samp>collect2</samp> detects a link set of <code>.o</code>/<code>.a</code> files with
  119. LTO information and the <samp>-flto</samp> is enabled, it invokes
  120. <samp>lto1</samp> which reads the set of files and aggregates them into a
  121. single translation unit for optimization. The main entry point for
  122. the reader is <samp>lto/lto.c</samp>:<code>lto_main</code>.
  123. </p>
  124. <a name="LTO-modes-of-operation"></a>
  125. <h4 class="subsection">25.1.1 LTO modes of operation</h4>
  126. <p>One of the main goals of the GCC link-time infrastructure was to allow
  127. effective compilation of large programs. For this reason GCC implements two
  128. link-time compilation modes.
  129. </p>
  130. <ol>
  131. <li> <em>LTO mode</em>, in which the whole program is read into the
  132. compiler at link-time and optimized in a similar way as if it
  133. were a single source-level compilation unit.
  134. </li><li> <em>WHOPR or partitioned mode</em>, designed to utilize multiple
  135. CPUs and/or a distributed compilation environment to quickly link
  136. large applications. WHOPR stands for WHOle Program optimizeR (not to
  137. be confused with the semantics of <samp>-fwhole-program</samp>). It
  138. partitions the aggregated callgraph from many different <code>.o</code>
  139. files and distributes the compilation of the sub-graphs to different
  140. CPUs.
  141. <p>Note that distributed compilation is not implemented yet, but since
  142. the parallelism is facilitated via generating a <code>Makefile</code>, it
  143. would be easy to implement.
  144. </p></li></ol>
  145. <p>WHOPR splits LTO into three main stages:
  146. </p><ol>
  147. <li> Local generation (LGEN)
  148. This stage executes in parallel. Every file in the program is compiled
  149. into the intermediate language and packaged together with the local
  150. call-graph and summary information. This stage is the same for both
  151. the LTO and WHOPR compilation mode.
  152. </li><li> Whole Program Analysis (WPA)
  153. WPA is performed sequentially. The global call-graph is generated, and
  154. a global analysis procedure makes transformation decisions. The global
  155. call-graph is partitioned to facilitate parallel optimization during
  156. phase 3. The results of the WPA stage are stored into new object files
  157. which contain the partitions of program expressed in the intermediate
  158. language and the optimization decisions.
  159. </li><li> Local transformations (LTRANS)
  160. This stage executes in parallel. All the decisions made during phase 2
  161. are implemented locally in each partitioned object file, and the final
  162. object code is generated. Optimizations which cannot be decided
  163. efficiently during the phase 2 may be performed on the local
  164. call-graph partitions.
  165. </li></ol>
  166. <p>WHOPR can be seen as an extension of the usual LTO mode of
  167. compilation. In LTO, WPA and LTRANS are executed within a single
  168. execution of the compiler, after the whole program has been read into
  169. memory.
  170. </p>
  171. <p>When compiling in WHOPR mode, the callgraph is partitioned during
  172. the WPA stage. The whole program is split into a given number of
  173. partitions of roughly the same size. The compiler tries to
  174. minimize the number of references which cross partition boundaries.
  175. The main advantage of WHOPR is to allow the parallel execution of
  176. LTRANS stages, which are the most time-consuming part of the
  177. compilation process. Additionally, it avoids the need to load the
  178. whole program into memory.
  179. </p>
  180. <hr>
  181. <div class="header">
  182. <p>
  183. Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  184. </div>
  185. </body>
  186. </html>