|
- <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
- <html>
- <!-- Copyright (C) 1988-2020 Free Software Foundation, Inc.
-
- Permission is granted to copy, distribute and/or modify this document
- under the terms of the GNU Free Documentation License, Version 1.3 or
- any later version published by the Free Software Foundation; with the
- Invariant Sections being "Funding Free Software", the Front-Cover
- Texts being (a) (see below), and with the Back-Cover Texts being (b)
- (see below). A copy of the license is included in the section entitled
- "GNU Free Documentation License".
-
- (a) The FSF's Front-Cover Text is:
-
- A GNU Manual
-
- (b) The FSF's Back-Cover Text is:
-
- You have freedom to copy and modify this GNU Manual, like GNU
- software. Copies published by the Free Software Foundation raise
- funds for GNU development. -->
- <!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
- <head>
- <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
- <title>LTO Overview (GNU Compiler Collection (GCC) Internals)</title>
-
- <meta name="description" content="LTO Overview (GNU Compiler Collection (GCC) Internals)">
- <meta name="keywords" content="LTO Overview (GNU Compiler Collection (GCC) Internals)">
- <meta name="resource-type" content="document">
- <meta name="distribution" content="global">
- <meta name="Generator" content="makeinfo">
- <link href="index.html#Top" rel="start" title="Top">
- <link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
- <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
- <link href="LTO.html#LTO" rel="up" title="LTO">
- <link href="LTO-object-file-layout.html#LTO-object-file-layout" rel="next" title="LTO object file layout">
- <link href="LTO.html#LTO" rel="prev" title="LTO">
- <style type="text/css">
- <!--
- a.summary-letter {text-decoration: none}
- blockquote.indentedblock {margin-right: 0em}
- blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
- blockquote.smallquotation {font-size: smaller}
- div.display {margin-left: 3.2em}
- div.example {margin-left: 3.2em}
- div.lisp {margin-left: 3.2em}
- div.smalldisplay {margin-left: 3.2em}
- div.smallexample {margin-left: 3.2em}
- div.smalllisp {margin-left: 3.2em}
- kbd {font-style: oblique}
- pre.display {font-family: inherit}
- pre.format {font-family: inherit}
- pre.menu-comment {font-family: serif}
- pre.menu-preformatted {font-family: serif}
- pre.smalldisplay {font-family: inherit; font-size: smaller}
- pre.smallexample {font-size: smaller}
- pre.smallformat {font-family: inherit; font-size: smaller}
- pre.smalllisp {font-size: smaller}
- span.nolinebreak {white-space: nowrap}
- span.roman {font-family: initial; font-weight: normal}
- span.sansserif {font-family: sans-serif; font-weight: normal}
- ul.no-bullet {list-style: none}
- -->
- </style>
-
-
- </head>
-
- <body lang="en">
- <a name="LTO-Overview"></a>
- <div class="header">
- <p>
- Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
- </div>
- <hr>
- <a name="Design-Overview"></a>
- <h3 class="section">25.1 Design Overview</h3>
-
- <p>Link time optimization is implemented as a GCC front end for a
- bytecode representation of GIMPLE that is emitted in special sections
- of <code>.o</code> files. Currently, LTO support is enabled in most
- ELF-based systems, as well as darwin, cygwin and mingw systems.
- </p>
- <p>Since GIMPLE bytecode is saved alongside final object code, object
- files generated with LTO support are larger than regular object files.
- This “fat” object format makes it easy to integrate LTO into
- existing build systems, as one can, for instance, produce archives of
- the files. Additionally, one might be able to ship one set of fat
- objects which could be used both for development and the production of
- optimized builds. A, perhaps surprising, side effect of this feature
- is that any mistake in the toolchain leads to LTO information not
- being used (e.g. an older <code>libtool</code> calling <code>ld</code> directly).
- This is both an advantage, as the system is more robust, and a
- disadvantage, as the user is not informed that the optimization has
- been disabled.
- </p>
- <p>The current implementation only produces “fat” objects, effectively
- doubling compilation time and increasing file sizes up to 5x the
- original size. This hides the problem that some tools, such as
- <code>ar</code> and <code>nm</code>, need to understand symbol tables of LTO
- sections. These tools were extended to use the plugin infrastructure,
- and with these problems solved, GCC will also support “slim” objects
- consisting of the intermediate code alone.
- </p>
- <p>At the highest level, LTO splits the compiler in two. The first half
- (the “writer”) produces a streaming representation of all the
- internal data structures needed to optimize and generate code. This
- includes declarations, types, the callgraph and the GIMPLE representation
- of function bodies.
- </p>
- <p>When <samp>-flto</samp> is given during compilation of a source file, the
- pass manager executes all the passes in <code>all_lto_gen_passes</code>.
- Currently, this phase is composed of two IPA passes:
- </p>
- <ul>
- <li> <code>pass_ipa_lto_gimple_out</code>
- This pass executes the function <code>lto_output</code> in
- <samp>lto-streamer-out.c</samp>, which traverses the call graph encoding
- every reachable declaration, type and function. This generates a
- memory representation of all the file sections described below.
-
- </li><li> <code>pass_ipa_lto_finish_out</code>
- This pass executes the function <code>produce_asm_for_decls</code> in
- <samp>lto-streamer-out.c</samp>, which takes the memory image built in the
- previous pass and encodes it in the corresponding ELF file sections.
- </li></ul>
-
- <p>The second half of LTO support is the “reader”. This is implemented
- as the GCC front end <samp>lto1</samp> in <samp>lto/lto.c</samp>. When
- <samp>collect2</samp> detects a link set of <code>.o</code>/<code>.a</code> files with
- LTO information and the <samp>-flto</samp> is enabled, it invokes
- <samp>lto1</samp> which reads the set of files and aggregates them into a
- single translation unit for optimization. The main entry point for
- the reader is <samp>lto/lto.c</samp>:<code>lto_main</code>.
- </p>
- <a name="LTO-modes-of-operation"></a>
- <h4 class="subsection">25.1.1 LTO modes of operation</h4>
-
- <p>One of the main goals of the GCC link-time infrastructure was to allow
- effective compilation of large programs. For this reason GCC implements two
- link-time compilation modes.
- </p>
- <ol>
- <li> <em>LTO mode</em>, in which the whole program is read into the
- compiler at link-time and optimized in a similar way as if it
- were a single source-level compilation unit.
-
- </li><li> <em>WHOPR or partitioned mode</em>, designed to utilize multiple
- CPUs and/or a distributed compilation environment to quickly link
- large applications. WHOPR stands for WHOle Program optimizeR (not to
- be confused with the semantics of <samp>-fwhole-program</samp>). It
- partitions the aggregated callgraph from many different <code>.o</code>
- files and distributes the compilation of the sub-graphs to different
- CPUs.
-
- <p>Note that distributed compilation is not implemented yet, but since
- the parallelism is facilitated via generating a <code>Makefile</code>, it
- would be easy to implement.
- </p></li></ol>
-
- <p>WHOPR splits LTO into three main stages:
- </p><ol>
- <li> Local generation (LGEN)
- This stage executes in parallel. Every file in the program is compiled
- into the intermediate language and packaged together with the local
- call-graph and summary information. This stage is the same for both
- the LTO and WHOPR compilation mode.
-
- </li><li> Whole Program Analysis (WPA)
- WPA is performed sequentially. The global call-graph is generated, and
- a global analysis procedure makes transformation decisions. The global
- call-graph is partitioned to facilitate parallel optimization during
- phase 3. The results of the WPA stage are stored into new object files
- which contain the partitions of program expressed in the intermediate
- language and the optimization decisions.
-
- </li><li> Local transformations (LTRANS)
- This stage executes in parallel. All the decisions made during phase 2
- are implemented locally in each partitioned object file, and the final
- object code is generated. Optimizations which cannot be decided
- efficiently during the phase 2 may be performed on the local
- call-graph partitions.
- </li></ol>
-
- <p>WHOPR can be seen as an extension of the usual LTO mode of
- compilation. In LTO, WPA and LTRANS are executed within a single
- execution of the compiler, after the whole program has been read into
- memory.
- </p>
- <p>When compiling in WHOPR mode, the callgraph is partitioned during
- the WPA stage. The whole program is split into a given number of
- partitions of roughly the same size. The compiler tries to
- minimize the number of references which cross partition boundaries.
- The main advantage of WHOPR is to allow the parallel execution of
- LTRANS stages, which are the most time-consuming part of the
- compilation process. Additionally, it avoids the need to load the
- whole program into memory.
- </p>
-
- <hr>
- <div class="header">
- <p>
- Next: <a href="LTO-object-file-layout.html#LTO-object-file-layout" accesskey="n" rel="next">LTO object file layout</a>, Up: <a href="LTO.html#LTO" accesskey="u" rel="up">LTO</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
- </div>
-
-
-
- </body>
- </html>
|