visus
/
toolchain-gcc-arm-embedded

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<!-- Copyright (C) 1988-2020 Free Software Foundation, Inc.

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.3 or
any later version published by the Free Software Foundation; with the
Invariant Sections being "Funding Free Software", the Front-Cover
Texts being (a) (see below), and with the Back-Cover Texts being (b)
(see below).  A copy of the license is included in the section entitled
"GNU Free Documentation License".

(a) The FSF's Front-Cover Text is:

A GNU Manual

(b) The FSF's Back-Cover Text is:

You have freedom to copy and modify this GNU Manual, like GNU
     software.  Copies published by the Free Software Foundation raise
     funds for GNU development. -->
<!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Insn Splitting (GNU Compiler Collection (GCC) Internals)</title>

<meta name="description" content="Insn Splitting (GNU Compiler Collection (GCC) Internals)">
<meta name="keywords" content="Insn Splitting (GNU Compiler Collection (GCC) Internals)">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="makeinfo">
<link href="index.html#Top" rel="start" title="Top">
<link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
<link href="Machine-Desc.html#Machine-Desc" rel="up" title="Machine Desc">
<link href="Including-Patterns.html#Including-Patterns" rel="next" title="Including Patterns">
<link href="Expander-Definitions.html#Expander-Definitions" rel="prev" title="Expander Definitions">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
blockquote.indentedblock {margin-right: 0em}
blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
blockquote.smallquotation {font-size: smaller}
div.display {margin-left: 3.2em}
div.example {margin-left: 3.2em}
div.lisp {margin-left: 3.2em}
div.smalldisplay {margin-left: 3.2em}
div.smallexample {margin-left: 3.2em}
div.smalllisp {margin-left: 3.2em}
kbd {font-style: oblique}
pre.display {font-family: inherit}
pre.format {font-family: inherit}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: inherit; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: inherit; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.nolinebreak {white-space: nowrap}
span.roman {font-family: initial; font-weight: normal}
span.sansserif {font-family: sans-serif; font-weight: normal}
ul.no-bullet {list-style: none}
-->
</style>


</head>

<body lang="en">
<a name="Insn-Splitting"></a>
<div class="header">
<p>
Next: <a href="Including-Patterns.html#Including-Patterns" accesskey="n" rel="next">Including Patterns</a>, Previous: <a href="Expander-Definitions.html#Expander-Definitions" accesskey="p" rel="prev">Expander Definitions</a>, Up: <a href="Machine-Desc.html#Machine-Desc" accesskey="u" rel="up">Machine Desc</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
</div>
<hr>
<a name="Defining-How-to-Split-Instructions"></a>
<h3 class="section">17.16 Defining How to Split Instructions</h3>
<a name="index-insn-splitting"></a>
<a name="index-instruction-splitting"></a>
<a name="index-splitting-instructions"></a>

<p>There are two cases where you should specify how to split a pattern
into multiple insns.  On machines that have instructions requiring
delay slots (see <a href="Delay-Slots.html#Delay-Slots">Delay Slots</a>) or that have instructions whose
output is not available for multiple cycles (see <a href="Processor-pipeline-description.html#Processor-pipeline-description">Processor pipeline description</a>), the compiler phases that optimize these cases need to
be able to move insns into one-instruction delay slots.  However, some
insns may generate more than one machine instruction.  These insns
cannot be placed into a delay slot.
</p>
<p>Often you can rewrite the single insn as a list of individual insns,
each corresponding to one machine instruction.  The disadvantage of
doing so is that it will cause the compilation to be slower and require
more space.  If the resulting insns are too complex, it may also
suppress some optimizations.  The compiler splits the insn if there is a
reason to believe that it might improve instruction or delay slot
scheduling.
</p>
<p>The insn combiner phase also splits putative insns.  If three insns are
merged into one insn with a complex expression that cannot be matched by
some <code>define_insn</code> pattern, the combiner phase attempts to split
the complex pattern into two insns that are recognized.  Usually it can
break the complex pattern into two patterns by splitting out some
subexpression.  However, in some other cases, such as performing an
addition of a large constant in two insns on a RISC machine, the way to
split the addition into two insns is machine-dependent.
</p>
<a name="index-define_005fsplit"></a>
<p>The <code>define_split</code> definition tells the compiler how to split a
complex insn into several simpler insns.  It looks like this:
</p>
<div class="smallexample">
<pre class="smallexample">(define_split
  [<var>insn-pattern</var>]
  &quot;<var>condition</var>&quot;
  [<var>new-insn-pattern-1</var>
   <var>new-insn-pattern-2</var>
   &hellip;]
  &quot;<var>preparation-statements</var>&quot;)
</pre></div>

<p><var>insn-pattern</var> is a pattern that needs to be split and
<var>condition</var> is the final condition to be tested, as in a
<code>define_insn</code>.  When an insn matching <var>insn-pattern</var> and
satisfying <var>condition</var> is found, it is replaced in the insn list
with the insns given by <var>new-insn-pattern-1</var>,
<var>new-insn-pattern-2</var>, etc.
</p>
<p>The <var>preparation-statements</var> are similar to those statements that
are specified for <code>define_expand</code> (see <a href="Expander-Definitions.html#Expander-Definitions">Expander Definitions</a>)
and are executed before the new RTL is generated to prepare for the
generated code or emit some insns whose pattern is not fixed.  Unlike
those in <code>define_expand</code>, however, these statements must not
generate any new pseudo-registers.  Once reload has completed, they also
must not allocate any space in the stack frame.
</p>
<p>There are two special macros defined for use in the preparation statements:
<code>DONE</code> and <code>FAIL</code>.  Use them with a following semicolon,
as a statement.
</p>
<dl compact="compact">
<dd>
<a name="index-DONE-1"></a>
</dd>
<dt><code>DONE</code></dt>
<dd><p>Use the <code>DONE</code> macro to end RTL generation for the splitter.  The
only RTL insns generated as replacement for the matched input insn will
be those already emitted by explicit calls to <code>emit_insn</code> within
the preparation statements; the replacement pattern is not used.
</p>
<a name="index-FAIL-1"></a>
</dd>
<dt><code>FAIL</code></dt>
<dd><p>Make the <code>define_split</code> fail on this occasion.  When a <code>define_split</code>
fails, it means that the splitter was not truly available for the inputs
it was given, and the input insn will not be split.
</p></dd>
</dl>

<p>If the preparation falls through (invokes neither <code>DONE</code> nor
<code>FAIL</code>), then the <code>define_split</code> uses the replacement
template.
</p>
<p>Patterns are matched against <var>insn-pattern</var> in two different
circumstances.  If an insn needs to be split for delay slot scheduling
or insn scheduling, the insn is already known to be valid, which means
that it must have been matched by some <code>define_insn</code> and, if
<code>reload_completed</code> is nonzero, is known to satisfy the constraints
of that <code>define_insn</code>.  In that case, the new insn patterns must
also be insns that are matched by some <code>define_insn</code> and, if
<code>reload_completed</code> is nonzero, must also satisfy the constraints
of those definitions.
</p>
<p>As an example of this usage of <code>define_split</code>, consider the following
example from <samp>a29k.md</samp>, which splits a <code>sign_extend</code> from
<code>HImode</code> to <code>SImode</code> into a pair of shift insns:
</p>
<div class="smallexample">
<pre class="smallexample">(define_split
  [(set (match_operand:SI 0 &quot;gen_reg_operand&quot; &quot;&quot;)
        (sign_extend:SI (match_operand:HI 1 &quot;gen_reg_operand&quot; &quot;&quot;)))]
  &quot;&quot;
  [(set (match_dup 0)
        (ashift:SI (match_dup 1)
                   (const_int 16)))
   (set (match_dup 0)
        (ashiftrt:SI (match_dup 0)
                     (const_int 16)))]
  &quot;
{ operands[1] = gen_lowpart (SImode, operands[1]); }&quot;)
</pre></div>

<p>When the combiner phase tries to split an insn pattern, it is always the
case that the pattern is <em>not</em> matched by any <code>define_insn</code>.
The combiner pass first tries to split a single <code>set</code> expression
and then the same <code>set</code> expression inside a <code>parallel</code>, but
followed by a <code>clobber</code> of a pseudo-reg to use as a scratch
register.  In these cases, the combiner expects exactly one or two new insn
patterns to be generated.  It will verify that these patterns match some
<code>define_insn</code> definitions, so you need not do this test in the
<code>define_split</code> (of course, there is no point in writing a
<code>define_split</code> that will never produce insns that match).
</p>
<p>Here is an example of this use of <code>define_split</code>, taken from
<samp>rs6000.md</samp>:
</p>
<div class="smallexample">
<pre class="smallexample">(define_split
  [(set (match_operand:SI 0 &quot;gen_reg_operand&quot; &quot;&quot;)
        (plus:SI (match_operand:SI 1 &quot;gen_reg_operand&quot; &quot;&quot;)
                 (match_operand:SI 2 &quot;non_add_cint_operand&quot; &quot;&quot;)))]
  &quot;&quot;
  [(set (match_dup 0) (plus:SI (match_dup 1) (match_dup 3)))
   (set (match_dup 0) (plus:SI (match_dup 0) (match_dup 4)))]
&quot;
{
  int low = INTVAL (operands[2]) &amp; 0xffff;
  int high = (unsigned) INTVAL (operands[2]) &gt;&gt; 16;

  if (low &amp; 0x8000)
    high++, low |= 0xffff0000;

  operands[3] = GEN_INT (high &lt;&lt; 16);
  operands[4] = GEN_INT (low);
}&quot;)
</pre></div>

<p>Here the predicate <code>non_add_cint_operand</code> matches any
<code>const_int</code> that is <em>not</em> a valid operand of a single add
insn.  The add with the smaller displacement is written so that it
can be substituted into the address of a subsequent operation.
</p>
<p>An example that uses a scratch register, from the same file, generates
an equality comparison of a register and a large constant:
</p>
<div class="smallexample">
<pre class="smallexample">(define_split
  [(set (match_operand:CC 0 &quot;cc_reg_operand&quot; &quot;&quot;)
        (compare:CC (match_operand:SI 1 &quot;gen_reg_operand&quot; &quot;&quot;)
                    (match_operand:SI 2 &quot;non_short_cint_operand&quot; &quot;&quot;)))
   (clobber (match_operand:SI 3 &quot;gen_reg_operand&quot; &quot;&quot;))]
  &quot;find_single_use (operands[0], insn, 0)
   &amp;&amp; (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ
       || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)&quot;
  [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4)))
   (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))]
  &quot;
{
  /* <span class="roman">Get the constant we are comparing against, C, and see what it
     looks like sign-extended to 16 bits.  Then see what constant
     could be XOR&rsquo;ed with C to get the sign-extended value.</span>  */

  int c = INTVAL (operands[2]);
  int sextc = (c &lt;&lt; 16) &gt;&gt; 16;
  int xorv = c ^ sextc;

  operands[4] = GEN_INT (xorv);
  operands[5] = GEN_INT (sextc);
}&quot;)
</pre></div>

<p>To avoid confusion, don&rsquo;t write a single <code>define_split</code> that
accepts some insns that match some <code>define_insn</code> as well as some
insns that don&rsquo;t.  Instead, write two separate <code>define_split</code>
definitions, one for the insns that are valid and one for the insns that
are not valid.
</p>
<p>The splitter is allowed to split jump instructions into sequence of
jumps or create new jumps in while splitting non-jump instructions.  As
the control flow graph and branch prediction information needs to be updated,
several restriction apply.
</p>
<p>Splitting of jump instruction into sequence that over by another jump
instruction is always valid, as compiler expect identical behavior of new
jump.  When new sequence contains multiple jump instructions or new labels,
more assistance is needed.  Splitter is required to create only unconditional
jumps, or simple conditional jump instructions.  Additionally it must attach a
<code>REG_BR_PROB</code> note to each conditional jump.  A global variable
<code>split_branch_probability</code> holds the probability of the original branch in case
it was a simple conditional jump, -1 otherwise.  To simplify
recomputing of edge frequencies, the new sequence is required to have only
forward jumps to the newly created labels.
</p>
<a name="index-define_005finsn_005fand_005fsplit"></a>
<p>For the common case where the pattern of a define_split exactly matches the
pattern of a define_insn, use <code>define_insn_and_split</code>.  It looks like
this:
</p>
<div class="smallexample">
<pre class="smallexample">(define_insn_and_split
  [<var>insn-pattern</var>]
  &quot;<var>condition</var>&quot;
  &quot;<var>output-template</var>&quot;
  &quot;<var>split-condition</var>&quot;
  [<var>new-insn-pattern-1</var>
   <var>new-insn-pattern-2</var>
   &hellip;]
  &quot;<var>preparation-statements</var>&quot;
  [<var>insn-attributes</var>])

</pre></div>

<p><var>insn-pattern</var>, <var>condition</var>, <var>output-template</var>, and
<var>insn-attributes</var> are used as in <code>define_insn</code>.  The
<var>new-insn-pattern</var> vector and the <var>preparation-statements</var> are used as
in a <code>define_split</code>.  The <var>split-condition</var> is also used as in
<code>define_split</code>, with the additional behavior that if the condition starts
with &lsquo;<samp>&amp;&amp;</samp>&rsquo;, the condition used for the split will be the constructed as a
logical &ldquo;and&rdquo; of the split condition with the insn condition.  For example,
from i386.md:
</p>
<div class="smallexample">
<pre class="smallexample">(define_insn_and_split &quot;zero_extendhisi2_and&quot;
  [(set (match_operand:SI 0 &quot;register_operand&quot; &quot;=r&quot;)
     (zero_extend:SI (match_operand:HI 1 &quot;register_operand&quot; &quot;0&quot;)))
   (clobber (reg:CC 17))]
  &quot;TARGET_ZERO_EXTEND_WITH_AND &amp;&amp; !optimize_size&quot;
  &quot;#&quot;
  &quot;&amp;&amp; reload_completed&quot;
  [(parallel [(set (match_dup 0)
                   (and:SI (match_dup 0) (const_int 65535)))
              (clobber (reg:CC 17))])]
  &quot;&quot;
  [(set_attr &quot;type&quot; &quot;alu1&quot;)])

</pre></div>

<p>In this case, the actual split condition will be
&lsquo;<samp>TARGET_ZERO_EXTEND_WITH_AND &amp;&amp; !optimize_size &amp;&amp; reload_completed</samp>&rsquo;.
</p>
<p>The <code>define_insn_and_split</code> construction provides exactly the same
functionality as two separate <code>define_insn</code> and <code>define_split</code>
patterns.  It exists for compactness, and as a maintenance tool to prevent
having to ensure the two patterns&rsquo; templates match.
</p>
<a name="index-define_005finsn_005fand_005frewrite"></a>
<p>It is sometimes useful to have a <code>define_insn_and_split</code>
that replaces specific operands of an instruction but leaves the
rest of the instruction pattern unchanged.  You can do this directly
with a <code>define_insn_and_split</code>, but it requires a
<var>new-insn-pattern-1</var> that repeats most of the original <var>insn-pattern</var>.
There is also the complication that an implicit <code>parallel</code> in
<var>insn-pattern</var> must become an explicit <code>parallel</code> in
<var>new-insn-pattern-1</var>, which is easy to overlook.
A simpler alternative is to use <code>define_insn_and_rewrite</code>, which
is a form of <code>define_insn_and_split</code> that automatically generates
<var>new-insn-pattern-1</var> by replacing each <code>match_operand</code>
in <var>insn-pattern</var> with a corresponding <code>match_dup</code>, and each
<code>match_operator</code> in the pattern with a corresponding <code>match_op_dup</code>.
The arguments are otherwise identical to <code>define_insn_and_split</code>:
</p>
<div class="smallexample">
<pre class="smallexample">(define_insn_and_rewrite
  [<var>insn-pattern</var>]
  &quot;<var>condition</var>&quot;
  &quot;<var>output-template</var>&quot;
  &quot;<var>split-condition</var>&quot;
  &quot;<var>preparation-statements</var>&quot;
  [<var>insn-attributes</var>])
</pre></div>

<p>The <code>match_dup</code>s and <code>match_op_dup</code>s in the new
instruction pattern use any new operand values that the
<var>preparation-statements</var> store in the <code>operands</code> array,
as for a normal <code>define_insn_and_split</code>.  <var>preparation-statements</var>
can also emit additional instructions before the new instruction.
They can even emit an entirely different sequence of instructions and
use <code>DONE</code> to avoid emitting a new form of the original
instruction.
</p>
<p>The split in a <code>define_insn_and_rewrite</code> is only intended
to apply to existing instructions that match <var>insn-pattern</var>.
<var>split-condition</var> must therefore start with <code>&amp;&amp;</code>,
so that the split condition applies on top of <var>condition</var>.
</p>
<p>Here is an example from the AArch64 SVE port, in which operand 1 is
known to be equivalent to an all-true constant and isn&rsquo;t used by the
output template:
</p>
<div class="smallexample">
<pre class="smallexample">(define_insn_and_rewrite &quot;*while_ult&lt;GPI:mode&gt;&lt;PRED_ALL:mode&gt;_cc&quot;
  [(set (reg:CC CC_REGNUM)
        (compare:CC
          (unspec:SI [(match_operand:PRED_ALL 1)
                      (unspec:PRED_ALL
                        [(match_operand:GPI 2 &quot;aarch64_reg_or_zero&quot; &quot;rZ&quot;)
                         (match_operand:GPI 3 &quot;aarch64_reg_or_zero&quot; &quot;rZ&quot;)]
                        UNSPEC_WHILE_LO)]
                     UNSPEC_PTEST_PTRUE)
          (const_int 0)))
   (set (match_operand:PRED_ALL 0 &quot;register_operand&quot; &quot;=Upa&quot;)
        (unspec:PRED_ALL [(match_dup 2)
                          (match_dup 3)]
                         UNSPEC_WHILE_LO))]
  &quot;TARGET_SVE&quot;
  &quot;whilelo\t%0.&lt;PRED_ALL:Vetype&gt;, %&lt;w&gt;2, %&lt;w&gt;3&quot;
  ;; Force the compiler to drop the unused predicate operand, so that we
  ;; don't have an unnecessary PTRUE.
  &quot;&amp;&amp; !CONSTANT_P (operands[1])&quot;
  {
    operands[1] = CONSTM1_RTX (&lt;MODE&gt;mode);
  }
)
</pre></div>

<p>The splitter in this case simply replaces operand 1 with the constant
value that it is known to have.  The equivalent <code>define_insn_and_split</code>
would be:
</p>
<div class="smallexample">
<pre class="smallexample">(define_insn_and_split &quot;*while_ult&lt;GPI:mode&gt;&lt;PRED_ALL:mode&gt;_cc&quot;
  [(set (reg:CC CC_REGNUM)
        (compare:CC
          (unspec:SI [(match_operand:PRED_ALL 1)
                      (unspec:PRED_ALL
                        [(match_operand:GPI 2 &quot;aarch64_reg_or_zero&quot; &quot;rZ&quot;)
                         (match_operand:GPI 3 &quot;aarch64_reg_or_zero&quot; &quot;rZ&quot;)]
                        UNSPEC_WHILE_LO)]
                     UNSPEC_PTEST_PTRUE)
          (const_int 0)))
   (set (match_operand:PRED_ALL 0 &quot;register_operand&quot; &quot;=Upa&quot;)
        (unspec:PRED_ALL [(match_dup 2)
                          (match_dup 3)]
                         UNSPEC_WHILE_LO))]
  &quot;TARGET_SVE&quot;
  &quot;whilelo\t%0.&lt;PRED_ALL:Vetype&gt;, %&lt;w&gt;2, %&lt;w&gt;3&quot;
  ;; Force the compiler to drop the unused predicate operand, so that we
  ;; don't have an unnecessary PTRUE.
  &quot;&amp;&amp; !CONSTANT_P (operands[1])&quot;
  [(parallel
     [(set (reg:CC CC_REGNUM)
           (compare:CC
             (unspec:SI [(match_dup 1)
                         (unspec:PRED_ALL [(match_dup 2)
                                           (match_dup 3)]
                                          UNSPEC_WHILE_LO)]
                        UNSPEC_PTEST_PTRUE)
             (const_int 0)))
      (set (match_dup 0)
           (unspec:PRED_ALL [(match_dup 2)
                             (match_dup 3)]
                            UNSPEC_WHILE_LO))])]
  {
    operands[1] = CONSTM1_RTX (&lt;MODE&gt;mode);
  }
)
</pre></div>

<hr>
<div class="header">
<p>
Next: <a href="Including-Patterns.html#Including-Patterns" accesskey="n" rel="next">Including Patterns</a>, Previous: <a href="Expander-Definitions.html#Expander-Definitions" accesskey="p" rel="prev">Expander Definitions</a>, Up: <a href="Machine-Desc.html#Machine-Desc" accesskey="u" rel="up">Machine Desc</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
</div>


</body>
</html>