您最多选择25个主题 主题必须以字母或数字开头,可以包含连字符 (-),并且长度不得超过35个字符

1951 行
81KB

  1. <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <!-- Copyright (C) 1988-2020 Free Software Foundation, Inc.
  4. Permission is granted to copy, distribute and/or modify this document
  5. under the terms of the GNU Free Documentation License, Version 1.3 or
  6. any later version published by the Free Software Foundation; with the
  7. Invariant Sections being "Funding Free Software", the Front-Cover
  8. Texts being (a) (see below), and with the Back-Cover Texts being (b)
  9. (see below). A copy of the license is included in the section entitled
  10. "GNU Free Documentation License".
  11. (a) The FSF's Front-Cover Text is:
  12. A GNU Manual
  13. (b) The FSF's Back-Cover Text is:
  14. You have freedom to copy and modify this GNU Manual, like GNU
  15. software. Copies published by the Free Software Foundation raise
  16. funds for GNU development. -->
  17. <!-- Created by GNU Texinfo 6.5, http://www.gnu.org/software/texinfo/ -->
  18. <head>
  19. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  20. <title>x86 Options (Using the GNU Compiler Collection (GCC))</title>
  21. <meta name="description" content="x86 Options (Using the GNU Compiler Collection (GCC))">
  22. <meta name="keywords" content="x86 Options (Using the GNU Compiler Collection (GCC))">
  23. <meta name="resource-type" content="document">
  24. <meta name="distribution" content="global">
  25. <meta name="Generator" content="makeinfo">
  26. <link href="index.html#Top" rel="start" title="Top">
  27. <link href="Option-Index.html#Option-Index" rel="index" title="Option Index">
  28. <link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
  29. <link href="Submodel-Options.html#Submodel-Options" rel="up" title="Submodel Options">
  30. <link href="x86-Windows-Options.html#x86-Windows-Options" rel="next" title="x86 Windows Options">
  31. <link href="VxWorks-Options.html#VxWorks-Options" rel="prev" title="VxWorks Options">
  32. <style type="text/css">
  33. <!--
  34. a.summary-letter {text-decoration: none}
  35. blockquote.indentedblock {margin-right: 0em}
  36. blockquote.smallindentedblock {margin-right: 0em; font-size: smaller}
  37. blockquote.smallquotation {font-size: smaller}
  38. div.display {margin-left: 3.2em}
  39. div.example {margin-left: 3.2em}
  40. div.lisp {margin-left: 3.2em}
  41. div.smalldisplay {margin-left: 3.2em}
  42. div.smallexample {margin-left: 3.2em}
  43. div.smalllisp {margin-left: 3.2em}
  44. kbd {font-style: oblique}
  45. pre.display {font-family: inherit}
  46. pre.format {font-family: inherit}
  47. pre.menu-comment {font-family: serif}
  48. pre.menu-preformatted {font-family: serif}
  49. pre.smalldisplay {font-family: inherit; font-size: smaller}
  50. pre.smallexample {font-size: smaller}
  51. pre.smallformat {font-family: inherit; font-size: smaller}
  52. pre.smalllisp {font-size: smaller}
  53. span.nolinebreak {white-space: nowrap}
  54. span.roman {font-family: initial; font-weight: normal}
  55. span.sansserif {font-family: sans-serif; font-weight: normal}
  56. ul.no-bullet {list-style: none}
  57. -->
  58. </style>
  59. </head>
  60. <body lang="en">
  61. <a name="x86-Options"></a>
  62. <div class="header">
  63. <p>
  64. Next: <a href="x86-Windows-Options.html#x86-Windows-Options" accesskey="n" rel="next">x86 Windows Options</a>, Previous: <a href="VxWorks-Options.html#VxWorks-Options" accesskey="p" rel="prev">VxWorks Options</a>, Up: <a href="Submodel-Options.html#Submodel-Options" accesskey="u" rel="up">Submodel Options</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  65. </div>
  66. <hr>
  67. <a name="x86-Options-1"></a>
  68. <h4 class="subsection">3.19.59 x86 Options</h4>
  69. <a name="index-x86-Options"></a>
  70. <p>These &lsquo;<samp>-m</samp>&rsquo; options are defined for the x86 family of computers.
  71. </p>
  72. <dl compact="compact">
  73. <dt><code>-march=<var>cpu-type</var></code></dt>
  74. <dd><a name="index-march-14"></a>
  75. <p>Generate instructions for the machine type <var>cpu-type</var>. In contrast to
  76. <samp>-mtune=<var>cpu-type</var></samp>, which merely tunes the generated code
  77. for the specified <var>cpu-type</var>, <samp>-march=<var>cpu-type</var></samp> allows GCC
  78. to generate code that may not run at all on processors other than the one
  79. indicated. Specifying <samp>-march=<var>cpu-type</var></samp> implies
  80. <samp>-mtune=<var>cpu-type</var></samp>.
  81. </p>
  82. <p>The choices for <var>cpu-type</var> are:
  83. </p>
  84. <dl compact="compact">
  85. <dt>&lsquo;<samp>native</samp>&rsquo;</dt>
  86. <dd><p>This selects the CPU to generate code for at compilation time by determining
  87. the processor type of the compiling machine. Using <samp>-march=native</samp>
  88. enables all instruction subsets supported by the local machine (hence
  89. the result might not run on different machines). Using <samp>-mtune=native</samp>
  90. produces code optimized for the local machine under the constraints
  91. of the selected instruction set.
  92. </p>
  93. </dd>
  94. <dt>&lsquo;<samp>x86-64</samp>&rsquo;</dt>
  95. <dd><p>A generic CPU with 64-bit extensions.
  96. </p>
  97. </dd>
  98. <dt>&lsquo;<samp>i386</samp>&rsquo;</dt>
  99. <dd><p>Original Intel i386 CPU.
  100. </p>
  101. </dd>
  102. <dt>&lsquo;<samp>i486</samp>&rsquo;</dt>
  103. <dd><p>Intel i486 CPU. (No scheduling is implemented for this chip.)
  104. </p>
  105. </dd>
  106. <dt>&lsquo;<samp>i586</samp>&rsquo;</dt>
  107. <dt>&lsquo;<samp>pentium</samp>&rsquo;</dt>
  108. <dd><p>Intel Pentium CPU with no MMX support.
  109. </p>
  110. </dd>
  111. <dt>&lsquo;<samp>lakemont</samp>&rsquo;</dt>
  112. <dd><p>Intel Lakemont MCU, based on Intel Pentium CPU.
  113. </p>
  114. </dd>
  115. <dt>&lsquo;<samp>pentium-mmx</samp>&rsquo;</dt>
  116. <dd><p>Intel Pentium MMX CPU, based on Pentium core with MMX instruction set support.
  117. </p>
  118. </dd>
  119. <dt>&lsquo;<samp>pentiumpro</samp>&rsquo;</dt>
  120. <dd><p>Intel Pentium Pro CPU.
  121. </p>
  122. </dd>
  123. <dt>&lsquo;<samp>i686</samp>&rsquo;</dt>
  124. <dd><p>When used with <samp>-march</samp>, the Pentium Pro
  125. instruction set is used, so the code runs on all i686 family chips.
  126. When used with <samp>-mtune</samp>, it has the same meaning as &lsquo;<samp>generic</samp>&rsquo;.
  127. </p>
  128. </dd>
  129. <dt>&lsquo;<samp>pentium2</samp>&rsquo;</dt>
  130. <dd><p>Intel Pentium II CPU, based on Pentium Pro core with MMX instruction set
  131. support.
  132. </p>
  133. </dd>
  134. <dt>&lsquo;<samp>pentium3</samp>&rsquo;</dt>
  135. <dt>&lsquo;<samp>pentium3m</samp>&rsquo;</dt>
  136. <dd><p>Intel Pentium III CPU, based on Pentium Pro core with MMX and SSE instruction
  137. set support.
  138. </p>
  139. </dd>
  140. <dt>&lsquo;<samp>pentium-m</samp>&rsquo;</dt>
  141. <dd><p>Intel Pentium M; low-power version of Intel Pentium III CPU
  142. with MMX, SSE and SSE2 instruction set support. Used by Centrino notebooks.
  143. </p>
  144. </dd>
  145. <dt>&lsquo;<samp>pentium4</samp>&rsquo;</dt>
  146. <dt>&lsquo;<samp>pentium4m</samp>&rsquo;</dt>
  147. <dd><p>Intel Pentium 4 CPU with MMX, SSE and SSE2 instruction set support.
  148. </p>
  149. </dd>
  150. <dt>&lsquo;<samp>prescott</samp>&rsquo;</dt>
  151. <dd><p>Improved version of Intel Pentium 4 CPU with MMX, SSE, SSE2 and SSE3 instruction
  152. set support.
  153. </p>
  154. </dd>
  155. <dt>&lsquo;<samp>nocona</samp>&rsquo;</dt>
  156. <dd><p>Improved version of Intel Pentium 4 CPU with 64-bit extensions, MMX, SSE,
  157. SSE2 and SSE3 instruction set support.
  158. </p>
  159. </dd>
  160. <dt>&lsquo;<samp>core2</samp>&rsquo;</dt>
  161. <dd><p>Intel Core 2 CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3 and SSSE3
  162. instruction set support.
  163. </p>
  164. </dd>
  165. <dt>&lsquo;<samp>nehalem</samp>&rsquo;</dt>
  166. <dd><p>Intel Nehalem CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3,
  167. SSE4.1, SSE4.2 and POPCNT instruction set support.
  168. </p>
  169. </dd>
  170. <dt>&lsquo;<samp>westmere</samp>&rsquo;</dt>
  171. <dd><p>Intel Westmere CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3,
  172. SSE4.1, SSE4.2, POPCNT, AES and PCLMUL instruction set support.
  173. </p>
  174. </dd>
  175. <dt>&lsquo;<samp>sandybridge</samp>&rsquo;</dt>
  176. <dd><p>Intel Sandy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3,
  177. SSE4.1, SSE4.2, POPCNT, AVX, AES and PCLMUL instruction set support.
  178. </p>
  179. </dd>
  180. <dt>&lsquo;<samp>ivybridge</samp>&rsquo;</dt>
  181. <dd><p>Intel Ivy Bridge CPU with 64-bit extensions, MMX, SSE, SSE2, SSE3, SSSE3,
  182. SSE4.1, SSE4.2, POPCNT, AVX, AES, PCLMUL, FSGSBASE, RDRND and F16C
  183. instruction set support.
  184. </p>
  185. </dd>
  186. <dt>&lsquo;<samp>haswell</samp>&rsquo;</dt>
  187. <dd><p>Intel Haswell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  188. SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
  189. BMI, BMI2 and F16C instruction set support.
  190. </p>
  191. </dd>
  192. <dt>&lsquo;<samp>broadwell</samp>&rsquo;</dt>
  193. <dd><p>Intel Broadwell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  194. SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
  195. BMI, BMI2, F16C, RDSEED, ADCX and PREFETCHW instruction set support.
  196. </p>
  197. </dd>
  198. <dt>&lsquo;<samp>skylake</samp>&rsquo;</dt>
  199. <dd><p>Intel Skylake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  200. SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
  201. BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC and
  202. XSAVES instruction set support.
  203. </p>
  204. </dd>
  205. <dt>&lsquo;<samp>bonnell</samp>&rsquo;</dt>
  206. <dd><p>Intel Bonnell CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3 and SSSE3
  207. instruction set support.
  208. </p>
  209. </dd>
  210. <dt>&lsquo;<samp>silvermont</samp>&rsquo;</dt>
  211. <dd><p>Intel Silvermont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  212. SSE4.1, SSE4.2, POPCNT, AES, PCLMUL and RDRND instruction set support.
  213. </p>
  214. </dd>
  215. <dt>&lsquo;<samp>goldmont</samp>&rsquo;</dt>
  216. <dd><p>Intel Goldmont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  217. SSE4.1, SSE4.2, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT and FSGSBASE
  218. instruction set support.
  219. </p>
  220. </dd>
  221. <dt>&lsquo;<samp>goldmont-plus</samp>&rsquo;</dt>
  222. <dd><p>Intel Goldmont Plus CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
  223. SSSE3, SSE4.1, SSE4.2, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT, FSGSBASE,
  224. PTWRITE, RDPID, SGX and UMIP instruction set support.
  225. </p>
  226. </dd>
  227. <dt>&lsquo;<samp>tremont</samp>&rsquo;</dt>
  228. <dd><p>Intel Tremont CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  229. SSE4.1, SSE4.2, POPCNT, AES, PCLMUL, RDRND, XSAVE, XSAVEOPT, FSGSBASE, PTWRITE,
  230. RDPID, SGX, UMIP, GFNI-SSE, CLWB and ENCLV instruction set support.
  231. </p>
  232. </dd>
  233. <dt>&lsquo;<samp>knl</samp>&rsquo;</dt>
  234. <dd><p>Intel Knight&rsquo;s Landing CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
  235. SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
  236. BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F, AVX512PF, AVX512ER and
  237. AVX512CD instruction set support.
  238. </p>
  239. </dd>
  240. <dt>&lsquo;<samp>knm</samp>&rsquo;</dt>
  241. <dd><p>Intel Knights Mill CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
  242. SSSE3, SSE4.1, SSE4.2, POPCNT, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
  243. BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, AVX512F, AVX512PF, AVX512ER, AVX512CD,
  244. AVX5124VNNIW, AVX5124FMAPS and AVX512VPOPCNTDQ instruction set support.
  245. </p>
  246. </dd>
  247. <dt>&lsquo;<samp>skylake-avx512</samp>&rsquo;</dt>
  248. <dd><p>Intel Skylake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3,
  249. SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA,
  250. BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
  251. CLWB, AVX512VL, AVX512BW, AVX512DQ and AVX512CD instruction set support.
  252. </p>
  253. </dd>
  254. <dt>&lsquo;<samp>cannonlake</samp>&rsquo;</dt>
  255. <dd><p>Intel Cannonlake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
  256. SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
  257. RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
  258. XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
  259. AVX512IFMA, SHA and UMIP instruction set support.
  260. </p>
  261. </dd>
  262. <dt>&lsquo;<samp>icelake-client</samp>&rsquo;</dt>
  263. <dd><p>Intel Icelake Client CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
  264. SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
  265. RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
  266. XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
  267. AVX512IFMA, SHA, CLWB, UMIP, RDPID, GFNI, AVX512VBMI2, AVX512VPOPCNTDQ,
  268. AVX512BITALG, AVX512VNNI, VPCLMULQDQ, VAES instruction set support.
  269. </p>
  270. </dd>
  271. <dt>&lsquo;<samp>icelake-server</samp>&rsquo;</dt>
  272. <dd><p>Intel Icelake Server CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2,
  273. SSE3, SSSE3, SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE,
  274. RDRND, FMA, BMI, BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC,
  275. XSAVES, AVX512F, AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI,
  276. AVX512IFMA, SHA, CLWB, UMIP, RDPID, GFNI, AVX512VBMI2, AVX512VPOPCNTDQ,
  277. AVX512BITALG, AVX512VNNI, VPCLMULQDQ, VAES, PCONFIG and WBNOINVD instruction
  278. set support.
  279. </p>
  280. </dd>
  281. <dt>&lsquo;<samp>cascadelake</samp>&rsquo;</dt>
  282. <dd><p>Intel Cascadelake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  283. SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI,
  284. BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F, CLWB,
  285. AVX512VL, AVX512BW, AVX512DQ, AVX512CD and AVX512VNNI instruction set support.
  286. </p>
  287. </dd>
  288. <dt>&lsquo;<samp>cooperlake</samp>&rsquo;</dt>
  289. <dd><p>Intel cooperlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  290. SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI,
  291. BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F, CLWB,
  292. AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VNNI and AVX512BF16 instruction
  293. set support.
  294. </p>
  295. </dd>
  296. <dt>&lsquo;<samp>tigerlake</samp>&rsquo;</dt>
  297. <dd><p>Intel Tigerlake CPU with 64-bit extensions, MOVBE, MMX, SSE, SSE2, SSE3, SSSE3,
  298. SSE4.1, SSE4.2, POPCNT, PKU, AVX, AVX2, AES, PCLMUL, FSGSBASE, RDRND, FMA, BMI,
  299. BMI2, F16C, RDSEED, ADCX, PREFETCHW, CLFLUSHOPT, XSAVEC, XSAVES, AVX512F,
  300. AVX512VL, AVX512BW, AVX512DQ, AVX512CD, AVX512VBMI, AVX512IFMA, SHA, CLWB, UMIP,
  301. RDPID, GFNI, AVX512VBMI2, AVX512VPOPCNTDQ, AVX512BITALG, AVX512VNNI, VPCLMULQDQ,
  302. VAES, PCONFIG, WBNOINVD, MOVDIRI, MOVDIR64B and AVX512VP2INTERSECT instruction
  303. set support.
  304. </p>
  305. </dd>
  306. <dt>&lsquo;<samp>k6</samp>&rsquo;</dt>
  307. <dd><p>AMD K6 CPU with MMX instruction set support.
  308. </p>
  309. </dd>
  310. <dt>&lsquo;<samp>k6-2</samp>&rsquo;</dt>
  311. <dt>&lsquo;<samp>k6-3</samp>&rsquo;</dt>
  312. <dd><p>Improved versions of AMD K6 CPU with MMX and 3DNow! instruction set support.
  313. </p>
  314. </dd>
  315. <dt>&lsquo;<samp>athlon</samp>&rsquo;</dt>
  316. <dt>&lsquo;<samp>athlon-tbird</samp>&rsquo;</dt>
  317. <dd><p>AMD Athlon CPU with MMX, 3dNOW!, enhanced 3DNow! and SSE prefetch instructions
  318. support.
  319. </p>
  320. </dd>
  321. <dt>&lsquo;<samp>athlon-4</samp>&rsquo;</dt>
  322. <dt>&lsquo;<samp>athlon-xp</samp>&rsquo;</dt>
  323. <dt>&lsquo;<samp>athlon-mp</samp>&rsquo;</dt>
  324. <dd><p>Improved AMD Athlon CPU with MMX, 3DNow!, enhanced 3DNow! and full SSE
  325. instruction set support.
  326. </p>
  327. </dd>
  328. <dt>&lsquo;<samp>k8</samp>&rsquo;</dt>
  329. <dt>&lsquo;<samp>opteron</samp>&rsquo;</dt>
  330. <dt>&lsquo;<samp>athlon64</samp>&rsquo;</dt>
  331. <dt>&lsquo;<samp>athlon-fx</samp>&rsquo;</dt>
  332. <dd><p>Processors based on the AMD K8 core with x86-64 instruction set support,
  333. including the AMD Opteron, Athlon 64, and Athlon 64 FX processors.
  334. (This supersets MMX, SSE, SSE2, 3DNow!, enhanced 3DNow! and 64-bit
  335. instruction set extensions.)
  336. </p>
  337. </dd>
  338. <dt>&lsquo;<samp>k8-sse3</samp>&rsquo;</dt>
  339. <dt>&lsquo;<samp>opteron-sse3</samp>&rsquo;</dt>
  340. <dt>&lsquo;<samp>athlon64-sse3</samp>&rsquo;</dt>
  341. <dd><p>Improved versions of AMD K8 cores with SSE3 instruction set support.
  342. </p>
  343. </dd>
  344. <dt>&lsquo;<samp>amdfam10</samp>&rsquo;</dt>
  345. <dt>&lsquo;<samp>barcelona</samp>&rsquo;</dt>
  346. <dd><p>CPUs based on AMD Family 10h cores with x86-64 instruction set support. (This
  347. supersets MMX, SSE, SSE2, SSE3, SSE4A, 3DNow!, enhanced 3DNow!, ABM and 64-bit
  348. instruction set extensions.)
  349. </p>
  350. </dd>
  351. <dt>&lsquo;<samp>bdver1</samp>&rsquo;</dt>
  352. <dd><p>CPUs based on AMD Family 15h cores with x86-64 instruction set support. (This
  353. supersets FMA4, AVX, XOP, LWP, AES, PCLMUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A,
  354. SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set extensions.)
  355. </p>
  356. </dd>
  357. <dt>&lsquo;<samp>bdver2</samp>&rsquo;</dt>
  358. <dd><p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This
  359. supersets BMI, TBM, F16C, FMA, FMA4, AVX, XOP, LWP, AES, PCLMUL, CX16, MMX,
  360. SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and 64-bit instruction set
  361. extensions.)
  362. </p>
  363. </dd>
  364. <dt>&lsquo;<samp>bdver3</samp>&rsquo;</dt>
  365. <dd><p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This
  366. supersets BMI, TBM, F16C, FMA, FMA4, FSGSBASE, AVX, XOP, LWP, AES,
  367. PCLMUL, CX16, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1, SSE4.2, ABM and
  368. 64-bit instruction set extensions.)
  369. </p>
  370. </dd>
  371. <dt>&lsquo;<samp>bdver4</samp>&rsquo;</dt>
  372. <dd><p>AMD Family 15h core based CPUs with x86-64 instruction set support. (This
  373. supersets BMI, BMI2, TBM, F16C, FMA, FMA4, FSGSBASE, AVX, AVX2, XOP, LWP,
  374. AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3, SSE4.1,
  375. SSE4.2, ABM and 64-bit instruction set extensions.)
  376. </p>
  377. </dd>
  378. <dt>&lsquo;<samp>znver1</samp>&rsquo;</dt>
  379. <dd><p>AMD Family 17h core based CPUs with x86-64 instruction set support. (This
  380. supersets BMI, BMI2, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED, MWAITX,
  381. SHA, CLZERO, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A, SSSE3,
  382. SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, and 64-bit
  383. instruction set extensions.)
  384. </p>
  385. </dd>
  386. <dt>&lsquo;<samp>znver2</samp>&rsquo;</dt>
  387. <dd><p>AMD Family 17h core based CPUs with x86-64 instruction set support. (This
  388. supersets BMI, BMI2, CLWB, F16C, FMA, FSGSBASE, AVX, AVX2, ADCX, RDSEED,
  389. MWAITX, SHA, CLZERO, AES, PCLMUL, CX16, MOVBE, MMX, SSE, SSE2, SSE3, SSE4A,
  390. SSSE3, SSE4.1, SSE4.2, ABM, XSAVEC, XSAVES, CLFLUSHOPT, POPCNT, RDPID,
  391. WBNOINVD, and 64-bit instruction set extensions.)
  392. </p>
  393. </dd>
  394. <dt>&lsquo;<samp>btver1</samp>&rsquo;</dt>
  395. <dd><p>CPUs based on AMD Family 14h cores with x86-64 instruction set support. (This
  396. supersets MMX, SSE, SSE2, SSE3, SSSE3, SSE4A, CX16, ABM and 64-bit
  397. instruction set extensions.)
  398. </p>
  399. </dd>
  400. <dt>&lsquo;<samp>btver2</samp>&rsquo;</dt>
  401. <dd><p>CPUs based on AMD Family 16h cores with x86-64 instruction set support. This
  402. includes MOVBE, F16C, BMI, AVX, PCLMUL, AES, SSE4.2, SSE4.1, CX16, ABM,
  403. SSE4A, SSSE3, SSE3, SSE2, SSE, MMX and 64-bit instruction set extensions.
  404. </p>
  405. </dd>
  406. <dt>&lsquo;<samp>winchip-c6</samp>&rsquo;</dt>
  407. <dd><p>IDT WinChip C6 CPU, dealt in same way as i486 with additional MMX instruction
  408. set support.
  409. </p>
  410. </dd>
  411. <dt>&lsquo;<samp>winchip2</samp>&rsquo;</dt>
  412. <dd><p>IDT WinChip 2 CPU, dealt in same way as i486 with additional MMX and 3DNow!
  413. instruction set support.
  414. </p>
  415. </dd>
  416. <dt>&lsquo;<samp>c3</samp>&rsquo;</dt>
  417. <dd><p>VIA C3 CPU with MMX and 3DNow! instruction set support.
  418. (No scheduling is implemented for this chip.)
  419. </p>
  420. </dd>
  421. <dt>&lsquo;<samp>c3-2</samp>&rsquo;</dt>
  422. <dd><p>VIA C3-2 (Nehemiah/C5XL) CPU with MMX and SSE instruction set support.
  423. (No scheduling is implemented for this chip.)
  424. </p>
  425. </dd>
  426. <dt>&lsquo;<samp>c7</samp>&rsquo;</dt>
  427. <dd><p>VIA C7 (Esther) CPU with MMX, SSE, SSE2 and SSE3 instruction set support.
  428. (No scheduling is implemented for this chip.)
  429. </p>
  430. </dd>
  431. <dt>&lsquo;<samp>samuel-2</samp>&rsquo;</dt>
  432. <dd><p>VIA Eden Samuel 2 CPU with MMX and 3DNow! instruction set support.
  433. (No scheduling is implemented for this chip.)
  434. </p>
  435. </dd>
  436. <dt>&lsquo;<samp>nehemiah</samp>&rsquo;</dt>
  437. <dd><p>VIA Eden Nehemiah CPU with MMX and SSE instruction set support.
  438. (No scheduling is implemented for this chip.)
  439. </p>
  440. </dd>
  441. <dt>&lsquo;<samp>esther</samp>&rsquo;</dt>
  442. <dd><p>VIA Eden Esther CPU with MMX, SSE, SSE2 and SSE3 instruction set support.
  443. (No scheduling is implemented for this chip.)
  444. </p>
  445. </dd>
  446. <dt>&lsquo;<samp>eden-x2</samp>&rsquo;</dt>
  447. <dd><p>VIA Eden X2 CPU with x86-64, MMX, SSE, SSE2 and SSE3 instruction set support.
  448. (No scheduling is implemented for this chip.)
  449. </p>
  450. </dd>
  451. <dt>&lsquo;<samp>eden-x4</samp>&rsquo;</dt>
  452. <dd><p>VIA Eden X4 CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2,
  453. AVX and AVX2 instruction set support.
  454. (No scheduling is implemented for this chip.)
  455. </p>
  456. </dd>
  457. <dt>&lsquo;<samp>nano</samp>&rsquo;</dt>
  458. <dd><p>Generic VIA Nano CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3
  459. instruction set support.
  460. (No scheduling is implemented for this chip.)
  461. </p>
  462. </dd>
  463. <dt>&lsquo;<samp>nano-1000</samp>&rsquo;</dt>
  464. <dd><p>VIA Nano 1xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3
  465. instruction set support.
  466. (No scheduling is implemented for this chip.)
  467. </p>
  468. </dd>
  469. <dt>&lsquo;<samp>nano-2000</samp>&rsquo;</dt>
  470. <dd><p>VIA Nano 2xxx CPU with x86-64, MMX, SSE, SSE2, SSE3 and SSSE3
  471. instruction set support.
  472. (No scheduling is implemented for this chip.)
  473. </p>
  474. </dd>
  475. <dt>&lsquo;<samp>nano-3000</samp>&rsquo;</dt>
  476. <dd><p>VIA Nano 3xxx CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1
  477. instruction set support.
  478. (No scheduling is implemented for this chip.)
  479. </p>
  480. </dd>
  481. <dt>&lsquo;<samp>nano-x2</samp>&rsquo;</dt>
  482. <dd><p>VIA Nano Dual Core CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1
  483. instruction set support.
  484. (No scheduling is implemented for this chip.)
  485. </p>
  486. </dd>
  487. <dt>&lsquo;<samp>nano-x4</samp>&rsquo;</dt>
  488. <dd><p>VIA Nano Quad Core CPU with x86-64, MMX, SSE, SSE2, SSE3, SSSE3 and SSE4.1
  489. instruction set support.
  490. (No scheduling is implemented for this chip.)
  491. </p>
  492. </dd>
  493. <dt>&lsquo;<samp>geode</samp>&rsquo;</dt>
  494. <dd><p>AMD Geode embedded processor with MMX and 3DNow! instruction set support.
  495. </p></dd>
  496. </dl>
  497. </dd>
  498. <dt><code>-mtune=<var>cpu-type</var></code></dt>
  499. <dd><a name="index-mtune-16"></a>
  500. <p>Tune to <var>cpu-type</var> everything applicable about the generated code, except
  501. for the ABI and the set of available instructions.
  502. While picking a specific <var>cpu-type</var> schedules things appropriately
  503. for that particular chip, the compiler does not generate any code that
  504. cannot run on the default machine type unless you use a
  505. <samp>-march=<var>cpu-type</var></samp> option.
  506. For example, if GCC is configured for i686-pc-linux-gnu
  507. then <samp>-mtune=pentium4</samp> generates code that is tuned for Pentium 4
  508. but still runs on i686 machines.
  509. </p>
  510. <p>The choices for <var>cpu-type</var> are the same as for <samp>-march</samp>.
  511. In addition, <samp>-mtune</samp> supports 2 extra choices for <var>cpu-type</var>:
  512. </p>
  513. <dl compact="compact">
  514. <dt>&lsquo;<samp>generic</samp>&rsquo;</dt>
  515. <dd><p>Produce code optimized for the most common IA32/AMD64/EM64T processors.
  516. If you know the CPU on which your code will run, then you should use
  517. the corresponding <samp>-mtune</samp> or <samp>-march</samp> option instead of
  518. <samp>-mtune=generic</samp>. But, if you do not know exactly what CPU users
  519. of your application will have, then you should use this option.
  520. </p>
  521. <p>As new processors are deployed in the marketplace, the behavior of this
  522. option will change. Therefore, if you upgrade to a newer version of
  523. GCC, code generation controlled by this option will change to reflect
  524. the processors
  525. that are most common at the time that version of GCC is released.
  526. </p>
  527. <p>There is no <samp>-march=generic</samp> option because <samp>-march</samp>
  528. indicates the instruction set the compiler can use, and there is no
  529. generic instruction set applicable to all processors. In contrast,
  530. <samp>-mtune</samp> indicates the processor (or, in this case, collection of
  531. processors) for which the code is optimized.
  532. </p>
  533. </dd>
  534. <dt>&lsquo;<samp>intel</samp>&rsquo;</dt>
  535. <dd><p>Produce code optimized for the most current Intel processors, which are
  536. Haswell and Silvermont for this version of GCC. If you know the CPU
  537. on which your code will run, then you should use the corresponding
  538. <samp>-mtune</samp> or <samp>-march</samp> option instead of <samp>-mtune=intel</samp>.
  539. But, if you want your application performs better on both Haswell and
  540. Silvermont, then you should use this option.
  541. </p>
  542. <p>As new Intel processors are deployed in the marketplace, the behavior of
  543. this option will change. Therefore, if you upgrade to a newer version of
  544. GCC, code generation controlled by this option will change to reflect
  545. the most current Intel processors at the time that version of GCC is
  546. released.
  547. </p>
  548. <p>There is no <samp>-march=intel</samp> option because <samp>-march</samp> indicates
  549. the instruction set the compiler can use, and there is no common
  550. instruction set applicable to all processors. In contrast,
  551. <samp>-mtune</samp> indicates the processor (or, in this case, collection of
  552. processors) for which the code is optimized.
  553. </p></dd>
  554. </dl>
  555. </dd>
  556. <dt><code>-mcpu=<var>cpu-type</var></code></dt>
  557. <dd><a name="index-mcpu-15"></a>
  558. <p>A deprecated synonym for <samp>-mtune</samp>.
  559. </p>
  560. </dd>
  561. <dt><code>-mfpmath=<var>unit</var></code></dt>
  562. <dd><a name="index-mfpmath-1"></a>
  563. <p>Generate floating-point arithmetic for selected unit <var>unit</var>. The choices
  564. for <var>unit</var> are:
  565. </p>
  566. <dl compact="compact">
  567. <dt>&lsquo;<samp>387</samp>&rsquo;</dt>
  568. <dd><p>Use the standard 387 floating-point coprocessor present on the majority of chips and
  569. emulated otherwise. Code compiled with this option runs almost everywhere.
  570. The temporary results are computed in 80-bit precision instead of the precision
  571. specified by the type, resulting in slightly different results compared to most
  572. of other chips. See <samp>-ffloat-store</samp> for more detailed description.
  573. </p>
  574. <p>This is the default choice for non-Darwin x86-32 targets.
  575. </p>
  576. </dd>
  577. <dt>&lsquo;<samp>sse</samp>&rsquo;</dt>
  578. <dd><p>Use scalar floating-point instructions present in the SSE instruction set.
  579. This instruction set is supported by Pentium III and newer chips,
  580. and in the AMD line
  581. by Athlon-4, Athlon XP and Athlon MP chips. The earlier version of the SSE
  582. instruction set supports only single-precision arithmetic, thus the double and
  583. extended-precision arithmetic are still done using 387. A later version, present
  584. only in Pentium 4 and AMD x86-64 chips, supports double-precision
  585. arithmetic too.
  586. </p>
  587. <p>For the x86-32 compiler, you must use <samp>-march=<var>cpu-type</var></samp>, <samp>-msse</samp>
  588. or <samp>-msse2</samp> switches to enable SSE extensions and make this option
  589. effective. For the x86-64 compiler, these extensions are enabled by default.
  590. </p>
  591. <p>The resulting code should be considerably faster in the majority of cases and avoid
  592. the numerical instability problems of 387 code, but may break some existing
  593. code that expects temporaries to be 80 bits.
  594. </p>
  595. <p>This is the default choice for the x86-64 compiler, Darwin x86-32 targets,
  596. and the default choice for x86-32 targets with the SSE2 instruction set
  597. when <samp>-ffast-math</samp> is enabled.
  598. </p>
  599. </dd>
  600. <dt>&lsquo;<samp>sse,387</samp>&rsquo;</dt>
  601. <dt>&lsquo;<samp>sse+387</samp>&rsquo;</dt>
  602. <dt>&lsquo;<samp>both</samp>&rsquo;</dt>
  603. <dd><p>Attempt to utilize both instruction sets at once. This effectively doubles the
  604. amount of available registers, and on chips with separate execution units for
  605. 387 and SSE the execution resources too. Use this option with care, as it is
  606. still experimental, because the GCC register allocator does not model separate
  607. functional units well, resulting in unstable performance.
  608. </p></dd>
  609. </dl>
  610. </dd>
  611. <dt><code>-masm=<var>dialect</var></code></dt>
  612. <dd><a name="index-masm_003ddialect"></a>
  613. <p>Output assembly instructions using selected <var>dialect</var>. Also affects
  614. which dialect is used for basic <code>asm</code> (see <a href="Basic-Asm.html#Basic-Asm">Basic Asm</a>) and
  615. extended <code>asm</code> (see <a href="Extended-Asm.html#Extended-Asm">Extended Asm</a>). Supported choices (in dialect
  616. order) are &lsquo;<samp>att</samp>&rsquo; or &lsquo;<samp>intel</samp>&rsquo;. The default is &lsquo;<samp>att</samp>&rsquo;. Darwin does
  617. not support &lsquo;<samp>intel</samp>&rsquo;.
  618. </p>
  619. </dd>
  620. <dt><code>-mieee-fp</code></dt>
  621. <dt><code>-mno-ieee-fp</code></dt>
  622. <dd><a name="index-mieee_002dfp"></a>
  623. <a name="index-mno_002dieee_002dfp"></a>
  624. <p>Control whether or not the compiler uses IEEE floating-point
  625. comparisons. These correctly handle the case where the result of a
  626. comparison is unordered.
  627. </p>
  628. </dd>
  629. <dt><code>-m80387</code></dt>
  630. <dt><code>-mhard-float</code></dt>
  631. <dd><a name="index-80387"></a>
  632. <a name="index-mhard_002dfloat-11"></a>
  633. <p>Generate output containing 80387 instructions for floating point.
  634. </p>
  635. </dd>
  636. <dt><code>-mno-80387</code></dt>
  637. <dt><code>-msoft-float</code></dt>
  638. <dd><a name="index-no_002d80387"></a>
  639. <a name="index-msoft_002dfloat-15"></a>
  640. <p>Generate output containing library calls for floating point.
  641. </p>
  642. <p><strong>Warning:</strong> the requisite libraries are not part of GCC.
  643. Normally the facilities of the machine&rsquo;s usual C compiler are used, but
  644. this cannot be done directly in cross-compilation. You must make your
  645. own arrangements to provide suitable library functions for
  646. cross-compilation.
  647. </p>
  648. <p>On machines where a function returns floating-point results in the 80387
  649. register stack, some floating-point opcodes may be emitted even if
  650. <samp>-msoft-float</samp> is used.
  651. </p>
  652. </dd>
  653. <dt><code>-mno-fp-ret-in-387</code></dt>
  654. <dd><a name="index-mno_002dfp_002dret_002din_002d387"></a>
  655. <a name="index-mfp_002dret_002din_002d387"></a>
  656. <p>Do not use the FPU registers for return values of functions.
  657. </p>
  658. <p>The usual calling convention has functions return values of types
  659. <code>float</code> and <code>double</code> in an FPU register, even if there
  660. is no FPU. The idea is that the operating system should emulate
  661. an FPU.
  662. </p>
  663. <p>The option <samp>-mno-fp-ret-in-387</samp> causes such values to be returned
  664. in ordinary CPU registers instead.
  665. </p>
  666. </dd>
  667. <dt><code>-mno-fancy-math-387</code></dt>
  668. <dd><a name="index-mno_002dfancy_002dmath_002d387"></a>
  669. <a name="index-mfancy_002dmath_002d387"></a>
  670. <p>Some 387 emulators do not support the <code>sin</code>, <code>cos</code> and
  671. <code>sqrt</code> instructions for the 387. Specify this option to avoid
  672. generating those instructions.
  673. This option is overridden when <samp>-march</samp>
  674. indicates that the target CPU always has an FPU and so the
  675. instruction does not need emulation. These
  676. instructions are not generated unless you also use the
  677. <samp>-funsafe-math-optimizations</samp> switch.
  678. </p>
  679. </dd>
  680. <dt><code>-malign-double</code></dt>
  681. <dt><code>-mno-align-double</code></dt>
  682. <dd><a name="index-malign_002ddouble"></a>
  683. <a name="index-mno_002dalign_002ddouble"></a>
  684. <p>Control whether GCC aligns <code>double</code>, <code>long double</code>, and
  685. <code>long long</code> variables on a two-word boundary or a one-word
  686. boundary. Aligning <code>double</code> variables on a two-word boundary
  687. produces code that runs somewhat faster on a Pentium at the
  688. expense of more memory.
  689. </p>
  690. <p>On x86-64, <samp>-malign-double</samp> is enabled by default.
  691. </p>
  692. <p><strong>Warning:</strong> if you use the <samp>-malign-double</samp> switch,
  693. structures containing the above types are aligned differently than
  694. the published application binary interface specifications for the x86-32
  695. and are not binary compatible with structures in code compiled
  696. without that switch.
  697. </p>
  698. </dd>
  699. <dt><code>-m96bit-long-double</code></dt>
  700. <dt><code>-m128bit-long-double</code></dt>
  701. <dd><a name="index-m96bit_002dlong_002ddouble"></a>
  702. <a name="index-m128bit_002dlong_002ddouble"></a>
  703. <p>These switches control the size of <code>long double</code> type. The x86-32
  704. application binary interface specifies the size to be 96 bits,
  705. so <samp>-m96bit-long-double</samp> is the default in 32-bit mode.
  706. </p>
  707. <p>Modern architectures (Pentium and newer) prefer <code>long double</code>
  708. to be aligned to an 8- or 16-byte boundary. In arrays or structures
  709. conforming to the ABI, this is not possible. So specifying
  710. <samp>-m128bit-long-double</samp> aligns <code>long double</code>
  711. to a 16-byte boundary by padding the <code>long double</code> with an additional
  712. 32-bit zero.
  713. </p>
  714. <p>In the x86-64 compiler, <samp>-m128bit-long-double</samp> is the default choice as
  715. its ABI specifies that <code>long double</code> is aligned on 16-byte boundary.
  716. </p>
  717. <p>Notice that neither of these options enable any extra precision over the x87
  718. standard of 80 bits for a <code>long double</code>.
  719. </p>
  720. <p><strong>Warning:</strong> if you override the default value for your target ABI, this
  721. changes the size of
  722. structures and arrays containing <code>long double</code> variables,
  723. as well as modifying the function calling convention for functions taking
  724. <code>long double</code>. Hence they are not binary-compatible
  725. with code compiled without that switch.
  726. </p>
  727. </dd>
  728. <dt><code>-mlong-double-64</code></dt>
  729. <dt><code>-mlong-double-80</code></dt>
  730. <dt><code>-mlong-double-128</code></dt>
  731. <dd><a name="index-mlong_002ddouble_002d64-1"></a>
  732. <a name="index-mlong_002ddouble_002d80"></a>
  733. <a name="index-mlong_002ddouble_002d128-1"></a>
  734. <p>These switches control the size of <code>long double</code> type. A size
  735. of 64 bits makes the <code>long double</code> type equivalent to the <code>double</code>
  736. type. This is the default for 32-bit Bionic C library. A size
  737. of 128 bits makes the <code>long double</code> type equivalent to the
  738. <code>__float128</code> type. This is the default for 64-bit Bionic C library.
  739. </p>
  740. <p><strong>Warning:</strong> if you override the default value for your target ABI, this
  741. changes the size of
  742. structures and arrays containing <code>long double</code> variables,
  743. as well as modifying the function calling convention for functions taking
  744. <code>long double</code>. Hence they are not binary-compatible
  745. with code compiled without that switch.
  746. </p>
  747. </dd>
  748. <dt><code>-malign-data=<var>type</var></code></dt>
  749. <dd><a name="index-malign_002ddata-1"></a>
  750. <p>Control how GCC aligns variables. Supported values for <var>type</var> are
  751. &lsquo;<samp>compat</samp>&rsquo; uses increased alignment value compatible uses GCC 4.8
  752. and earlier, &lsquo;<samp>abi</samp>&rsquo; uses alignment value as specified by the
  753. psABI, and &lsquo;<samp>cacheline</samp>&rsquo; uses increased alignment value to match
  754. the cache line size. &lsquo;<samp>compat</samp>&rsquo; is the default.
  755. </p>
  756. </dd>
  757. <dt><code>-mlarge-data-threshold=<var>threshold</var></code></dt>
  758. <dd><a name="index-mlarge_002ddata_002dthreshold"></a>
  759. <p>When <samp>-mcmodel=medium</samp> is specified, data objects larger than
  760. <var>threshold</var> are placed in the large data section. This value must be the
  761. same across all objects linked into the binary, and defaults to 65535.
  762. </p>
  763. </dd>
  764. <dt><code>-mrtd</code></dt>
  765. <dd><a name="index-mrtd-1"></a>
  766. <p>Use a different function-calling convention, in which functions that
  767. take a fixed number of arguments return with the <code>ret <var>num</var></code>
  768. instruction, which pops their arguments while returning. This saves one
  769. instruction in the caller since there is no need to pop the arguments
  770. there.
  771. </p>
  772. <p>You can specify that an individual function is called with this calling
  773. sequence with the function attribute <code>stdcall</code>. You can also
  774. override the <samp>-mrtd</samp> option by using the function attribute
  775. <code>cdecl</code>. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>.
  776. </p>
  777. <p><strong>Warning:</strong> this calling convention is incompatible with the one
  778. normally used on Unix, so you cannot use it if you need to call
  779. libraries compiled with the Unix compiler.
  780. </p>
  781. <p>Also, you must provide function prototypes for all functions that
  782. take variable numbers of arguments (including <code>printf</code>);
  783. otherwise incorrect code is generated for calls to those
  784. functions.
  785. </p>
  786. <p>In addition, seriously incorrect code results if you call a
  787. function with too many arguments. (Normally, extra arguments are
  788. harmlessly ignored.)
  789. </p>
  790. </dd>
  791. <dt><code>-mregparm=<var>num</var></code></dt>
  792. <dd><a name="index-mregparm"></a>
  793. <p>Control how many registers are used to pass integer arguments. By
  794. default, no registers are used to pass arguments, and at most 3
  795. registers can be used. You can control this behavior for a specific
  796. function by using the function attribute <code>regparm</code>.
  797. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>.
  798. </p>
  799. <p><strong>Warning:</strong> if you use this switch, and
  800. <var>num</var> is nonzero, then you must build all modules with the same
  801. value, including any libraries. This includes the system libraries and
  802. startup modules.
  803. </p>
  804. </dd>
  805. <dt><code>-msseregparm</code></dt>
  806. <dd><a name="index-msseregparm"></a>
  807. <p>Use SSE register passing conventions for float and double arguments
  808. and return values. You can control this behavior for a specific
  809. function by using the function attribute <code>sseregparm</code>.
  810. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>.
  811. </p>
  812. <p><strong>Warning:</strong> if you use this switch then you must build all
  813. modules with the same value, including any libraries. This includes
  814. the system libraries and startup modules.
  815. </p>
  816. </dd>
  817. <dt><code>-mvect8-ret-in-mem</code></dt>
  818. <dd><a name="index-mvect8_002dret_002din_002dmem"></a>
  819. <p>Return 8-byte vectors in memory instead of MMX registers. This is the
  820. default on VxWorks to match the ABI of the Sun Studio compilers until
  821. version 12. <em>Only</em> use this option if you need to remain
  822. compatible with existing code produced by those previous compiler
  823. versions or older versions of GCC.
  824. </p>
  825. </dd>
  826. <dt><code>-mpc32</code></dt>
  827. <dt><code>-mpc64</code></dt>
  828. <dt><code>-mpc80</code></dt>
  829. <dd><a name="index-mpc32"></a>
  830. <a name="index-mpc64"></a>
  831. <a name="index-mpc80"></a>
  832. <p>Set 80387 floating-point precision to 32, 64 or 80 bits. When <samp>-mpc32</samp>
  833. is specified, the significands of results of floating-point operations are
  834. rounded to 24 bits (single precision); <samp>-mpc64</samp> rounds the
  835. significands of results of floating-point operations to 53 bits (double
  836. precision) and <samp>-mpc80</samp> rounds the significands of results of
  837. floating-point operations to 64 bits (extended double precision), which is
  838. the default. When this option is used, floating-point operations in higher
  839. precisions are not available to the programmer without setting the FPU
  840. control word explicitly.
  841. </p>
  842. <p>Setting the rounding of floating-point operations to less than the default
  843. 80 bits can speed some programs by 2% or more. Note that some mathematical
  844. libraries assume that extended-precision (80-bit) floating-point operations
  845. are enabled by default; routines in such libraries could suffer significant
  846. loss of accuracy, typically through so-called &ldquo;catastrophic cancellation&rdquo;,
  847. when this option is used to set the precision to less than extended precision.
  848. </p>
  849. </dd>
  850. <dt><code>-mstackrealign</code></dt>
  851. <dd><a name="index-mstackrealign"></a>
  852. <p>Realign the stack at entry. On the x86, the <samp>-mstackrealign</samp>
  853. option generates an alternate prologue and epilogue that realigns the
  854. run-time stack if necessary. This supports mixing legacy codes that keep
  855. 4-byte stack alignment with modern codes that keep 16-byte stack alignment for
  856. SSE compatibility. See also the attribute <code>force_align_arg_pointer</code>,
  857. applicable to individual functions.
  858. </p>
  859. </dd>
  860. <dt><code>-mpreferred-stack-boundary=<var>num</var></code></dt>
  861. <dd><a name="index-mpreferred_002dstack_002dboundary-1"></a>
  862. <p>Attempt to keep the stack boundary aligned to a 2 raised to <var>num</var>
  863. byte boundary. If <samp>-mpreferred-stack-boundary</samp> is not specified,
  864. the default is 4 (16 bytes or 128 bits).
  865. </p>
  866. <p><strong>Warning:</strong> When generating code for the x86-64 architecture with
  867. SSE extensions disabled, <samp>-mpreferred-stack-boundary=3</samp> can be
  868. used to keep the stack boundary aligned to 8 byte boundary. Since
  869. x86-64 ABI require 16 byte stack alignment, this is ABI incompatible and
  870. intended to be used in controlled environment where stack space is
  871. important limitation. This option leads to wrong code when functions
  872. compiled with 16 byte stack alignment (such as functions from a standard
  873. library) are called with misaligned stack. In this case, SSE
  874. instructions may lead to misaligned memory access traps. In addition,
  875. variable arguments are handled incorrectly for 16 byte aligned
  876. objects (including x87 long double and __int128), leading to wrong
  877. results. You must build all modules with
  878. <samp>-mpreferred-stack-boundary=3</samp>, including any libraries. This
  879. includes the system libraries and startup modules.
  880. </p>
  881. </dd>
  882. <dt><code>-mincoming-stack-boundary=<var>num</var></code></dt>
  883. <dd><a name="index-mincoming_002dstack_002dboundary"></a>
  884. <p>Assume the incoming stack is aligned to a 2 raised to <var>num</var> byte
  885. boundary. If <samp>-mincoming-stack-boundary</samp> is not specified,
  886. the one specified by <samp>-mpreferred-stack-boundary</samp> is used.
  887. </p>
  888. <p>On Pentium and Pentium Pro, <code>double</code> and <code>long double</code> values
  889. should be aligned to an 8-byte boundary (see <samp>-malign-double</samp>) or
  890. suffer significant run time performance penalties. On Pentium III, the
  891. Streaming SIMD Extension (SSE) data type <code>__m128</code> may not work
  892. properly if it is not 16-byte aligned.
  893. </p>
  894. <p>To ensure proper alignment of this values on the stack, the stack boundary
  895. must be as aligned as that required by any value stored on the stack.
  896. Further, every function must be generated such that it keeps the stack
  897. aligned. Thus calling a function compiled with a higher preferred
  898. stack boundary from a function compiled with a lower preferred stack
  899. boundary most likely misaligns the stack. It is recommended that
  900. libraries that use callbacks always use the default setting.
  901. </p>
  902. <p>This extra alignment does consume extra stack space, and generally
  903. increases code size. Code that is sensitive to stack space usage, such
  904. as embedded systems and operating system kernels, may want to reduce the
  905. preferred alignment to <samp>-mpreferred-stack-boundary=2</samp>.
  906. </p>
  907. </dd>
  908. <dt><code>-mmmx</code></dt>
  909. <dd><a name="index-mmmx"></a>
  910. </dd>
  911. <dt><code>-msse</code></dt>
  912. <dd><a name="index-msse"></a>
  913. </dd>
  914. <dt><code>-msse2</code></dt>
  915. <dd><a name="index-msse2"></a>
  916. </dd>
  917. <dt><code>-msse3</code></dt>
  918. <dd><a name="index-msse3"></a>
  919. </dd>
  920. <dt><code>-mssse3</code></dt>
  921. <dd><a name="index-mssse3"></a>
  922. </dd>
  923. <dt><code>-msse4</code></dt>
  924. <dd><a name="index-msse4"></a>
  925. </dd>
  926. <dt><code>-msse4a</code></dt>
  927. <dd><a name="index-msse4a"></a>
  928. </dd>
  929. <dt><code>-msse4.1</code></dt>
  930. <dd><a name="index-msse4_002e1"></a>
  931. </dd>
  932. <dt><code>-msse4.2</code></dt>
  933. <dd><a name="index-msse4_002e2"></a>
  934. </dd>
  935. <dt><code>-mavx</code></dt>
  936. <dd><a name="index-mavx"></a>
  937. </dd>
  938. <dt><code>-mavx2</code></dt>
  939. <dd><a name="index-mavx2"></a>
  940. </dd>
  941. <dt><code>-mavx512f</code></dt>
  942. <dd><a name="index-mavx512f"></a>
  943. </dd>
  944. <dt><code>-mavx512pf</code></dt>
  945. <dd><a name="index-mavx512pf"></a>
  946. </dd>
  947. <dt><code>-mavx512er</code></dt>
  948. <dd><a name="index-mavx512er"></a>
  949. </dd>
  950. <dt><code>-mavx512cd</code></dt>
  951. <dd><a name="index-mavx512cd"></a>
  952. </dd>
  953. <dt><code>-mavx512vl</code></dt>
  954. <dd><a name="index-mavx512vl"></a>
  955. </dd>
  956. <dt><code>-mavx512bw</code></dt>
  957. <dd><a name="index-mavx512bw"></a>
  958. </dd>
  959. <dt><code>-mavx512dq</code></dt>
  960. <dd><a name="index-mavx512dq"></a>
  961. </dd>
  962. <dt><code>-mavx512ifma</code></dt>
  963. <dd><a name="index-mavx512ifma"></a>
  964. </dd>
  965. <dt><code>-mavx512vbmi</code></dt>
  966. <dd><a name="index-mavx512vbmi"></a>
  967. </dd>
  968. <dt><code>-msha</code></dt>
  969. <dd><a name="index-msha"></a>
  970. </dd>
  971. <dt><code>-maes</code></dt>
  972. <dd><a name="index-maes"></a>
  973. </dd>
  974. <dt><code>-mpclmul</code></dt>
  975. <dd><a name="index-mpclmul"></a>
  976. </dd>
  977. <dt><code>-mclflushopt</code></dt>
  978. <dd><a name="index-mclflushopt"></a>
  979. </dd>
  980. <dt><code>-mclwb</code></dt>
  981. <dd><a name="index-mclwb"></a>
  982. </dd>
  983. <dt><code>-mfsgsbase</code></dt>
  984. <dd><a name="index-mfsgsbase"></a>
  985. </dd>
  986. <dt><code>-mptwrite</code></dt>
  987. <dd><a name="index-mptwrite"></a>
  988. </dd>
  989. <dt><code>-mrdrnd</code></dt>
  990. <dd><a name="index-mrdrnd"></a>
  991. </dd>
  992. <dt><code>-mf16c</code></dt>
  993. <dd><a name="index-mf16c"></a>
  994. </dd>
  995. <dt><code>-mfma</code></dt>
  996. <dd><a name="index-mfma"></a>
  997. </dd>
  998. <dt><code>-mpconfig</code></dt>
  999. <dd><a name="index-mpconfig"></a>
  1000. </dd>
  1001. <dt><code>-mwbnoinvd</code></dt>
  1002. <dd><a name="index-mwbnoinvd"></a>
  1003. </dd>
  1004. <dt><code>-mfma4</code></dt>
  1005. <dd><a name="index-mfma4"></a>
  1006. </dd>
  1007. <dt><code>-mprfchw</code></dt>
  1008. <dd><a name="index-mprfchw"></a>
  1009. </dd>
  1010. <dt><code>-mrdpid</code></dt>
  1011. <dd><a name="index-mrdpid"></a>
  1012. </dd>
  1013. <dt><code>-mprefetchwt1</code></dt>
  1014. <dd><a name="index-mprefetchwt1"></a>
  1015. </dd>
  1016. <dt><code>-mrdseed</code></dt>
  1017. <dd><a name="index-mrdseed"></a>
  1018. </dd>
  1019. <dt><code>-msgx</code></dt>
  1020. <dd><a name="index-msgx"></a>
  1021. </dd>
  1022. <dt><code>-mxop</code></dt>
  1023. <dd><a name="index-mxop"></a>
  1024. </dd>
  1025. <dt><code>-mlwp</code></dt>
  1026. <dd><a name="index-mlwp"></a>
  1027. </dd>
  1028. <dt><code>-m3dnow</code></dt>
  1029. <dd><a name="index-m3dnow"></a>
  1030. </dd>
  1031. <dt><code>-m3dnowa</code></dt>
  1032. <dd><a name="index-m3dnowa"></a>
  1033. </dd>
  1034. <dt><code>-mpopcnt</code></dt>
  1035. <dd><a name="index-mpopcnt"></a>
  1036. </dd>
  1037. <dt><code>-mabm</code></dt>
  1038. <dd><a name="index-mabm"></a>
  1039. </dd>
  1040. <dt><code>-madx</code></dt>
  1041. <dd><a name="index-madx"></a>
  1042. </dd>
  1043. <dt><code>-mbmi</code></dt>
  1044. <dd><a name="index-mbmi"></a>
  1045. </dd>
  1046. <dt><code>-mbmi2</code></dt>
  1047. <dd><a name="index-mbmi2"></a>
  1048. </dd>
  1049. <dt><code>-mlzcnt</code></dt>
  1050. <dd><a name="index-mlzcnt"></a>
  1051. </dd>
  1052. <dt><code>-mfxsr</code></dt>
  1053. <dd><a name="index-mfxsr"></a>
  1054. </dd>
  1055. <dt><code>-mxsave</code></dt>
  1056. <dd><a name="index-mxsave"></a>
  1057. </dd>
  1058. <dt><code>-mxsaveopt</code></dt>
  1059. <dd><a name="index-mxsaveopt"></a>
  1060. </dd>
  1061. <dt><code>-mxsavec</code></dt>
  1062. <dd><a name="index-mxsavec"></a>
  1063. </dd>
  1064. <dt><code>-mxsaves</code></dt>
  1065. <dd><a name="index-mxsaves"></a>
  1066. </dd>
  1067. <dt><code>-mrtm</code></dt>
  1068. <dd><a name="index-mrtm"></a>
  1069. </dd>
  1070. <dt><code>-mhle</code></dt>
  1071. <dd><a name="index-mhle"></a>
  1072. </dd>
  1073. <dt><code>-mtbm</code></dt>
  1074. <dd><a name="index-mtbm"></a>
  1075. </dd>
  1076. <dt><code>-mmwaitx</code></dt>
  1077. <dd><a name="index-mmwaitx"></a>
  1078. </dd>
  1079. <dt><code>-mclzero</code></dt>
  1080. <dd><a name="index-mclzero"></a>
  1081. </dd>
  1082. <dt><code>-mpku</code></dt>
  1083. <dd><a name="index-mpku"></a>
  1084. </dd>
  1085. <dt><code>-mavx512vbmi2</code></dt>
  1086. <dd><a name="index-mavx512vbmi2"></a>
  1087. </dd>
  1088. <dt><code>-mavx512bf16</code></dt>
  1089. <dd><a name="index-mavx512bf16"></a>
  1090. </dd>
  1091. <dt><code>-mgfni</code></dt>
  1092. <dd><a name="index-mgfni"></a>
  1093. </dd>
  1094. <dt><code>-mvaes</code></dt>
  1095. <dd><a name="index-mvaes"></a>
  1096. </dd>
  1097. <dt><code>-mwaitpkg</code></dt>
  1098. <dd><a name="index-mwaitpkg"></a>
  1099. </dd>
  1100. <dt><code>-mvpclmulqdq</code></dt>
  1101. <dd><a name="index-mvpclmulqdq"></a>
  1102. </dd>
  1103. <dt><code>-mavx512bitalg</code></dt>
  1104. <dd><a name="index-mavx512bitalg"></a>
  1105. </dd>
  1106. <dt><code>-mmovdiri</code></dt>
  1107. <dd><a name="index-mmovdiri"></a>
  1108. </dd>
  1109. <dt><code>-mmovdir64b</code></dt>
  1110. <dd><a name="index-mmovdir64b"></a>
  1111. </dd>
  1112. <dt><code>-menqcmd</code></dt>
  1113. <dd><a name="index-menqcmd"></a>
  1114. </dd>
  1115. <dt><code>-mavx512vpopcntdq</code></dt>
  1116. <dd><a name="index-mavx512vpopcntdq"></a>
  1117. </dd>
  1118. <dt><code>-mavx512vp2intersect</code></dt>
  1119. <dd><a name="index-mavx512vp2intersect"></a>
  1120. </dd>
  1121. <dt><code>-mavx5124fmaps</code></dt>
  1122. <dd><a name="index-mavx5124fmaps"></a>
  1123. </dd>
  1124. <dt><code>-mavx512vnni</code></dt>
  1125. <dd><a name="index-mavx512vnni"></a>
  1126. </dd>
  1127. <dt><code>-mavx5124vnniw</code></dt>
  1128. <dd><a name="index-mavx5124vnniw"></a>
  1129. </dd>
  1130. <dt><code>-mcldemote</code></dt>
  1131. <dd><a name="index-mcldemote"></a>
  1132. <p>These switches enable the use of instructions in the MMX, SSE,
  1133. SSE2, SSE3, SSSE3, SSE4, SSE4A, SSE4.1, SSE4.2, AVX, AVX2, AVX512F, AVX512PF,
  1134. AVX512ER, AVX512CD, AVX512VL, AVX512BW, AVX512DQ, AVX512IFMA, AVX512VBMI, SHA,
  1135. AES, PCLMUL, CLFLUSHOPT, CLWB, FSGSBASE, PTWRITE, RDRND, F16C, FMA, PCONFIG,
  1136. WBNOINVD, FMA4, PREFETCHW, RDPID, PREFETCHWT1, RDSEED, SGX, XOP, LWP,
  1137. 3DNow!, enhanced 3DNow!, POPCNT, ABM, ADX, BMI, BMI2, LZCNT, FXSR, XSAVE,
  1138. XSAVEOPT, XSAVEC, XSAVES, RTM, HLE, TBM, MWAITX, CLZERO, PKU, AVX512VBMI2,
  1139. GFNI, VAES, WAITPKG, VPCLMULQDQ, AVX512BITALG, MOVDIRI, MOVDIR64B, AVX512BF16,
  1140. ENQCMD, AVX512VPOPCNTDQ, AVX5124FMAPS, AVX512VNNI, AVX5124VNNIW, or CLDEMOTE
  1141. extended instruction sets. Each has a corresponding <samp>-mno-</samp> option to
  1142. disable use of these instructions.
  1143. </p>
  1144. <p>These extensions are also available as built-in functions: see
  1145. <a href="x86-Built_002din-Functions.html#x86-Built_002din-Functions">x86 Built-in Functions</a>, for details of the functions enabled and
  1146. disabled by these switches.
  1147. </p>
  1148. <p>To generate SSE/SSE2 instructions automatically from floating-point
  1149. code (as opposed to 387 instructions), see <samp>-mfpmath=sse</samp>.
  1150. </p>
  1151. <p>GCC depresses SSEx instructions when <samp>-mavx</samp> is used. Instead, it
  1152. generates new AVX instructions or AVX equivalence for all SSEx instructions
  1153. when needed.
  1154. </p>
  1155. <p>These options enable GCC to use these extended instructions in
  1156. generated code, even without <samp>-mfpmath=sse</samp>. Applications that
  1157. perform run-time CPU detection must compile separate files for each
  1158. supported architecture, using the appropriate flags. In particular,
  1159. the file containing the CPU detection code should be compiled without
  1160. these options.
  1161. </p>
  1162. </dd>
  1163. <dt><code>-mdump-tune-features</code></dt>
  1164. <dd><a name="index-mdump_002dtune_002dfeatures"></a>
  1165. <p>This option instructs GCC to dump the names of the x86 performance
  1166. tuning features and default settings. The names can be used in
  1167. <samp>-mtune-ctrl=<var>feature-list</var></samp>.
  1168. </p>
  1169. </dd>
  1170. <dt><code>-mtune-ctrl=<var>feature-list</var></code></dt>
  1171. <dd><a name="index-mtune_002dctrl_003dfeature_002dlist"></a>
  1172. <p>This option is used to do fine grain control of x86 code generation features.
  1173. <var>feature-list</var> is a comma separated list of <var>feature</var> names. See also
  1174. <samp>-mdump-tune-features</samp>. When specified, the <var>feature</var> is turned
  1175. on if it is not preceded with &lsquo;<samp>^</samp>&rsquo;, otherwise, it is turned off.
  1176. <samp>-mtune-ctrl=<var>feature-list</var></samp> is intended to be used by GCC
  1177. developers. Using it may lead to code paths not covered by testing and can
  1178. potentially result in compiler ICEs or runtime errors.
  1179. </p>
  1180. </dd>
  1181. <dt><code>-mno-default</code></dt>
  1182. <dd><a name="index-mno_002ddefault"></a>
  1183. <p>This option instructs GCC to turn off all tunable features. See also
  1184. <samp>-mtune-ctrl=<var>feature-list</var></samp> and <samp>-mdump-tune-features</samp>.
  1185. </p>
  1186. </dd>
  1187. <dt><code>-mcld</code></dt>
  1188. <dd><a name="index-mcld"></a>
  1189. <p>This option instructs GCC to emit a <code>cld</code> instruction in the prologue
  1190. of functions that use string instructions. String instructions depend on
  1191. the DF flag to select between autoincrement or autodecrement mode. While the
  1192. ABI specifies the DF flag to be cleared on function entry, some operating
  1193. systems violate this specification by not clearing the DF flag in their
  1194. exception dispatchers. The exception handler can be invoked with the DF flag
  1195. set, which leads to wrong direction mode when string instructions are used.
  1196. This option can be enabled by default on 32-bit x86 targets by configuring
  1197. GCC with the <samp>--enable-cld</samp> configure option. Generation of <code>cld</code>
  1198. instructions can be suppressed with the <samp>-mno-cld</samp> compiler option
  1199. in this case.
  1200. </p>
  1201. </dd>
  1202. <dt><code>-mvzeroupper</code></dt>
  1203. <dd><a name="index-mvzeroupper"></a>
  1204. <p>This option instructs GCC to emit a <code>vzeroupper</code> instruction
  1205. before a transfer of control flow out of the function to minimize
  1206. the AVX to SSE transition penalty as well as remove unnecessary <code>zeroupper</code>
  1207. intrinsics.
  1208. </p>
  1209. </dd>
  1210. <dt><code>-mprefer-avx128</code></dt>
  1211. <dd><a name="index-mprefer_002davx128"></a>
  1212. <p>This option instructs GCC to use 128-bit AVX instructions instead of
  1213. 256-bit AVX instructions in the auto-vectorizer.
  1214. </p>
  1215. </dd>
  1216. <dt><code>-mprefer-vector-width=<var>opt</var></code></dt>
  1217. <dd><a name="index-mprefer_002dvector_002dwidth"></a>
  1218. <p>This option instructs GCC to use <var>opt</var>-bit vector width in instructions
  1219. instead of default on the selected platform.
  1220. </p>
  1221. <dl compact="compact">
  1222. <dt>&lsquo;<samp>none</samp>&rsquo;</dt>
  1223. <dd><p>No extra limitations applied to GCC other than defined by the selected platform.
  1224. </p>
  1225. </dd>
  1226. <dt>&lsquo;<samp>128</samp>&rsquo;</dt>
  1227. <dd><p>Prefer 128-bit vector width for instructions.
  1228. </p>
  1229. </dd>
  1230. <dt>&lsquo;<samp>256</samp>&rsquo;</dt>
  1231. <dd><p>Prefer 256-bit vector width for instructions.
  1232. </p>
  1233. </dd>
  1234. <dt>&lsquo;<samp>512</samp>&rsquo;</dt>
  1235. <dd><p>Prefer 512-bit vector width for instructions.
  1236. </p></dd>
  1237. </dl>
  1238. </dd>
  1239. <dt><code>-mcx16</code></dt>
  1240. <dd><a name="index-mcx16"></a>
  1241. <p>This option enables GCC to generate <code>CMPXCHG16B</code> instructions in 64-bit
  1242. code to implement compare-and-exchange operations on 16-byte aligned 128-bit
  1243. objects. This is useful for atomic updates of data structures exceeding one
  1244. machine word in size. The compiler uses this instruction to implement
  1245. <a href="_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins">__sync Builtins</a>. However, for <a href="_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins">__atomic Builtins</a> operating on
  1246. 128-bit integers, a library call is always used.
  1247. </p>
  1248. </dd>
  1249. <dt><code>-msahf</code></dt>
  1250. <dd><a name="index-msahf"></a>
  1251. <p>This option enables generation of <code>SAHF</code> instructions in 64-bit code.
  1252. Early Intel Pentium 4 CPUs with Intel 64 support,
  1253. prior to the introduction of Pentium 4 G1 step in December 2005,
  1254. lacked the <code>LAHF</code> and <code>SAHF</code> instructions
  1255. which are supported by AMD64.
  1256. These are load and store instructions, respectively, for certain status flags.
  1257. In 64-bit mode, the <code>SAHF</code> instruction is used to optimize <code>fmod</code>,
  1258. <code>drem</code>, and <code>remainder</code> built-in functions;
  1259. see <a href="Other-Builtins.html#Other-Builtins">Other Builtins</a> for details.
  1260. </p>
  1261. </dd>
  1262. <dt><code>-mmovbe</code></dt>
  1263. <dd><a name="index-mmovbe"></a>
  1264. <p>This option enables use of the <code>movbe</code> instruction to implement
  1265. <code>__builtin_bswap32</code> and <code>__builtin_bswap64</code>.
  1266. </p>
  1267. </dd>
  1268. <dt><code>-mshstk</code></dt>
  1269. <dd><a name="index-mshstk"></a>
  1270. <p>The <samp>-mshstk</samp> option enables shadow stack built-in functions
  1271. from x86 Control-flow Enforcement Technology (CET).
  1272. </p>
  1273. </dd>
  1274. <dt><code>-mcrc32</code></dt>
  1275. <dd><a name="index-mcrc32"></a>
  1276. <p>This option enables built-in functions <code>__builtin_ia32_crc32qi</code>,
  1277. <code>__builtin_ia32_crc32hi</code>, <code>__builtin_ia32_crc32si</code> and
  1278. <code>__builtin_ia32_crc32di</code> to generate the <code>crc32</code> machine instruction.
  1279. </p>
  1280. </dd>
  1281. <dt><code>-mrecip</code></dt>
  1282. <dd><a name="index-mrecip-1"></a>
  1283. <p>This option enables use of <code>RCPSS</code> and <code>RSQRTSS</code> instructions
  1284. (and their vectorized variants <code>RCPPS</code> and <code>RSQRTPS</code>)
  1285. with an additional Newton-Raphson step
  1286. to increase precision instead of <code>DIVSS</code> and <code>SQRTSS</code>
  1287. (and their vectorized
  1288. variants) for single-precision floating-point arguments. These instructions
  1289. are generated only when <samp>-funsafe-math-optimizations</samp> is enabled
  1290. together with <samp>-ffinite-math-only</samp> and <samp>-fno-trapping-math</samp>.
  1291. Note that while the throughput of the sequence is higher than the throughput
  1292. of the non-reciprocal instruction, the precision of the sequence can be
  1293. decreased by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994).
  1294. </p>
  1295. <p>Note that GCC implements <code>1.0f/sqrtf(<var>x</var>)</code> in terms of <code>RSQRTSS</code>
  1296. (or <code>RSQRTPS</code>) already with <samp>-ffast-math</samp> (or the above option
  1297. combination), and doesn&rsquo;t need <samp>-mrecip</samp>.
  1298. </p>
  1299. <p>Also note that GCC emits the above sequence with additional Newton-Raphson step
  1300. for vectorized single-float division and vectorized <code>sqrtf(<var>x</var>)</code>
  1301. already with <samp>-ffast-math</samp> (or the above option combination), and
  1302. doesn&rsquo;t need <samp>-mrecip</samp>.
  1303. </p>
  1304. </dd>
  1305. <dt><code>-mrecip=<var>opt</var></code></dt>
  1306. <dd><a name="index-mrecip_003dopt-1"></a>
  1307. <p>This option controls which reciprocal estimate instructions
  1308. may be used. <var>opt</var> is a comma-separated list of options, which may
  1309. be preceded by a &lsquo;<samp>!</samp>&rsquo; to invert the option:
  1310. </p>
  1311. <dl compact="compact">
  1312. <dt>&lsquo;<samp>all</samp>&rsquo;</dt>
  1313. <dd><p>Enable all estimate instructions.
  1314. </p>
  1315. </dd>
  1316. <dt>&lsquo;<samp>default</samp>&rsquo;</dt>
  1317. <dd><p>Enable the default instructions, equivalent to <samp>-mrecip</samp>.
  1318. </p>
  1319. </dd>
  1320. <dt>&lsquo;<samp>none</samp>&rsquo;</dt>
  1321. <dd><p>Disable all estimate instructions, equivalent to <samp>-mno-recip</samp>.
  1322. </p>
  1323. </dd>
  1324. <dt>&lsquo;<samp>div</samp>&rsquo;</dt>
  1325. <dd><p>Enable the approximation for scalar division.
  1326. </p>
  1327. </dd>
  1328. <dt>&lsquo;<samp>vec-div</samp>&rsquo;</dt>
  1329. <dd><p>Enable the approximation for vectorized division.
  1330. </p>
  1331. </dd>
  1332. <dt>&lsquo;<samp>sqrt</samp>&rsquo;</dt>
  1333. <dd><p>Enable the approximation for scalar square root.
  1334. </p>
  1335. </dd>
  1336. <dt>&lsquo;<samp>vec-sqrt</samp>&rsquo;</dt>
  1337. <dd><p>Enable the approximation for vectorized square root.
  1338. </p></dd>
  1339. </dl>
  1340. <p>So, for example, <samp>-mrecip=all,!sqrt</samp> enables
  1341. all of the reciprocal approximations, except for square root.
  1342. </p>
  1343. </dd>
  1344. <dt><code>-mveclibabi=<var>type</var></code></dt>
  1345. <dd><a name="index-mveclibabi-1"></a>
  1346. <p>Specifies the ABI type to use for vectorizing intrinsics using an
  1347. external library. Supported values for <var>type</var> are &lsquo;<samp>svml</samp>&rsquo;
  1348. for the Intel short
  1349. vector math library and &lsquo;<samp>acml</samp>&rsquo; for the AMD math core library.
  1350. To use this option, both <samp>-ftree-vectorize</samp> and
  1351. <samp>-funsafe-math-optimizations</samp> have to be enabled, and an SVML or ACML
  1352. ABI-compatible library must be specified at link time.
  1353. </p>
  1354. <p>GCC currently emits calls to <code>vmldExp2</code>,
  1355. <code>vmldLn2</code>, <code>vmldLog102</code>, <code>vmldPow2</code>,
  1356. <code>vmldTanh2</code>, <code>vmldTan2</code>, <code>vmldAtan2</code>, <code>vmldAtanh2</code>,
  1357. <code>vmldCbrt2</code>, <code>vmldSinh2</code>, <code>vmldSin2</code>, <code>vmldAsinh2</code>,
  1358. <code>vmldAsin2</code>, <code>vmldCosh2</code>, <code>vmldCos2</code>, <code>vmldAcosh2</code>,
  1359. <code>vmldAcos2</code>, <code>vmlsExp4</code>, <code>vmlsLn4</code>,
  1360. <code>vmlsLog104</code>, <code>vmlsPow4</code>, <code>vmlsTanh4</code>, <code>vmlsTan4</code>,
  1361. <code>vmlsAtan4</code>, <code>vmlsAtanh4</code>, <code>vmlsCbrt4</code>, <code>vmlsSinh4</code>,
  1362. <code>vmlsSin4</code>, <code>vmlsAsinh4</code>, <code>vmlsAsin4</code>, <code>vmlsCosh4</code>,
  1363. <code>vmlsCos4</code>, <code>vmlsAcosh4</code> and <code>vmlsAcos4</code> for corresponding
  1364. function type when <samp>-mveclibabi=svml</samp> is used, and <code>__vrd2_sin</code>,
  1365. <code>__vrd2_cos</code>, <code>__vrd2_exp</code>, <code>__vrd2_log</code>, <code>__vrd2_log2</code>,
  1366. <code>__vrd2_log10</code>, <code>__vrs4_sinf</code>, <code>__vrs4_cosf</code>,
  1367. <code>__vrs4_expf</code>, <code>__vrs4_logf</code>, <code>__vrs4_log2f</code>,
  1368. <code>__vrs4_log10f</code> and <code>__vrs4_powf</code> for the corresponding function type
  1369. when <samp>-mveclibabi=acml</samp> is used.
  1370. </p>
  1371. </dd>
  1372. <dt><code>-mabi=<var>name</var></code></dt>
  1373. <dd><a name="index-mabi-5"></a>
  1374. <p>Generate code for the specified calling convention. Permissible values
  1375. are &lsquo;<samp>sysv</samp>&rsquo; for the ABI used on GNU/Linux and other systems, and
  1376. &lsquo;<samp>ms</samp>&rsquo; for the Microsoft ABI. The default is to use the Microsoft
  1377. ABI when targeting Microsoft Windows and the SysV ABI on all other systems.
  1378. You can control this behavior for specific functions by
  1379. using the function attributes <code>ms_abi</code> and <code>sysv_abi</code>.
  1380. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>.
  1381. </p>
  1382. </dd>
  1383. <dt><code>-mforce-indirect-call</code></dt>
  1384. <dd><a name="index-mforce_002dindirect_002dcall"></a>
  1385. <p>Force all calls to functions to be indirect. This is useful
  1386. when using Intel Processor Trace where it generates more precise timing
  1387. information for function calls.
  1388. </p>
  1389. </dd>
  1390. <dt><code>-mmanual-endbr</code></dt>
  1391. <dd><a name="index-mmanual_002dendbr"></a>
  1392. <p>Insert ENDBR instruction at function entry only via the <code>cf_check</code>
  1393. function attribute. This is useful when used with the option
  1394. <samp>-fcf-protection=branch</samp> to control ENDBR insertion at the
  1395. function entry.
  1396. </p>
  1397. </dd>
  1398. <dt><code>-mcall-ms2sysv-xlogues</code></dt>
  1399. <dd><a name="index-mcall_002dms2sysv_002dxlogues"></a>
  1400. <a name="index-mno_002dcall_002dms2sysv_002dxlogues"></a>
  1401. <p>Due to differences in 64-bit ABIs, any Microsoft ABI function that calls a
  1402. System V ABI function must consider RSI, RDI and XMM6-15 as clobbered. By
  1403. default, the code for saving and restoring these registers is emitted inline,
  1404. resulting in fairly lengthy prologues and epilogues. Using
  1405. <samp>-mcall-ms2sysv-xlogues</samp> emits prologues and epilogues that
  1406. use stubs in the static portion of libgcc to perform these saves and restores,
  1407. thus reducing function size at the cost of a few extra instructions.
  1408. </p>
  1409. </dd>
  1410. <dt><code>-mtls-dialect=<var>type</var></code></dt>
  1411. <dd><a name="index-mtls_002ddialect-1"></a>
  1412. <p>Generate code to access thread-local storage using the &lsquo;<samp>gnu</samp>&rsquo; or
  1413. &lsquo;<samp>gnu2</samp>&rsquo; conventions. &lsquo;<samp>gnu</samp>&rsquo; is the conservative default;
  1414. &lsquo;<samp>gnu2</samp>&rsquo; is more efficient, but it may add compile- and run-time
  1415. requirements that cannot be satisfied on all systems.
  1416. </p>
  1417. </dd>
  1418. <dt><code>-mpush-args</code></dt>
  1419. <dt><code>-mno-push-args</code></dt>
  1420. <dd><a name="index-mpush_002dargs"></a>
  1421. <a name="index-mno_002dpush_002dargs"></a>
  1422. <p>Use PUSH operations to store outgoing parameters. This method is shorter
  1423. and usually equally fast as method using SUB/MOV operations and is enabled
  1424. by default. In some cases disabling it may improve performance because of
  1425. improved scheduling and reduced dependencies.
  1426. </p>
  1427. </dd>
  1428. <dt><code>-maccumulate-outgoing-args</code></dt>
  1429. <dd><a name="index-maccumulate_002doutgoing_002dargs-1"></a>
  1430. <p>If enabled, the maximum amount of space required for outgoing arguments is
  1431. computed in the function prologue. This is faster on most modern CPUs
  1432. because of reduced dependencies, improved scheduling and reduced stack usage
  1433. when the preferred stack boundary is not equal to 2. The drawback is a notable
  1434. increase in code size. This switch implies <samp>-mno-push-args</samp>.
  1435. </p>
  1436. </dd>
  1437. <dt><code>-mthreads</code></dt>
  1438. <dd><a name="index-mthreads"></a>
  1439. <p>Support thread-safe exception handling on MinGW. Programs that rely
  1440. on thread-safe exception handling must compile and link all code with the
  1441. <samp>-mthreads</samp> option. When compiling, <samp>-mthreads</samp> defines
  1442. <samp>-D_MT</samp>; when linking, it links in a special thread helper library
  1443. <samp>-lmingwthrd</samp> which cleans up per-thread exception-handling data.
  1444. </p>
  1445. </dd>
  1446. <dt><code>-mms-bitfields</code></dt>
  1447. <dt><code>-mno-ms-bitfields</code></dt>
  1448. <dd><a name="index-mms_002dbitfields"></a>
  1449. <a name="index-mno_002dms_002dbitfields"></a>
  1450. <p>Enable/disable bit-field layout compatible with the native Microsoft
  1451. Windows compiler.
  1452. </p>
  1453. <p>If <code>packed</code> is used on a structure, or if bit-fields are used,
  1454. it may be that the Microsoft ABI lays out the structure differently
  1455. than the way GCC normally does. Particularly when moving packed
  1456. data between functions compiled with GCC and the native Microsoft compiler
  1457. (either via function call or as data in a file), it may be necessary to access
  1458. either format.
  1459. </p>
  1460. <p>This option is enabled by default for Microsoft Windows
  1461. targets. This behavior can also be controlled locally by use of variable
  1462. or type attributes. For more information, see <a href="x86-Variable-Attributes.html#x86-Variable-Attributes">x86 Variable Attributes</a>
  1463. and <a href="x86-Type-Attributes.html#x86-Type-Attributes">x86 Type Attributes</a>.
  1464. </p>
  1465. <p>The Microsoft structure layout algorithm is fairly simple with the exception
  1466. of the bit-field packing.
  1467. The padding and alignment of members of structures and whether a bit-field
  1468. can straddle a storage-unit boundary are determine by these rules:
  1469. </p>
  1470. <ol>
  1471. <li> Structure members are stored sequentially in the order in which they are
  1472. declared: the first member has the lowest memory address and the last member
  1473. the highest.
  1474. </li><li> Every data object has an alignment requirement. The alignment requirement
  1475. for all data except structures, unions, and arrays is either the size of the
  1476. object or the current packing size (specified with either the
  1477. <code>aligned</code> attribute or the <code>pack</code> pragma),
  1478. whichever is less. For structures, unions, and arrays,
  1479. the alignment requirement is the largest alignment requirement of its members.
  1480. Every object is allocated an offset so that:
  1481. <div class="smallexample">
  1482. <pre class="smallexample">offset % alignment_requirement == 0
  1483. </pre></div>
  1484. </li><li> Adjacent bit-fields are packed into the same 1-, 2-, or 4-byte allocation
  1485. unit if the integral types are the same size and if the next bit-field fits
  1486. into the current allocation unit without crossing the boundary imposed by the
  1487. common alignment requirements of the bit-fields.
  1488. </li></ol>
  1489. <p>MSVC interprets zero-length bit-fields in the following ways:
  1490. </p>
  1491. <ol>
  1492. <li> If a zero-length bit-field is inserted between two bit-fields that
  1493. are normally coalesced, the bit-fields are not coalesced.
  1494. <p>For example:
  1495. </p>
  1496. <div class="smallexample">
  1497. <pre class="smallexample">struct
  1498. {
  1499. unsigned long bf_1 : 12;
  1500. unsigned long : 0;
  1501. unsigned long bf_2 : 12;
  1502. } t1;
  1503. </pre></div>
  1504. <p>The size of <code>t1</code> is 8 bytes with the zero-length bit-field. If the
  1505. zero-length bit-field were removed, <code>t1</code>&rsquo;s size would be 4 bytes.
  1506. </p>
  1507. </li><li> If a zero-length bit-field is inserted after a bit-field, <code>foo</code>, and the
  1508. alignment of the zero-length bit-field is greater than the member that follows it,
  1509. <code>bar</code>, <code>bar</code> is aligned as the type of the zero-length bit-field.
  1510. <p>For example:
  1511. </p>
  1512. <div class="smallexample">
  1513. <pre class="smallexample">struct
  1514. {
  1515. char foo : 4;
  1516. short : 0;
  1517. char bar;
  1518. } t2;
  1519. struct
  1520. {
  1521. char foo : 4;
  1522. short : 0;
  1523. double bar;
  1524. } t3;
  1525. </pre></div>
  1526. <p>For <code>t2</code>, <code>bar</code> is placed at offset 2, rather than offset 1.
  1527. Accordingly, the size of <code>t2</code> is 4. For <code>t3</code>, the zero-length
  1528. bit-field does not affect the alignment of <code>bar</code> or, as a result, the size
  1529. of the structure.
  1530. </p>
  1531. <p>Taking this into account, it is important to note the following:
  1532. </p>
  1533. <ol>
  1534. <li> If a zero-length bit-field follows a normal bit-field, the type of the
  1535. zero-length bit-field may affect the alignment of the structure as whole. For
  1536. example, <code>t2</code> has a size of 4 bytes, since the zero-length bit-field follows a
  1537. normal bit-field, and is of type short.
  1538. </li><li> Even if a zero-length bit-field is not followed by a normal bit-field, it may
  1539. still affect the alignment of the structure:
  1540. <div class="smallexample">
  1541. <pre class="smallexample">struct
  1542. {
  1543. char foo : 6;
  1544. long : 0;
  1545. } t4;
  1546. </pre></div>
  1547. <p>Here, <code>t4</code> takes up 4 bytes.
  1548. </p></li></ol>
  1549. </li><li> Zero-length bit-fields following non-bit-field members are ignored:
  1550. <div class="smallexample">
  1551. <pre class="smallexample">struct
  1552. {
  1553. char foo;
  1554. long : 0;
  1555. char bar;
  1556. } t5;
  1557. </pre></div>
  1558. <p>Here, <code>t5</code> takes up 2 bytes.
  1559. </p></li></ol>
  1560. </dd>
  1561. <dt><code>-mno-align-stringops</code></dt>
  1562. <dd><a name="index-mno_002dalign_002dstringops"></a>
  1563. <a name="index-malign_002dstringops"></a>
  1564. <p>Do not align the destination of inlined string operations. This switch reduces
  1565. code size and improves performance in case the destination is already aligned,
  1566. but GCC doesn&rsquo;t know about it.
  1567. </p>
  1568. </dd>
  1569. <dt><code>-minline-all-stringops</code></dt>
  1570. <dd><a name="index-minline_002dall_002dstringops"></a>
  1571. <p>By default GCC inlines string operations only when the destination is
  1572. known to be aligned to least a 4-byte boundary.
  1573. This enables more inlining and increases code
  1574. size, but may improve performance of code that depends on fast
  1575. <code>memcpy</code> and <code>memset</code> for short lengths.
  1576. The option enables inline expansion of <code>strlen</code> for all
  1577. pointer alignments.
  1578. </p>
  1579. </dd>
  1580. <dt><code>-minline-stringops-dynamically</code></dt>
  1581. <dd><a name="index-minline_002dstringops_002ddynamically"></a>
  1582. <p>For string operations of unknown size, use run-time checks with
  1583. inline code for small blocks and a library call for large blocks.
  1584. </p>
  1585. </dd>
  1586. <dt><code>-mstringop-strategy=<var>alg</var></code></dt>
  1587. <dd><a name="index-mstringop_002dstrategy_003dalg"></a>
  1588. <p>Override the internal decision heuristic for the particular algorithm to use
  1589. for inlining string operations. The allowed values for <var>alg</var> are:
  1590. </p>
  1591. <dl compact="compact">
  1592. <dt>&lsquo;<samp>rep_byte</samp>&rsquo;</dt>
  1593. <dt>&lsquo;<samp>rep_4byte</samp>&rsquo;</dt>
  1594. <dt>&lsquo;<samp>rep_8byte</samp>&rsquo;</dt>
  1595. <dd><p>Expand using i386 <code>rep</code> prefix of the specified size.
  1596. </p>
  1597. </dd>
  1598. <dt>&lsquo;<samp>byte_loop</samp>&rsquo;</dt>
  1599. <dt>&lsquo;<samp>loop</samp>&rsquo;</dt>
  1600. <dt>&lsquo;<samp>unrolled_loop</samp>&rsquo;</dt>
  1601. <dd><p>Expand into an inline loop.
  1602. </p>
  1603. </dd>
  1604. <dt>&lsquo;<samp>libcall</samp>&rsquo;</dt>
  1605. <dd><p>Always use a library call.
  1606. </p></dd>
  1607. </dl>
  1608. </dd>
  1609. <dt><code>-mmemcpy-strategy=<var>strategy</var></code></dt>
  1610. <dd><a name="index-mmemcpy_002dstrategy_003dstrategy"></a>
  1611. <p>Override the internal decision heuristic to decide if <code>__builtin_memcpy</code>
  1612. should be inlined and what inline algorithm to use when the expected size
  1613. of the copy operation is known. <var>strategy</var>
  1614. is a comma-separated list of <var>alg</var>:<var>max_size</var>:<var>dest_align</var> triplets.
  1615. <var>alg</var> is specified in <samp>-mstringop-strategy</samp>, <var>max_size</var> specifies
  1616. the max byte size with which inline algorithm <var>alg</var> is allowed. For the last
  1617. triplet, the <var>max_size</var> must be <code>-1</code>. The <var>max_size</var> of the triplets
  1618. in the list must be specified in increasing order. The minimal byte size for
  1619. <var>alg</var> is <code>0</code> for the first triplet and <code><var>max_size</var> + 1</code> of the
  1620. preceding range.
  1621. </p>
  1622. </dd>
  1623. <dt><code>-mmemset-strategy=<var>strategy</var></code></dt>
  1624. <dd><a name="index-mmemset_002dstrategy_003dstrategy"></a>
  1625. <p>The option is similar to <samp>-mmemcpy-strategy=</samp> except that it is to control
  1626. <code>__builtin_memset</code> expansion.
  1627. </p>
  1628. </dd>
  1629. <dt><code>-momit-leaf-frame-pointer</code></dt>
  1630. <dd><a name="index-momit_002dleaf_002dframe_002dpointer-2"></a>
  1631. <p>Don&rsquo;t keep the frame pointer in a register for leaf functions. This
  1632. avoids the instructions to save, set up, and restore frame pointers and
  1633. makes an extra register available in leaf functions. The option
  1634. <samp>-fomit-leaf-frame-pointer</samp> removes the frame pointer for leaf functions,
  1635. which might make debugging harder.
  1636. </p>
  1637. </dd>
  1638. <dt><code>-mtls-direct-seg-refs</code></dt>
  1639. <dt><code>-mno-tls-direct-seg-refs</code></dt>
  1640. <dd><a name="index-mtls_002ddirect_002dseg_002drefs"></a>
  1641. <p>Controls whether TLS variables may be accessed with offsets from the
  1642. TLS segment register (<code>%gs</code> for 32-bit, <code>%fs</code> for 64-bit),
  1643. or whether the thread base pointer must be added. Whether or not this
  1644. is valid depends on the operating system, and whether it maps the
  1645. segment to cover the entire TLS area.
  1646. </p>
  1647. <p>For systems that use the GNU C Library, the default is on.
  1648. </p>
  1649. </dd>
  1650. <dt><code>-msse2avx</code></dt>
  1651. <dt><code>-mno-sse2avx</code></dt>
  1652. <dd><a name="index-msse2avx"></a>
  1653. <p>Specify that the assembler should encode SSE instructions with VEX
  1654. prefix. The option <samp>-mavx</samp> turns this on by default.
  1655. </p>
  1656. </dd>
  1657. <dt><code>-mfentry</code></dt>
  1658. <dt><code>-mno-fentry</code></dt>
  1659. <dd><a name="index-mfentry"></a>
  1660. <p>If profiling is active (<samp>-pg</samp>), put the profiling
  1661. counter call before the prologue.
  1662. Note: On x86 architectures the attribute <code>ms_hook_prologue</code>
  1663. isn&rsquo;t possible at the moment for <samp>-mfentry</samp> and <samp>-pg</samp>.
  1664. </p>
  1665. </dd>
  1666. <dt><code>-mrecord-mcount</code></dt>
  1667. <dt><code>-mno-record-mcount</code></dt>
  1668. <dd><a name="index-mrecord_002dmcount"></a>
  1669. <p>If profiling is active (<samp>-pg</samp>), generate a __mcount_loc section
  1670. that contains pointers to each profiling call. This is useful for
  1671. automatically patching and out calls.
  1672. </p>
  1673. </dd>
  1674. <dt><code>-mnop-mcount</code></dt>
  1675. <dt><code>-mno-nop-mcount</code></dt>
  1676. <dd><a name="index-mnop_002dmcount"></a>
  1677. <p>If profiling is active (<samp>-pg</samp>), generate the calls to
  1678. the profiling functions as NOPs. This is useful when they
  1679. should be patched in later dynamically. This is likely only
  1680. useful together with <samp>-mrecord-mcount</samp>.
  1681. </p>
  1682. </dd>
  1683. <dt><code>-minstrument-return=<var>type</var></code></dt>
  1684. <dd><a name="index-minstrument_002dreturn"></a>
  1685. <p>Instrument function exit in -pg -mfentry instrumented functions with
  1686. call to specified function. This only instruments true returns ending
  1687. with ret, but not sibling calls ending with jump. Valid types
  1688. are <var>none</var> to not instrument, <var>call</var> to generate a call to __return__,
  1689. or <var>nop5</var> to generate a 5 byte nop.
  1690. </p>
  1691. </dd>
  1692. <dt><code>-mrecord-return</code></dt>
  1693. <dt><code>-mno-record-return</code></dt>
  1694. <dd><a name="index-mrecord_002dreturn"></a>
  1695. <p>Generate a __return_loc section pointing to all return instrumentation code.
  1696. </p>
  1697. </dd>
  1698. <dt><code>-mfentry-name=<var>name</var></code></dt>
  1699. <dd><a name="index-mfentry_002dname"></a>
  1700. <p>Set name of __fentry__ symbol called at function entry for -pg -mfentry functions.
  1701. </p>
  1702. </dd>
  1703. <dt><code>-mfentry-section=<var>name</var></code></dt>
  1704. <dd><a name="index-mfentry_002dsection"></a>
  1705. <p>Set name of section to record -mrecord-mcount calls (default __mcount_loc).
  1706. </p>
  1707. </dd>
  1708. <dt><code>-mskip-rax-setup</code></dt>
  1709. <dt><code>-mno-skip-rax-setup</code></dt>
  1710. <dd><a name="index-mskip_002drax_002dsetup"></a>
  1711. <p>When generating code for the x86-64 architecture with SSE extensions
  1712. disabled, <samp>-mskip-rax-setup</samp> can be used to skip setting up RAX
  1713. register when there are no variable arguments passed in vector registers.
  1714. </p>
  1715. <p><strong>Warning:</strong> Since RAX register is used to avoid unnecessarily
  1716. saving vector registers on stack when passing variable arguments, the
  1717. impacts of this option are callees may waste some stack space,
  1718. misbehave or jump to a random location. GCC 4.4 or newer don&rsquo;t have
  1719. those issues, regardless the RAX register value.
  1720. </p>
  1721. </dd>
  1722. <dt><code>-m8bit-idiv</code></dt>
  1723. <dt><code>-mno-8bit-idiv</code></dt>
  1724. <dd><a name="index-m8bit_002didiv"></a>
  1725. <p>On some processors, like Intel Atom, 8-bit unsigned integer divide is
  1726. much faster than 32-bit/64-bit integer divide. This option generates a
  1727. run-time check. If both dividend and divisor are within range of 0
  1728. to 255, 8-bit unsigned integer divide is used instead of
  1729. 32-bit/64-bit integer divide.
  1730. </p>
  1731. </dd>
  1732. <dt><code>-mavx256-split-unaligned-load</code></dt>
  1733. <dt><code>-mavx256-split-unaligned-store</code></dt>
  1734. <dd><a name="index-mavx256_002dsplit_002dunaligned_002dload"></a>
  1735. <a name="index-mavx256_002dsplit_002dunaligned_002dstore"></a>
  1736. <p>Split 32-byte AVX unaligned load and store.
  1737. </p>
  1738. </dd>
  1739. <dt><code>-mstack-protector-guard=<var>guard</var></code></dt>
  1740. <dt><code>-mstack-protector-guard-reg=<var>reg</var></code></dt>
  1741. <dt><code>-mstack-protector-guard-offset=<var>offset</var></code></dt>
  1742. <dd><a name="index-mstack_002dprotector_002dguard-3"></a>
  1743. <a name="index-mstack_002dprotector_002dguard_002dreg-3"></a>
  1744. <a name="index-mstack_002dprotector_002dguard_002doffset-3"></a>
  1745. <p>Generate stack protection code using canary at <var>guard</var>. Supported
  1746. locations are &lsquo;<samp>global</samp>&rsquo; for global canary or &lsquo;<samp>tls</samp>&rsquo; for per-thread
  1747. canary in the TLS block (the default). This option has effect only when
  1748. <samp>-fstack-protector</samp> or <samp>-fstack-protector-all</samp> is specified.
  1749. </p>
  1750. <p>With the latter choice the options
  1751. <samp>-mstack-protector-guard-reg=<var>reg</var></samp> and
  1752. <samp>-mstack-protector-guard-offset=<var>offset</var></samp> furthermore specify
  1753. which segment register (<code>%fs</code> or <code>%gs</code>) to use as base register
  1754. for reading the canary, and from what offset from that base register.
  1755. The default for those is as specified in the relevant ABI.
  1756. </p>
  1757. </dd>
  1758. <dt><code>-mgeneral-regs-only</code></dt>
  1759. <dd><a name="index-mgeneral_002dregs_002donly-2"></a>
  1760. <p>Generate code that uses only the general-purpose registers. This
  1761. prevents the compiler from using floating-point, vector, mask and bound
  1762. registers.
  1763. </p>
  1764. </dd>
  1765. <dt><code>-mindirect-branch=<var>choice</var></code></dt>
  1766. <dd><a name="index-mindirect_002dbranch"></a>
  1767. <p>Convert indirect call and jump with <var>choice</var>. The default is
  1768. &lsquo;<samp>keep</samp>&rsquo;, which keeps indirect call and jump unmodified.
  1769. &lsquo;<samp>thunk</samp>&rsquo; converts indirect call and jump to call and return thunk.
  1770. &lsquo;<samp>thunk-inline</samp>&rsquo; converts indirect call and jump to inlined call
  1771. and return thunk. &lsquo;<samp>thunk-extern</samp>&rsquo; converts indirect call and jump
  1772. to external call and return thunk provided in a separate object file.
  1773. You can control this behavior for a specific function by using the
  1774. function attribute <code>indirect_branch</code>. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>.
  1775. </p>
  1776. <p>Note that <samp>-mcmodel=large</samp> is incompatible with
  1777. <samp>-mindirect-branch=thunk</samp> and
  1778. <samp>-mindirect-branch=thunk-extern</samp> since the thunk function may
  1779. not be reachable in the large code model.
  1780. </p>
  1781. <p>Note that <samp>-mindirect-branch=thunk-extern</samp> is compatible with
  1782. <samp>-fcf-protection=branch</samp> since the external thunk can be made
  1783. to enable control-flow check.
  1784. </p>
  1785. </dd>
  1786. <dt><code>-mfunction-return=<var>choice</var></code></dt>
  1787. <dd><a name="index-mfunction_002dreturn"></a>
  1788. <p>Convert function return with <var>choice</var>. The default is &lsquo;<samp>keep</samp>&rsquo;,
  1789. which keeps function return unmodified. &lsquo;<samp>thunk</samp>&rsquo; converts function
  1790. return to call and return thunk. &lsquo;<samp>thunk-inline</samp>&rsquo; converts function
  1791. return to inlined call and return thunk. &lsquo;<samp>thunk-extern</samp>&rsquo; converts
  1792. function return to external call and return thunk provided in a separate
  1793. object file. You can control this behavior for a specific function by
  1794. using the function attribute <code>function_return</code>.
  1795. See <a href="Function-Attributes.html#Function-Attributes">Function Attributes</a>.
  1796. </p>
  1797. <p>Note that <samp>-mindirect-return=thunk-extern</samp> is compatible with
  1798. <samp>-fcf-protection=branch</samp> since the external thunk can be made
  1799. to enable control-flow check.
  1800. </p>
  1801. <p>Note that <samp>-mcmodel=large</samp> is incompatible with
  1802. <samp>-mfunction-return=thunk</samp> and
  1803. <samp>-mfunction-return=thunk-extern</samp> since the thunk function may
  1804. not be reachable in the large code model.
  1805. </p>
  1806. </dd>
  1807. <dt><code>-mindirect-branch-register</code></dt>
  1808. <dd><a name="index-mindirect_002dbranch_002dregister"></a>
  1809. <p>Force indirect call and jump via register.
  1810. </p>
  1811. </dd>
  1812. </dl>
  1813. <p>These &lsquo;<samp>-m</samp>&rsquo; switches are supported in addition to the above
  1814. on x86-64 processors in 64-bit environments.
  1815. </p>
  1816. <dl compact="compact">
  1817. <dt><code>-m32</code></dt>
  1818. <dt><code>-m64</code></dt>
  1819. <dt><code>-mx32</code></dt>
  1820. <dt><code>-m16</code></dt>
  1821. <dt><code>-miamcu</code></dt>
  1822. <dd><a name="index-m32-5"></a>
  1823. <a name="index-m64-5"></a>
  1824. <a name="index-mx32"></a>
  1825. <a name="index-m16"></a>
  1826. <a name="index-miamcu"></a>
  1827. <p>Generate code for a 16-bit, 32-bit or 64-bit environment.
  1828. The <samp>-m32</samp> option sets <code>int</code>, <code>long</code>, and pointer types
  1829. to 32 bits, and
  1830. generates code that runs on any i386 system.
  1831. </p>
  1832. <p>The <samp>-m64</samp> option sets <code>int</code> to 32 bits and <code>long</code> and pointer
  1833. types to 64 bits, and generates code for the x86-64 architecture.
  1834. For Darwin only the <samp>-m64</samp> option also turns off the <samp>-fno-pic</samp>
  1835. and <samp>-mdynamic-no-pic</samp> options.
  1836. </p>
  1837. <p>The <samp>-mx32</samp> option sets <code>int</code>, <code>long</code>, and pointer types
  1838. to 32 bits, and
  1839. generates code for the x86-64 architecture.
  1840. </p>
  1841. <p>The <samp>-m16</samp> option is the same as <samp>-m32</samp>, except for that
  1842. it outputs the <code>.code16gcc</code> assembly directive at the beginning of
  1843. the assembly output so that the binary can run in 16-bit mode.
  1844. </p>
  1845. <p>The <samp>-miamcu</samp> option generates code which conforms to Intel MCU
  1846. psABI. It requires the <samp>-m32</samp> option to be turned on.
  1847. </p>
  1848. </dd>
  1849. <dt><code>-mno-red-zone</code></dt>
  1850. <dd><a name="index-mno_002dred_002dzone"></a>
  1851. <a name="index-mred_002dzone"></a>
  1852. <p>Do not use a so-called &ldquo;red zone&rdquo; for x86-64 code. The red zone is mandated
  1853. by the x86-64 ABI; it is a 128-byte area beyond the location of the
  1854. stack pointer that is not modified by signal or interrupt handlers
  1855. and therefore can be used for temporary data without adjusting the stack
  1856. pointer. The flag <samp>-mno-red-zone</samp> disables this red zone.
  1857. </p>
  1858. </dd>
  1859. <dt><code>-mcmodel=small</code></dt>
  1860. <dd><a name="index-mcmodel_003dsmall-3"></a>
  1861. <p>Generate code for the small code model: the program and its symbols must
  1862. be linked in the lower 2 GB of the address space. Pointers are 64 bits.
  1863. Programs can be statically or dynamically linked. This is the default
  1864. code model.
  1865. </p>
  1866. </dd>
  1867. <dt><code>-mcmodel=kernel</code></dt>
  1868. <dd><a name="index-mcmodel_003dkernel"></a>
  1869. <p>Generate code for the kernel code model. The kernel runs in the
  1870. negative 2 GB of the address space.
  1871. This model has to be used for Linux kernel code.
  1872. </p>
  1873. </dd>
  1874. <dt><code>-mcmodel=medium</code></dt>
  1875. <dd><a name="index-mcmodel_003dmedium-1"></a>
  1876. <p>Generate code for the medium model: the program is linked in the lower 2
  1877. GB of the address space. Small symbols are also placed there. Symbols
  1878. with sizes larger than <samp>-mlarge-data-threshold</samp> are put into
  1879. large data or BSS sections and can be located above 2GB. Programs can
  1880. be statically or dynamically linked.
  1881. </p>
  1882. </dd>
  1883. <dt><code>-mcmodel=large</code></dt>
  1884. <dd><a name="index-mcmodel_003dlarge-3"></a>
  1885. <p>Generate code for the large model. This model makes no assumptions
  1886. about addresses and sizes of sections.
  1887. </p>
  1888. </dd>
  1889. <dt><code>-maddress-mode=long</code></dt>
  1890. <dd><a name="index-maddress_002dmode_003dlong"></a>
  1891. <p>Generate code for long address mode. This is only supported for 64-bit
  1892. and x32 environments. It is the default address mode for 64-bit
  1893. environments.
  1894. </p>
  1895. </dd>
  1896. <dt><code>-maddress-mode=short</code></dt>
  1897. <dd><a name="index-maddress_002dmode_003dshort"></a>
  1898. <p>Generate code for short address mode. This is only supported for 32-bit
  1899. and x32 environments. It is the default address mode for 32-bit and
  1900. x32 environments.
  1901. </p></dd>
  1902. </dl>
  1903. <hr>
  1904. <div class="header">
  1905. <p>
  1906. Next: <a href="x86-Windows-Options.html#x86-Windows-Options" accesskey="n" rel="next">x86 Windows Options</a>, Previous: <a href="VxWorks-Options.html#VxWorks-Options" accesskey="p" rel="prev">VxWorks Options</a>, Up: <a href="Submodel-Options.html#Submodel-Options" accesskey="u" rel="up">Submodel Options</a> &nbsp; [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Option-Index.html#Option-Index" title="Index" rel="index">Index</a>]</p>
  1907. </div>
  1908. </body>
  1909. </html>