NASM, though it attempts to avoid the bureaucracy of assemblers like MASM and TASM, is nevertheless forced to support a few directives. These are described in this chapter.
NASM's directives come in two types: user-level directives and primitive directives. Typically, each directive has a user-level form and a primitive form. In almost all cases, we recommend that users use the user-level forms of the directives, which are implemented as macros which call the primitive forms.
Primitive directives are enclosed in square brackets; user-level directives are not.
In addition to the universal directives described in this chapter, each object file format can optionally supply extra directives in order to control particular features of that file format. These format-specific directives are documented along with the formats that implement them, in chapter 8.
BITS
: Target Processor ModeThe BITS
directive specifies whether NASM should generate
code designed to run on a processor operating in 16-bit mode, 32-bit mode
or 64-bit mode. The syntax is BITS XX
, where XX is 16, 32 or
64.
In most cases, you should not need to use BITS
explicitly.
The aout
, coff
, elf*
,
macho
, win32
and win64
object
formats, which are designed for use in 32-bit or 64-bit operating systems,
all cause NASM to select 32-bit or 64-bit mode, respectively, by default.
The obj
object format allows you to specify each segment you
define as either USE16
or USE32
, and NASM will
set its operating mode accordingly, so the use of the BITS
directive is once again unnecessary.
The most likely reason for using the BITS
directive is to
write 32-bit or 64-bit code in a flat binary file; this is because the
bin
output format defaults to 16-bit mode in anticipation of
it being used most frequently to write DOS .COM
programs, DOS
.SYS
device drivers and boot loader software.
The BITS
directive can also be used to generate code for a
different mode than the standard one for the output format.
You do not need to specify BITS 32
merely in order
to use 32-bit instructions in a 16-bit DOS program; if you do, the
assembler will generate incorrect code because it will be writing code
targeted at a 32-bit platform, to be run on a 16-bit one.
When NASM is in BITS 16
mode, instructions which use 32-bit
data are prefixed with an 0x66 byte, and those referring to 32-bit
addresses have an 0x67 prefix. In BITS 32
mode, the reverse is
true: 32-bit instructions require no prefixes, whereas instructions using
16-bit data need an 0x66 and those working on 16-bit addresses need an
0x67.
When NASM is in BITS 64
mode, most instructions operate the
same as they do for BITS 32
mode. However, there are 8 more
general and SSE registers, and 16-bit addressing is no longer supported.
The default address size is 64 bits; 32-bit addressing can be selected
with the 0x67 prefix. The default operand size is still 32 bits, however,
and the 0x66 prefix selects 16-bit operand size. The REX
prefix is used both to select 64-bit operand size, and to access the new
registers. NASM automatically inserts REX prefixes when necessary.
When the REX
prefix is used, the processor does not know
how to address the AH, BH, CH or DH (high 8-bit legacy) registers. Instead,
it is possible to access the the low 8-bits of the SP, BP SI and DI
registers as SPL, BPL, SIL and DIL, respectively; but only when the REX
prefix is used.
The BITS
directive has an exactly equivalent primitive
form, [BITS 16]
, [BITS 32]
and
[BITS 64]
. The user-level form is a macro which has no
function other than to call the primitive form.
Note that the space is necessary, e.g. BITS32
will
not work!
USE16
& USE32
: Aliases for BITSThe `USE16
' and `USE32
' directives can be used
in place of `BITS 16
' and `BITS 32
', for
compatibility with other assemblers.
DEFAULT
: Change the assembler defaultsThe DEFAULT
directive changes the assembler defaults.
Normally, NASM defaults to a mode where the programmer is expected to
explicitly specify most features directly. However, this is occasionally
obnoxious, as the explicit form is pretty much the only one one wishes to
use.
Currently, DEFAULT
can set REL
&
ABS
and BND
& NOBND
.
REL
& ABS
: RIP-relative addressingThis sets whether registerless instructions in 64-bit mode are
RIP
–relative or not. By default, they are absolute
unless overridden with the REL
specifier (see
section 3.3). However, if
DEFAULT REL
is specified, REL
is default, unless
overridden with the ABS
specifier, except when used with
an FS or GS segment override.
The special handling of FS
and GS
overrides
are due to the fact that these registers are generally used as thread
pointers or other special functions in 64-bit mode, and generating
RIP
–relative addresses would be extremely confusing.
DEFAULT REL
is disabled with DEFAULT ABS
.
BND
& NOBND
: BND
prefixIf DEFAULT BND
is set, all bnd-prefix available
instructions following this directive are prefixed with bnd. To override
it, NOBND
prefix can be used.
DEFAULT BND call foo ; BND will be prefixed nobnd call foo ; BND will NOT be prefixed
DEFAULT NOBND
can disable DEFAULT BND
and then
BND
prefix will be added only when explicitly specified in
code.
DEFAULT BND
is expected to be the normal configuration for
writing MPX-enabled code.
SECTION
or SEGMENT
: Changing and Defining SectionsThe SECTION
directive (SEGMENT
is an exactly
equivalent synonym) changes which section of the output file the code you
write will be assembled into. In some object file formats, the number and
names of sections are fixed; in others, the user may make up as many as
they wish. Hence SECTION
may sometimes give an error message,
or may define a new section, if you try to switch to a section that does
not (yet) exist.
The Unix object formats, and the bin
object format (but see
section 8.1.3), all support the
standardized section names .text
, .data
and
.bss
for the code, data and uninitialized-data sections. The
obj
format, by contrast, does not recognize these section
names as being special, and indeed will strip off the leading period of any
section name that has one.
__?SECT?__
MacroThe SECTION
directive is unusual in that its user-level
form functions differently from its primitive form. The primitive form,
[SECTION xyz]
, simply switches the current target section to
the one given. The user-level form, SECTION xyz
, however,
first defines the single-line macro __?SECT?__
to be the
primitive [SECTION]
directive which it is about to issue, and
then issues it. So the user-level directive
SECTION .text
expands to the two lines
%define __?SECT?__ [SECTION .text] [SECTION .text]
Users may find it useful to make use of this in their own macros. For
example, the writefile
macro defined in
section 4.5.3 can be usefully
rewritten in the following more sophisticated form:
%macro writefile 2+ [section .data] %%str: db %2 %%endstr: __?SECT?__ mov dx,%%str mov cx,%%endstr-%%str mov bx,%1 mov ah,0x40 int 0x21 %endmacro
This form of the macro, once passed a string to output, first switches
temporarily to the data section of the file, using the primitive form of
the SECTION
directive so as not to modify
__?SECT?__
. It then declares its string in the data section,
and then invokes __?SECT?__
to switch back to
whichever section the user was previously working in. It thus
avoids the need, in the previous version of the macro, to include a
JMP
instruction to jump over the data, and also does not fail
if, in a complicated OBJ
format module, the user could
potentially be assembling the code in any of several separate code
sections.
ABSOLUTE
: Defining Absolute LabelsThe ABSOLUTE
directive can be thought of as an alternative
form of SECTION
: it causes the subsequent code to be directed
at no physical section, but at the hypothetical section starting at the
given absolute address. The only instructions you can use in this mode are
the RESB
family.
ABSOLUTE
is used as follows:
absolute 0x1A kbuf_chr resw 1 kbuf_free resw 1 kbuf resw 16
This example describes a section of the PC BIOS data area, at segment
address 0x40: the above code defines kbuf_chr
to be 0x1A,
kbuf_free
to be 0x1C, and kbuf
to be 0x1E.
The user-level form of ABSOLUTE
, like that of
SECTION
, redefines the __?SECT?__
macro when it
is invoked.
STRUC
and ENDSTRUC
are defined as macros which
use ABSOLUTE
(and also __?SECT?__
).
ABSOLUTE
doesn't have to take an absolute constant as an
argument: it can take an expression (actually, a critical expression: see
section 3.8) and it can be a value
in a segment. For example, a TSR can re-use its setup code as run-time BSS
like this:
org 100h ; it's a .COM program jmp setup ; setup code comes last ; the resident part of the TSR goes here setup: ; now write the code that installs the TSR here absolute setup runtimevar1 resw 1 runtimevar2 resd 20 tsr_end:
This defines some variables `on top of' the setup code, so that after the setup has finished running, the space it took up can be re-used as data storage for the running TSR. The symbol `tsr_end' can be used to calculate the total size of the part of the TSR that needs to be made resident.
EXTERN
: Importing Symbols from Other ModulesEXTERN
is similar to the MASM directive EXTRN
and the C keyword extern
: it is used to declare a symbol which
is not defined anywhere in the module being assembled, but is assumed to be
defined in some other module and needs to be referred to by this one. Not
every object-file format can support external variables: the
bin
format cannot.
The EXTERN
directive takes as many arguments as you like.
Each argument is the name of a symbol:
extern _printf extern _sscanf,_fscanf
Some object-file formats provide extra features to the
EXTERN
directive. In all cases, the extra features are used by
suffixing a colon to the symbol name followed by object-format specific
text. For example, the obj
format allows you to declare that
the default segment base of an external should be the group
dgroup
by means of the directive
extern _variable:wrt dgroup
The primitive form of EXTERN
differs from the user-level
form only in that it can take only one argument at a time: the support for
multiple arguments is implemented at the preprocessor level.
You can declare the same variable as EXTERN
more than once:
NASM will quietly ignore the second and later redeclarations.
If a variable is declared both GLOBAL
and
EXTERN
, or if it is declared as EXTERN
and then
defined, it will be treated as GLOBAL
. If a variable is
declared both as COMMON
and EXTERN
, it will be
treated as COMMON
.
REQUIRED
: Unconditionally Importing Symbols from Other ModulesThe REQUIRED
keyword is similar to EXTERN
one.
The difference is that the EXTERN
keyword as of version 2.15
does not generate unknown symbols as that prevents using common header
files, as it might cause the linker to pull in a bunch of unnecessary
modules.
If the old behavior is required, use REQUIRED
keyword
instead.
GLOBAL
: Exporting Symbols to Other ModulesGLOBAL
is the other end of EXTERN
: if one
module declares a symbol as EXTERN
and refers to it, then in
order to prevent linker errors, some other module must actually
define the symbol and declare it as GLOBAL
. Some
assemblers use the name PUBLIC
for this purpose.
GLOBAL
uses the same syntax as EXTERN
, except
that it must refer to symbols which are defined in the same module
as the GLOBAL
directive. For example:
global _main _main: ; some code
GLOBAL
, like EXTERN
, allows object formats to
define private extensions by means of a colon. The ELF object format, for
example, lets you specify whether global data items are functions or data:
global hashlookup:function, hashtable:data
Like EXTERN
, the primitive form of GLOBAL
differs from the user-level form only in that it can take only one argument
at a time.
COMMON
: Defining Common Data AreasThe COMMON
directive is used to declare common
variables. A common variable is much like a global variable declared
in the uninitialized data section, so that
common intvar 4
is similar in function to
global intvar section .bss intvar resd 1
The difference is that if more than one module defines the same common
variable, then at link time those variables will be merged, and
references to intvar
in all modules will point at the same
piece of memory.
Like GLOBAL
and EXTERN
, COMMON
supports object-format specific extensions. For example, the
obj
format allows common variables to be NEAR or FAR, and the
ELF format allows you to specify the alignment requirements of a common
variable:
common commvar 4:near ; works in OBJ common intarray 100:4 ; works in ELF: 4 byte aligned
Once again, like EXTERN
and GLOBAL
, the
primitive form of COMMON
differs from the user-level form only
in that it can take only one argument at a time.
STATIC
: Local Symbols within ModulesOpposite to EXTERN
and GLOBAL
,
STATIC
is local symbol, but should be named according to the
global mangling rules (named by analogy with the C keyword
static
as applied to functions or global variables).
static foo foo: ; codes
Unlike GLOBAL
, STATIC
does not allow object
formats to accept private extensions mentioned in
section 7.7.
(G|L)PREFIX
, (G|L)POSTFIX
: Mangling SymbolsPREFIX
, GPREFIX
, LPREFIX
,
POSTFIX
, GPOSTFIX
, and LPOSTFIX
directives can prepend or append a string to a certain type of symbols,
normally to fit specific ABI conventions
PREFIX
|GPREFIX
: Prepend the argument to all
EXTERN
, COMMON
, STATIC
, and
GLOBAL
symbols.
LPREFIX
: Prepend the argument to all other symbols such as
local labels and backend defined symbols.
POSTFIX
|GPOSTFIX
: Append the argument to all
EXTERN
, COMMON
, STATIC
, and
GLOBAL
symbols.
LPOSTFIX
: Append the argument to all other symbols such as
local labels and backend defined symbols.
These are macros implemented as pragmas, and using %pragma
syntax can be restricted to specific backends (see
section 4.12):
%pragma macho lprefix L_
Command line options are also available. See also section 2.1.28.
One example which supports many ABIs:
; The most common conventions %pragma output gprefix _ %pragma output lprefix L_ ; ELF uses a different convention %pragma elf gprefix ; empty %pragma elf lprefix .L
Some toolchains is aware of a particular prefix for its own optimization
options, such as dead code elimination. For instance, the Mach-O binary
format has a linker convention that uses a simplistic naming scheme to
chunk up sections into smaller subsections, each of which may be
eliminated. When the subsections_via_symbols
directive
(section 8.8.4) is declared, each
symbol is the start of a separate block. The subsection is, then, defined
to include sections before the one that starts with a 'L'.
LPREFIX
is useful here to mark all local symbols with the 'L'
prefix to be excluded to the meta section. It converts local symbols
compatible with the particular toolchain. Note that local symbols declared
with STATIC
(section 7.9) are
excluded from the symbol mangling and also not marked as global.
CPU
: Defining CPU DependenciesThe CPU
directive restricts assembly to those instructions
which are available on the specified CPU. At the moment, it is primarily
used to enforce unavailable encodings of instructions, such as
5-byte jumps on the 8080.
(If someone would volunteer to work through the database and add proper annotations to each instruction, this could be greatly improved. Please contact the developers to volunteer, see appendix E.)
Current CPU keywords are:
CPU 8086
– Assemble only 8086 instruction set
CPU 186
– Assemble instructions up to the 80186
instruction set
CPU 286
– Assemble instructions up to the 286
instruction set
CPU 386
– Assemble instructions up to the 386
instruction set
CPU 486
– 486 instruction set
CPU 586
– Pentium instruction set
CPU PENTIUM
– Same as 586
CPU 686
– P6 instruction set
CPU PPRO
– Same as 686
CPU P2
– Same as 686
CPU P3
– Pentium III (Katmai) instruction sets
CPU KATMAI
– Same as P3
CPU P4
– Pentium 4 (Willamette) instruction set
CPU WILLAMETTE
– Same as P4
CPU PRESCOTT
– Prescott instruction set
CPU X64
– x86-64 (x64/AMD64/Intel 64) instruction set
CPU IA64
– IA64 CPU (in x86 mode) instruction set
CPU DEFAULT
– All available instructions
CPU ALL
– All available instructions and
flags
All options are case insensitive.
In addition, optional flags can be specified to modify the instruction
selections. These can be combined with a CPU declaration or specified
alone. They can be prefixed by +
(add flag, default),
-
(remove flag) or *
(set flag to default); these
prefixes are "sticky", so:
cpu -foo,bar
means remove both the foo
and bar
options.
If prefixed with no
, it inverts the meaning of the flag,
but this is not sticky, so:
cpu nofoo,bar
means remove the foo
flag but add the bar
flag.
Currently available flags are:
EVEX
– Enable generation of EVEX (AVX-512) encoded
instructions without an explicit {evex}
prefix. Default on.
VEX
– Enable generation of VEX (AVX) or XOP encoded
instructions without an explict {vex}
prefix. Default on.
LATEVEX
– Enable generation of VEX (AVX) encoding of
instructions where the VEX instructions forms were introduced
after the corresponding EVEX (AVX-512) instruction forms without
requiring an explicit {vex}
prefix. This is implicit if the
EVEX
flag is disabled and the VEX
flag is
enabled. Default off.
FLOAT
: Handling of floating-point constantsBy default, floating-point constants are rounded to nearest, and IEEE denormals are supported. The following options can be set to alter this behaviour:
FLOAT DAZ
– Flush denormals to zero
FLOAT NODAZ
– Do not flush denormals to zero
(default)
FLOAT NEAR
– Round to nearest (default)
FLOAT UP
– Round up (toward +Infinity)
FLOAT DOWN
– Round down (toward –Infinity)
FLOAT ZERO
– Round toward zero
FLOAT DEFAULT
– Restore default settings
The standard macros __?FLOAT_DAZ?__
,
__?FLOAT_ROUND?__
, and __?FLOAT?__
contain the
current state, as long as the programmer has avoided the use of the
brackeded primitive form, ([FLOAT]
).
__?FLOAT?__
contains the full set of floating-point
settings; this value can be saved away and invoked later to restore the
setting.
[WARNING]
: Enable or disable warningsThe [WARNING]
directive can be used to enable or disable
classes of warnings in the same way as the -w
option, see
appendix A for more details about warning
classes.
[warning +
warning-class]
enables
warnings for warning-class.
[warning -
warning-class]
disables
warnings for warning-class.
[warning *
warning-class]
restores
warning-class to the original value, either the default value or
as specified on the command line.
[warning push]
saves the current warning state on a stack.
[warning pop]
restores the current warning state from the
stack.
The [WARNING]
directive also accepts the all
,
error
and error=
warning-class
specifiers, see section 2.1.26.
No "user form" (without the brackets) currently exists.