In order to give you more control over how an individual register or instruction is used, you have an array of masks, selectors, and modifiers to manipulate exactly how an instruction works, and what register channels are used and/or written to.
source negation
Negation can be used to negate an entire source register before it is used. Source negation is indicated by placing a minus sign, “-”, in front of the source register to be negated. The source register values are unchanged.
Rules for using source negation:
mov t0, -v0 // t0 = -1.0 * v0
mul t0, -v0, -c3
add r0, v0, -v1
mul r1, 1-v0, -v1 // with invert modifier
texkill –t0 // Error! Text instruction!
source invert
Subtracts all elements of a register from one and uses that as its output. Source negation is indicated by placing a “1 - ”, (Number one followed by a minus sign) in front of the source register to be inverted. The source register values are unchanged.
Rules for using source invert:
mov r0, 1-v0 // swaps colors
mul r1, 1-v0, -v1
source bias
The bias modifier is used for shifting the range of the input register from the [0,1] range to the [-0.5,+0.5] range. The bias modifier is indicated by adding a “_bias” suffix to a register. Essentially the modifier subtracts 0.5 from the register’s values before they are used. Be careful when using this modifier with the color registers, as the range of the color registers is [0,1], and you’ll get an implicit clamping. The source register values are unchanged.
If you use it with a mov_2X instruction/modifier, you can convert a register range from [0,1] to [-1,1], the same as source signed scaling modifier.
Note: If you used the D3DTOP_ADDSIGNED texture operation in one of your DirectX texture stages, the bias modifier performs the same operation.
Rules for using source bias:
// Shift range from [0,1] to [-0.5, 0.5]
mov r0, r0_bias // r0 = r0 - 0.5
// Shift range from [0,1] to [-0.5, 0.5]
// Then shift sign
mov r0, -r0_bias // r0 = 0.5 - r0
// shift range from [0,1] to [-1,1]
mov_x2 r0, r0_bias
source signed scaling
The signed scaling modifier (also called “bias times two”) is used for shifting the range of the input register from the [0,1] range to the [-1,+1] range, typically when you want to use the full signed range registers are capable of. The bias modifier is indicated by adding a ” _bx2” suffix to a register. Essentially the modifier subtracts 0.5 from the register’s values and then multiplies that result by 2 before they are used. The source register values are unchanged.
For PS 1.0 & 1.1 arguments for the texm3x2* and texm3x3* instructions can use the _bx2 modifier.
For PS 1.2 & 1.3 arguments for any tex* instruction can use the _bx2 modifier.
Note: If you used the D3DTOP_ADDSIGNED2X texture operation in one of your DirectX texture stages, the signed scaling modifier performs the same operation.
Rules for using signed source scaling:
mov t0, t0_bx2 // t0 = 2.0* (t0 - 0.5)
mov r0, r0_bx2 // darken dull colors
source scale 2X

The scale by two modifier is used for shifting the range of the input register from the [0,1] range to the [-1,+1] range, typically when you want to use the full signed range registers are capable of. The scale by two modifier is indicated by adding a ” _x2” suffix to a register. Essentially the modifier multiplies that register values by 2 before they are used. The source register values are unchanged.
Rules for using scale by two:
mov r0, r0_x2 // 2x r0
source
replication/selection
Just as vertex shader let you select the particular elements of a source register to use, so do pixel shaders with some differences. You can only select a single element and that element will be replicated to all channels. You specify a channel to replicate by adding a “.n” suffix to the register, where n is r, g, b or a. (or x, y, z and w).
|
Source Register Selectors |
||||
|
|
Register channel |
|||
|
PS Version |
red |
green |
blue |
alpha |
|
1.0 |
|
|
|
x |
|
1.1 |
|
|
x |
x |
|
1.2 |
|
|
x |
x |
|
1.3 |
|
|
x |
x |
|
1.4 Phase 1 |
x |
x |
x |
x |
|
1.4 Phase 2 |
x |
x |
x |
x |
|
2.0 |
x |
x |
x |
x |
mov r0, v0.a // ps.1.0
mov r0.a, v0.b // ps.1.1 ps.1.2 ps.1.3
// these commands are an error if not ps.1.4
mov r0, v0.b
mov r0, v0.g
texture register modifiers
(PS 1.4 only)

PS 1.4 has its own set of modifiers for texture instructions. Since the only texcrd and texld instructions are used to load or sample textures with PS 1.4, these modifiers are unique to those instructions. Note that you can interchange rgba syntax with xyzw syntax, thus _dz is the same as _db.
These allow you to do to swizzle the source register to a limited extent. The syntax is they are added as a suffix on the register. They can be used anytime texcrd or texld can be used. Since the instructions will only read three components, these selectors allow you to fill the register’s last two channels with either the .z value or the .w value, instead of leaving it uninitialized. You can mix the .xyw selector with the _dw modifier. You can use the _dz modifier only on a temporary register, but not more than twice per shader. This allows you to map a 4D texture into 3D texture space so it can be manipulated in the shader.
|
PS 1.4 Source Register Selectors |
|
|
Description |
Syntax |
|
Source register looks like .xyzz |
.xyz |
|
Source register looks like .xyww |
.xyw |
texld r0, t0.xyz // r0.xyzw = t0.xyzz
texld r0, t0.rgb // alternate syntax
texld r0, t0_dz.xyz // with a register modifier
Once you use a particular selector on a texture register, you cannot use a different one on the same source register in the same shader. For example, the following is a legal set of instructions. Register t2 is used with the .xyz selector twice.
texld r0, t2.xyz
texld r1, t2.xyz
However, the following, which uses register t2 with the .xyz selector and then the .xyw selector is in error.
texld r0, t2.xyz
texld r1, t2.xyw // Error register t2
// used again but with different selector.
These modifiers allow you to do a perspective divide (either by the .z or the .w element) in the pixel shader. The syntax is they are added as a suffix on the register. They can be used anytime texcrd or texld can be used. Only the .xy channel of the destination will be modified. If the divisor is zero, then the destination is set to one. The _dw modifier is for Phase 1, the _dz modifier is for Phase 2.
|
PS 1.4 Source Register Modifiers |
|
|
Description |
Syntax |
|
Divide x,y by z |
_dz |
|
Divide x,y by w |
_dw |
texld r0, t0_dz
// these are the same as above
texld r0, t0_dz.xyz
texld r0, t0_db.xyz
texld r0, t0_db.rgb
.
You can mix the .xyw selector with the _dw modifier. The _dw modifier can be used as many times as necessary in Phase 1. After Phase 1 the .w channel is invalid, thus you can’t use the modifier. You can use the _dz modifier only on a temporary register (thus, only in Phase 2), and not more than twice per shader. The following shows what phase an instruction would be valid or invalid for. I’ve ignored usage restrictions on texture register, etc.
// Phase 1
texld r0, t0_dz // Invalid – dz Phase 2 only
texld r0, t0_dw // Valid
phase
// Phase 2
texld r0, t1_dz.xyz // Invalid – text register
texld r0, t1_db.xyz // Invalid - _db == _dz
texld r0, r0_dz.xyz // Valid – temp register
texld r0, t0_dw.xyz // Invalid – w is undefined
These write masks control which channel(s) are written to. They can be used anytime texcrd or texld can be used. No mask is the same as specifying all. Only the combinations shown in the table can be used.
|
PS 1.4 Destination Write Masks |
|
|
Description |
Syntax |
|
Writes to the xyzw channels |
xyzw |
|
Writes to the xyz channels |
xyz |
|
Writes to the xy channels |
xy |
texcrd r0.xy, t0_dz
texcrd r0.rg, t0_dz // same as previous
texcrd r0.xyzw, t0_dz
texcrd r0, t0_db // same as previous
destination write mask
Note the word destination above. Masks can only be used to select which elements of a register are to be written to. Unlike vertex shaders however, all you can do is select all channels (.rgba), color channels only (.rgb), or the alpha channel (.rgb) – though later pixel shaders allow more control. This mimics the traditional lighting pipeline in which you can have color and alpha channels processed separately. Omitting a mask is the same as specifying the full mask. The alpha mask is also referred to as the scalar mask, since it uses a scalar value. The color write mask is sometimes referred to as the vector mask. An alternate syntax is to use .xyzw instead of .rgba.
Destination write masks are supported only for arithmetic instructions only with the exception of the texcrd and texld instructions. The dp3 instruction can only use .rgb or .rgba masks for PS 1.0 – 1.3.
Destination masks are particularly important when you start getting set up for instruction pairing.
Note that with PS 1.4 shaders you have the ability to operate on individual channels, giving you a lot more flexibility.
|
Destination write mask Descriptions |
|
|
Mask |
Operation |
|
.rgb |
The operation works on the color channel (rgb) and is scheduled for execution in the vector pipeline. |
|
.a |
The operation works on the alpha channel and is scheduled for execution in the scalar pipeline. |
|
.r, .g, .b |
Let’s you select the destination channel to write to. |
|
.rgba |
The operation works on the color and alpha channel and is scheduled for parallel execution in the vector and scalar pipelines. This is the default if a mask is not specified. |
|
.(r)(g)(b)(a) |
Arbitrary mask. Must be listed in .rgba order but can use any of the masks. |
|
Destination write mask Selectors |
|||||||
|
|
Selector |
||||||
|
PS Version |
r |
g |
b |
a |
rgb |
rgba |
(r)(g)(b)(a) |
|
1.0 |
|
|
|
x |
x |
x |
|
|
1.1 |
|
|
|
x |
x |
x |
|
|
1.2 |
|
|
|
x |
x |
x |
|
|
1.3 |
|
|
|
x |
x |
x |
|
|
1.4 Phase 1 |
x |
x |
x |
x |
x |
x |
x |
|
1.4 Phase 2 |
x |
x |
x |
x |
x |
x |
x |
|
2.0 |
x |
x |
|
x |
x |
x |
x |
Here are some examples of using the write mask.
// color channel is modulated
mul r0.rgb, t0, v0
// alpha is added using a different source register
add r0.a, t1, v1
//
mul r0.rgb, t0, v0
+add r0.a, t0, v0 // note instruction pairing
// variations that have the same affect
// no masks is equvalent to
mul r0, t0, v0
mul r0.rgba, t0, v0 // full specification
Note that specifying exactly the same operation on the color and alpha channel (including registers) will automatically cause pairing to occur. The following code fragments cause the same code to be assembled in the pixel shader.
// no masks, a single operation
mul r0, t0, v0
This is the same as writing;
// full mask with a single operation
mul r0.rgba, t0, v0
This is the same as writing;
// color and alpha mask with the same operation
mul r0.rgba, t0, v0 // on color
mul r0.a, t0, v0 // on alpha, same arguments
except it takes up an extra slot and will run slower. However, you can rewrite it as;
// color and alpha mask with the same operation
// with pairing
mul r0.rgba, t0, v0 // on color
+mul r0.a, t0, v0 // on alpha, same arguments
And now you’ve paired the instructions since you’ve freed one slot and reduced the run time. The point being that now you can change the alpha manipulations and perform something different in the scalar (alpha) pipe.
instruction modifiers
Note that these are placed on the actual instructions, not the arguments. The pixel shader assembler support shift/scale modifier flags and a saturation modifier flag that affects the generated output result. The modifiers and be though of as shift left (power-of-two multiply), shift right (power-of-two divide), and saturate – clamp output range to [0,1]
Rules for using instruction modifiers:
|
Instruction modifiers Description |
|
|
Modifier |
Operation |
|
_2x |
2X modifier. Multiply the results by 2 before storing in the register. |
|
_4x |
4X modifier. Multiply the results by 4 before storing in the register. |
|
_8x |
8X modifier. Multiply the results by 8 before storing in the register. |
|
_d2 |
Half modifier. Divide the results by 2 before storing in the register. |
|
_d4 |
Quarter modifier. Divide the results by 4 before storing in the register. |
|
_d8 |
Eighth modifier. Divide the results by 8 before storing in the register. |
|
_sat |
Saturation modifier. Clamps the results to the range [0,1] before storing. |
|
Instruction modifiers usage |
|||||||
|
|
Modifier |
||||||
|
PS Version |
_x2 |
_x4 |
_x8 |
_d2 |
_d4 |
_d8 |
_sat |
|
1.0 |
X |
X |
|
X |
|
|
X |
|
1.1 |
X |
X |
|
X |
|
|
X |
|
1.2 |
X |
X |
|
X |
|
|
X |
|
1.3 |
X |
X |
|
X |
|
|
X |
|
1.4 Phase 1 |
X |
X |
X |
X |
x |
X |
X |
|
1.4 Phase 2 |
X |
X |
X |
X |
x |
X |
X |
|
2.0 |
x |
x |
(?) |
x |
(?) |
(?) |
X |
Here are some examples of using instruction modifiers.
add_x2 r0, v1, v1
add_d2 r0, v1, v0
add_sat r0, v1, v0
add_x2_sat r0, v1, v1
add_d2_sat r0, v1, v1
add_sat_d2 r0, v1, v1 // Error! _sat must be last
partial precision
declaration modifier (PS 2.0)

DirectX 9 introduced the partial pre