Click on an instruction to jump to that page.
|
|
|||
|
|
|||
|
|
|||
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
vs 2.0 |
This macro computes the absolute value of the input register.
One slot
abs Dest0, Source0
This macro is equivalent to;
max Dest0, Source0, -Source0
which you can use if your using a pre-vertex shader 2.0 shader. In any case, you’ll end up with the absolute value of the Source0 in Dest0.
Setup:
One source register, Source0.
Results:
Dest0 is filled with the absolute value of Source0.
abs r0 , r0
abs r0.z, r0.z
|
vs 1.0, 1.1, 2.0 |
Adds two sources into the destination register.
One slot
add Dest0, Source0, Source1
Adds the Source0 and Source0 registers and places the result in the Dest0 register.
Setup:
Two source registers, Source0 and Source1.
Results:
Each element of Dest0 is filled with the element-by-element addition of the elements of Source0 and Source1.
add r0 , r0 , c2
add r0.z, r0.z, -r0.z
SetSourceRegisters();
//
Simulate the add instruction
TempReg.x
= Source0.x + Source1.x;
TempReg.y
= Source0.y + Source1.y;
TempReg.z
= Source0.z + Source1.z;
TempReg.w
= Source0.w + Source1.w;
WriteDestinationRegisters();
|
vs 2.0 |
Makes an unconditional function call to the instruction label.
One slot
call_InstructionLabelID
Pushed the address of the following instruction onto the internal shader stack, and then sets the current instruction address to address of the instruction that follows the label instruction with the name InstructionLabeIID. The instruction label ID will be an integer in the range [1,16]. TODO – particular format of the statement??
Typically you’d create a shader subroutine that terminates with the ret instruction.
Setup:
Requires a valid, existing instruction label. .
Results:
The shader execution is transferred the instruction following the instruction label.
call_1
call_16
call_Fred // Error! Invalid label
call_0 // Error! Invalid label (out of range)
//
Simulate the call instruction
//
make a cast to a bare function pointer
typedef
(void (*fp)(void));
//
take address of the label
fp
pFP = (fp)IntructionLabelID;
pFP();
// call the function
//
returns here only when ret is executed
|
vs 2.0 |
Call if Not-Zero. Makes a function call to the instruction label.
One slot
callnz InstructionLabelID
BoolSource0
If the boolean register Source0 is not zero, then the address of the following instruction is pushed onto the internal shader stack, and then the current instruction address is set to address of the instruction that follows the label instruction with the name InstructionLabeIID. The instruction label ID will be an integer in the range [1,16].
Typically you’d create a shader subroutine that terminates with the ret instruction.
Setup:
Source0 is a Boolean register. Requires a valid, existing instruction label. .
Results:
If the source register is not zero, the shader exectution is transferred the instruction following the instruction label.
callnz 1 b0 // transfer execution to label1 if = != b0
callnz 2 r0 // Error! Not a Boolean register
//
Simulate the callnz instruction
//
make a cast to a bare function pointer
typedef
(void (*fp)(void));
if
( 0 != Boolean argument )
{
fp pFP = (fp)IntructionLabelID;
pFP(); // call the function
}
|
vs 2.0 |
The three component cross product computed.
Two slots
crs Dest0, Source0,
Source1
Computes the three component cross product using the right-hand rule. There are fairly severe restrictions on the use of swizzles. The w element of all registers are ignored.
This macro is equivalent to;
mul Dest0.xyz, Source0.yzxw, Source1.zxyw
mad Dest0.xyz,
-Source1.yzxw, Source0.zxyw, Dest0
Setup:
Two source registers, Source0 and Source1. These registers must not be the same as the destination register. The source registers must not have any swizzles. TODO – is this checked or just gonna produce wrong values?
The destination register must have a destination mask, and that mask must not contain a
reference to the w
element of the destination register.
Results:
The cross product of the two input registers is stored into the specified elements of the destination register.
crs r0.xyz, r1., r2 // fill r0 with dp3
|
vs 2.0 |
Declare. Map a vertex element to an input register.
Takes no slots
dcl Dest0
In order to make it easier to optimize and verify shaders VS 2.0 now requires a declaration statement on all input registers. Thus all texture or vertex input registers must be declared before use in the shader. Dest0 will be a specific input register. The partial precision modifier (_pp) can be applied to the declaration statement to indicate a lower precision is acceptable when using this register. You must supply a component mask on Dest0 to indicate which elements are in use and valid. dcl statements must appear before the first executable instruction.
dcl t1.rg // using a 2D texture
dcl t2 // using a 4D texture (default mask)
dcl_pp t3 // indicate partial precision is OK
|
vs 1.0, 1.1, 2.0 |
Sets the value of vertex shader float constants, but leaves it up to the programmer to insert these into the shader code.
No slot
def Dest0, value0, value1, value2, value3
Stores four floating-point values in the elements of Dest0 register. If these instructions are used in a shader, these instructions must follow the vs instruction and precede any other instructions.
Setup:
Four floating-point values separated by commas.
Results:
Has no effect upon the shader code to follow, you must manually insert the returned code fragment into your shader.
Note: If you use the def in a shader then when the shader is compiled you will have to use the 4th parameter returned from D3DXAssembleShader. This parameter will contain an ID3DXBuffer interface, which will contain a compiled shader code fragment. You will have to manually insert this fragment into your shader declaration.
.
def r0, 0.0, 0.5, 0.25 -1.0
def r1, 1.0, 2.0, 5.0, 10.0
|
vs 2.0 |
Sets the value of vertex shader integer constants.
No slot
defi IntDest0, value0, value1, value2, value3
Stores four integer values in the elements of IntDest0 register for use in this shader.
Setup:
Four integer values separated by commas.
Results:
Locally sets these values into the register. A local call takes precedence over an external SetVertexShaderConstantI() call to set a shader constant. The previous values of the register are restored upon exit from the shader.
defi i0, 0, 2, 4, 8
defi i1, -2, -1, 1, 2
|
vs 2.0 |
Sets the value of vertex shader boolean constants.
No slot
defb BoolDest0, value0, value1, value2, value3
Stores four boolean values in the elements of BoolDest0 register for use in this shader. Zero indicates false. Nonzero indicates true.
Setup:
Four booleans separated by commas.
Results:
Locally sets these values into the register. A local call takes precedence over an external SetVertexShaderConstantB() call to set a shader constant. The previous values of the register are restored upon exit from the shader.
defb b0, 0, 1, 0, 2 // false, true, false, true
|
vs 1.0, 1.1, 2.0 |
Three component dot product ( a.k.a Dot-product three) is computed and the result replicated in all specified channels of the destination register.
One slot
dp3 Dest0, Source0, Source1
Computes the dot product of the Source0 and Source1 registers and places the result in the Dest0 register. Only the x,y and z values are used to compute the dot product, the w component is ignored.
Setup:
Two source registers, Source0 and Source1.
Results:
Unless otherwise masked, each element of Dest0 is filled with the dot product of the first three elements of registers Source0 and Source1.
dp3 r0 , v3, c2 // fill r0 with dp3
dp3 r1.x, v3, c2 // just fill r1.x
SetSourceRegisters();
//
Simulate the dp3 instruction
TempReg.x
= TempReg.y = TempReg.z = TempReg.w =
Source0.x * Source1.x +
Source0.y * Source1.y +
Source0.z * Source1.z;
// note w component ignored
WriteDestinationRegisters();
|
vs 1.0, 1.1, 2.0 |
Four component dot product ( a.k.a Dot-product four) is computed and the result stored in all specified channels of the destination register.
One slot
dp4 Dest0, Source0, Source1
Computes the dot product of the Source0 and Source1 registers and places the result in the Dest0 register. If no mask is specified on the destination, then the entire register is filled with the dot product.
Setup:
Two source registers, Source0 and Source1
Results:
Unless otherwise masked, each element of Dest0 is filled with the dot product of the four elements of registers Source0 and Source1.
dp4 r0, v3, c2
dp4 r1.x, v3, c2 // just fill r1.x
SetSourceRegisters();
//
Simulate the dp4 instruction
TempReg.x
= TempReg.y = TempReg.z = TempReg.w =
Source0.x * Source1.x +
Source0.y * Source1.y +
Source0.z * Source1.z +
Source0.w * Source1.w;
WriteDestinationRegisters();
|
vs 1.0, 1.1 |
Computes a distance vector in the format typically used for attenuated lighting calculations.
One slot
dst Dest0, Source0, Source1
Creates a distance vector from a set of distance squared & reciprocal distance values, and put them in a format that can be used for attenuated lighting calculations.
Setup:
Two source registers are required to be set up. Source0 should be set up as [n/a, d2, d2, n/a]. Source1 should be set up as [n/a, 1/d, n/a, 1/d]. . Elements noted as “n/a” are not used and their values are ignored.
Results:
Dest0 will be filled with elements that correspond to [1, d, d2, 1/d]. Dest0.y is computed from the product of Source0.y and Source1.y
dst r2, r0, r1
SetSourceRegisters();
//
Simulate the dst instruction
TempReg.x
= 1;
TempReg.y
= Source0.y * Source1.y;
TempReg.z
= Source0.z;
TempReg.w
= Source1.w;
WriteDestinationRegisters();
|
vs 2.0 |
Provided an alternate path of execution for an if-else-endif block.
One slot
else
Must be inside of an if-endif block. If the Boolean argument of the if statement is false, then the execution will skip to the else instruction and continue to the terminating endif statement.. If the boolean was true then execution will skip over the code enclosed by the else-endif block. There can be only one else statement in an if-endif block.
Setup:
The else statement must be between an if and endif statement.
Results:
If the argument provided to the if statement was false, then the code inside the else-endif block will be executed.
else
|
vs 2.0 |
The termination point for a loop-endloop block.
One slot
endloop
When used with the loop instruction, creates a block of instruction over which execution can be specified a variable number of times.
Setup:
You must have a loop instruction in your shader prior to this instruction.
Results:
When the loop reached the endloop instruction the loop counter (specified in the loop instruction) is incremented by the increment value (also specified in the loop instruction).
endloop
// simulate the endloop
instruction
// assume that
LoopCounter, LoopStep, LoopInterator
// were defined in the
loop instruction and
// StartLoopOffset is the
instruction following
// the loop instruction
LoopCounter += LoopStep;
--LoopInterator;
if ( LoopIterater > 0 )
goto StartLoopOffset
// fall though
|
vs 2.0 |
The termination point for an if-endif or ifc-endif block.
Zero slots
endif
When used with the if or ifc instruction, creates a block of instruction over which execution can be specified a number of times.
Setup:
You must have an if or ifc instruction in your shader prior to this instruction.
Results:
Execution is controlled by the if or ifc instruction that proceeds this instruction. When the argument of that statement is false then execution will jump to the statement following the endif.
if b1
// if b1 != 0, this section gets executed
else // optional else statement
// if b1 = 0, this section gets executed
endif
|
vs 2.0 |
The termination point for a rep-endrep block.
Zero slots
endrep
When used with the rep instruction, creates a block of instruction over which execution can be specified a number of times.
Setup:
You must have a rep instruction in your shader prior to this instruction.
Results:
Execution is controlled by the rep instruction that precedes this instruction. When the iteration count of that statement is zero then execution will jump to the statement following the endrep.
defi i0, 20, 0, 0, 0
rep i0 // i0.x is used = 20
// this section gets executed 20 times
endrep
|
vs 1.0, 1.1, 2.0 |
This macro computes power of two to at least 20-bits of precision. By default, only the source register’s w element is used. The results are replicated in the entire destination register. Note that the expp instruction sets the destination’s w element is set to 1.
Takes at least 12 instruction slots.
exp Dest0, Source0
Calculates for 2Source0.w, and writes the result in Dest0. Unless otherwise specified, Source0.w is the input value, and all elements of Dest0 are written with the exponented value. This is somewhat different from the expp instruction, which always sets Dest0.w to 1. [TODO: what happens for 0 and negative arguments??]
exp r0, c1 // fill all of r0 with exp2(c1.w)
exp r0.x, c1.y // store exp2(c1.y) in r0.x
|
vs 1.0, 1.1, 2.0 |
Computes power of two with the results being broken into a partial precision part and a higher precision integer and fractional parts. This allows you to use the lower precision single element or use a more complicated integer/fractional calculation when you need higher precision. The destination’s w element is set to 1. Only the integer part of the source register’s w element is used. If Source0.w < 0 then the results are undefined.
Note: Don’t confuse this with the exp macro!
One slot
expp Dest0, Source0
Computes low and higher precision values for 2Source0.w, where Dest0.z contains the low precision single element approximation, Dest0.x and Dest0.y contain the integer and fractional parts. Dest0.w is set to 1.
You have a choice in which part of the results to use. There low precision part will contain the exponent of the input value to 10-bits of precision. The two-part higher precision part will contain the exponent of the integer part of the input value, and the fractional part of the input value, which you will have to provide a function to compute the value of 2n for 0 <= n <= 1 to your desired precision, and then add that to the integer’s exponent value.
Setup:
Store the value you want the exponent of in Source0.w. The value should be positive. The other resister elements are ignored.
Results:
Dest0.z will contain a low precision exponential value.
Dest0.x will contain the exponential of the integer part of the input.
Dest0.y will contain the fractional part of the input, not the exponential of the fractional part. You have to do the conversion yourself.
Dest0.w is set to 1.0.
expp r0, r1
//
DirectX 8 version
SetSourceRegisters();
//
Simulate the expp instruction
float
wWhole = Source0.w; // take all
float
wInt = (int)Source0.w; // take integer
part
//
compute the higher-precision parts
TempReg.x
= pow(2,wInt);
TempReg.y
= Source0.w – wInt; // fractional part of w
//
calculate the 2^(Source0.w) then chop
//
to 10 bits precision
TempReg.z
= pow(2,wWhole) & 0xffffff00;
//
set w to 1
TempReg.w
= 1;
WriteDestinationRegisters();
//
DirectX 9 version
SetSourceRegisters();
//
Simulate the logp instruction
float
v = abs(Source0.w); // only positive values
float
logValue;
if
( 0 == v )
{
logValue = MINUS_INFINITY;
}
else
{
logValue = (float)(log(v)/log(2));
logValue = (int)::floor( logValue );
// store low-precision part to 10-bits
unsigned long temp = *(unsigned
long*)&logValue;
logValue = *(float*)& temp &
0xFFFFFF00;
}
TempReg.x = TempReg.y =
TempReg.z = TempReg.w =
LogValue;
WriteDestinationRegisters();
|
vs 1.0, 1.1, 2.0 |
This macro removes the integer part of the input register’s x and y elements and places the fractional remainder into the destination register’s x and y elements. The sign of the results are always positive. A write mask on the destination is required.
Takes 3 instruction slots
frc Dest0, Source0
Takes the fractional parts of Source0’s x and y elements and places them in Dest0’s x and y elements. Dest0’s z and w elements are unaltered. The sign of the input arguments is ignored. You must specify a write mask on Dest0. This can be either xy or just y. (Not just x, for some reason).
Note: Early versions of the SDK documentation incorrectly stated that the entire source register was used, and made no mention of the fact that write masks were required.
frc r0.xy, r1 // use r1.xy and store fractions in r0.xy
// use r1.x and store fraction in r0.y, r0.x (and z & w)
// remain unchanged
frc r0.y , r1.x
// this has no effect on the results, since the
// sign is ignored
frc r0.y , -r1.x
frc r0, r1 // Error! No write mask.
|
vs 2.0 |
The start of an if-else-endif block. Conditionally execute a block of code.
One slot.
if BoolReg0
The argument must be a boolean constant register. There must be a terminating endif that follows the if instruction. The else instruction is optional and must be between the if and endif statements. If the boolean argument is true, then execution will continue immediately after the if statement, until either the else or