Difference between revisions of "FloatingPoint Instruction Extension"
(Imported from http://web.archive.org/web/20100713091534/http://www.alphalinux.org/wiki/index.php?title=FloatingPoint_Instruction_Extension&action=edit) 
(No difference)

Latest revision as of 18:22, 29 August 2019
Initially implemented in the EV6 processor, the FloatingPoint Instruction Extension adds nine instructions to calculate the square root and to transfer data directly between floatingpoint and integer registers.
On previous generations, such as the EV5, in order to copy data from a floatingpoint register to an integer register or vice versa, a temporary memory location must be used. Since memory is orders of magnitude slower than registers, this made transferring data between the register classes very slow and inefficient. Square root calculations must be done (slowly) in software without this extension, making programming more difficult.
Contents
Added Instructions
Mnemonic  Description  FloatingPoint Format 

itofs  Copy Low 32 bits from Integer Register to FloatingPoint Register using S_floating Format  IEEE 
itoft  Copy 64 bits from Integer Register to FloatingPoint Register  IEEE 
itoff  Copy Low 32 bits from Integer Register to FloatingPoint Register using F_floating Format  VAX 
ftois  Copy FloatingPoint Register using S_floating Format to Integer Register  IEEE 
ftoit  Copy 64 bits from FloatingPoint Register to Integer Register  IEEE 
sqrts  Calculate square root of S_floating Formatted Value  IEEE 
sqrtt  Calculate square root of T_floating Formatted Value  IEEE 
sqrtf  Calculate square root of F_floating Formatted Value  VAX 
sqrtg  Calculate square root of G_floating Formatted Value  VAX 
Determining Presence
To determine the presence of FIX, use the amask instruction.
Latency and Slotting
Instruction  Precision  Latency  Slotting 

itof  4  L0, L1  
ftoi  3  FST0, FST1, L0, L1  
fsqrt  Single  18  FA 
Double  33  FA^{[1]}. 
The latency of fsqrt instructions also depends on the slotting of the instruction consuming the result. If the consuming instruction is also slotted FA, then the latency is reduced by 3. That is, the latency of fsqrtt in the following example is 30 since faddt is slotted FA.
fsqrtt $f0,$f1 ; $f1 = sqrt($f0) faddt $f1,$f2,$f3 ; $f3 = $f1 + $f2
Note that fsqrt instructions are not pipelined.
Usage
Data Movement
In order to copy data from an integer register to a floatingpoint register on Alphas prior to the EV6, a sequence similar to the following must be used.
stq Rs,temp ldt Fd,temp
where Rs is the integer register source, temp is the temporary memory location, and Fd is the floatingpoint register source.
With FIX, this code is replaced by
itoft Rs,Fd
Square Root
To calculate the square root on Alphas prior to the EV6, a routine must be written in software.
Using FIX, the square root can be calculated with
sqrtt Fs,Fd
where Fs is the floatingpoint register source and Fd is the floatingpoint register destination.