SH-4 CPU Core Architecture
PRELIMINARY DATA

Issued by the MCDT Documentation Group on behalf of STMicroelectronics

Information furnished is believed to be accurate and reliable. However, STMicroelectronics assumes no responsibility for the consequences of use of such information nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent or patent rights of STMicroelectronics. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. STMicroelectronics products are not authorized for use as critical components in life support devices or systems without the express written approval of STMicroelectronics.

Notice:
When using this document, keep the following in mind:
1. This document may, wholly or partially, be subject to change without notice.
2. All rights are reserved: No one is permitted to reproduce or duplicate, in any form, the whole or part of this document without Hitachi's permission.
3. Hitachi will not be held responsible for any damage to the user that may result from accidents or any other reasons during operation of the user's unit according to this document.
4. Circuitry and other examples described herein are meant merely to indicate the characteristics and performance of Hitachi's semiconductor products. Hitachi assumes no responsibility for any intellectual property claims or other problems that may result from applications based on the examples described herein.
5. No license is granted by implication or otherwise under any patents or other rights of any third party or Hitachi, Ltd.
6. MEDICAL APPLICATIONS: Hitachi's products are not authorized for use in MEDICAL APPLICATIONS without the written consent of the appropriate officer of Hitachi's sales company. Such use includes, but is not limited to, use in life support systems. Buyers of Hitachi's products are requested to notify the relevant Hitachi sales offices when planning to use the products in MEDICAL APPLICATIONS.

The ST logo is a registered trademark of STMicroelectronics.

SuperH is a registered trademark for products originally developed by Hitachi, Ltd. and is owned by Hitachi Ltd.


STMicroelectronics Group of Companies
Australia - Brazil - Canada - China - Finland - France - Germany - Hong Kong - India - Israel - Italy - Japan - Malaysia - Malta - Morocco - Singapore - Spain - Sweden - Switzerland - United Kingdom - U.S.A.

http://www.st.com

STMicroelectronics and Hitachi, Ltd.
SH-4 CPU Core Architecture
ADCS 7182230F
Contents

Preface 

1 Overview  
1.1 SH-4 CPU core features 15  
1.2 Block diagram 19

2 Programming model 21  
2.1 General registers 22  
2.2 System registers 25  
2.3 Control registers 31  
2.4 Floating-point registers 34  
2.5 Memory-mapped registers 36  
2.6 Data format in registers 37  
2.7 Data formats in memory 37  
2.8 Processor states 38  
2.8.1 Reset state 38  
2.8.2 Exception-handling state 38  
2.8.3 Program execution state 38  
2.8.4 Power-down state 39  
2.9 Processor modes 40
3 Memory management unit (MMU) 41
   3.1 Overview 41
   3.2 Role of the MMU 41
   3.3 Register descriptions 42
      3.3.1 Page table entry high register (PTEH) 43
      3.3.2 Page table entry low register (PTEL) 44
      3.3.3 Translation table base register (TTB) 47
      3.3.4 TLB exception address register (TEA) 47
      3.3.5 MMU control register (MMUCR) 47
   3.4 Address space 51
      3.4.1 Physical address space 51
      3.4.2 External memory space 52
      3.4.3 Virtual address space 55
      3.4.4 On-chip RAM space 56
      3.4.5 Address translation 57
      3.4.6 Single virtual memory mode and multiple virtual memory mode 57
      3.4.7 Address space identifier (ASID) 58
   3.5 TLB functions 58
      3.5.1 Unified TLB (UTLB) configuration 58
      3.5.2 Instruction TLB (ITLB) configuration 59
      3.5.3 Address translation method 59
   3.6 MMU functions 62
      3.6.1 MMU hardware management 62
      3.6.2 MMU software management 62
      3.6.3 MMU instruction (LDTLB) 63
      3.6.4 Hardware ITLB miss handling 64
      3.6.5 Avoiding synonym problems 64
   3.7 Handling MMU exceptions 65
      3.7.1 ITLBMULTIHIT 65
      3.7.2 ITLBMISS 65
      3.7.3 EXECPROT 66
3.7.4 OTLBMULTIHIT 67
3.7.5 TLBMISS 67
3.7.6 READPROT 68
3.7.7 FIRSTWRITE 68

3.8 Memory-mapped TLB configuration 69
3.8.1 ITLB address array 70
3.8.2 ITLB data array 1 71
3.8.3 UTLB address array 72
3.8.4 UTLB data array 1 74

4 Caches 75
4.1 Overview 75
4.1.1 Features 75
4.2 Register descriptions 77
4.2.1 Cache control register (CCR) 77
4.2.2 Queue address control register 0 (QACR0) 80
4.2.3 Queue address control register 1 (QACR1) 81
4.3 Operand cache (OC) 82
4.3.1 Configuration 82
4.3.2 Read operation 84
4.3.3 Write operation 86
4.3.4 Write-back buffer 88
4.3.5 Write-through buffer 88
4.3.6 RAM mode 88
4.3.7 OC index mode 91
4.3.8 Coherency between cache and external memory 91
4.3.9 Prefetch operation 91
4.4 Instruction cache (IC) 92
4.4.1 Configuration 92
4.4.2 Read operation 94
4.4.3 IC index mode 94
<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>4.5</td>
<td>Memory-mapped cache configuration</td>
<td>95</td>
</tr>
<tr>
<td>4.5.1</td>
<td>IC address array</td>
<td>95</td>
</tr>
<tr>
<td>4.5.4</td>
<td>IC data array</td>
<td>97</td>
</tr>
<tr>
<td>4.5.5</td>
<td>OC address array</td>
<td>98</td>
</tr>
<tr>
<td>4.5.6</td>
<td>OC data array</td>
<td>99</td>
</tr>
<tr>
<td>4.6</td>
<td>Store queues</td>
<td>101</td>
</tr>
<tr>
<td>4.6.1</td>
<td>SQ configuration</td>
<td>101</td>
</tr>
<tr>
<td>4.6.2</td>
<td>SQ writes</td>
<td>102</td>
</tr>
<tr>
<td>4.6.3</td>
<td>SQ reads (implementation dependant)</td>
<td>102</td>
</tr>
<tr>
<td>4.6.4</td>
<td>Transfer to external memory</td>
<td>102</td>
</tr>
<tr>
<td>5</td>
<td>Exceptions</td>
<td>105</td>
</tr>
<tr>
<td>5.1</td>
<td>Overview</td>
<td>105</td>
</tr>
<tr>
<td>5.2</td>
<td>Register descriptions</td>
<td>105</td>
</tr>
<tr>
<td>5.2.1</td>
<td>Exception event register (EXPEVT)</td>
<td>106</td>
</tr>
<tr>
<td>5.2.2</td>
<td>Interrupt event register (INTEVT)</td>
<td>106</td>
</tr>
<tr>
<td>5.2.3</td>
<td>TRAPA exception register (TRA)</td>
<td>107</td>
</tr>
<tr>
<td>5.3</td>
<td>Exception handling functions</td>
<td>108</td>
</tr>
<tr>
<td>5.3.1</td>
<td>Exception handling flow</td>
<td>108</td>
</tr>
<tr>
<td>5.3.2</td>
<td>Exception handling vector addresses</td>
<td>108</td>
</tr>
<tr>
<td>5.4</td>
<td>Exception types and priorities</td>
<td>109</td>
</tr>
<tr>
<td>5.5</td>
<td>Exception flow</td>
<td>110</td>
</tr>
<tr>
<td>5.5.1</td>
<td>Exception flow</td>
<td>110</td>
</tr>
<tr>
<td>5.5.2</td>
<td>Exception source acceptance</td>
<td>112</td>
</tr>
<tr>
<td>5.5.3</td>
<td>Exception requests and BL bit</td>
<td>114</td>
</tr>
<tr>
<td>5.5.4</td>
<td>Return from exception handling</td>
<td>114</td>
</tr>
<tr>
<td>5.6</td>
<td>Description of exceptions</td>
<td>115</td>
</tr>
<tr>
<td>5.6.1</td>
<td>Resets</td>
<td>115</td>
</tr>
<tr>
<td>5.6.2</td>
<td>General exceptions</td>
<td>120</td>
</tr>
<tr>
<td>5.6.3</td>
<td>Interrupts</td>
<td>138</td>
</tr>
<tr>
<td>5.6.4</td>
<td>Priority order with multiple exceptions</td>
<td>141</td>
</tr>
<tr>
<td>5.7</td>
<td>Usage notes</td>
<td>142</td>
</tr>
</tbody>
</table>
6  Floating-point unit  145
   6.1  Overview  145
   6.2  Floating-point format  146
      6.2.1  Non-numbers (NaN)  148
      6.2.2  Denormalized numbers  149
   6.3  Rounding  149
   6.4  Floating-point exceptions  150
   6.5  Graphics support functions  152
      6.5.1  Geometric operation instructions  152
      6.5.2  Pair single-precision data transfer  154

7  Instruction set  155
   7.1  Execution environment  155
   7.2  Addressing modes  158
   7.3  Instruction set summary  163

8  Instruction specification  179
   8.1  Overview  179
   8.2  Variables and types  180
      8.2.1  Integer  180
      8.2.2  Boolean  181
      8.2.3  Bit-fields  181
      8.2.4  Arrays  181
      8.2.5  Floating point values  182
   8.3  Expressions  182
      8.3.1  Integer arithmetic operators  182
      8.3.2  Integer shift operators  184
      8.3.3  Integer bitwise operators  184
      8.3.4  Relational operators  186
      8.3.5  Boolean operators  186
      8.3.6  Single-value functions  187
9 Instruction descriptions

9.1 Alphabetical list of instructions

9.2 Example instructions

9.3 State changes

9.4 Instruction execution loop

9.5 Instruction execution loop

9.6 Instruction execution loop

9.7 Abstract sequential model

9.8 Floating-point model

9.9 Functions to access SR and FPSCR

9.10 Functions to model floating-point behavior

9.11 Initial conditions

9.12 Instructions special cases and exceptions

8.4 Statements

8.4.1 Undefined behavior

8.4.2 Assignment

8.4.3 Conditional

8.4.4 Repetition

8.4.5 Exceptions

8.4.6 Procedures

8.5 Architectural state

8.6 Memory model

8.6.1 Support functions

8.6.2 Reading memory

8.6.3 Prefetching memory

8.6.4 Writing memory

8.7 Cache model

8.8 Floating-point model

8.8.1 Functions to access SR and FPSCR

8.8.2 Functions to model floating-point behavior

8.8.3 Floating-point special cases and exceptions

8.9 Abstract sequential model

8.9.1 Initial conditions

8.9.2 Instruction execution loop

8.9.3 State changes

8.10 Example instructions

8.11 Functions to access SR and FPSCR

8.12 Functions to model floating-point behavior

8.13 Instructions special cases and exceptions

8.14 Floating-point model

8.15 Abstract sequential model

8.16 architectural state

8.17 Memory model

8.18 Cache model

8.19 Floating-point model

8.20 Abstract sequential model

8.21 Architectural state

8.22 Memory model

8.23 Cache model

8.24 Floating-point model
PRELIMINARY DATA
Preface

This document is part of the SuperH Documentation Suite detailed below. Comments on this or other manuals in the SuperH Documentation Suite should be made by contacting your local STMicroelectronics Limited Sales Office or distributor.

Document identification and control

Each book carries a unique identifier in the form:

ADCS nnnnnnnx

Where, nnnnnnn is the document number and x is the revision.

Whenever making comments on a document the complete identification ADCS nnnnnnnx should be quoted.

ST40 Micro Toolset Getting Started

ADCS 7379953. This manual provides an introduction to the ST40 Micro Toolset and instructions for getting a simple OS21 application run on an STMicroelectronics' MediaRef platform. It also describes how to boot OS21 applications from ROM and how to port applications which use STMicroelectronics' STLite/OS20 operating systems to OS21.

OS21 User's Manual

ADCS 7358306. This manual describes the generic use of OS21 across supported platforms. It describes all the core features of OS21 and their use and details the OS21 function definitions. It also explains how OS21 differs to STLite/OS20, the API targeted at ST20.
OS21 for ST40 User Manual

ADCS 7358673. This manual describes the use of OS21 on ST40 platforms. It describes how specific ST40 facilities are exploited by the OS21 API. It also describes the OS21 board support packages for ST40 platforms.

32-Bit RISC Series, SH-4 CPU Core Architecture

ADCS 7182230. This manual describes the architecture and instruction set of the SH4-1xx (previously known a ST40-C200) core as used by STMicroelectronics.

32-Bit RISC Series, SH-4, ST40 System Architecture

This manual describes the ST40 family system architecture. It is split into four volumes:

ST40 System Architecture - Volume 1 System - ADCS 7153464.
ST40 System Architecture - Volume 3 Video Devices - ADCS 7225754.

Conventions used in this guide

General notation

The notation in this document uses the following conventions:

• Sample code, keyboard input and file names,
• Variables and code variables,
• Equations and math,
• Screens, windows and dialog boxes,
• Instructions.

Hardware notation

The following conventions are used for hardware notation:

• REGISTER NAMES and FIELD NAMES,
• PIN NAMES and SIGNAL NAMES.

**Software notation**

Syntax definitions are presented in a modified Backus-Naur Form (BNF). Briefly:

1. Terminal strings of the language, that is those not built up by rules of the language, are printed in teletype font. For example, *void*.

2. Nonterminal strings of the language, that is those built up by rules of the language, are printed in italic teletype font. For example, *name*.

3. If a nonterminal string of the language starts with a nonitalicized part, it is equivalent to the same nonterminal string without that nonitalicized part. For example, *vspace-name*.

4. Each phrase definition is built up using a double colon and an equals sign to separate the two sides.

5. Alternatives are separated by vertical bars (`|`).

6. Optional sequences are enclosed in square brackets (`[` and `]`).

7. Items which may be repeated appear in braces (`{` and `}`).
Overview

1.1 SH-4 CPU core features

This manual describes the architecture of the SH-4 CPU core. The core is a highly encapsulated design component that can be integrated into any product, you will therefore find no references to clock speeds, system facilities, pin-outs or similar data in this manual. For this information you are referred to the Datasheet and/or System Architecture Manual of the appropriate product.

The SH-4 is a 32-bit RISC (reduced instruction set computer) microprocessor, featuring object code upward-compatibility with Hitachi SuperH SH-1, SH-2, SH-3, and SH-3E microcomputers. It includes an instruction cache, a operand cache that can be switched between copy-back and write-through modes, a 4-entry full-associative instruction TLB (translation look aside buffer), and MMU (memory management unit) with 64-entry full-associative shared TLB.

The SH-4’s 16-bit fixed-length instruction set enables program code size to be reduced by almost 50% compared with 32-bit instructions.

The SH-4 200 series includes an enhanced mode which enables 2-way set associative instruction and operand cache (rather than direct mapped as for the SH-4 100 series and SH-4 200 series when running in default compatibility mode). In particular, the SH4-202 has a 32 Kbyte 2-way operand cache and a 16 Kbyte 2-way instruction cache. On power up this behaves as a 16Kbyte direct mapped operand cache and an 8Kbyte direct mapped instruction cache.

1. Naming conventions:
   SH-4: for non-variant specific information
   SH-4 100/200 series: for series specific features
   SH4-103/202: for variant specific features
The features of the SH-4 CPU core are summarized as follows:

**CPU**
- Original Hitachi SH architecture
- 32-bit internal data bus
- General register file:
  - Sixteen 32-bit general registers (and eight 32-bit shadow registers)
  - Seven 32-bit control registers
  - Four 32-bit system registers
- RISC-type instruction set (upward-compatible with SH Series)
  - Fixed 16-bit instruction length for improved code efficiency
  - Load-store architecture
  - Delayed branch instructions
  - Conditional execution
- Superscalar architecture: Parallel execution of two instructions
- Instruction execution time: Maximum 2 instructions/cycle
- Virtual address space: 4 Gbytes (448-Mbyte external memory space)
- Space identifier ASIDs: 8 bits, 256 virtual address spaces
- On-chip multiplier
- Five-stage pipeline

**FPU**
- On-chip floating-point coprocessor
- Supports single-precision (32 bits) and double-precision (64 bits)
- Supports IEEE754-compliant data types and exceptions
- Two rounding modes: Round to Nearest and Round to Zero
- Handling of denormalized numbers: Truncation to zero or interrupt generation for compliance with IEEE 754
- Floating-point registers:
  - 2 banks of sixteen 32-bit single precision registers or,
  - 2 banks of eight 64-bit double precision registers or,
  - 2 banks of four 128-bit vector registers (each vector is 4 single precision elements)
PRELIMINARY DATA

• 32-bit CPU-FPU floating-point communication register (FPUL)
• Supports FMAC (multiply-and-accumulate) instruction
• Supports FDIV (divide) and FSQRT (square root) instructions
• Supports FLDI0/FLDI1 (load constant 0/1) instructions
• Instruction execution times
  - Latency (FMAC/FADD/FSUB/FMUL): 3 cycles (single-precision), 8 cycles (double-precision)
  - Pitch (FMAC/FADD/FSUB/FMUL): 1 cycle (single-precision), 6 cycles (double-precision)
  - Note: FMAC is supported for single-precision only.
• 3-D graphics instructions (single-precision only):
  - 4-dimensional vector conversion and matrix operations (FTRV): 4 cycles (pitch), 7 cycles (latency)
  - 4-dimensional vector (FIPR) inner product: 1 cycle (pitch), 4 cycles (latency)
• Five-stage pipeline

Power-down
• Power-down modes
  - Sleep mode
  - Standby mode
  - Module standby function

MMU
• 4-Gbyte address space, 256 address space identifiers (8-bit ASIDs)
• Single virtual mode and multiple virtual memory mode
• Supports multiple page sizes: 1 kbyte, 4 kbytes, 64 kbytes, 1 Mbyte
• 4-entry fully-associative TLB for instructions
• 64-entry fully-associative TLB for instructions and operands
• Supports software-controlled replacement and random-counter replacement algorithm
• TLB contents can be accessed directly by address mapping
Cache memory

**SH4-103**
- Instruction cache (IC)
  - 8 kbytes, direct mapping
  - 256 entries, 32-byte block length
  - Normal mode (8-K byte cache)
  - Index mode
- Operand cache (OC)
  - 16 kbytes, direct mapping
  - 512 entries, 32-byte block length
  - Normal mode (16-kbyte cache)
  - Index mode
  - RAM mode (8-kbyte cache + 8-kbyte RAM)
  - Choice of write method (copy-back or write-through)
- Single-stage copy-back buffer, single-stage write-through buffer
- Cache memory contents can be accessed directly by address mapping (usable as on-chip memory)
- Store queue (32 bytes x 2 entries)

**SH4-202**
- Instruction cache (IC):
  - 16 Kbyte, 2-way set associative
  - 512 entries, 32-bytes block length
  - Compatibility mode (8 kbyte direct mapped)
  - Index mode
- Operand cache (OC)
  - 32 Kbyte, 2-way set associative
  - 1024 entries, 32 bytes block length
  - Compatibility mode (16 Kbyte direct mapped)
  - Index mode
  - RAM mode (16 Kbyte cache + 16 Kbyte RAM)
- Single-stage copy-back buffer, single-stage write-through buffer
- Cache memory contents can be accessed directly by address mapping (usable as on-chip memory)
- Store queue (32 bytes x 2 entries)

a. Index mode (IC and OC) is only supported when in SH4-1xx compatibility mode.
1.2 Block diagram

Figure 1 shows an internal block diagram of the SH-4 32-Bit CPU Core.

![Block Diagram of SH-4 32-Bit CPU Core](image-url)

- **CCN**: Cache and TLB controller
- **FPU**: Floating point unit
- **ITLB**: Instruction Translation lookaside buffer
- **UTLB**: Unified Translation lookaside buffer

Figure 1 SH-4 32-Bit CPU core
PRELIMINARY DATA
Programming model

The SH-4 CPU core has two processor modes, user mode and privileged mode. The SH-4 normally operates in user mode, and switches to privileged mode when an exception occurs, or an interrupt is accepted.

There are four kinds of registers:

• general registers
  There are 16 general registers, R0 to R15. General registers R0 to R7 are banked registers which are switched by a processor mode change.

• system registers
  Access to these registers does not depend on the processor mode.

• control registers

• floating-point registers
  There are thirty-two floating-point registers, FR0–FR15 and XF0–XF15. FR0–FR15 and XF0–XF15 can be assigned to either of two banks (FPR0_BANK0–FPR15_BANK0 or FPR0_BANK1–FPR15_BANK1).

The registers that can be accessed differ in the two processor modes.
Register values after a reset are shown in Table 1.

<table>
<thead>
<tr>
<th>Type</th>
<th>Registers</th>
<th>Initial valuea</th>
</tr>
</thead>
<tbody>
<tr>
<td>General registers</td>
<td>R0_BANK0–R7_BANK0,</td>
<td>Undefined</td>
</tr>
<tr>
<td></td>
<td>R0_BANK1–R7_BANK1,</td>
<td></td>
</tr>
<tr>
<td></td>
<td>R8–R15</td>
<td></td>
</tr>
<tr>
<td>Control registers</td>
<td>SR</td>
<td>MD bit = 1, RB bit = 1, BL bit = 1, FD bit = 0, I3–I0 = 1111 (0xF), reserved bits = 0, others undefined</td>
</tr>
<tr>
<td></td>
<td>GBR, SSR, SPC, SGR,</td>
<td>Undefined</td>
</tr>
<tr>
<td></td>
<td>DBR</td>
<td></td>
</tr>
<tr>
<td></td>
<td>VBR</td>
<td>0x00000000</td>
</tr>
<tr>
<td>System registers</td>
<td>MACH, MACL, PR, FPUL</td>
<td>Undefined</td>
</tr>
<tr>
<td></td>
<td>PC</td>
<td>0xA0000000</td>
</tr>
<tr>
<td></td>
<td>FPSCR</td>
<td>0x00040001</td>
</tr>
<tr>
<td>Floating-point registers</td>
<td>FR0–FR15, XF0–XF15</td>
<td>Undefined</td>
</tr>
</tbody>
</table>

Table 1: Initial register values

a. Initialized by a power-on reset and manual reset

2.1 General registers

Figure 2 shows the relationship between the processor modes and the general registers. The SH-4 CPU core has twenty-four 32-bit general registers (R0_BANK0–R7_BANK0, R0_BANK1–R7_BANK1, and R8–R15). However, only 16 of these can be accessed as general registers, R0–R15, in either processor mode. The assignment of R0–R7, in both modes, is shown below.

- R0_BANK0–R7_BANK0
  - In user mode (SR.MD = 0), R0–R7 are always assigned to R0_BANK0–R7_BANK0.
  - In privileged mode (SR.MD = 1), R0–R7 are assigned to R0_BANK0–R7_BANK0 only when SR.RB = 0.
- R0_BANK1–R7_BANK1

In user mode, R0_BANK1–R7_BANK1 cannot be accessed.

In privileged mode, R0–R7 are assigned to R0_BANK1–R7_BANK1 only when SR.RB = 1.

<table>
<thead>
<tr>
<th>SR.MD = 0 or (SR.MD = 1, SR.RB = 0)</th>
<th>(SR.MD = 1, SR.RB = 1)</th>
</tr>
</thead>
<tbody>
<tr>
<td>R0</td>
<td>R0_BANK0</td>
</tr>
<tr>
<td>R1</td>
<td>R1_BANK0</td>
</tr>
<tr>
<td>R2</td>
<td>R2_BANK0</td>
</tr>
<tr>
<td>R3</td>
<td>R3_BANK0</td>
</tr>
<tr>
<td>R4</td>
<td>R4_BANK0</td>
</tr>
<tr>
<td>R5</td>
<td>R5_BANK0</td>
</tr>
<tr>
<td>R6</td>
<td>R6_BANK0</td>
</tr>
<tr>
<td>R7</td>
<td>R7_BANK0</td>
</tr>
<tr>
<td>R0_BANK1</td>
<td>R0_BANK1</td>
</tr>
<tr>
<td>R1_BANK1</td>
<td>R1_BANK1</td>
</tr>
<tr>
<td>R2_BANK1</td>
<td>R2_BANK1</td>
</tr>
<tr>
<td>R3_BANK1</td>
<td>R3_BANK1</td>
</tr>
<tr>
<td>R4_BANK1</td>
<td>R4_BANK1</td>
</tr>
<tr>
<td>R5_BANK1</td>
<td>R5_BANK1</td>
</tr>
<tr>
<td>R6_BANK1</td>
<td>R6_BANK1</td>
</tr>
<tr>
<td>R7_BANK1</td>
<td>R7_BANK1</td>
</tr>
<tr>
<td>R8</td>
<td>R8</td>
</tr>
<tr>
<td>R9</td>
<td>R9</td>
</tr>
<tr>
<td>R10</td>
<td>R10</td>
</tr>
<tr>
<td>R11</td>
<td>R11</td>
</tr>
<tr>
<td>R12</td>
<td>R12</td>
</tr>
<tr>
<td>R13</td>
<td>R13</td>
</tr>
<tr>
<td>R14</td>
<td>R14</td>
</tr>
<tr>
<td>R15</td>
<td>R15</td>
</tr>
</tbody>
</table>

Figure 2: General registers
Programming Note:

As the user's R0–R7 are assigned to R0_BANK0–R7_BANK0, and after an exception or interrupt R0–R7 are assigned to R0_BANK1–R7_BANK1, it is not necessary for the interrupt handler to save and restore the user's R0–R7 (R0_BANK0–R7_BANK0).

After a reset, the values of R0_BANK0–R7_BANK0, R0_BANK1–R7_BANK1, and R8–R15 are undefined.
### 2.2 System registers

<table>
<thead>
<tr>
<th>Name</th>
<th>Size</th>
<th>Initial value</th>
<th>Synopsis</th>
</tr>
</thead>
<tbody>
<tr>
<td>MACH</td>
<td>32</td>
<td>Undefined</td>
<td>Multiply-and-accumulate register high</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>MACH is used for the added value in a MAC instruction, and to store a MAC instruction or MUL instruction operation result.</td>
</tr>
<tr>
<td>MACL</td>
<td>32</td>
<td>Undefined</td>
<td>Multiply-and-accumulate register low</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>MACL is used for the added value in a MAC instruction, and to store a MAC instruction or MUL instruction operation result.</td>
</tr>
<tr>
<td>PR</td>
<td>32</td>
<td>Undefined</td>
<td>Procedure register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>The return address is stored when a subroutine call using a BSR, BSRF or JSR instruction. PR is referenced by the subroutine return instruction (RTS).</td>
</tr>
<tr>
<td>PC</td>
<td>32</td>
<td>0xA000 0000</td>
<td>Program counter</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>PC indicates the executing instruction address.</td>
</tr>
<tr>
<td>FPSCR</td>
<td>32</td>
<td>0x0004 0001</td>
<td>Floating-point status/control register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Refer to Table 3: FPSCR register description</td>
</tr>
<tr>
<td>FPUL</td>
<td>32</td>
<td>undefined</td>
<td>Floating-point communication register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Data transfer between FPU registers and CPU registers is carried out via the FPUL register. The FPUL register is a system register, and is accessed from the CPU side by means of LDS and STS instructions. For example, to convert the integer stored in general register R1 to a single-precision floating-point number, the processing flow is as follows: R1 → (LDS instruction) → FPUL → (single-precision FLOAT instruction) → FR1</td>
</tr>
</tbody>
</table>

**Table 2: System registers**
### FPSCR

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>RM</td>
<td>[0,1]</td>
<td>2</td>
<td>Rounding mode.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>RM = 00: Round to Nearest.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>RM = 01: Round to Zero.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>RM = 10: Reserved.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>RM = 11: Reserved.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>For details see Section 6.3: Rounding</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td>1</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Flag inexact</td>
<td>2</td>
<td>1</td>
<td>FPU inexact exception flag.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Set to 1 if Inexact exception occurs.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
<tr>
<td>Flag underflow</td>
<td>3</td>
<td>1</td>
<td>FPU underflow exception flag.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Set to 1 if Underflow exception occurs</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
<tr>
<td>Flag overflow</td>
<td>4</td>
<td>1</td>
<td>FPU overflow exception flag.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Set to 1 if overflow exception occurs</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
<tr>
<td>Flag division by zero</td>
<td>5</td>
<td>1</td>
<td>FPU division by zero exception flag.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Set to 1 if division by zero exception occurs</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
<tr>
<td>Flag invalid operation</td>
<td>6</td>
<td>1</td>
<td>FPU invalid operation exception flag.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Set to 1 if Invalid operation exception occurs</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
</tbody>
</table>

Table 3: FPSCR register description
Table 3: FPSCR register description

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Enable inexact</td>
<td>7</td>
<td>1</td>
<td>FPU invalid exception enable field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 1 to cause a trap when an inexact exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Enable underflow</td>
<td>8</td>
<td>1</td>
<td>FPU underflow exception enable field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 1 to cause a trap when an underflow exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Enable overflow</td>
<td>9</td>
<td>1</td>
<td>FPU overflow exception enable field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 1 to cause a trap when an overflow exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Enable division by zero</td>
<td>10</td>
<td>1</td>
<td>FPU division by zero exception enable field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 1 to cause a trap when a division by zero exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Enable invalid</td>
<td>11</td>
<td>1</td>
<td>FPU invalid exception enable field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 1 to cause a trap when an Invalid exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Cause inexact</td>
<td>12</td>
<td>1</td>
<td>FPU inexact exception cause field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 0 before an FPU instruction is executed. Set to 1 if an Inexact exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
### FPSCR Register Description

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Cause underflow</td>
<td>13</td>
<td>1</td>
<td>FPU underflow exception cause field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 0 before an FPU instruction is executed. Set to 1 if an underflow exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>Cause overflow</td>
<td>14</td>
<td>1</td>
<td>FPU overflow exception cause field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 0 before an FPU instruction is executed. Set to 1 if an overflow exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>Cause division by zero</td>
<td>15</td>
<td>1</td>
<td>FPU division by zero exception cause field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 0 before an FPU instruction is executed. Set to 1 if a division by zero exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>Cause invalid</td>
<td>16</td>
<td>1</td>
<td>FPU invalid exception cause field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 0 before an FPU instruction is executed. Set to 1 if an invalid exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>Cause FPU error</td>
<td>17</td>
<td>1</td>
<td>FPU error exception cause field.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Set to 0 before an FPU instruction is executed. Set to 1 if an FPU error exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>DN</td>
<td>18</td>
<td>1</td>
<td>Denormalization mode.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>DN = 0: A denormalizing number is treated as such. DN = 1: A denormalized number is treated as zero.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
</tbody>
</table>

*Table 3: FPSCR register description*
PRELIMINARY DATA

STMicroelectronics and Hitachi, Ltd.

ADCS 7182230F

SH-4 CPU Core Architecture

Table 3: FPSCR register description

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>PR</td>
<td>19</td>
<td>1</td>
<td>Precision mode.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>PR = 0: Floating point instructions are executed as single precision operations.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>PR = 1: Floating point instructions are executed as double-precision operations (the result of instructions for which double-precision is not supported is undefined).</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Mode setting [SZ = 1, PR = 1] is reserved. FPU operation results are undefined in this mode.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>1</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>SZ</td>
<td>20</td>
<td>1</td>
<td>Transfer size mode.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>SZ = 0: The data size of the FMOV instruction is 32 bits.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>SZ = 1: The data size of the FMOV instruction is a 32-bit register pair (64 bits).</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Programming note:</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>When SZ = 1 and big endian mode is selected, FMOV can be used for double-precision floating-point data load or store operations. In little endian mode, two 32-bit data size moves must be executed, with SZ = 0, to load or store a double-precision floating-point number.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>FR</td>
<td>21</td>
<td>1</td>
<td>Floating-point register bank.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>FR = 0: FPR0_BANK0-FPR15_BANK0 are assigned to FR0-FR15; FPR0_BANK1-FPR15_BANK1 are assigned to XF0-XF15.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>FR = 1: FPR0_BANK0-FPR15_BANK1 are assigned to FR0-FR15.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Field</td>
<td>Bits</td>
<td>Size</td>
<td>Synopsis</td>
<td>Type</td>
</tr>
<tr>
<td>---------</td>
<td>----------</td>
<td>------</td>
<td>---------------</td>
<td>-------</td>
</tr>
<tr>
<td>RES</td>
<td>[22,31]</td>
<td>10</td>
<td>Bits reserved</td>
<td>RW</td>
</tr>
<tr>
<td>Power-on reset</td>
<td>Undefined</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 3: FPSCR register description
### 2.3 Control registers

<table>
<thead>
<tr>
<th>Name</th>
<th>Size</th>
<th>Initial value</th>
<th>Privilege protection</th>
<th>Synopsis</th>
</tr>
</thead>
<tbody>
<tr>
<td>SR</td>
<td>32</td>
<td>See Table 5 for individual bits.</td>
<td>Yes</td>
<td>Status register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>Refer to Table 5: SR register description</td>
</tr>
<tr>
<td>SSR</td>
<td>32</td>
<td>Undefined</td>
<td>Yes</td>
<td>Saved status register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>The current contents of SR are saved to SSR in the event of an exception or interrupt.</td>
</tr>
<tr>
<td>SPC</td>
<td>32</td>
<td>Undefined</td>
<td>Yes</td>
<td>Saved program counter</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>The address of an instruction at which an interrupt or exception occurs is saved to SPC.</td>
</tr>
<tr>
<td>GBR</td>
<td>32</td>
<td>Undefined</td>
<td>No</td>
<td>Global base register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>GBR is referenced as the base address in a GBR-referencing MOV instruction.</td>
</tr>
<tr>
<td>VBR</td>
<td>32</td>
<td>0x0000 0000</td>
<td>Yes</td>
<td>Vector base register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>VBR is referenced as the branch destination base address in the event of an exception or interrupt.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>For details, see Chapter 5: Exceptions.</td>
</tr>
<tr>
<td>SGR</td>
<td>32</td>
<td>Undefined</td>
<td>Yes</td>
<td>Saved general register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>The contents of R15 are saved to SGR in the event of an exception or interrupt.</td>
</tr>
<tr>
<td>DBR</td>
<td>32</td>
<td>undefined</td>
<td>Yes</td>
<td>Debug base register</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td>When the user break debug function is enabled (BRCR.UBDE = 1), DBR is referenced as the user break handler branch destination address instead of VBR.</td>
</tr>
</tbody>
</table>

**Table 4: Control registers**
### SR Register Description

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>T</td>
<td>0</td>
<td>1</td>
<td>True/False condition or carry/borrow bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Refer to individual instruction descriptions, which affect the T bit.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
<tr>
<td>S</td>
<td>1</td>
<td>1</td>
<td>Specifies a saturation operation for a MAC instruction.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Refer to individual instruction descriptions, which affect the S bit.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
<tr>
<td>IMASK</td>
<td>[4,7]</td>
<td>4</td>
<td>Interrupt mask level.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>External interrupts of a lower level than IMASK are masked.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>Q</td>
<td>8</td>
<td>1</td>
<td>State for divide step.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Used by the DIV0S, DIV0U and DIV1 instructions.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
<tr>
<td>M</td>
<td>9</td>
<td>1</td>
<td>State for divide step.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Used by the DIV0S, DIV0U and DIV1 instructions.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
<tr>
<td>FD</td>
<td>15</td>
<td>1</td>
<td>FPU disable bit (cleared to 0 by a reset).</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>FD = 1: An FPU instruction causes a general FPU disable exception, and if the FPU instruction is in a delay slot, a slot FPU disable exception is generated.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
</tbody>
</table>

Table 5: SR register description
<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>BL</td>
<td>28</td>
<td>1</td>
<td>Exception/interrupt block bit (set to 1 by a reset, exception, or interrupt).</td>
<td>RW</td>
</tr>
</tbody>
</table>
|      |      |      | Operation
BL = 1: Interrupt requests are masked. If a general exception, other than a user break occurs while BL = 1, the processor switches to the reset state. |       |
|      |      |      | Power-on reset
1 |       |
| RB   | 29   | 1    | General register bank specifier in privileged mode (set to 1 by a reset, exception or interrupt). | RW    |
|      |      |      | Operation
RB = 0: R0_BANK0-R7_BANK0 are accessed as general registers R0-R7. (R0_BANK1-R7_BANK1 can be accessed using LDC/STC R0_BANK-R7_BANK instructions.)
RB = 1: R0_BANK1-R7_BANK1 are accessed as general registers R0-R7. (R0_BANK0-R7_BANK0 can be accessed using LDC/STC R0_BANK-R7_BANK instructions.) |       |
|      |      |      | Power-on reset
1 |       |
| MD   | 30   | 1    | Processor mode.                                                         | RW    |
|      |      |      | Operation
MD = 0: User mode (Some instructions cannot be executed, and some resources cannot be accessed).
MD = 1: Privileged mode. |       |
|      |      |      | Power-on reset
1 |       |
| RES  | [2,3], [10,14][16,27] 31 | 20   | Bits reserved                                                          | RW    |
|      |      |      | Power-on reset
Undefined |       |

Table 5: SR register description
2.4 Floating-point registers

Figure 3 shows the floating-point registers. There are thirty-two 32-bit floating-point registers, divided into two banks (FPR0_BANK0–FPR15_BANK0 and FPR0_BANK1–FPR15_BANK1). These 32 registers are referenced as FR0-FR15, DR0/2/4/6/8/10/12/14, FV0/4/8/12, XF0-XF15, XD0/2/4/6/8/10/12/14, or XMTRX. The correspondence between FPRn_BANKi and the reference name is determined by the FR bit in FPSCR.

- Floating-point registers, FPRn_BANKi (32 registers)
- Single-precision floating-point registers, FRi (16 registers)
  
  FPSCR.FR = 0 : FR0-FR15 are assigned to FPR0_BANK0-FPR15_BANK0.
  FPSCR.FR = 1 : FR0-FR15 are assigned to FPR0_BANK1-FPR15_BANK1.
- Double-precision floating-point registers or single-precision floating-point register pairs, DRi (8 registers): A DR register comprises two FR registers.
  
  DR0 = {FR0, FR1}, DR2 = {FR2, FR3}, DR4 = {FR4, FR5}, DR6 = {FR6, FR7},
  DR8 = {FR8, FR9}, DR10 = {FR10, FR11}, DR12 = {FR12, FR13},
  DR14 = {FR14, FR15}
- Single-precision floating-point vector registers, FVi (4 registers): An FV register comprises four FR registers
  
  FV0 = {FR0, FR1, FR2, FR3}, FV4 = {FR4, FR5, FR6, FR7},
  FV8 = {FR8, FR9, FR10, FR11}, FV12 = {FR12, FR13, FR14, FR15}
- Single-precision floating-point extended registers, XFi (16 registers)
  
  FPSCR.FR = 0 : XF0-XF15 are assigned to FPR0_BANK1-FPR15_BANK1.
  FPSCR.FR = 1 : XF0-XF15 are assigned to FPR0_BANK0-FPR15_BANK0.
- Single-precision floating-point extended register pairs, XDi (8 registers): An XD register comprises two XF registers
  
  XD0 = {XF0, XF1}, XD2 = {XF2, XF3}, XD4 = {XF4, XF5}, XD6 = {XF6, XF7},
  XD8 = {XF8, XF9}, XD10 = {XF10, XF11}, XD12 = {XF12, XF13},
  XD14 = {XF14, XF15}
- Single-precision floating-point extended register matrix, XMTRX: XMTRX comprises all 16 XF registers
# PRELIMINARY DATA

XMTRX = 

<table>
<thead>
<tr>
<th>XMTRX</th>
<th>XD0</th>
<th>XF0</th>
<th>FR0</th>
<th>XF0</th>
<th>XD0</th>
<th>XMTRX</th>
</tr>
</thead>
<tbody>
<tr>
<td>XF0</td>
<td>FR1</td>
<td>FPR0_BANK0</td>
<td>FPR1_BANK0</td>
<td>FPR2_BANK0</td>
<td>FPR3_BANK0</td>
<td>FPR4_BANK0</td>
</tr>
<tr>
<td>XF1</td>
<td>FR2</td>
<td>FPR5_BANK0</td>
<td>FPR6_BANK0</td>
<td>FPR7_BANK0</td>
<td>FPR8_BANK0</td>
<td>FPR9_BANK0</td>
</tr>
<tr>
<td>XF2</td>
<td>FR3</td>
<td>FPR10_BANK0</td>
<td>FPR11_BANK0</td>
<td>FPR12_BANK0</td>
<td>FPR13_BANK0</td>
<td>FPR14_BANK0</td>
</tr>
<tr>
<td>XF3</td>
<td>FR4</td>
<td>FPR15_BANK0</td>
<td>FPR0_BANK1</td>
<td>FPR1_BANK1</td>
<td>FPR2_BANK1</td>
<td>FPR3_BANK1</td>
</tr>
<tr>
<td>XF4</td>
<td>FR5</td>
<td>FPR4_BANK1</td>
<td>FPR5_BANK1</td>
<td>FPR6_BANK1</td>
<td>FPR7_BANK1</td>
<td>FPR8_BANK1</td>
</tr>
<tr>
<td>XF5</td>
<td>FR6</td>
<td>FPR9_BANK1</td>
<td>FPR10_BANK1</td>
<td>FPR11_BANK1</td>
<td>FPR12_BANK1</td>
<td>FPR13_BANK1</td>
</tr>
<tr>
<td>XF6</td>
<td>FR7</td>
<td>FPR14_BANK1</td>
<td>FPR15_BANK1</td>
<td>FPR0_BANK1</td>
<td>FPR1_BANK1</td>
<td>FPR2_BANK1</td>
</tr>
<tr>
<td>XF7</td>
<td>FR8</td>
<td>FPR1_BANK1</td>
<td>FPR2_BANK1</td>
<td>FPR3_BANK1</td>
<td>FPR4_BANK1</td>
<td>FPR5_BANK1</td>
</tr>
<tr>
<td>XF8</td>
<td>FR9</td>
<td>FPR6_BANK1</td>
<td>FPR7_BANK1</td>
<td>FPR8_BANK1</td>
<td>FPR9_BANK1</td>
<td>FPR10_BANK1</td>
</tr>
<tr>
<td>XF9</td>
<td>FR10</td>
<td>FPR11_BANK1</td>
<td>FPR12_BANK1</td>
<td>FPR13_BANK1</td>
<td>FPR14_BANK1</td>
<td>FPR15_BANK1</td>
</tr>
<tr>
<td>XF10</td>
<td>FR11</td>
<td>FPR13_BANK1</td>
<td>FPR14_BANK1</td>
<td>FPR15_BANK1</td>
<td>FPR0_BANK1</td>
<td>FPR1_BANK1</td>
</tr>
<tr>
<td>XF11</td>
<td>FR12</td>
<td>FPR14_BANK1</td>
<td>FPR15_BANK1</td>
<td>FPR0_BANK1</td>
<td>FPR1_BANK1</td>
<td>FPR2_BANK1</td>
</tr>
<tr>
<td>XF12</td>
<td>FR13</td>
<td>FPR15_BANK1</td>
<td>FPR0_BANK1</td>
<td>FPR1_BANK1</td>
<td>FPR2_BANK1</td>
<td>FPR3_BANK1</td>
</tr>
<tr>
<td>XF13</td>
<td>FR14</td>
<td>FPR0_BANK1</td>
<td>FPR1_BANK1</td>
<td>FPR2_BANK1</td>
<td>FPR3_BANK1</td>
<td>FPR4_BANK1</td>
</tr>
<tr>
<td>XF14</td>
<td>FR15</td>
<td>FPR1_BANK1</td>
<td>FPR2_BANK1</td>
<td>FPR3_BANK1</td>
<td>FPR4_BANK1</td>
<td>FPR5_BANK1</td>
</tr>
<tr>
<td>XF15</td>
<td>FPR5_BANK0</td>
<td>FPR6_BANK0</td>
<td>FPR7_BANK0</td>
<td>FPR8_BANK0</td>
<td>FPR9_BANK0</td>
<td>FPR10_BANK0</td>
</tr>
</tbody>
</table>

FPSCR.FR = 0  
FPSCR.FR = 1

*Figure 3: Floating-point registers*
Programming Note:

After a reset, the values of FPR0_BANK0–FPR15_BANK0 and FPR0_BANK1–FPR15_BANK1 are undefined.

2.5 Memory-mapped registers

Appendix A summarizes how the control registers are mapped into the address space. The control registers are double-mapped to the following two memory areas. All registers have two addresses.

0x1F00 0000-0x1FFF FFFF
0xFF00 0000-0xFFFF FFFF

These two areas are used as follows.

• 0x1F00 0000–0x1FFF FFFF

This area must be accessed in address translation mode using the TLB. Since external memory area is defined as a 29-bit address space in the SH-4 CPU core architecture, the TLB's physical page numbers do not cover a 32-bit address space. In address translation, the page numbers of this area can be set in the corresponding field of the TLB by accessing a memory-mapped register. The page numbers of this area should be used as the actual page numbers set in the TLB. When address translation is not performed, the operation of accesses to this area is undefined.

• 0xFF00 0000–0xFFFF FFFF

Access to area 0xFF00 0000-0xFFFF FFFF in user mode will cause an address error. Memory-mapped registers can be referenced in user mode by means of access that involves address translation.

Note: Do not access undefined locations in either area. The operation of an access to an undefined location is undefined. Memory-mapped registers must be accessed using a load/store instruction of an equal size to that of the register. The operation of an access using an invalid data size is undefined.
2.6 Data format in registers

Register operands are always longwords (32 bits). When a memory operand is only a byte (8 bits) or a word (16 bits), it is sign-extended into a longword when loaded into a register.

2.7 Data formats in memory

Memory can be accessed in 8-bit byte, 16-bit word, or 32-bit longword form. A memory operand less than 32 bits in length is sign-extended before being loaded into a register.

A word operand must be accessed starting from a word boundary (even address of a 2-byte unit: address 2n), and a longword operand starting from a longword boundary (even address of a 4-byte unit: address 4n). An address error will result if this rule is not observed. A byte operand can be accessed from any address.

Big endian or little endian byte order can be selected for the data format. This endian selection cannot be changed dynamically and is selected by the system during power-on reset. Refer to the system architecture manual of the relevant product for details of how to perform endian selection. Bit positions are numbered left to right from most-significant to least-significant. Thus, in a 32-bit longword, the left-most bit, bit 31, is the most significant bit and the right-most bit, bit 0, is the least significant bit.

The data format in memory is shown in Figure 4.

![Figure 4: Data formats in memory]

Note: The SH-4 CPU core does not support endian conversion for the 64-bit data format. Therefore, if double-precision floating-point format (64-bit) access is performed in little endian mode, the upper and lower 32 bits will be reversed.
2.8 Processor states

The SH-4 CPU core has four processor states. Transitions between the states are shown in Figure 5.

2.8.1 Reset state

In this state the CPU is reset. The CPU can be placed in one of two reset states, either power on reset or manual reset. Which of these is selected is determined by the system architecture. Refer to the relevant system architecture manual for details. For more information on resets, see section 5, Exceptions.

The purpose of having two reset modes is to allow some flexibility over which system components are reset. Typically:

- power-on reset will cause all system components to be reset,
- manual reset may, for example, avoid resetting DRAM controllers so that memory contents are preserved.

2.8.2 Exception-handling state

This is a transient state during which the CPU’s processor state flow is altered by a reset, general exception, or interrupt exception source.

In the case of a reset, the CPU branches to address 0xA000 0000 and starts executing the user-coded exception handling program.

In the case of a general exception or interrupt, the program counter (PC) contents are saved in the saved program counter (SPC), the status register (SR) contents are saved in the saved status register (SSR), and the R15 contents are saved in saved general register (SGR). The CPU branches to the start address of the user-coded exception service routine, found from the sum of the contents of the vector base address and the vector offset.

See Chapter 5: Exceptions, for more information on resets, general exceptions, and interrupts.

2.8.3 Program execution state

In this state the CPU executes program instructions in sequence.
2.8.4 Power-down state

The power-down state is entered by executing a SLEEP instruction. In this state the
CPU stops executing instructions and signals to the system that the CPU has been
put to sleep. The system response to receiving this signal is described in the System
Architecture Manual of the appropriate product.

The CPU is restarted by raising an interrupt.

Note: For conditions determining state transitions, see the System Architecture Manual.
2.9 Processor modes

There are two processor modes: user mode and privileged mode. The processor mode is determined by the processor mode bit (MD) in the status register (SR). User mode is selected when the MD bit is cleared to 0, and privileged mode when the MD bit is set to 1. When the reset state or exception-handling state is entered, the MD bit is set to 1. When exception handling ends, the MD bit returns to the value held before the exception occurred.
Memory management unit (MMU)

3.1 Overview

The SH-4 CPU core manages a 29-bit external memory space by providing 8-bit address space identifiers, and a 32-bit logical (virtual) address space. Address translation from virtual address to physical address is performed using the memory management unit (MMU), built into the SH-4 CPU core. The MMU performs high-speed address translation by caching user-created address translation table information, in an address translation buffer (translation lookaside buffer: TLB). The SH-4 has four instruction TLB (ITLB) entries and 64 unified TLB (UTLB) entries. UTLB copies are stored in the ITLB by hardware. It is possible to set the virtual address space access right, and implement storage protection independently, for privileged mode and user mode.

3.2 Role of the MMU

The main purpose of an MMU is to ensure that efficient use is made of physical memory, which in most systems is a limiting resource. The MMU is normally managed by the OS, which allocates physical pages of memory to virtual pages of memory, as required by a task. Pages which are switched out by the OS are placed in a secondary storage device, such as a hard disk.

A page refers to a contiguous range of addresses, which can all be translated by a single translation table entry. On SH-4 there is support for 4 page sizes: 1-kbyte, 4-kbyte, 64-kbyte and 1-Mbyte.

Memory protection functions are provided to prevent physical memory from inadvertently being accessed and reset by a process.
Although the functions of the MMU could be implemented by software alone, having address translation performed by software each time a process accessed physical memory would be very inefficient. For this reason, a buffer for address translation (TLB) is provided in hardware, and frequently used address translation information is placed here. The TLB can be described as a cache for address translation information. However, unlike a cache, if address translation fails—that is, if an exception occurs—switching of the address translation information is normally performed by software. Thus memory management can be performed in a flexible manner by software.

### 3.3 Register descriptions

There are six MMU-related registers.

<table>
<thead>
<tr>
<th>Name</th>
<th>Abbreviation</th>
<th>R/W</th>
<th>Initial value(^a)</th>
<th>P4 address(^b)</th>
<th>Area 7 address(^b)</th>
<th>Access size</th>
</tr>
</thead>
<tbody>
<tr>
<td>Page table entry high register</td>
<td>PTEH</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 0000</td>
<td>0x1F00 0000</td>
<td>32</td>
</tr>
<tr>
<td>Page table entry low register</td>
<td>PTEL</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 0004</td>
<td>0x1F00 0004</td>
<td>32</td>
</tr>
<tr>
<td>Translation table base register</td>
<td>TTB</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 0008</td>
<td>0x1F00 0008</td>
<td>32</td>
</tr>
<tr>
<td>Translation table address register</td>
<td>TEA</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 000C</td>
<td>0x1F00 000C</td>
<td>32</td>
</tr>
<tr>
<td>MMU control register</td>
<td>MMUCR</td>
<td>R/W</td>
<td>0x0000 0000</td>
<td>0xFF00 0010</td>
<td>0x1F00 0010</td>
<td>32</td>
</tr>
</tbody>
</table>

**Table 6: MMU registers**

\(^a\) The initial value is the value after a power-on reset or manual reset.

\(^b\) This is the address when using the virtual/physical address space P4 region. When making an access from physical address space Area 7 using the TLB, the upper 3 bits of the address are ignored.

**Note:** Behavior is undefined if an area designated as a reserved area in this manual is accessed.
3.3.1 Page table entry high register (PTEH)

Longword access to PTEH can be performed from 0xFF00 0000 in the P4 region, and 0x1F00 0000 in Area 7. When an MMU exception or address error exception occurs, the VPN of the virtual address at which the exception occurred, is set in the VPN field by hardware. VPN varies according to the page size, but the VPN set by hardware when an exception occurs, always consists of the upper 22 bits of the virtual address which caused the exception. VPN setting can also be carried out by software. The number of the currently executing process is set in the ASID field by software. ASID is not updated by hardware. VPN and ASID are recorded in the UTLB by means of the LDLTB instruction.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>ASID</td>
<td>[0,7]</td>
<td>8</td>
<td>Address space identifier.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Indicates the process that can access a virtual page. In single virtual memory mode and user mode, or in multiple virtual memory mode, if the SH bit is 0, this identifier is compared with the ASID in PTEH when address comparison is performed. See section 3.3.7 Address space identifier.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
<tr>
<td>VPN</td>
<td>[10,31]</td>
<td>22</td>
<td>Virtual page number.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>For 1-kbyte: upper 22 bits of virtual address.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>For 4-kbyte: upper 20 bits of virtual address.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>For 64-kbyte: upper 16 bits of virtual address.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>For 1-Mbyte: upper 12 bits of virtual address.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
</tbody>
</table>

Table 7: PTEH register description
3.3.2 Page table entry low register (PTEL)

Longword access to PTEL can be performed from 0xFF00 0004 in the P4 region, and 0x1F00 0004 in Area 7. PTEL is used to hold the physical page number and page management information to be recorded in the UTLB, by means of the LDTLB instruction. The contents of this register are not changed unless a software directive is issued.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>WT</td>
<td>0</td>
<td>1</td>
<td>Write-through bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>Specifies the cache write mode.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>0: Copy-back mode.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>1: Write-through mode.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>Undefined</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>SH</td>
<td>1</td>
<td>1</td>
<td>Share status bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>0: pages are not shared by processes.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>1: pages are shared by processes.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>Undefined</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>D</td>
<td>2</td>
<td>1</td>
<td>Dirty bit</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>Indicates whether a write has been performed to a page.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>0: Write has not been performed.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>1: Write has been performed.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>Undefined</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>C</td>
<td>3</td>
<td>1</td>
<td>Cacheability bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>Indicates whether a page is cacheable.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>0: Not cacheable.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>1: Cacheable.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>When control register is mapped, this bit must be cleared to 0.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>Undefined</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 8: PTEL register description
### PTEL

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>SZ0</td>
<td>4</td>
<td>1</td>
<td>Page size bit.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td>Specify page size.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Bit SZ1</td>
<td>Bit SZ0</td>
</tr>
<tr>
<td>0</td>
<td>0</td>
<td></td>
<td>1-kbyte</td>
<td></td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td></td>
<td>4-kbyte</td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td></td>
<td>64-kbyte</td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td></td>
<td>1-Mbyte</td>
<td></td>
</tr>
<tr>
<td>PR</td>
<td>[5,6]</td>
<td>2</td>
<td>Protection key data.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td>2-bit data expressing the page access right as a code.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>00:</td>
<td>Can be read only in privileged mode.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>01:</td>
<td>Can be read and written in privileged mode.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>10:</td>
<td>Can be read only, in privileged or user mode.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>11:</td>
<td>Can be read and written in privileged or user mode.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>Undefined</td>
</tr>
<tr>
<td>SZ1</td>
<td>7</td>
<td>1</td>
<td>Page size bit</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td>Refer to SZ0 for operation details.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
</tbody>
</table>

**Table 8: PTEL register description**
## PRELIMINARY DATA

### PTEL

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>V</td>
<td>8</td>
<td>1</td>
<td>Validity bit.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Indicates whether the entry is valid.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0: Invalid</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1: Valid</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Cleared to 0 by a power-on reset.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Not affected by a manual reset.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset Undefined</td>
<td></td>
</tr>
<tr>
<td>PPN</td>
<td>[10,28]</td>
<td>19</td>
<td>Physical page number</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Upper 19 bits of the physical address.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>With a 1-kbyte page, PPN bits [28:10] are valid.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>With a 4-kbyte page, PPN bits [28:12] are valid.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>With a 64-kbyte page, PPN bits [28:16] are valid.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>With a 1-Mbyte page, PPN bits [28:20] are valid.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>The synonym problem must be taken into account when setting the PPN (Section 3.6.5: Avoiding synonym problems on page 64).</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset Undefined</td>
<td></td>
</tr>
<tr>
<td>RES</td>
<td>9, [29,31]</td>
<td>4</td>
<td>Bits reserved</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset Undefined</td>
<td></td>
</tr>
</tbody>
</table>

*Table 8: PTEL register description*
3.3.3 Translation table base register (TTB)

Long word access to the TTB can be performed from 0xFF00 0008 in the P4 region, and 0x1F00 0008 in Area 7. The contents of the TTB are not changed unless a software directive is issued. This register can be freely used by software.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>TTB</td>
<td>[0,31]</td>
<td>32</td>
<td>Translation table base register.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>TTB is used, for example, to hold the base address of the currently used page table.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
</tbody>
</table>

Table 9: TTB register description

3.3.4 TLB exception address register (TEA)

Longword access to TEA can be performed from 0xFF00 000C in the P4 region and 0x1F00 000C in Area 7. The contents of this register can be changed by software.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>TEA</td>
<td>[0,31]</td>
<td>32</td>
<td>TLB exception address register.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>After an MMU exception or address error exception occurs, the virtual address at which the exception occurred is set in TEA by hardware.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
</tbody>
</table>

Table 10: TEA register description

3.3.5 MMU control register (MMUCR)

Longword access to MMUCR can be performed from 0xFF00 0010 in the P4 region, and 0x1F00 0010 in Area 7. The individual bits perform MMU settings as shown below. Therefore, MMUCR rewriting should be performed by a program in the P1 or P2 region. After MMUCR is updated, an instruction that performs data access to the
P0, P3, U0, or store queue region should be located at least four instructions after the MMUCR update instruction. Also, a branch instruction to the P0, P3, or U0 region should be located at least eight instructions after the MMUCR update instruction. MMUCR contents can be changed by software. The LRUI bits and URC bits may also be updated by hardware.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>AT</td>
<td>0</td>
<td>1</td>
<td>Address translation bit.</td>
<td>RW</td>
</tr>
</tbody>
</table>
|       |      |      | Operation | Specifies MMU enabling or disabling.  
|       |      |      |          | 0: MMU disabled.  
|       |      |      |          | 1: MMU enabled.  
|       |      |      |          | MMU exceptions are not generated when the AT bit is 0.  
|       |      |      |          | Therefore, in the case of software that does not use the MMU,  
|       |      |      |          | the AT bit should be cleared to 0.  
|       |      |      | Power-on reset | 0 |
| TI    | 2    | 1    | TLB invalidate. | RW   |
|       |      |      | Operation | Writing 1 to this bit invalidates (clears to 0) all valid UTLB/ITLB bits. This bit always returns 0 when read.  
|       |      |      | Power-on reset | 0 |
| SV    | 8    | 1    | Single virtual mode bit. | RW   |
|       |      |      | Operation | Bit that switches between single virtual memory mode and multiple virtual memory mode.  
|       |      |      |          | 0: Multiple virtual memory mode.  
|       |      |      |          | 1: Single virtual memory mode.  
|       |      |      |          | When this bit is changed, ensure that 1 is also written to the TI bit.  
|       |      |      | Power-on reset | 0 |

Table 11: MMUCR register description
**MMUCR**

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>SQMD</td>
<td>9</td>
<td>1</td>
<td>Store queue mode bit.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Specifies the right of access to the store queues.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0: User/privileged access possible.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1: Privileged access possible (address error exception in case of user access).</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>URC</th>
<th>[10,15]</th>
<th>6</th>
<th>UTLB replace counter.</th>
<th>RW</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Random counter for indicating the UTLB entry for which replacement is to be performed with an LDTLB instruction. URC is incremented each time the UTLB is accessed. When URB &gt; 0, URC is reset to 0 when the condition URC = URB occurs. Also note that, if a value is written to URC by software which results in the condition URC &gt; URB, incrementing is first performed in excess of URB until URC = 0x3F. URC is not incremented by an LDTLB instruction.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>URB</th>
<th>[18,23]</th>
<th>6</th>
<th>UTLB replace boundary.</th>
<th>RW</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td>Operation</td>
<td>Bits that indicate the UTLB entry boundary at which replacement is to be performed. Valid only when URB &gt; 0.</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Power-on reset</td>
<td>0</td>
</tr>
</tbody>
</table>

Table 11: MMUCR register description
The LRU (least recently used) method is used to decide the ITLB entry to be replaced in the event of an ITLB miss. The entry to be purged from the ITLB can be confirmed using the LRUI bits. LRUI is updated by means of the algorithm shown below. A dash in this table means that updating is not performed.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>LRUI</td>
<td>[26, 31]</td>
<td>6</td>
<td>Least recently used ITLB.</td>
<td>RW</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>Operation</th>
<th>[5]</th>
<th>[4]</th>
<th>[3]</th>
<th>[2]</th>
<th>[1]</th>
<th>[0]</th>
</tr>
</thead>
<tbody>
<tr>
<td>When ITLB entry 0 is used</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>When ITLB entry 1 is used</td>
<td>1</td>
<td>-</td>
<td>-</td>
<td>0</td>
<td>0</td>
<td>-</td>
</tr>
<tr>
<td>When ITLB entry 2 is used</td>
<td>-</td>
<td>1</td>
<td>-</td>
<td>1</td>
<td>-</td>
<td>0</td>
</tr>
<tr>
<td>When ITLB entry 3 is used</td>
<td>-</td>
<td>-</td>
<td>1</td>
<td>-</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>Other than the above</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
</tbody>
</table>

When the LRUI bit settings are as shown below, the corresponding ITLB entry is updated by an ITLB miss. An asterisk in this table means “don’t care”..

<table>
<thead>
<tr>
<th>ITLB entry 0 is updated</th>
<th>[5]</th>
<th>[4]</th>
<th>[3]</th>
<th>[2]</th>
<th>[1]</th>
<th>[0]</th>
</tr>
</thead>
<tbody>
<tr>
<td>ITLB entry 1 is updated</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>*</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>ITLB entry 2 is updated</td>
<td>0</td>
<td>*</td>
<td>*</td>
<td>1</td>
<td>1</td>
<td>*</td>
</tr>
<tr>
<td>ITLB entry 3 is updated</td>
<td>*</td>
<td>0</td>
<td>*</td>
<td>0</td>
<td>*</td>
<td>1</td>
</tr>
<tr>
<td>Other than the above</td>
<td>*</td>
<td>*</td>
<td>0</td>
<td>*</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>

Ensure that values for which “Setting prohibited” is indicated in the above table are not set at the discretion of software. After a power-on manual reset the bits are initialized to 0, and therefore a prohibited setting is never made by a hardware update.

| Power-on reset | 0 |

Table 11: MMUCR register description
### 3.4 Address space

#### 3.4.1 Physical address space

The SH-4 CPU core supports a 32-bit (4-Gbyte) physical address space. When the MMUCR.AT bit is cleared to 0 and the MMU is disabled, the address space accessed by the program is this physical address space. The physical address space is divided into a number of regions, as shown in Figure 7. The region is selected using the top 3 bits of the physical address.

<table>
<thead>
<tr>
<th>Bit</th>
<th>Region accessed</th>
</tr>
</thead>
<tbody>
<tr>
<td>31</td>
<td>30</td>
</tr>
<tr>
<td>P0</td>
<td>U0</td>
</tr>
<tr>
<td>P1</td>
<td></td>
</tr>
<tr>
<td>P2</td>
<td></td>
</tr>
<tr>
<td>P3</td>
<td></td>
</tr>
<tr>
<td>P4</td>
<td></td>
</tr>
</tbody>
</table>

*Table 12: Region selection*

- Except for address from 0xe000 0000 - 0xe3FF FFFF which the user can use to access the store queues.
The region selected determines how the remaining 29 bits are interpreted. For example P0, P1 and P3 all access the 29 bits of external memory via the cache. P4 is used exclusively to access the cores internal devices. See the system architecture manual for more details of the internal devices available on a particular product.

### 3.4.2 External memory space

The SH-4 CPU core supports a 29-bit external memory space. The external memory space is divided into eight Areas as shown in Figure 7. Areas 0 to 6 relate to memory, Area 7 is a reserved area, and is only accessed via the P4 region.

![Figure 6: External memory Space](image)
**P0, P1, P3, U0 Regions:** The P0, P1, P3, and U0 regions can be accessed using the cache. Whether or not the cache is used is determined by the cache control register (CCR). When the cache is used, with the exception of the P1 region, switching between the copy-back method and the write-through method for write accesses is specified by the CCR.WT bit. For the P1 region, switching is specified by the CCR.CB bit. Zeroing the upper 3 bits of an address in these regions gives the corresponding external memory space address. However, since Area 7 in the external memory space is a reserved Area, a reserved area also appears in these regions.

**P2 Region:** The P2 region cannot be accessed using the cache. In the P2 region, zeroing the upper 3 bits of an address gives the corresponding external memory space address. However, since Area 7 in the external memory space is a reserved Area, a reserved area also appears in this region.

**P4 Region:** The P4 region is mapped onto SH-4 CPU core on-chip I/O channels. This region cannot be accessed using the cache. The P4 region is shown in detail in Table 13.

---

**Figure 7: Physical address space (MMUCR.AT = 0)**

<table>
<thead>
<tr>
<th>Privileged mode</th>
<th>External memory space</th>
<th>User mode</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x0000 0000</td>
<td>Area 0</td>
<td>0x0000 0000</td>
</tr>
<tr>
<td>P0 region</td>
<td>Area 1</td>
<td>0x8000 0000</td>
</tr>
<tr>
<td>Cacheable</td>
<td>Area 2</td>
<td>Address error</td>
</tr>
<tr>
<td></td>
<td>Area 3</td>
<td>0xE000 0000</td>
</tr>
<tr>
<td></td>
<td>Area 4</td>
<td>Store queue region</td>
</tr>
<tr>
<td></td>
<td>Area 5</td>
<td>0xE400 0000</td>
</tr>
<tr>
<td></td>
<td>Area 6</td>
<td>Address error</td>
</tr>
<tr>
<td></td>
<td>Area 7 *</td>
<td>0xFFFF FFFF</td>
</tr>
<tr>
<td>0x8000 0000</td>
<td></td>
<td>0xFFFF FFFF</td>
</tr>
<tr>
<td>P1 region</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Cacheable</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0xA000 0000</td>
<td></td>
<td></td>
</tr>
<tr>
<td>P2 region</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Non-cacheable</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0xC000 0000</td>
<td></td>
<td></td>
</tr>
<tr>
<td>P3 region</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Cacheable</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0xE000 0000</td>
<td></td>
<td></td>
</tr>
<tr>
<td>P4 region</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Non-cacheable</td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0xFFFF FFFF</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

* Area 7 is reserved
### Table 13: P4 area

<table>
<thead>
<tr>
<th>Start address</th>
<th>End address</th>
<th>Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>0xE000 0000</td>
<td>0xE3FF FFFF</td>
<td>Comprises addresses for accessing the store queues (SQs). When the MMU is disabled (MMUCR.AT=0), the SQ access right is specified by the MMUCR.SQMD bit. For details, see Section 4.6: Store queues on page 101.</td>
</tr>
<tr>
<td>0xF000 0000</td>
<td>0xF0FF FFFF</td>
<td>Used for direct access to the instruction cache address array. For details, see Section 4.5.1: IC address array on page 95.</td>
</tr>
<tr>
<td>0xF100 0000</td>
<td>0xF1FF FFFF</td>
<td>Used for direct access to the instruction cache data array. For details, see Section 4.5.4: IC data array on page 97.</td>
</tr>
<tr>
<td>0xF200 0000</td>
<td>0xF2FF FFFF</td>
<td>Used for direct access to the instruction TLB address array. For details, see Section 3.8.1: ITLB address array on page 70</td>
</tr>
<tr>
<td>0xF300 0000</td>
<td>0xF3FF FFFF</td>
<td>Used for direct access to instruction TLB data arrays 1 and 2. For details, see Section 3.8.2: ITLB data array 1 on page 71.</td>
</tr>
<tr>
<td>0xF400 0000</td>
<td>0xF4FF FFFF</td>
<td>Used for direct access to the operand cache address array. For details, see Section 4.5.5: OC address array on page 98.</td>
</tr>
<tr>
<td>0xF500 0000</td>
<td>0xF5FF FFFF</td>
<td>Used for direct access to the operand cache data array. For details, see Section 4.5.6: OC data array on page 99.</td>
</tr>
<tr>
<td>0xF600 0000</td>
<td>0xF6FF FFFF</td>
<td>Used for direct access to the unified TLB address array. For details, see Section 3.8.3: UTLB address array on page 72.</td>
</tr>
<tr>
<td>0xF700 0000</td>
<td>0xF7FF FFFF</td>
<td>Used for direct access to unified TLB data arrays 1 and 2. For details, see Section 3.8.4: UTLB data array 1 on page 74.</td>
</tr>
<tr>
<td>0xFC00 0000</td>
<td>0xFFFF FFFF</td>
<td>Control register area.</td>
</tr>
</tbody>
</table>
3.4.3 Virtual address space

Setting the MMUCR.AT bit to 1, enables the P0, P3, and U0 regions of the address space in the SH-4 CPU core to be mapped onto any external memory space in 1-, 4-, or 64-kbyte, or 1-Mbyte, page units. Mapping from virtual address space to 29-bit external memory space is carried out using the TLB. When accessed using virtual addressing, Area 7 is equivalent to the P4 region in physical address space. Virtual address space is illustrated in Figure 8.

![Figure 8: Virtual memory space (MMUCR.AT = 1)]
P0, P3, U0 Regions: The P0 region (excluding addresses 0x7C00 0000 to 0x7FFFF FFFF), P3 region, and U0 region, allow access using the cache, and address translation using the TLB. These regions can be mapped onto any external memory space in 1, 4, or 64-kbyte, or 1-Mbyte, page units. When CCR is in the cache-enabled state, and the TLB enable bit (C bit) is 1, accesses can be performed using the cache. In write accesses to the cache, switching between the copy-back method and the write-through method is indicated by the TLB write-through bit (WT bit), and is specified in page units.

Only when the P0, P3, and U0 regions are mapped onto external memory space by means of the TLB, are addresses 0x1C00 0000 to 0x1FFFF FFFF of Area 7 in external memory space allocated to the control register area. This enables control registers to be accessed from the U0 region in user mode. In this case, the C bit for the corresponding page must be cleared to 0.

P1, P2, P4 Regions: Address translation using the TLB cannot be performed for the P1, P2, or P4 region (except for the store queue region). Accesses to these regions are the same as for physical address space. The store queue region can be mapped onto any external memory space by the MMU. However, operation in the case of an exception differs from that for normal P0, U0, and P3 spaces. For details, see section 4.6, Store Queues.

3.4.4 On-chip RAM space

In the SH-4 CPU core, half of the (16 kbyte) operand cache can be used as on-chip RAM. This can be done by changing the CCR settings.

When the operand cache is used as on-chip RAM (CCR.ORA = 1), the P0/U0 region addresses 0x7C00 0000 to 0x7FFFF FFFF are an on-chip RAM area. Data accesses (byte/word/longword/quadword) can be used in this area. This area can only be used in RAM mode.

Note: It is not possible to execute instructions out of this on-chip RAM.
3.4.5 Address translation

In the SH-4 CPU core, the ITLB is used for instruction accesses and the UTLB for data accesses. In the event of an access to an region other than the P4 region, the accessed virtual address is translated to a physical address. If the virtual address belongs to the P1 or P2 region, the physical address is uniquely determined without accessing the TLB. If the virtual address belongs to the P0, U0, or P3 region, the TLB is searched using the virtual address, and if the virtual address is recorded in the TLB, a TLB hit is made and the corresponding physical address is read from the TLB. If the accessed virtual address is not recorded in the TLB, a TLB miss exception is generated and processing switches to the TLB miss exception handling routine. In the TLB miss exception handling routine, the address translation table in external memory is searched, and the corresponding physical address and page management information are recorded in the TLB. After the return from the exception handling routine, the instruction which caused the TLB miss exception is re-executed.

3.4.6 Single virtual memory mode and multiple virtual memory mode

There are two virtual memory systems, either of which can be selected with the MMUCR.SV bit:

- single virtual memory
  A number of processes run simultaneously, using non-overlapping virtual address spaces, so that the physical address corresponding to a particular virtual address is uniquely determined.

- multiple virtual memory
  A number of processes run with overlapping virtual address spaces, consequently, virtual addresses may need to be translated into different physical addresses depending on the process i.d.

The only difference between the single virtual memory and multiple virtual memory systems in terms of operation is in the TLB address comparison method (see Section 3.5.3: Address translation method on page 59).
3.4.7 Address space identifier (ASID)

In multiple virtual memory mode, the 8-bit address space identifier (ASID) is used to distinguish between processes running simultaneously, while sharing the virtual address space. Software can set the ASID of the currently executing process in PTEH in the MMU. The TLB does not have to be purged when processes are switched by means of ASID.

In single virtual memory mode, ASID is used to provide memory protection for processes running simultaneously while using the virtual memory space on an exclusive basis.

3.5 TLB functions

3.5.1 Unified TLB (UTLB) configuration

The unified TLB (UTLB) is so called because of its use for the following two purposes:

1. To translate a virtual address to a physical address in a data access
2. As a table of address translation information, to be recorded in the instruction TLB in the event of an ITLB miss

Information in the address translation table located in external memory is cached into the UTLB. The address translation table contains virtual page numbers and address space identifiers, and corresponding physical page numbers and page management information. Figure 9 shows the overall configuration of the UTLB. The UTLB consists of 64 fully-associative type entries.

![Figure 9: UTLB configuration](image-url)
3.5.2 Instruction TLB (ITLB) configuration

The ITLB is used to translate a virtual address to a physical address in an instruction access. Information in the address translation table located in the UTLB, is cached into the ITLB. Figure 10 shows the overall configuration of the ITLB. The ITLB consists of 4 fully-associative type entries. The address translation information is almost the same as that in the UTLB, but with the following differences:

1. D and WT bits are not supported.
2. There is only one PR bit, corresponding to the upper of the PR bits in the UTLB.

![Figure 10: ITLB configuration](image)

3.5.3 Address translation method

Figure 11 and Figure 12 show flowcharts of memory accesses using the UTLB and ITLB.
Figure 11: Flowchart of memory access using UTLB figure
Figure 12: Flowchart of memory access using ITLB
3.6 MMU functions

3.6.1 MMU hardware management

The SH-4 CPU core supports the following MMU functions.

1. The MMU decodes the virtual address to be accessed by software, and performs address translation by controlling the UTLB/ITLB, in accordance with the MMUCR settings.

2. The MMU determines the cache access status, on the basis of the page management information read during address translation (C, WT bits).

3. If address translation cannot be performed normally in a data access or instruction access, the MMU notifies software by means of an MMU exception.

4. If address translation information is not recorded in the ITLB in an instruction access, the MMU searches the UTLB, and if the necessary address translation information is recorded in the UTLB, the MMU copies this information into the ITLB in accordance with MMUCR.LRUI.

3.6.2 MMU software management

Software processing for the MMU consists of the following:

1. Setting of MMU-related registers.
   Some registers are also partially updated by hardware automatically.

2. Recording, deletion, and reading of TLB entries.
   There are two methods of recording UTLB entries: by using the LDTLB instruction, or by writing directly to the memory-mapped UTLB.
   ITLB entries can only be recorded by writing directly to the memory-mapped ITLB. For deleting or reading UTLB/ITLB entries, it is possible to access the memory-mapped UTLB/ITLB.

3. MMU exception handling.
   When an MMU exception occurs, processing is performed based on information set by hardware.
3.6.3 MMU instruction (LDTLB)

A TLB load instruction (LDTLB) is provided for recording UTLB entries. When an LDTLB instruction is issued, the SH-4 CPU core copies the contents of PTEH and PTEL, to the UTLB entry indicated by MMUCR.URC. ITLB entries are not updated by the LDTLB instruction, and therefore address translation information purged from the UTLB entry may still remain in the ITLB entry. As the LDTLB instruction changes address translation information, ensure that it is issued by a program in the P1 or P2 region. The operation of the LDTLB instruction is shown in Figure 13.

**Figure 13: Operation of LDTLB instruction**
3.6.4 Hardware ITLB miss handling

In an instruction access, the SH-4 CPU core searches the ITLB. If it cannot find the necessary address translation information (i.e. in the event of an ITLB miss), the UTLB is searched by hardware, and if the necessary address translation information is present, it is recorded in the ITLB. This procedure is known as hardware ITLB miss handling. If the necessary address translation information is not found in the UTLB search, an instruction TLB miss exception is generated and processing passes to software.

3.6.5 Avoiding synonym problems

When 1 or 4-kbyte pages are recorded in TLB entries, a synonym problem may arise. The problem is that, when a number of virtual addresses are mapped onto a single physical address, the same physical address data may be recorded in a number of cache entries, and it becomes impossible to guarantee data integrity. This problem does not occur with the instruction TLB or instruction cache. In the SH-4 CPU core, line selection is performed using bits [13:5] of the virtual address, as this avoids the cache having to go via the TLB and thus achieves faster operand cache operation. However, bits [13:10] of the virtual address in the case of a 1-kbyte page, and bits [13:12] of the virtual address in the case of a 4-kbyte page, are subject to address translation. As a result, bits [13:10] of the physical address after translation may differ from bits [13:10] of the virtual address.

Great care must therefore be taken whenever translations are set up which could cause synonyms, in particular, if two operand translations are to the same physical page but their virtual addresses differ in their synonym bits:

• Do not allow both the translations to be active at the same time.
• Always separate activations of the two translations by an appropriate cache purge.
3.7 Handling MMU exceptions

There are seven MMU exceptions.

3.7.1 ITLBMULTIHIIT

An instruction TLB multiple hit exception occurs when, more than one ITLB entry matches the virtual address to which an instruction access has been made. If multiple hits occur when the UTLB is searched by hardware, in hardware ITLB miss handling, a data TLB multiple hit exception will result.

When an instruction TLB multiple hit exception occurs a reset is executed, and cache coherency is not guaranteed.

Hardware processing

See Chapter 5: Exceptions on page 105, ITLBMULTIHIIT - Instruction TLB Multiple-Hit Exception on page 118.

Software processing (reset routine)

The ITLB entries which caused the multiple hit exception are checked in the reset handling routine. This exception is intended for use in program debugging, and should not normally be generated.

3.7.2 ITLBMISS

An instruction TLB miss exception occurs when, address translation information for the virtual address to which an instruction access is made, is not found in the UTLB entries by the hardware ITLB miss handling procedure. The instruction TLB miss exception processing, carried out by software, is shown below. This is the same as the processing for a data TLB miss exception.

Hardware processing

See, Chapter 5: Exceptions on page 105, ITLBMISS - Instruction TLB Miss Exception on page 122.
Software processing (instruction TLB miss exception handling routine)

Software is responsible for searching the external memory page table and assigning the necessary page table entry. Software should carry out the following processing in order to find and assign the necessary page table entry.

1. Write to PTEL the values of the PPN, PR, SZ, C, D, SH, V, and WT bits in the page table entry recorded in the external memory address translation table.

2. When the entry to be replaced in entry replacement is specified by software, write that value to URC in the MMUCR register. If URC is greater than URB at this time, the value should be changed to an appropriate value after issuing an LDTLB instruction.

3. Execute the LDTLB instruction and write the contents of PTEH, PTEL, and to the TLB.

4. Finally, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction.

3.7.3 EXECPROT

An instruction TLB protection violation exception occurs when, even though an ITLB entry contains address translation information matching the virtual address to which an instruction access is made, the actual access type is not permitted by the access right specified by the PR bit. The instruction TLB protection violation exception processing, carried out by software, is shown below.

Hardware processing

See Chapter 5: Exceptions on page 105, EXECPROT - Instruction TLB Protection Violation Exception on page 126.

Software processing (instruction TLB protection violation exception handling routine)

Resolve the instruction TLB protection violation, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction.
3.7.4 OTLBMULTIHIIT

An operand TLB multiple hit exception occurs when, more than one UTLB entry matches the virtual address to which a data access has been made. A data TLB multiple hit exception is also generated if multiple hits occur, when the UTLB is searched in hardware ITLB miss handling.

When an operand TLB multiple hit exception occurs, a reset is executed, and cache coherency is not guaranteed. The contents of PPN in the UTLB prior to the exception may also be corrupted.

**Hardware processing**

See Chapter 5: Exceptions on page 105, OTLBMULTIHIIT - Operand TLB Multiple-Hit Exception on page 119.

**Software processing (reset routine)**

The UTLB entries which caused the multiple hit exception are checked in the reset handling routine. This exception is intended for use in program debugging, and should not normally be generated.

3.7.5 TLBMISS

A data TLB miss exception occurs when, address translation information for the virtual address to which a data access is made is not found in the UTLB entries. The data TLB miss exception processing, carried out by software, is shown below.

**Hardware processing**

See Chapter 5: Exceptions on page 105, RTLBMISS - Read Data TLB Miss Exception on page 120.

**Software processing (data TLB miss exception handling routine)**

Software is responsible for searching the external memory page table and assigning the necessary page table entry. Software should carry out the following processing in order to find and assign the necessary page table entry.

1. Write to PTEL the values of the PPN, PR, SZ, C, D, SH, V, and WT bits in the page table entry recorded in the external memory address translation table.
2 When the entry to be replaced in entry replacement is specified by software, write that value to URC in the MMUCR register. If URC is greater than URB at this time, the value should be changed to an appropriate value after issuing an LDTLB instruction.

3 Execute the LDTLB instruction and write the contents of PTEH, PTEL, and to the UTLB.

4 Finally, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction.

3.7.6 READPROT

A data TLB protection violation exception occurs when, even though a UTLB entry contains address translation information matching the virtual address to which a data access is made, the actual access type is not permitted by the access right specified by the PR bit. The data TLB protection violation exception processing, carried out by software, is shown below.

Hardware processing

See Chapter 5: Exceptions on page 105, READPROT - Data TLB Protection Violation Exception on page 124

Software processing (data TLB protection violation exception handling routine)

Resolve the data TLB protection violation, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction.

3.7.7 FIRSTWRITE

An initial page write exception occurs when, the D bit is 0 even though a UTLB entry contains address translation information matching the virtual address to which a data access (write) is made, and the access is permitted. The initial page write exception processing, carried out by software, is shown below.
Hardware processing

See Chapter 5: Exceptions on page 105, FIRSTWRITE - Initial Page Write Exception on page 123

Software processing (initial page write exception handling routine)

The following processing should be carried out as the responsibility of software:

1. Retrieve the necessary page table entry from external memory.
2. Write 1 to the D bit in the external memory page table entry.
3. Write to PTEL the values of the PPN, PR, SZ, C, D, WT, SH, and V bits in the page table entry recorded in external memory.
4. When the entry to be replaced in entry replacement is specified by software, write that value to URC in the MMUCR register. If URC is greater than URB at this time, the value should be changed to an appropriate value after issuing an LDTLB instruction.
5. Execute the LDTLB instruction and write the contents of PTEH, PTEL, and to the UTLB.
6. Finally, execute the exception handling return instruction (RTE), terminate the exception handling routine, and return control to the normal flow. The RTE instruction should be issued at least one instruction after the LDTLB instruction.

3.8 Memory-mapped TLB configuration

To enable the ITLB and UTLB to be managed by software, their contents can be read and written by a P2 region program, with a MOV instruction in privileged mode. Operation is not guaranteed if access is made from a program in another region. A branch to a region other than the P2 region should be made at least 8 instructions after this MOV instruction. The ITLB and UTLB are allocated to the P4 region in physical address space. VPN, V and ASID in the ITLB can be accessed as an address array, PPN, V, SZ, PR, C, and SH as data array 1. VPN, D, V, and ASID in the UTLB can be accessed as an address array, PPN, V, SZ, PR, C, D, WT, and SH as data array 1. V and D can be accessed from both the address array side and the data array side. Only longword access is possible. Instruction fetches cannot be performed in these regions. For reserved bits, a write value of 0 should be specified; their read value is undefined.
3.8.1 ITLB address array

The ITLB address array is allocated to addresses 0xF200 0000 to 0xF2FF FFFF in the P4 region. An address array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and VPN, V, and ASID to be written to the address array are specified in the data field.

In the address field, bits [31:24] have the value 0xF2 indicating the ITLB address array, and the entry is selected by bits [9:8]. As longword access is used, 0 should be specified for address field bits [1:0].

In the data field, VPN is indicated by bits [31:10], V by bit [8], and ASID by bits [7:0].

The following two kinds of operation can be used on the ITLB address array:

1. ITLB address array read
   VPN, V, and ASID are read into the data field from the ITLB entry corresponding to the entry set in the address field.

2. ITLB address array write
   VPN, V, and ASID specified in the data field are written to the ITLB entry corresponding to the entry set in the address field.

---

**Figure 14: Memory-mapped ITLB address array**

<table>
<thead>
<tr>
<th>Address field</th>
<th>Data field</th>
</tr>
</thead>
<tbody>
<tr>
<td>31 24 23</td>
<td>31 VPN</td>
</tr>
<tr>
<td>11 11 10 10</td>
<td>10 9 8 7</td>
</tr>
<tr>
<td>E</td>
<td>ASID</td>
</tr>
</tbody>
</table>

VPN: Virtual page number
V: Validity bit
ASID: Address space identifier
E: Entry
...: Reserved bits (0 write value, undefined read value)
3.8.2 ITLB data array 1

ITLB data array 1 is allocated to addresses 0xF300 0000 to 0xF37F FFFF in the P4 region. A data array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and PPN, V, SZ, PR, C, and SH to be written to the data array are specified in the data field.

In the address field, bits [31:23] have the value 0xF30 indicating ITLB data array 1, and the entry is selected by bits [9:8].

In the data field, PPN is indicated by bits [28:10], V by bit [8], SZ by bits [7] and [4], PR by bit [6], C by bit [3], and SH by bit [1].

The following two kinds of operation can be used on ITLB data array 1:

1. ITLB data array 1 read
   - PPN, V, SZ, PR, C, and SH are read into the data field from the ITLB entry corresponding to the entry set in the address field.

2. ITLB data array 1 write
   - PPN, V, SZ, PR, C, and SH specified in the data field are written to the ITLB entry corresponding to the entry set in the address field.

---

Figure 15: Memory-mapped ITLB data array 1
3.8.3 UTLB address array

The UTLB address array is allocated to addresses 0xF600 0000 to 0xF6FF FFFF in the P4 region. An address array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and VPN, D, V, and ASID to be written to the address array are specified in the data field.

In the address field, bits [31:24] have the value 0xF6 indicating the UTLB address array, and the entry is selected by bits [13:8]. The address array bit [7] association bit (A bit), specifies whether or not address comparison is performed when writing to the UTLB address array.

In the data field, VPN is indicated by bits [31:10], D by bit [9], V by bit [8], and ASID by bits [7:0].
The following three kinds of operation can be used on the UTLB address array:

1. **UTLB address array read**

   VPN, D, V, and ASID are read into the data field from the UTLB entry corresponding to the entry set in the address field. In a read, associative operation is not performed, regardless of whether the association bit specified in the address field is 1 or 0.

2. **UTLB address array write (non-associative)**

   VPN, D, V, and ASID specified in the data field are written to the UTLB entry corresponding to the entry set in the address field. The A bit in the address field should be cleared to 0.

3. **UTLB address array write (associative)**

   When a write is performed with the A bit in the address field set to 1, comparison of all the UTLB entries is carried out using the VPN specified in the data field and PTEH.ASID. The usual address comparison rules are followed, but if a UTLB miss occurs, the result is no operation, and an exception is not generated. If the comparison identifies a UTLB entry, corresponding to the VPN specified in the data field, D and V specified in the data field are written to that entry. If there is more than one matching entry, a data TLB multiple hit exception results. This associative operation is simultaneously carried out on the ITLB, and if a matching entry is found in the ITLB, V is written to that entry. Even if the UTLB comparison results in no operation, a write to the ITLB side only is performed as long as there is an ITLB match. If there is a match in both the UTLB and ITLB, the UTLB information is also written to the ITLB.

---

**Figure 16: Memory-mapped UTLB address array**

<table>
<thead>
<tr>
<th>Address field</th>
<th>Data field</th>
</tr>
</thead>
<tbody>
<tr>
<td>31 24 23 14 13 8 7 2 1 0</td>
<td>31 10 9 8 7 0</td>
</tr>
</tbody>
</table>

VPN: Virtual page number  ASID: Address space identifier
V: Validity bit  A: Association bit
E: Entry  ...: Reserved bits (0 write value, undefined read value)
D: Dirty bit

---

STMicroelectronics and Hitachi, Ltd.

ADCS 7182230F  SH-4 CPU Core Architecture
3.8.4 UTLB data array 1

UTLB data array 1 is allocated to addresses 0xF70 0000 to 0xF7F FFFF in the P4 region. A data array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification (when writing). Information for selecting the entry to be accessed is specified in the address field, and PPN, V, SZ, PR, C, D, SH, and WT to be written to the data array, are specified in the data field.

In the address field, bits [31:23] have the value 0xF70 indicating UTLB data array 1, and the entry is selected by bits [13:8].

In the data field, PPN is indicated by bits [28:10], V by bit [8], SZ by bits [7] and [4], PR by bits [6:5], C by bit [3], D by bit [2], SH by bit [1], and WT by bit [0].

The following two kinds of operation can be used on UTLB data array 1:

1. UTLB data array 1 read

   PPN, V, SZ, PR, C, D, SH, and WT are read into the data field, from the UTLB entry corresponding to the entry set in the address field.

2. UTLB data array 1 write

   PPN, V, SZ, PR, C, D, SH, and WT specified in the data field, are written to the UTLB entry corresponding to the entry set in the address field.

![Figure 17: Memory-mapped UTLB data array 1](image-url)
4.1 Overview

4.1.1 Features

Note: This chapter details both the SH4-103 and SH4-202 variants. Please refer to your datasheet for specific core details.

The SH-4 CPU core has an on-chip 8-kbyte instruction cache (IC) for instructions and 16-kbyte operand cache (OC) for data. Half of the memory of the operand cache (8 kbytes) can also be used as on-chip RAM. The features of these caches are summarized in Table 14.

The SH4-202 has an on-chip 16-kbyte instruction cache (IC) for instructions and 32-kbyte operand cache (OC) for data. Half of the operand cache (16 kbytes) can also be used as on-chip RAM. The features of these caches are summarized in Table 14 and Table 15.

The SH-4 CPU supports two 32-byte store queues (SQ) to perform high-speed writes to external memory. The features of the SQ are summarized in Table 16.

<table>
<thead>
<tr>
<th>Item</th>
<th>Instruction cache</th>
<th>Operand cache</th>
</tr>
</thead>
<tbody>
<tr>
<td>Capacity</td>
<td>8-kbyte cache</td>
<td>16-kbyte cache or 8-kbyte cache + 8-kbyte RAM</td>
</tr>
<tr>
<td>Type</td>
<td>Direct mapping</td>
<td>Direct mapping</td>
</tr>
<tr>
<td>Line size</td>
<td>32 bytes</td>
<td>32 bytes</td>
</tr>
</tbody>
</table>

Table 14: Cache features (SH4-103, SH4-202 in compatibility mode)
**PRELIMINARY DATA**

<table>
<thead>
<tr>
<th>Item</th>
<th>Instruction cache</th>
<th>Operand cache</th>
</tr>
</thead>
<tbody>
<tr>
<td>Entries</td>
<td>256</td>
<td>512</td>
</tr>
<tr>
<td>Write method</td>
<td>Copy-back/write-through selectable</td>
<td></td>
</tr>
</tbody>
</table>

Table 14: Cache features (SH4-103, SH4-202 in compatibility mode)

<table>
<thead>
<tr>
<th>Item</th>
<th>Instruction cache</th>
<th>Operand cache</th>
</tr>
</thead>
<tbody>
<tr>
<td>Capacity</td>
<td>16-kbyte cache</td>
<td>32-kbyte cache or 16-kbyte cache + 16-kbyte RAM</td>
</tr>
<tr>
<td>Type</td>
<td>2way set associative</td>
<td>2way set associative</td>
</tr>
<tr>
<td>Line size</td>
<td>32 bytes</td>
<td>32 bytes</td>
</tr>
<tr>
<td>Entries</td>
<td>256 entry /way</td>
<td>512 entry / way</td>
</tr>
<tr>
<td>Write method</td>
<td>Copy-back/write-through selectable</td>
<td></td>
</tr>
<tr>
<td>Replace algorithm</td>
<td>LRU</td>
<td>LRU</td>
</tr>
</tbody>
</table>

Table 15: Cache features (SH4-202 in the enhanced mode)

<table>
<thead>
<tr>
<th>Item</th>
<th>Store queues</th>
</tr>
</thead>
<tbody>
<tr>
<td>Capacity</td>
<td>2 × 32 bytes</td>
</tr>
<tr>
<td>Addresses</td>
<td>0xE000 0000 to 0xE3FF FFFF</td>
</tr>
<tr>
<td>Write</td>
<td>Store instruction</td>
</tr>
<tr>
<td>Write-back</td>
<td>Prefetch instruction</td>
</tr>
<tr>
<td>Access right</td>
<td>MMU off: according to MMUCR.SQMD</td>
</tr>
<tr>
<td></td>
<td>MMU on: according to individual page PR</td>
</tr>
</tbody>
</table>

Table 16: Store queue features
4.2 Register descriptions

There are three cache and store queue related control registers.

<table>
<thead>
<tr>
<th>Name</th>
<th>Abbreviation</th>
<th>R/W</th>
<th>Initial value(^a)</th>
<th>P4 address(^b)</th>
<th>Area 7 address(^b)</th>
<th>Access size</th>
</tr>
</thead>
<tbody>
<tr>
<td>Cache control register</td>
<td>CCR</td>
<td>R/W</td>
<td>0x0000 0000</td>
<td>0xFF00 001C</td>
<td>0x1F00 001C</td>
<td>32</td>
</tr>
<tr>
<td>Queue address control register 0</td>
<td>QACR0</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 0038</td>
<td>0x1F00 0038</td>
<td>32</td>
</tr>
<tr>
<td>Queue address control register 1</td>
<td>QACR1</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 003C</td>
<td>0x1F00 003C</td>
<td>32</td>
</tr>
</tbody>
</table>

Table 17: Cache control registers

- \(^a\) The initial value is the value after a power-on or manual reset.
- \(^b\) This is the address when using the virtual/physical address space P4 area. The area 7 address is the address used when making an access from physical address space area 7 using the TLB.

4.2.1 Cache control register (CCR)

CCR can be accessed by longword-size access from 0xFF00001C in the P4 region and 0x1F00001C in Area 7. The CCR bits are used to modify the cache settings described below. CCR modifications must only be made by a program in the non-cached P2 region. After CCR is updated, an instruction that performs data access to the P0, P1, P3, or U0 regions, should be located at least four instructions after the CCR update instruction. Also, a branch instruction to the P0, P1, P3, or U0 regions should be located at least eight instructions after the CCR update instruction.
### CCR

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>OCE</td>
<td>0</td>
<td>1</td>
<td>OC enable.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Operation</strong></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Indicates whether or not the OC is to be used. When address translation is performed, the OC cannot be used unless the C bit in the page management information is also 1.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0: OC not used.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1: OC used.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Power-on reset</strong></td>
<td>0</td>
</tr>
<tr>
<td>WT</td>
<td>1</td>
<td>1</td>
<td>Write-through enable.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Operation</strong></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Indicates the P0, U0 and P3 region cache write mode. When address translation is performed, the value of the WT bit in the page management information has priority.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0: Copy-back mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1: Write-through mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Power-on reset</strong></td>
<td>0</td>
</tr>
<tr>
<td>CB</td>
<td>2</td>
<td>1</td>
<td>Copy-back bit.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Operation</strong></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Indicates the P1 region cache write mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0: Write-through mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1: Copy-back mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Power-on reset</strong></td>
<td>0</td>
</tr>
<tr>
<td>OCI</td>
<td>3</td>
<td>1</td>
<td>OC invalidation bit.</td>
<td>RW</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Operation</strong></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>When 1 is written to this bit, the V and U bits of all OC entries are cleared to 0. This bit always returns 0 when read.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td><strong>Power-on reset</strong></td>
<td>0</td>
</tr>
</tbody>
</table>

*Table 18: CCR register description*
Table 18: CCR register description

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>ORA</td>
<td>5</td>
<td>1</td>
<td>OC RAM enable bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>0: Normal mode (all of OC is used as cache).&lt;br&gt;1: RAM mode (half of OC is used as cache, the other half is used as RAM. Please refer to Section 4.3.6).</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>OIX</td>
<td>7</td>
<td>1</td>
<td>OC index enable bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>0: Address bits [13:5] used for OC entry selection.&lt;br&gt;1: Address bits [25] and [12:5] used for OC entry selection.&lt;br&gt;Note: In SH4-202, when CCR.ORA is set to 1, CCR.OIX must be set to 0. Please refer to Section 4.3.7.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ICE</td>
<td>8</td>
<td>1</td>
<td>IC enable bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>Indicates whether or not the IC is to be used. When address translation is to be performed, the IC cannot be used unless the C bit in the page management information is also 1.&lt;br&gt;0: IC not used.&lt;br&gt;1: IC used.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ICI</td>
<td>11</td>
<td>1</td>
<td>IC invalidation bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td>When 1 is written to this bit, the V bits of all IC entries are cleared to 0. This bit always returns 0 when read.</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>IIX</td>
<td>15</td>
<td>1</td>
<td>IC index enable bit.</td>
<td>RW</td>
</tr>
<tr>
<td>Power-on reset</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
**4.2.2 Queue address control register 0 (QACR0)**

QACR0 can be accessed by longword-size access from 0xFF000038 in the P4 region, and 0x1F000038 in Area 7.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>EMODE</td>
<td>31</td>
<td>1</td>
<td>Enhanced mode <strong>SH4-202 only.</strong></td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Indicates whether or not the OC is to be used in enhanced mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0: Compatible mode*.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1: Enhanced mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>*<strong>: SH4-202 is not compatible with SH4-103 in the following conditions:</strong></td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>1. OC index mode and RAM mode.</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>2. Address map in RAM mode.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>0</td>
<td></td>
</tr>
<tr>
<td>Reserved bits</td>
<td>[4, 6, [10:9] [14:12] [30:16]]</td>
<td>23</td>
<td>For maximum forward compatibility preserve values on write, otherwise write 0. Read is undefined.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
</tbody>
</table>

**Table 18: CCR register description**

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Area</td>
<td>[2,4]</td>
<td>3</td>
<td>Queue address control register 0.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>QACR0 specifies the area onto which store queue 0 (SQ0) is mapped when the MMU is off.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
</tbody>
</table>

**Table 19: QACR0**
4.2.3 Queue address control register 1 (QACR1)

QACR1 can be accessed by longword-size access from 0xFF00003C in the P4 region, and 0x1F00003C in Area 7.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Area</td>
<td>[2,4]</td>
<td>3</td>
<td>Queue address control register 1.</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>QACR1 specifies the area onto which store queue 1 (SQ1) is mapped when the MMU is off.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
<tr>
<td>Reserved bits</td>
<td>[0,1], [5,31]</td>
<td>29</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 20: QACR1
4.3 Operand cache (OC)

4.3.1 Configuration

Figure 18 shows the configuration of the operand cache for the SH4-103 while Figure 19 shows the same for the SH4-202.

Figure 18: Configuration of operand cache on SH4-103 and SH-202 in compatibility mode
The operand cache for the SH4-103 consists of 512 cache lines, each composed of a 19-bit tag, V bit, U bit, and 32-byte data.

The SH4-202 operand cache is 2 way associative cache and consists of 512 cache lines/way, each composed of a 19-bit tag, V bit and 32-byte data.

- **Tag**

  Stores the upper 19 bits of the 29-bit external address of the data line to be cached. The tag is not initialized by a power-on or manual reset.
V bit (validity bit)

Setting this bit to 1, indicates that valid data is stored in the cache line. The V bit is initialized to 0 by a power-on reset, but retains its value in a manual reset.

U bit (dirty bit)

The U bit is set to 1 if data is written to the cache line, while the cache is being used in copy-back mode, that is the U bit indicates a mismatch between the data in the cache line and the data in external memory. The U bit is never set to 1 while the cache is being used in write-through mode, unless it is modified by accessing the memory-mapped cache (see Section 4.5: Memory-mapped cache configuration on page 95). The U bit is initialized to 0 by a power-on reset, but retains its value in a manual reset.

Data field

The data field holds 32 bytes (256 bits) of data per cache line. The data array is not initialized by a power-on or manual reset.

LRU (SH-4 200 series only)

When a 200 series SH-4 is operating in enhanced mode, an additional state bit is deployed to keep track of which of the two ways in each cache set was least recently used (LRU). These additional LRU bits can not be read or written by software.

4.3.2 Read operation

When the OC is enabled (CCR.OCE = 1) and data is read by means of an effective address from a cacheable area, the cache operates as follows:

1. The tag, V bit, and U bit are read from the cache line, indexed by effective address bits [13:5].
2. The tag is compared with bits [28:10] of the address resulting from effective address translation by the MMU. Operation is as described in Table 2.
### Table 21: OC read operation

<table>
<thead>
<tr>
<th>Tag match</th>
<th>V bit</th>
<th>U bit</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Yes</td>
<td>1</td>
<td>-</td>
<td>Cache hit</td>
<td>The data indexed by bits [4:0] of the effective address, is read from the cache line indexed by bits [13:5], in accordance with the access size (quadword/longword/word/byte).</td>
</tr>
<tr>
<td>Yes</td>
<td>0</td>
<td>-</td>
<td>Cache miss (no write-back)</td>
<td>Data from the external memory space, corresponding to the effective address, is written into the cache line. Data reading is performed using the critical word first method, and when the date arrives in the cache, the read data is returned to the CPU. The CPU continues to execute the next process, while the cache line of data is being read. When reading of one line of data is completed, the tag corresponding to the effective address is recorded in the cache, the V bit is set to 1, and the U bit is set to 0. The data in the write-back buffer is then written back to the external memory.</td>
</tr>
<tr>
<td>No</td>
<td>0</td>
<td>-</td>
<td>Cache miss</td>
<td>The tag and data field of the cache line, indexed by effective address bits [13:5], are saved in the write-back buffer. Then, data from the external memory space, corresponding to the effective address, is written into the cache line. Data reading is performed using the critical word first method, and when the date arrives in the cache, the read data is returned to the CPU. The CPU continues to execute the next process, while the cache line of data is being read. When reading of one line of data is completed, the tag corresponding to the effective address is recorded in the cache, the V bit is set to 1, and the U bit is set to 0. The data in the write-back buffer is then written back to the external memory.</td>
</tr>
</tbody>
</table>
4.3.3 Write operation

When the OC is enabled (CCR.OCE = 1) and data is written by means of an effective address to a cacheable area, the cache operates as follows:

1. The tag, V bit, and U bit are read from the cache line indexed by effective address bits [13:5].

2. The tag is compared with bits [28:10] of the address resulting from effective address translation by the MMU. In copy back, operation is per Table 22. In write through mode it is per Table 23.

<table>
<thead>
<tr>
<th>Tag match</th>
<th>V bit</th>
<th>U bit</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Yes</td>
<td>1</td>
<td>-</td>
<td>Cache hit (copy-back)</td>
<td>A data write for the data indexed by bits [4:0] is performed, in accordance with the access size (quadword/longword/word/byte).</td>
</tr>
<tr>
<td>Yes</td>
<td>0</td>
<td>-</td>
<td>Cache miss (no copy-back/write-back)</td>
<td>A data write for the data indexed by bits [4:0] is performed, in accordance with the access size (quadword/longword/word/byte). Then, data from the external memory corresponding to the effective address, is read into the cache line. Data reading is performed, using the critical word first method, and one cache line of data is read, excluding the written data. The CPU continues to execute the next process, while the cache line of data is being read. When reading of one line of data is completed, the tag corresponding to the effective address is recorded in the cache, the V bit and U bit are both set to 1.</td>
</tr>
<tr>
<td>No</td>
<td>0</td>
<td>-</td>
<td></td>
<td></td>
</tr>
<tr>
<td>No</td>
<td>1</td>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 22: OC write operation, with copy-back
The tag and data field of the cache line, indexed by effective address bits [13:5] are first saved in the write-back buffer. Then, a data write for the data indexed by bits [4:0], is performed in accordance with the access size (quadword/longword/word/byte). Data from the external memory space, corresponding to the effective address, is read into the cache line. Data reading is performed, using the critical word first method, and one cache line of data is read, excluding the written data. The CPU continues to execute the next process, while the cache line of data is being read. When reading of one line of data is completed, the tag corresponding to the effective address is recorded in the cache, the V bit and U bit are both set to 1. The data in the write back buffer is then written back to external memory.

<table>
<thead>
<tr>
<th>Tag match</th>
<th>V bit</th>
<th>U bit</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>No</td>
<td>1</td>
<td>1</td>
<td>Cache miss (with copy-back/ write-back)</td>
<td>The tag and data field of the cache line, indexed by effective address bits [13:5] are first saved in the write-back buffer. Then, a data write for the data indexed by bits [4:0], is performed in accordance with the access size (quadword/longword/word/byte). Data from the external memory space, corresponding to the effective address, is read into the cache line. Data reading is performed, using the critical word first method, and one cache line of data is read, excluding the written data. The CPU continues to execute the next process, while the cache line of data is being read. When reading of one line of data is completed, the tag corresponding to the effective address is recorded in the cache, the V bit and U bit are both set to 1. The data in the write back buffer is then written back to external memory.</td>
</tr>
</tbody>
</table>

Table 22: OC write operation, with copy-back

<table>
<thead>
<tr>
<th>Tag match</th>
<th>V bit</th>
<th>U bit</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Yes</td>
<td>1</td>
<td>-</td>
<td>Cache-hit (write-through)</td>
<td>A data write for the data indexed by bits [4:0], is performed in accordance with the access size (quadword/longword/word/byte). The U bit is set to 1.</td>
</tr>
<tr>
<td>Yes</td>
<td>0</td>
<td>-</td>
<td>Cache miss (write-through)</td>
<td>A write is performed to the external memory, corresponding to the effective address. A write to cache is not performed.</td>
</tr>
<tr>
<td>No</td>
<td>0</td>
<td>-</td>
<td></td>
<td></td>
</tr>
<tr>
<td>No</td>
<td>1</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>No</td>
<td>1</td>
<td>1</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 23: OC write operation, with write-through
4.3.4 Write-back buffer

The write-back buffer enables priority to be given to data reads, and improves performance. When a cache miss makes the purge of a dirty cache entry into external memory necessary, the cache entry is held in the write-back buffer. The write-back buffer contains one cache line of data and the physical address of the purge destination.

<table>
<thead>
<tr>
<th>Physical address bits [28:5]</th>
<th>LW0</th>
<th>LW1</th>
<th>LW2</th>
<th>LW3</th>
<th>LW4</th>
<th>LW5</th>
<th>LW6</th>
<th>LW7</th>
</tr>
</thead>
</table>

Figure 20: Configuration of write-back buffer

4.3.5 Write-through buffer

When writing data in write-through mode or writing to a non-cacheable area, data is held in a 64-bit buffer. This allows the CPU to proceed to the next operation as soon as the write to the write-through buffer is completed, without waiting for completion of the write to external memory.

<table>
<thead>
<tr>
<th>Physical address bits [28:0]</th>
<th>LW0</th>
<th>LW1</th>
</tr>
</thead>
</table>

Figure 21: Configuration of write-through buffer

4.3.6 RAM mode

**SH-4 100 series**

Setting CCR.ORA to 1 enables 8 kbytes of the operand cache to be used as RAM. The operand cache entries used as RAM are, entries 128 to 255 and 384 to 511. Other entries can still be used as cache. RAM can be accessed using addresses 0x7C00 0000 to 0x7FFF FFFF. Byte-, word-, longword-, and quadword-size data reads and writes can be performed in the operand cache RAM area. Instruction fetches cannot be performed in this area.

**Note:** On the SH4-202, RAM mode cannot be used in conjunction with OC index mode even when in compatibility mode.
An example of RAM use is shown below. Here, the 4 kbytes comprising OC entries 128 to 256 are designated as RAM area 1, and the 4 kbytes comprising OC entries 384 to 511 as RAM area 2.

- When OC index mode is off (CCR.OIX = 0):

<table>
<thead>
<tr>
<th>Address start</th>
<th>Address end</th>
<th>Size</th>
<th>RAM area</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x7C00 0000</td>
<td>0x7C00 0FFF</td>
<td>4-kbytes</td>
<td>1</td>
</tr>
<tr>
<td>0x7C00 1000</td>
<td>0x7C00 1FFF</td>
<td></td>
<td>1</td>
</tr>
<tr>
<td>0x7C00 2000</td>
<td>0x7C00 2FFF</td>
<td></td>
<td>2</td>
</tr>
<tr>
<td>0x7C00 3000</td>
<td>0x7C00 3FFF</td>
<td></td>
<td>2</td>
</tr>
<tr>
<td>0x7C00 4000</td>
<td>0x7C00 4FFF</td>
<td></td>
<td>1(^{a})</td>
</tr>
</tbody>
</table>

Table 24: RAM use when OC index mode is off

\(^{a}\) RAM areas 1 and 2 then repeat every 8Kbytes up to 0x7FFF FFFF.

Thus, to secure a continuous 8-kbyte RAM area, the area from 0x7C00 1000 to 0x7C00 2FFF can be used, for example.

- When OC index mode is on (CCR.OIX = 1):

<table>
<thead>
<tr>
<th>Address start</th>
<th>Address end</th>
<th>Size</th>
<th>RAM area</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x7C00 0000</td>
<td>0x7C00 0FFF</td>
<td>4-kbytes</td>
<td>1</td>
</tr>
<tr>
<td>0x7C00 1000</td>
<td>0x7C00 1FFF</td>
<td></td>
<td>1</td>
</tr>
<tr>
<td>0x7C00 2000</td>
<td>0x7C00 2FFF</td>
<td></td>
<td>1</td>
</tr>
<tr>
<td>...</td>
<td>...</td>
<td></td>
<td>1</td>
</tr>
<tr>
<td>0x7DFF F000</td>
<td>0x7DFF FFFF</td>
<td></td>
<td>1</td>
</tr>
<tr>
<td>0x7E00 0000</td>
<td>0x7E00 0FFF</td>
<td></td>
<td>2</td>
</tr>
<tr>
<td>0x7E001000</td>
<td>0x7E00 1FFF</td>
<td></td>
<td>2</td>
</tr>
<tr>
<td>...</td>
<td>...</td>
<td></td>
<td>2</td>
</tr>
<tr>
<td>0x7FFF F000</td>
<td>0x7FFF FFFF</td>
<td></td>
<td>2</td>
</tr>
</tbody>
</table>

Table 25: RAM use when OC index mode is on

As the distinction between RAM areas 1 and 2 is indicated by address bit [25], the area from 0x7DFF F000 to 0x7E00 0FFF should be used to secure a continuous 8-kbyte RAM area.
RAM Mode of SH4-202

Setting CCR.ORA to 1 enables half of the operand cache to be used as RAM. The operand cache entries used as RAM are entries 256 to 511 in the compatible mode. The operand cache entries used as RAM are entries 256 to 511 of each way in the enhanced mode. Other entries can still be used as cache. RAM can be accessed using addresses 0x7C00 0000 to 0x7FFF FFFF. Byte-, word-, longword-, and quadword-size data reads and writes can be performed in the operand cache RAM area. Instruction fetches cannot be performed in this area.

Even when in compatibility mode, the OC index mode cannot be used in conjunction with RAM mode on a 200 series part.

RAM mode address map of SH4-202

An example of RAM use is shown below. Here, the 8 kbytes comprising OC entries 256 to 511 of way 0 are designated as RAM area 1, and the 8 kbytes comprising OC entries 256 to 511 of way 1 as RAM area 2.

In the compatible mode (CCR.EMODE = 0)

<table>
<thead>
<tr>
<th>Address start</th>
<th>Address end</th>
<th>Size</th>
<th>RAM area</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x7C00 0000</td>
<td>0x7C00 1FFF</td>
<td>8-Kbytes</td>
<td>256-511</td>
</tr>
<tr>
<td>0x7C00 2000</td>
<td>0x7C00 3FFF</td>
<td>8-Kbytes</td>
<td>256-511</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x7FFF E000</td>
<td>0x7FFF FFFF</td>
<td>8-Kbytes</td>
<td>256-511</td>
</tr>
</tbody>
</table>

Table 26: Compatible mode

In the enhanced mode (CCR.EMODE = 1)

<table>
<thead>
<tr>
<th>Address start</th>
<th>Address end</th>
<th>Size</th>
<th>RAM area</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x7C00 0000</td>
<td>0x7C00 1FFF</td>
<td>8-Kbytes</td>
<td>1</td>
</tr>
<tr>
<td>0x7C00 2000</td>
<td>0x7C00 3FFF</td>
<td>8-Kbytes</td>
<td>2</td>
</tr>
<tr>
<td>0x7C00 4000</td>
<td>0x7C00 5FFF</td>
<td>8-Kbytes</td>
<td>1</td>
</tr>
<tr>
<td>0x7C00 6000</td>
<td>0x7C00 7FFF</td>
<td>8-Kbytes</td>
<td>2</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0x7FFF C000</td>
<td>0x7FFF FFFF</td>
<td>16-Kbytes</td>
<td></td>
</tr>
</tbody>
</table>

Table 27: Compatible mode
4.3.7 OC index mode

OC index mode is only available on the SH-4 100 series or when a 200 series part is used in compatibility mode and RAM mode is not being used.

In normal mode, with CCR.OIX cleared to 0, OC indexing is performed using bits [13:5] of the effective address. Using index mode, with CCR.OIX set to 1, allows the OC to be handled as two 8-kbyte areas, by means of effective address bit [25]. This partitioning makes it possible for the software to make more efficient use of the cache.

4.3.8 Coherency between cache and external memory

Coherency between cache and external memory should be assured by software. In the SH-4 CPU core, the following four new instructions are supported for cache operations. Details of these instructions are given in the Instruction Descriptions chapter.

- Invalidate instruction: OCBI @Rn Cache invalidation (no write-back)
- Purge instruction: OCBP @Rn Cache invalidation (with write-back)
- Write-back instruction: OCBWB @Rn Cache write-back
- Allocate instruction: MOVCA.L R0,@Rn Cache allocation

4.3.9 Prefetch operation

The SH-4 CPU core supports a prefetch instruction, to reduce the cache fill penalty incurred as the result of a cache miss. If it is known that a cache miss will result from a read or write operation, it can be prevented by using the prefetch instruction to fill the cache with data before the operation, and so improve software performance. If a prefetch instruction is executed for data already held in the cache, or if the prefetch address results in a UTLB miss or a protection violation, the result is no operation, and an exception is not generated. Details of the prefetch instruction are given in the Instruction Descriptions chapter.

Prefetch instruction: PREF @Rn
4.4 Instruction cache (IC)

4.4.1 Configuration

Figure 22 shows the configuration of the instruction cache for SH4-103, while Figure 23 shows the IC for the SH4-202.

![Diagram of Instruction Cache Configuration]

Figure 22: Configuration of instruction cache on the SH4-103 (and SH4-202 in compatibility mode)
The instruction cache for the SH4-103 consists of 256 cache lines, each composed of a 19-bit tag, V bit, and 32-byte data (16 instructions).

The instruction cache for the SH4-202 consists of 256 cache lines/way, each composed of a 19-bit tag, V bit, and 32-byte data (16 instructions).

- **Tag**
  
  Stores the upper 19 bits of the 29-bit external memory address of the data line to be cached. The tag is not initialized by a power-on or manual reset.

- **V bit (validity bit)**
  
  Setting this bit to 1 indicates that valid data is stored in the cache line. The V bit is initialized to 0 by a power-on reset, but retains its value in a manual reset.
• Data array

The data field holds 32 bytes (256 bits) of data per cache line. The data array is not initialized by a power-on or manual reset.

• LRU (SH-4 200 series only)

When a 200 series SH-4 is operating in enhanced mode, an additional state bit is deployed to keep track of which of the two ways in each cache set was least recently used (LRU). These additional LRU bits can not be read or written by software.

### 4.4.2 Read operation

When the IC is enabled (CCR.ICE = 1), and instruction fetches are performed by means of an effective address from a cacheable area, the instruction cache operates as follows:

1. The tag and V bit are read from the cache line indexed by effective address bits [12:5].
2. The tag is compared with bits [28:10] of the address resulting from effective address translation by the MMU:

<table>
<thead>
<tr>
<th>Tag</th>
<th>V bit</th>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Matches</td>
<td>1</td>
<td>Cache hit</td>
<td>Data indexed by effective address bits [4:2], is read as an instruction.</td>
</tr>
<tr>
<td>Matches</td>
<td>0</td>
<td>Cache miss</td>
<td>Data is read into the cache line, from the external memory space corresponding to the effective address. Data reading is performed, using the critical word first method, and when the data arrives in the cache, the read data is returned to the CPU as an instruction. When reading of one line of data is completed, the tag corresponding to the effective address is recorded in the cache, and 1 is written to the V bit.</td>
</tr>
<tr>
<td>Does not match</td>
<td>0</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Does not match</td>
<td>1</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

*Table 28: IC read operation*

### 4.4.3 IC index mode

IC index mode is only available on the SH-4 100 series or when a 200 series part is used in compatibility mode.
In normal mode, with CCR.IIX cleared to 0, IC indexing is performed using bits [12:5] of the effective address. Using index mode, with CCR.IIX set to 1, allows the IC to be handled as two 4-kbyte areas by means of effective address bit [25]. This provides efficient use of the cache.

4.5 Memory-mapped cache configuration

To enable the IC and OC to be managed by software, IC content can be read and written by a P2 region program, with a MOV instruction in privileged mode. Behavior is undefined if access is made from a program in another region. In this case, a branch to the P0, U0, P1, or P3 regions should be made at least 8 instructions after this MOV instruction.

The OC content can be read and written by a P1 and P2 regions program, with a MOV instruction in privileged mode. Behavior is undefined if access is made from a program in another region. In this case, a branch to the P0, U0, or P3 regions should be made at least 8 instructions after this MOV instruction.

The IC and OC are allocated to the P4 region in physical memory space. Only (longword) data accesses can be used on both the IC address array and data array, and the OC address array and data array. Instruction fetches cannot be performed in these regions. For reserved bits, a write value of 0 should be specified; their read value is undefined.

4.5.1 IC address array

The IC address array is allocated to addresses 0xF000 0000 to 0xFF FFFF in the P4 region. An address array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification. The entry to be accessed is specified in the address field, and the write tag and V bit are specified in the data field.

In the address field, bits [31:24] have the value 0xF0 indicating the IC address array, and the entry is specified by bits [12:5]. CCR.IIX has no effect on this entry specification. The address array bit [3], the association bit (A bit), specifies whether or not association is performed when writing to the IC address array. As only longword access is used, 0 should be specified for address field bits [1:0].
In the data field, the tag is indicated by bits [31:10], and the V bit by bit [0]. As the IC address array tag is 19 bits in length, data field bits [31:29] are not used in the case of a write in which association is not performed. Data field bits [31:29] are used for the virtual address specification, only in the case of a write in which association is performed.

The following three kinds of operation can be used on the IC address array:

1. IC address array read
   The tag and V bit are read into the data field from the IC entry corresponding to the entry set in the address field. In a read, associative operation is not performed, regardless of whether the association bit specified in the address field is 1 or 0.

2. IC address array write (non-associative)
   The tag and V bit specified in the data field are written to the IC entry corresponding to the entry set in the address field. The A bit in the address field should be cleared to 0.

3. IC address array write (associative)
   When a write is performed with the A bit in the address field set to 1, the tag stored in the entry specified in the address field, is compared with the tag specified in the data field. If the MMU is enabled at this time, comparison is performed after the virtual address, specified by data field bits [31:10], has been translated to a physical address using the ITLB. If the addresses match and the V bit is 1, the V bit specified in the data field is written into the IC entry. In other cases, no operation is performed. This operation is used to invalidate a specific IC entry. If an ITLB miss occurs during address translation, or the comparison shows a mismatch, an interrupt is not generated, no operation is performed, and the write is not executed. If an instruction TLB multiple hit exception occurs during address translation, processing switches to the instruction TLB multiple hit exception handling routine.

![Figure 24: Memory-mapped IC address array](image-url)
4.5.4 IC data array

The IC data array is allocated to addresses 0xF100 0000 to 0xF1FF FFFF in the P4 region. A data array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification. The entry to be accessed is specified in the address field, and the longword data to be written is specified in the data field.

In the address field, bits [31:24] have the value 0xF1 indicating the IC data array, and the entry is specified by bits [12:5]. CCR.IIX has no effect on this entry specification. Address field bits [4:2] are used for the longword data specification in the entry. As only longword access is used, 0 should be specified for address field bits [1:0].

The data field is used for the longword data specification.

The following two kinds of operation can be used on the IC data array:

1. IC data array read
   Longword data is read into the data field, from the data specified by the longword specification bits in the address field in the IC entry, corresponding to the entry set in the address field.

2. IC data array write
   The longword data specified in the data field is written, for the data specified by the longword specification bits in the address field in the IC entry, corresponding to the entry set in the address field.

![Figure 25: Memory-mapped IC data array](image)
4.5.5 OC address array

The OC address array is allocated to addresses 0xF400 0000 to 0xF4FF FFFF in the P4 region. An address array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification. The entry to be accessed is specified in the address field, and the write tag, U bit, and V bit are specified in the data field.

In the address field, bits [31:24] have the value 0xF4 indicating the OC address array, and the entry is specified by bits [13:5]. CCR.OIX and CCR.ORA have no effect on this entry specification. The address array bit [3], association bit (A bit), specifies whether or not association is performed when writing to the OC address array. As only longword access is used, 0 should be specified for address field bits [1:0].

In the data field, the tag is indicated by bits [31:10], the U bit by bit [1], and the V bit by bit [0]. As the OC address array tag is 19 bits in length, data field bits [31:29] are not used in the case of a write in which association is not performed. Data field bits [31:29] are used for the virtual address specification only in the case of a write in which association is performed.

The following three kinds of operation can be used on the OC address array:

1. OC address array read
   
   The tag, U bit, and V bit are read into the data field from the OC entry corresponding to the entry set in the address field. In a read, associative operation is not performed, regardless of whether the association bit specified in the address field is 1 or 0.

2. OC address array write (non-associative)
   
   The tag, U bit, and V bit specified in the data field are written to the OC entry corresponding to the entry set in the address field. The A bit in the address field should be cleared to 0.

   When a write is performed to a cache line for which the U bit and V bit are both 1, after write-back of that cache line, the tag, U bit, and V bit specified in the data field are written.
3 OC address array write (associative)

When a write is performed with the A bit in the address field set to 1, the tag stored in the entry specified in the address field is compared with the tag specified in the data field. If the MMU is enabled at this time, comparison is performed after the virtual address specified by data field bits [31:10] has been translated to a physical address using the UTLB. If the addresses match and the V bit is 1, the U bit and V bit specified in the data field are written into the OC entry. This operation is used to invalidate a specific OC entry. In other cases, no operation is performed. If the OC entry U bit is 1, and 0 is written to the V bit or to the U bit, write-back is performed. If a UTLB miss occurs during address translation, or the comparison shows a mismatch, an exception is not generated, no operation is performed, and the write is not executed. If a data TLB multiple hit exception occurs during address translation, processing switches to the data TLB multiple hit exception handling routine.

<table>
<thead>
<tr>
<th>Address field</th>
<th>31</th>
<th>24 23</th>
<th>14 13</th>
<th>5 4 3 2 1 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>Entry</td>
<td></td>
<td></td>
<td></td>
<td>A</td>
</tr>
<tr>
<td>Data field</td>
<td>31</td>
<td>10 9</td>
<td>2 1 0</td>
<td>U V</td>
</tr>
</tbody>
</table>

**Figure 26: Memory-mapped OC address array**

4.5.6 OC data array

The OC data array is allocated to addresses 0xF500 0000 to 0xF5FF FFFF in the P4 region. A data array access requires a 32-bit address field specification (when reading or writing), and a 32-bit data field specification. The entry to be accessed is specified in the address field, and the longword data to be written is specified in the data field.

In the address field, bits [31:24] have the value 0xF5 indicating the OC data array, and the entry is specified by bits [13:5]. CCR.OIX and CCR.ORA have no effect on this entry specification. Address field bits [4:2] are used for the longword data specification in the entry. As only longword access is used, 0 should be specified for address field bits [1:0].
The data field is used for the longword data specification.

The following two kinds of operation can be used on the OC data array:

1. **OC data array read**

   Longword data is read into the data field, from the data specified by the longword specification bits in the address field, in the OC entry corresponding to the entry set in the address field.

2. **OC data array write**

   The longword data specified in the data field is written for the data specified by the longword specification bits in the address field in the OC entry corresponding the entry set in the address field. This write does not set the U bit to 1 on the address array side.

---

**Figure 27: Memory-mapped OC data array**

### Memory-mapped OC configuration in the enhanced mode (SH4-202)

- **Normal mode (0xF500 0000 to 0xF500 1FFF)**
  - (8 Kbyte): Corresponds to Way0 (entry 0 - 255)
  - (16 Kbyte): Corresponds to Way1 (entry 0 - 511)
  - Cache area then repeat every 16 kbytes up to 0xF5FF FFFF.

- **RAM mode (CCR.ORA =1)**
  - (8 Kbyte): Corresponds to Way0 (entry 0 - 255)
  - (16 Kbyte): Corresponds to Way1 (entry 0 - 511)
  - Cache area then repeat every 16 kbytes up to 0xF5FF FFFF.
4.6 Store queues

Two 32-byte store queues (SQs) are supported to perform high-speed writes to external memory. When not using the SQs, the low power dissipation power-down modes, in which SQ functions are stopped, can be used. The queue address control registers (QACR0 and QACR1) cannot be accessed while SQ functions are stopped. Refer to the product level documentation of clock and power management for the details on stopping SQ functions.

<table>
<thead>
<tr>
<th>Item</th>
<th>Store queues</th>
</tr>
</thead>
<tbody>
<tr>
<td>Capacity</td>
<td>2 * 32</td>
</tr>
<tr>
<td>Addresses</td>
<td>0xE000 0000 to 0xE3FF FFFF</td>
</tr>
<tr>
<td>Write</td>
<td>Store instruction (1-cycle write)</td>
</tr>
<tr>
<td>Write-back</td>
<td>Prefetch instruction</td>
</tr>
<tr>
<td>Access right</td>
<td>MMU off: according to MMUCR.SQMD</td>
</tr>
<tr>
<td></td>
<td>MMU on: according to individual page PR</td>
</tr>
</tbody>
</table>

Table 29: Store queue features

4.6.1 SQ configuration

There are two 32-byte store queues, SQ0 and SQ1, as shown in Figure 28. These two store queues can be set independently.

Figure 28: Store queue configuration
4.6.2 SQ writes

A write to the SQs can be performed using a store instruction on P4 area 0xE000 0000 to 0xE3FF FFFC. A longword or quadword access size can be used. The meaning of the address bits is as follows:

<table>
<thead>
<tr>
<th>Bit Range</th>
<th>Value</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>[31:26]</td>
<td>111000</td>
<td>Store queue specification</td>
</tr>
<tr>
<td>[25:6]</td>
<td>Don’t care</td>
<td>Used for external memory transfer/access right</td>
</tr>
<tr>
<td>[5]</td>
<td>0/1</td>
<td>0: SQ0 specification 1: SQ1 specification</td>
</tr>
<tr>
<td>[4:2]</td>
<td>LW specification</td>
<td>Specifies longword position in SQ0/SQ1</td>
</tr>
<tr>
<td>[1:0]</td>
<td>00</td>
<td>Fixed at 0</td>
</tr>
</tbody>
</table>

4.6.3 SQ reads (implementation dependant)

A read from the SQs can be performed using a load instruction on P4 area 0xFF00 1000 to 0xFF00 103C. A longword access size must be used. The meaning of the address bits is as follows:

<table>
<thead>
<tr>
<th>Bit Range</th>
<th>Value</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>[31:6]</td>
<td>0xFF00100</td>
<td>Store queue specification</td>
</tr>
<tr>
<td>[5]</td>
<td>0/1</td>
<td>0: SQ0 specification 1: SQ1 specification</td>
</tr>
<tr>
<td>[4:2]</td>
<td>LW specification</td>
<td>Specifies longword position in SQ0/SQ1</td>
</tr>
<tr>
<td>[1:0]</td>
<td>00</td>
<td>Fixed at 0</td>
</tr>
</tbody>
</table>

4.6.4 Transfer to external memory

Transfer from the SQs to external memory can be performed with the prefetch instruction (PREF). Issuing a PREF instruction for 0xE000 0000 to 0xE3FF FFFC in the P4 region, starts a burst transfer from the SQs to external memory. The burst transfer has a fixed length of 32 bytes, and the start address must be at a 32-byte boundary. While the contents of one SQ are being transferred to external memory, the other SQ can be written to, without incurring a penalty cycle. A write to the SQ being transferred to external memory is suspended until the transfer to external memory is completed.

The SQ transfer destination external address bit [28:0] specification is as shown below, according to whether the MMU is on or off.
• When MMU is on

The SQ area (0xE000 0000 to 0xE3FF FFFF) is set in VPN of the UTLB, and the transfer destination external address is set in PPN. The ASID, V, SZ, SH, PR, and D bits have the same meaning as for normal address translation, but the C and WT bits have no meaning with regard to this page.

When a prefetch instruction is issued for the SQ area, address translation is performed and external memory address bits [28:10] are generated in accordance with the SZ bit specification. For external address bits [9:5], the address prior to address translation is generated in the same way as when the MMU is off. External address bits [4:0] are fixed at 0. Transfer from the SQs to external is performed to this address.

If SQ access is enabled by MMUCR.SQMD, in privileged mode only, an address error will be flagged in user mode, even if address translation is successful.

• When MMU is off

• The SQ area (0xE000 0000 to 0xE3FF FFFF) is specified as the address at which a prefetch is performed. The meaning of address bits [31:0] is as follows:

<table>
<thead>
<tr>
<th>Bit 31:26</th>
<th>Bit 25:6</th>
<th>Bit 5</th>
<th>Bit 4:2</th>
<th>Bit 1:0</th>
</tr>
</thead>
<tbody>
<tr>
<td>External queue specification</td>
<td>Address</td>
<td>0/1</td>
<td>Don’t care</td>
<td>00</td>
</tr>
</tbody>
</table>

External address bits [28:26], which cannot be generated from the above address, are generated from the QACR0/1 registers.

QACR0 [4:2]: External address bits [28:26] corresponding to SQ0
QACR1 [4:2]: External address bits [28:26] corresponding to SQ1

External address bits [4:0] are always fixed at 0 since burst transfer starts at a 32-byte boundary.
Determination of SQ access exception

Determination of an exception in a write to an SQ or transfer to external memory (PREF instruction) is performed as follows according to whether the MMU is on or off. In the SH7751, if an exception occurs in as SQ Write, the SQ contents may be corrupted. In the SH7751R, if an exception occurs in as SQ Write, SQ write access is cancelled and the data before the SQ write access is kept. If an exception occurs in transfer from an SQ to external memory, the transfer to external memory will be aborted.

• When MMU is on

  Operation is in accordance with the address translation information recorded in the UTLB, and MMUCR.SQMD. Write type exception judgment is performed for writes to the SQs, and read type for transfer from the SQs to external memory (PREF instruction), and a TLB miss exception, protection violation exception, or initial page write exception is generated. However, if SQ access is enabled, in privileged mode only, by MMUCR.SQMD, an address error will be flagged in user mode even if address translation is successful.

• When MMU is off

  Operation is in accordance with MMUCR.SQMD.

  0: Privileged/user access possible

  1: Privileged access possible

  If the SQ area is accessed in user mode when MMUCR.SQMD is set to 1, an address error will be flagged.
5.1 Overview

The process of responding to an extraordinary event such as a reset, a general exception (trap) or an interrupt, is called exception handling.

Exception handling is performed by user supplied special routines, that are executed by the CPU when one of these extraordinary events is encountered.

5.2 Register descriptions

There are three registers related to exception handling. These are allocated to memory, and can be accessed by specifying the P4 address or Area 7 address.

<table>
<thead>
<tr>
<th>Name</th>
<th>Abbreviation</th>
<th>R/W</th>
<th>Initial value</th>
<th>P4 address</th>
<th>Area 7 address</th>
<th>Access size</th>
</tr>
</thead>
<tbody>
<tr>
<td>TRAP exception register</td>
<td>TRA</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 0020</td>
<td>0x1F00 0020</td>
<td>32</td>
</tr>
<tr>
<td>Exception event register</td>
<td>EXPEVT</td>
<td>R/W</td>
<td>0x0000 0000/0x0000 0020</td>
<td>0xFF00 0024</td>
<td>0x1F00 0024</td>
<td>32</td>
</tr>
<tr>
<td>Interrupt event register</td>
<td>INTEVT</td>
<td>R/W</td>
<td>Undefined</td>
<td>0xFF00 0028</td>
<td>0x1F00 0028</td>
<td>32</td>
</tr>
</tbody>
</table>

Table 30: Exception-related registers

a. 0x0000 0000 is set in a power-on reset, and 0x0000 0020 in a manual reset.

b. This is the address when using the virtual/physical address space P4 area. When making an access from physical address space area 7 using the TLB, the upper 3 bits of the address are ignored.
5.2.1 Exception event register (EXPEVT)

The exception event register (EXPEVT) resides at P4 address 0xFF00 0024, and contains a 12-bit exception code. The exception code set in EXPEVT is that for a reset or general exception event. The exception code is set automatically by hardware when an exception occurs. EXPEVT can also be modified by software.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Exception code</td>
<td>[0,11]</td>
<td>12</td>
<td>Exception code</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Exception code set automatically by hardware when exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
<tr>
<td>RES</td>
<td>[12,31]</td>
<td>20</td>
<td>Bits reserved</td>
<td>RW</td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
</tbody>
</table>

Table 31: EXPEVT Register Description

5.2.2 Interrupt event register (INTEVT)

The interrupt event register (INTEVT) resides at P4 address 0xFF00 0028, and contains a 12-bit exception code. The exception code set in INTEVT is that for an interrupt request. The exception code is set automatically by hardware when an exception occurs. INTEVT can also be modified by software.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Exception code</td>
<td>[0,11]</td>
<td>12</td>
<td>Exception code</td>
<td>RW</td>
</tr>
<tr>
<td>Operation</td>
<td></td>
<td></td>
<td>Exception code set automatically by hardware when exception occurs.</td>
<td></td>
</tr>
<tr>
<td>Power-on reset</td>
<td></td>
<td></td>
<td>Undefined</td>
<td></td>
</tr>
</tbody>
</table>

Table 32: INTEVT Register Description
5.2.3 TRAPA exception register (TRA)

The TRAPA exception register (TRA) resides at P4 address 0xFF 00 0020. TRA is set automatically by hardware when a TRAPA instruction is executed. TRA can also be modified by software.

<table>
<thead>
<tr>
<th>Field</th>
<th>Bits</th>
<th>Size</th>
<th>Synopsis</th>
<th>Type</th>
</tr>
</thead>
<tbody>
<tr>
<td>Imm</td>
<td>[2,9]</td>
<td>8</td>
<td>8-bit immediate data for the TRAPA instruction.</td>
<td>RW</td>
</tr>
<tr>
<td>RES</td>
<td>[0,1], [10,31]</td>
<td>24</td>
<td>Bits reserved</td>
<td>RW</td>
</tr>
<tr>
<td>Power-on reset</td>
<td>Undefined</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 32: INTEVT Register Description

Table 33: TRA
5.3 Exception handling functions

5.3.1 Exception handling flow

In exception handling, the contents of the program counter (PC), status register (SR) and R15 are saved in the saved program counter (SPC), saved status register (SSR) and saved general register (SGR). The CPU starts execution of the appropriate exception handling routine according to the vector address. An exception handling routine is a program the user writes to handle a specific exception. The exception handling routine is terminated and control returned to the original program, by executing a return-from-exception instruction (RTE). This instruction restores the PC and SR contents, and returns control to the normal processing routine at the point at which the exception occurred. The SGR contents are not written back to R15 by an RTE instruction.

The basic processing flow is as follows. See section 2, Data Formats and Registers, for the meaning of the individual SR bits.

1. The PC, SR and R15 contents are saved in SPC, SSR and SGR.
2. The block bit (BL) in SR is set to 1.
3. The mode bit (MD) in SR is set to 1.
4. The register bank bit (RB) in SR is set to 1.
5. In a reset, the FPU disable bit (FD) in SR is cleared to 0.
6. The exception code is written to bits 11 to 0 of the exception event register (EXPEVT), or to bits 13 to 0 of the interrupt event register (INTEVT).
7. The CPU branches to the determined exception handling vector address, and the exception handling routine begins.

5.3.2 Exception handling vector addresses

The reset vector address is fixed at 0xA000 0000. Exception and interrupt vector addresses are determined by adding the offset for the specific event, to the vector base address, which is set by software in the vector base register (VBR). In the case of the TLB miss exception, for example, the offset is 0x0000 0400, so if 0x9C08 0000 is set in VBR, the exception handling vector address will be 0x9C08 0400. If a further exception occurs at the exception handling vector address, a duplicate exception will result, and recovery will be difficult; therefore, fixed physical addresses (P1, P2) should be specified for vector addresses.
5.4 Exception types and priorities

Table 34 shows the types of exceptions, with their relative priorities, vector addresses, and exception/interrupt codes.

<table>
<thead>
<tr>
<th>Exception category</th>
<th>Execution mode</th>
<th>Exception</th>
<th>Priority level</th>
<th>Priority order</th>
<th>Vector address Offset</th>
<th>Exception code</th>
</tr>
</thead>
<tbody>
<tr>
<td>Reset</td>
<td>Abort type</td>
<td>POWERON</td>
<td>1</td>
<td>1</td>
<td>0xA000 0000 - 0x000</td>
<td>0x000</td>
</tr>
<tr>
<td></td>
<td></td>
<td>MANRESET</td>
<td>1</td>
<td>2</td>
<td>0xA000 0000 - 0x020</td>
<td>0x020</td>
</tr>
<tr>
<td></td>
<td></td>
<td>HUDIRESET</td>
<td>1</td>
<td>1</td>
<td>0xA000 0000 - 0x000</td>
<td>0x000</td>
</tr>
<tr>
<td></td>
<td></td>
<td>ITLBMULTIHIT</td>
<td>1</td>
<td>3</td>
<td>0xA000 0000 - 0x140</td>
<td>0x140</td>
</tr>
<tr>
<td></td>
<td></td>
<td>OTLBMULTIHIT</td>
<td>1</td>
<td>4</td>
<td>0xA000 0000 - 0x140</td>
<td>0x140</td>
</tr>
<tr>
<td>General exception</td>
<td>Re-execution type</td>
<td>UBRKBEFORE*1</td>
<td>2</td>
<td>0</td>
<td>(VBR/DBR) 0x100/- 0x1E0</td>
<td>0x1E0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>IADDERR</td>
<td>2</td>
<td>1</td>
<td>(VBR) 0x100 0x0E0</td>
<td>0x0E0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>ITLBMISS</td>
<td>2</td>
<td>2</td>
<td>(VBR) 0x400 0x040</td>
<td>0x040</td>
</tr>
<tr>
<td></td>
<td></td>
<td>EXECPROT</td>
<td>2</td>
<td>3</td>
<td>(VBR) 0x100 0x0A0</td>
<td>0x0A0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>RESINST</td>
<td>2</td>
<td>4</td>
<td>(VBR) 0x100 0x180</td>
<td>0x180</td>
</tr>
<tr>
<td></td>
<td></td>
<td>ILLSLOT</td>
<td>2</td>
<td>4</td>
<td>(VBR) 0x100 0x1A0</td>
<td>0x1A0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>FPUDIS</td>
<td>2</td>
<td>4</td>
<td>(VBR) 0x100 0x800</td>
<td>0x800</td>
</tr>
<tr>
<td></td>
<td></td>
<td>SLOTFPUDIS</td>
<td>2</td>
<td>4</td>
<td>(VBR) 0x100 0x820</td>
<td>0x820</td>
</tr>
<tr>
<td></td>
<td></td>
<td>RADDERR</td>
<td>2</td>
<td>5</td>
<td>(VBR) 0x100 0x0E0</td>
<td>0x0E0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>WADDERR</td>
<td>2</td>
<td>5</td>
<td>(VBR) 0x100 0x100</td>
<td>0x100</td>
</tr>
<tr>
<td></td>
<td></td>
<td>RTLBMISS</td>
<td>2</td>
<td>6</td>
<td>(VBR) 0x400 0x040</td>
<td>0x040</td>
</tr>
<tr>
<td></td>
<td></td>
<td>WTLBMISS</td>
<td>2</td>
<td>6</td>
<td>(VBR) 0x400 0x060</td>
<td>0x060</td>
</tr>
<tr>
<td></td>
<td></td>
<td>READPROT</td>
<td>2</td>
<td>7</td>
<td>(VBR) 0x100 0x0A0</td>
<td>0x0A0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>WRITEPROT</td>
<td>2</td>
<td>7</td>
<td>(VBR) 0x100 0x0C0</td>
<td>0x0C0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>FPUEXC</td>
<td>2</td>
<td>8</td>
<td>(VBR) 0x100 0x120</td>
<td>0x120</td>
</tr>
<tr>
<td></td>
<td></td>
<td>FIRSTWRITE</td>
<td>2</td>
<td>9</td>
<td>(VBR) 0x100 0x080</td>
<td>0x080</td>
</tr>
</tbody>
</table>

Table 34: Exceptions
5.5 Exception flow

5.5.1 Exception flow

*Figure 29* shows an outline flowchart of the basic operations in instruction execution and exception handling. For the sake of clarity, the following description assumes that instructions are executed sequentially, one by one. Register settings in the
event of an exception are shown only for SSR, SPC, EXPEVT/INTEVT, SR, and PC, but other registers may be set automatically by hardware, depending on the exception. For details, see section 5.6, Description of Exceptions. Also, see Section 5.6.4, for exception handling during execution of a delayed branch instruction and a delay slot instruction, and in the case of instructions in which two data accesses are performed.

Figure 29: Instruction execution and exception handling
5.5.2 Exception source acceptance

A priority ranking is provided for all exceptions, for use in determining which of two or more simultaneously generated exceptions should be accepted. Five of the general exceptions:

- general illegal instruction exception
- slot illegal instruction exception
- general FPU disable exception
- slot FPU disable exception
- unconditional trap exception

are detected in the process of instruction decoding, and do not occur simultaneously in the instruction pipeline. Therefore, these exceptions all have the same priority. General exceptions are detected in the order of instruction execution. However, exception handling is performed in the order of instruction flow (program order). Thus, an exception for an earlier instruction is accepted before that for a later instruction. An example of the order of acceptance for general exceptions is shown in Figure 30.
Figure 30: Example of general exception acceptance order
5.5.3 Exception requests and BL bit

When the BL bit in SR is 0, exceptions and interrupts are accepted.

When the BL bit in SR is 1 and an exception other than a user break is generated, the CPU's internal registers are set to their post-reset state, the registers of the other modules retain their contents prior to the exception, and the CPU branches to the same address as in a reset (0xA000 0000). For the operation in the event of a user break, see section 20: User Break Controller. If an ordinary interrupt occurs, the interrupt request is held pending, and is accepted after the BL bit has been cleared to 0 by software. If a nonmaskable interrupt (NMI) occurs, it can be held pending or accepted, according to the setting made by software.

Thus, normally, SPC and SSR are saved and then the BL bit in SR is cleared to 0, to enable multiple exception state acceptance.

5.5.4 Return from exception handling

The RTE instruction is used to return from exception handling. When the RTE instruction is executed, the SPC contents are restored to PC, and the SSR contents to SR. The CPU returns from the exception handling routine by branching to the SPC address. If SPC and SSR were saved to external memory, set the BL bit in SR to 1 before restoring the SPC and SSR contents and issuing the RTE instruction.
5.6 Description of exceptions

The various exception handling operations are described here, covering exception sources, transition addresses, and processor operation, when a transition is made.

5.6.1 Resets

1 POWERON - Power-On Reset

- Sources:
  For details of how the core is driven to the power on reset state, refer to the System Architecture Manual of the appropriate product.

- Transition address: 0xA000 0000

- Transition operations:
  Exception code 0x000 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = 0xA000 0000. In the initialization processing, the VBR register is set to 0x0000 0000, and in SR, the MD, RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I3-I0) are set to 0xF.

CPU initialization is performed. For details of the impact on the rest of the system refer to the System Architecture Manual.

Refer to Appendix A for power-on reset values for the various CPU core modules set by the Initialize_Module function.

POWERON()
{
    Initialize_Module(PowerOn);
    EXPEVT = 0x00000000;
    VBR = 0x00000000;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    SR.(I0-I3) = 0xF;
    SR.FD = 0;
    PC = 0xA0000000;
}

MANRESET - Manual Reset
When a general exception other than a user break occurs while the BL bit is set to 1 in SR. It is also possible for the system in which the core is integrated to drive the processor into this reset state. For details refer to the System Architecture Manual of the appropriate product.

- Transition address: 0xA000 0000

- Transition operations:

Exception code 0x020 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = 0xA000 0000. In the initialization processing, the VBR register is set to 0x0000 0000, and in SR, the MD, RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I3-I0) are set to 0xF. CPU and system initialization are performed. For details refer to the System Architecture Manual.

Refer to Appendix A for the manual reset values for the various CPU core modules set by the Initialize_Module function.

```
MANRESET()
{
    Initialize_Module(Manual);
    EXPEVT = 0x00000020;
    VBR = 0x00000000;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    SR.(I0-I3) = 0xF;
    SR.FD = 0;
    PC = 0xA0000000;
}
```
2 HUDIRESET - H-UDI Reset

- Source:
  Refer to the System Architecture Manual for a description of how the core is placed in the H-UDI reset state.

- Transition address: 0xA000 0000

  Transition operations:
  Exception code 0x000 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = 0xA000 0000. In the initialization processing, the VBR register is set to 0x0000 0000, and in SR, the MD, RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I3-I0) are set to 0xF. CPU and system initialization are performed, for details refer to the System Architecture Manual.

  Refer to Appendix A for the manual reset values for the various CPU core modules set by the Initialize_Module function.

  HUDIRESET()
  {
    Initialize_Module(PowerOn);
    EXPEVT = 0x00000000;
    VBR = 0x00000000;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    SR.(I0-I3) = 0xF;
    SR.FD = 0;
    PC = 0xA0000000;
  }
3 ITLBMULTIHIT - Instruction TLB Multiple-Hit Exception

- Source: Multiple ITLB address matches
- Transition address: 0xA000 0000
- Transition operations:
  The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.

Exception code 0x140 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = 0xA000 0000.

In the initialization processing, the VBR register is set to 0x0000 0000, and in SR, the MD, RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I0-I3) are set to 0xF.

CPU and system initialization are performed in the same way as in a manual reset.

Refer to Appendix A for the manual reset values for the various CPU core modules set by the Initialize_Module function.

```
ITLBMULTIHIT()
{
    Initialize_Module(Manual);
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    EXPEVT = 0x00000140;
    VBR = 0x00000000;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    SR.(I0-I3) = 0xF;
    SR.FD = 0;
    PC = 0xA0000000;
}
```
4 OTLBMULTIHIT - Operand TLB Multiple-Hit Exception

- Source: Multiple UTLB address matches
- Transition address: 0xA000 0000

Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.

Exception code 0x140 is set in EXPEVT, initialization of VBR and SR is performed, and a branch is made to PC = 0xA000 0000.

In the initialization processing, the VBR register is set to 0x0000 0000, and in SR, the MD, RB, and BL bits are set to 1, the FD bit is cleared to 0, and the interrupt mask bits (I3-I0) are set to 0xF.

CPU and system initialization are performed in the same way as in a manual reset.

Refer to Appendix A for the manual reset values for the various CPU core modules set by the Initialize_Module function.

```
OTLBMULTIHIT()
{
    Initialize_Module(Manual);
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    EXPEVT = 0x00000140;
    VBR = 0x0000000;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    SR.(I0-I3) = 0xF;
    SR.FD = 0;
    PC = 0xA0000000;
}
```
5.6.2 General exceptions

1. RTLBMISS - Read Data TLB Miss Exception
   - Source: Address mismatch in UTLB address comparison
   - Transition address: VBR + 0x0000 0400
   - Transition operations:
     The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
     The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents are saved in SGR.
     Exception code 0x040 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0400.
     To speed up TLB miss processing, the offset is separate from that of other exceptions.

RTLBMISS()
{
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x00000040;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000400;
}
2 WTLBMISS - Write Data TLB Miss Exception

- Source: Address mismatch in UTLB address comparison
- Transition address: VBR + 0x0000 0400
- Transition operations:
  The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.

The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

Exception code 0x060 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0400.

To speed up TLB miss processing, the offset is separate from that of other exceptions.

WTLBMISS()
{
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x00000060;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000400;
}
3 ITLBMISS - Instruction TLB Miss Exception

- Source: Address mismatch in ITLB address comparison
- Transition address: VBR + 0x0000 0400
- Transition operations:
  The virtual address (32 bits) at which this exception occurred is set in TEA,
  and the corresponding virtual page number (22 bits) is set in PTEH [31:10].
  ASID in PTEH indicates the ASID when this exception occurred.

  The PC and SR contents for the instruction at which this exception occurred
  are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

  Exception code 0x040 is set in EXPEVT. The BL, MD, and RB bits are set to 1
  in SR, and a branch is made to PC = VBR + 0x0400.

To speed up TLB miss processing, the offset is separate from that of other
exceptions.

ITLBMISS()
{
    TEA = EXCEPTION_ADDRESS;
PTEH.VPN = PAGE_NUMBER;
SPC = PC;
SSR = SR;
SGR = R15;
EXPEVT = 0x00000040;
SR.MD = 1;
SR.RB = 1;
SR.BL = 1;
PC = VBR + 0x00000400;
}
4 FIRSTWRITE - Initial Page Write Exception

- Source: TLB is hit in a store access, but dirty bit D = 0
- Transition address: VBR + 0x0000 0100
- Transition operations:
  - The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
  - The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.
  - Exception code 0x080 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100.

```c
FIRSTWRITE()
{
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x00000080;
    SR.MD = 1; SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}
```
5 READPROT - Data TLB Protection Violation Exception

- Source: The access does not agree with the UTLB protection information (PR bits) shown below.

<table>
<thead>
<tr>
<th>PR</th>
<th>Privileged mode</th>
<th>User mode</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>Only read access possible</td>
<td>Access not possible</td>
</tr>
<tr>
<td>01</td>
<td>Read/write access possible</td>
<td>Access not possible</td>
</tr>
<tr>
<td>10</td>
<td>Only read access possible</td>
<td>Only read access possible</td>
</tr>
<tr>
<td>11</td>
<td>Read/write access possible</td>
<td>Read/write access possible</td>
</tr>
</tbody>
</table>

- Transition address: VBR + 0x0000 0100

- Transition operations:
  The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.

  The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

  Exception code 0x0A0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100.

READPROT()
{
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x000000A0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}
6 WRITEPROT - Write Data TLB Protection Violation Exception

- Source: The access does not agree with the UTLB protection information (PR bits) shown below.

<table>
<thead>
<tr>
<th>PR</th>
<th>Privileged mode</th>
<th>User mode</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>Only read access possible</td>
<td>Access not possible</td>
</tr>
<tr>
<td>01</td>
<td>Read/write access possible</td>
<td>Access not possible</td>
</tr>
<tr>
<td>10</td>
<td>Only read access possible</td>
<td>Only read access possible</td>
</tr>
<tr>
<td>11</td>
<td>Read/write access possible</td>
<td>Read/write access possible</td>
</tr>
</tbody>
</table>

- Transition address: VBR +0x0000 0100

- Transition operations:
  The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.

  The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

  Exception code 0x0C0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR +0x0100.

WRITEPROT()
{
  TEA = EXCEPTION_ADDRESS;
PTEH.VPN = PAGE_NUMBER;
SPC = PC;
SSR = SR;
SGR = R15;
EXPEVT = 0x000000C0;
SR.MD = 1;
SR.RB = 1;
SR.BL = 1;
PC = VBR + 0x00000100;
}
7 EXECPROT - Instruction TLB Protection Violation Exception

- Source: The access does not agree with the ITLB protection information (PR bits) shown below.

<table>
<thead>
<tr>
<th>PR</th>
<th>Privileged mode</th>
<th>User mode</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>Access possible</td>
<td>Access not possible</td>
</tr>
<tr>
<td>1</td>
<td>Access possible</td>
<td>Access possible</td>
</tr>
</tbody>
</table>

- Transition address: VBR + 0x0000 0100

- Transition operations: The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.

The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

Exception code 0xA0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100.

```c
EXECPROT()
{
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x000000A0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}
```
8 RADDERR - Read Data Address Error

- Sources:
  Word data access from other than a word boundary (2n +1)
  Longword data access from other than a longword data boundary (4n +1, 4n + 2, or 4n +3)
  Quadword data access from other than a quadword data boundary (8n +1, 8n + 2, 8n +3, 8n +4, 8n +5, 8n +6, or 8n +7)
  Access to area 0x8000 00000xFFFF FFFF in user mode

- Transition address: VBR + 0x0000 0100

- Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.
Exception code 0x0E0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100. For details, see Chapter 3: Memory management unit (MMU) on page 41.

RADDERR()
{
    TEA = EXCEPTION_ADDRESS;
    PTEN.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x000000E0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}

STMicroelectronics and Hitachi, Ltd.
ADCS 7182230F
SH-4 CPU Core Architecture
9 WADDERR - Write Data Address Error

- Sources:
  
  Word data access from other than a word boundary (2n +1)
  
  Longword data access from other than a longword data boundary (4n +1, 4n +
  2, or 4n +3)
  
  Quadword data access from other than a quadword data boundary (8n +1, 8n
  +2, 8n +3, 8n +4, 8n +5, 8n +6, or 8n +7)
  
  Access to area 0x8000 00000 - 0xFFFF FFFF in user mode (except for the
  store queue area 0xE000 00000 - 0xE3FF FFFF)
  
- Transition address: VBR +0x0000 0100

- Transition operations:
  
  The virtual address (32 bits) at which this exception occurred is set in TEA,
  and the corresponding virtual page number (22 bits) is set in PTEH [31:10].
  ASID in PTEH indicates the ASID when this exception occurred.

  The PC and SR contents for the instruction at which this exception occurred
  are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

  Exception code 0x100 is set in EXPEVT. The BL, MD, and RB bits are set to 1
  in SR, and a branch is made to PC = VBR + 0x0100. For details, see
  Chapter 3: Memory management unit (MMU) on page 41.

  WADDERR (}
  {
    TEA = EXCEPTION_ADDRESS;
    PTEH.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x00000100;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
  }
10 IADDERR - Instruction Address Error

- Sources:
  Instruction fetch from other than a word boundary (2n +1)
  Instruction fetch from area 0x8000 00000 - 0xFFFF FFFF in user mode
- Transition address: VBR + 0x0000 0100
- Transition operations:
The virtual address (32 bits) at which this exception occurred is set in TEA, and the corresponding virtual page number (22 bits) is set in PTEH [31:10]. ASID in PTEH indicates the ASID when this exception occurred.
The PC and SR contents for the instruction at which this exception occurred are saved in the SPC and SSR. The R15 contents at this time are saved in SGR.
Exception code 0x0E 0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100. For details, see Chapter 3: Memory management unit (MMU) on page 41.

IADDERR()
{
    TEA = EXCEPTION_ADDRESS;
    PTEN.VPN = PAGE_NUMBER;
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x000000E0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}
11 TRAP - Unconditional trap
- Source: Execution of TRAPA instruction
- Transition address: VBR + 0x0000 0100
- Transition operations:
  As this is a processing-completion-type exception, the PC contents for the
  instruction following the TRAPA instruction are saved in SPC. The value of
  SR and R15 when the TRAPA instruction is executed are saved in SSR and
  SGR. The 8-bit immediate value in the TRAPA instruction is multiplied by 4,
  and the result is set in TRA [9]. Exception code 0x160 is set in EXPEVT. The
  BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR +
  0x0100.

TRAP ()
{
    SPC = PC + 2;
    SSR = SR;
    SGR = R15;
    TRA = imm << 2; EXPEVT = 0x00000160;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}
12 RESINST - General Illegal Instruction Exception

- Sources:
  Decoding of an undefined instruction other than in a branch delay slot.
  The opcode 0xFFFD is guaranteed to be defined in any SH-4 architecture revision. Other unused opcodes may be treated as reserved in any particular SH-4 implementation.
  Decoding in user mode of a privileged instruction not in a delay slot
  Privileged instructions: LDC, STC, RTE, LDTLB, SLEEP, but excluding LDC/STC instructions that access GBR
- Transition address: VBR +0x0000 0100
- Transition operations:
  The PC contents for the instruction at which this exception occurred are saved in SPC. The SR and R15 contents when this exception occurred are saved in SSR and SGR.
  Exception code 0x180 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100.

Note: The only undefined opcode which the architecture guarantees to cause a General Illegal Instruction Exception is 0xFFF D.

RESINST()
{
   SPC = PC;
   SSR = SR;
   SGR = R15;
   EXPEVT = 0x00000180;
   SR.MD = 1;
   SR.RB = 1;
   SR.BL = 1;
   PC = VBR + 0x00000100;
}
13 ILLSLOT - Slot Illegal Instruction Exception

- Sources:
  - Decoding of an undefined instruction in a delay slot
    The branches with delay slots are JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT/S and BF/S. The opcode 0xFFFFD is guaranteed to be undefined in any SH-4 architecture revision. Other unused opcodes may be treated as reserved in any particular SH-4 implementation.
  - Decoding of an instruction that modifies PC in a delay slot
    Instructions that modify PC: JMP, JSR, BRA, BRAF, BSR, BSRF, RTS, RTE, BT, BF, BT/S, BF/S, TRAPA, LDC Rm, SR, LDC.L @Rm+, SR
  - Decoding in user mode of a privileged instruction in a delay slot
    Privileged instructions: LDC, STC, RTE, LDTLB, SLEEP, but excluding LDC/STC instructions that access GBR
  - Decoding of a PC-relative MOV instruction or MOVA instruction in a delay slot
    Transition address: VBR + 0x0000 0100

- Transition operations:
  - The PC contents for the preceding delayed branch instruction are saved in SPC. The SR contents when this exception occurred are saved in SSR. The R15 contents at this time are saved in SGR.
  - Exception code 0x1A0 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100.

Note: The only undefined opcode which the architecture guarantees to cause a Slot Illegal Instruction Exception is 0xFFFFD.

ILLSLOT()
{
   SPC = PC - 2;
   SSR = SR;
   SGR = R15;
   EXPEVT = 0x000001A0;
   SR.MD = 1;
   SR.RB = 1;
   SR.BL = 1;
   PC = VBR + 0x00000100;
}
14 FPUDIS - General FPU Disable Exception

- Source: Decoding of an FPU instruction* not in a delay slot with SR.FD =1
- Transition address: VBR + 0x0000 0100

Transition operations:
The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

Exception code 0x800 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100.

Note: FPU instructions are instructions in which the first 4 bits of the instruction code are F (but excluding undefined instruction 0xFFFD), and the LDS, STS, LDS.L, and STS.L instructions corresponding to FPUL and FPSCR.

FPUDIS()
{
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x00000800;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}
15 SLOTFPUDIS - Slot FPU Disable Exception

- Source: Decoding of an FPU instruction in a delay slot with SR.FD = 1
- Transition address: VBR + 0x0000 0100
- Transition operations:
  The PC contents for the preceding delayed branch instruction are saved in SPC. The SR and R15 contents when this exception occurred are saved in SSR and SGR.

Exception code 0x820 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100.

SLOTFPUDIS()
{
    SPC = PC - 2;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x00000820;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000100;
}
16 UBRKBEFORE - User Breakpoint Pre-execution Trap

- Source: Fulfilling of a break condition set in the user break controller
- Transition address: VBR + 0x0000 0100, or DBR
- Transition operations:

  The PC contents for the instruction at which the breakpoint is set are set in SPC. The SR and R15 contents when the break occurred are saved in SSR and SGR. Exception code 0x1E0 is set in EXPEVT.

  The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100. It is also possible to branch to PC = DBR. For details of PC, etc., when a data break is set, see User Break Controller (UBC) Chapter in the ST40 System Architecture Manual.

UBRKBEFORE()
{
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x000001E0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = (BRCR.UBDE==1 ? DBR : VBR + H00000100);
}


17 UBRKAFTER - User Breakpoint Post-Execution Trap

- Source: Fulfilling of a break condition set in the user break controller
- Transition address: VBR + 0x0000 0100, or DBR
- Transition operations:

  The PC of the instruction following that at which the breakpoint is set is placed in SPC. The SR and R15 contents when the break occurred are saved in SSR and SGR. Exception code 0x1E0 is set in EXPEVT.

  The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100. It is also possible to branch to PC = DBR. For details of PC, etc., when a data break is set, see User Break Controller (UBC) Chapter in the ST40 System Architecture Manual.

```c
UBRKAFTER()
{
    SPC = PC + 2;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x000001E0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = (BRCR.UBDE==1 ? DBR : VBR + H00000100);
}
```
18 FPUEXC - FPU Exception

- Source: Exception due to execution of a floating-point operation
- Transition address: VBR + 0x0000 0100
- Transition operations:
  
  The PC and SR contents for the instruction at which this exception occurred are saved in SPC and SSR. Exception code 0x120 is set in EXPEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0100. The contents of R15 are saved to SGR.

FPUEXC()
{
    SPC = PC;
    SSR = SR;
    SGR = R15;
    EXPEVT = 0x00000120;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x0000100;
}
5.6.3 Interrupts

1 NMI - Non-Maskable Interrupt

- Source: Refer to relevant System Architecture Manual for details of non-maskable interrupt generation (NMI).

- Transition address: VBR + 0x0000 0600

Transition operations:

The PC and SR contents for the instruction at which this exception is accepted are saved in SPC and SSR. The R15 contents at this time are saved in SGR.

Exception code 0x1C0 is set in INTEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to PC = VBR + 0x0600.

When the BL bit in SR is 0, this interrupt is not masked by the interrupt mask bits in SR, and is accepted at the highest priority level. When the BL bit in SR is 1, a software setting can specify whether this interrupt is to be masked or accepted. For details refer to the description of interrupt programming in the appropriate System Architecture Manual.

NMI()
{
    SPC = PC;
    SSR = SR;
    SGR = R15;
    INTEVT = 0x000001C0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000600;
}
2 IRLINT - IRL Interrupts

- Source: The interrupt mask bit setting in SR is smaller than the IRL (3-0) level, and the BL bit in SR is 0 (accepted at instruction boundary).

- Transition address: VBR + 0x0000 0600

- Transition operations:

  The PC contents immediately after the instruction at which the interrupt is accepted are set in SPC. The SR and R15 contents at the time of acceptance are set in SSR and SGR.

  The code corresponding to the IRL (3-0) level is set in INTEVT. For further details of the interrupt handling behavior, refer to the product level documentation of the interrupt controller. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to VBR + 0x0600. The acceptance level is not set in the interrupt mask bits in SR. When the BL bit in SR is 1, the interrupt is masked. For further details of the interrupt handling behavior, refer to the product level documentation of the interrupt controller.

IRLINT()
{
    SPC = PC;
    SSR = SR;
    SGR = R15;
    INTEVT = 0x00000200 ~ 0x000003C0;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000600;
}

3 PERIPHINT - Peripheral Module Interrupts

- Source: The interrupt mask bit setting in SR is smaller than the peripheral module (Hitachi-UDI for example) interrupt level, and the BL bit in SR is 0 (accepted at instruction boundary).
- Transition address: VBR + 0x0000 0600
- Transition operations:
  The PC contents immediately after the instruction at which the interrupt is accepted are set in SPC. The SR and R15 contents at the time of acceptance are set in SSR and SGR.

  The code corresponding to the interrupt source is set in INTEVT. The BL, MD, and RB bits are set to 1 in SR, and a branch is made to VBR + 0x0600. The module interrupt levels should be set as values between 0x0 and 0xF in the interrupt priority registers (IPRA-IPRC) in the interrupt controller. For further details of the interrupt handling behavior, refer to the product level documentation of the interrupt controller.

```c
Module_interruption()
{
    SPC = PC;
    SSR = SR;
    SGR = R15;
    INTEVT = 0x00000400 ~ 0x00000B80;
    SR.MD = 1;
    SR.RB = 1;
    SR.BL = 1;
    PC = VBR + 0x00000600;
}```
5.6.4 Priority order with multiple exceptions

With some instructions, such as instructions that make two accesses to memory, and the indivisible pair comprising a delayed branch instruction and delay slot instruction, multiple exceptions occur. Care is required in these cases, as the exception priority order differs from the normal order.

1 Instructions that make two accesses to memory.
   With MAC instructions, memory-to-memory arithmetic/logic instructions, and TAS instructions, two data transfers are performed by a single instruction, and an exception will be detected for each of these data transfers. In these cases, therefore, the following order is used to determine priority.

   1.1 Data address error in first data transfer.
   1.2 TLB miss in first data transfer.
   1.3 TLB protection violation in first data transfer.
   1.4 Data address error in second data transfer.
   1.5 TLB miss in second data transfer.
   1.6 TLB protection violation in second data transfer.
   1.7 Initial page write exception in second data transfer.

2 Indivisible delayed branch instruction and delay slot instruction.
   As a delayed branch instruction and its associated delay slot instruction are indivisible, they are treated as a single instruction. Consequently, the priority order for exceptions that occur in these instructions differs from the usual priority order. The priority order shown below is for the case where the delay slot instruction has only one data transfer.

   2.1 The delayed branch instruction is checked for priority levels 1 and 2.
   2.2 The delay slot instruction is checked for priority levels 1 and 2.
   2.3 A check is performed for priority level 3 in the delayed branch instruction and priority level 3 in the delay slot instruction. (There is no priority ranking between these two.)
   2.4 A check is performed for priority level 4 in the delayed branch instruction and priority level 4 in the delay slot instruction. (There is no priority ranking between these two.)

   If the delay slot instruction has a second data transfer, two checks are performed in step b, as in 1 above.
If the accepted exception (the highest-priority exception) is a delay slot instruction re-execution type exception, the branch instruction PR register write operation (PC PR operation performed in BSR, BSRF, JSR) is inhibited.

5.7 Usage notes

1 Return from exception handling
   1.1 Check the BL bit in SR with software.
       If SPC and SSR have been saved to external memory, set the BL bit in SR to 1 before restoring them.
   1.2 Issue an RTE instruction.
       When RTE is executed, the SPC contents are set in PC, the SSR contents are set in SR, and branch is made to the SPC address to return from the exception handling routine.

2 If an exception or interrupt occurs when SR.BL = 1
   2.1 Exception
       When an exception other than a user break occurs, the CPUs internal registers are set to their post-reset state, the registers of the other modules retain their contents prior to the exception, and the CPU branches to the same address as in a reset (0xA000 0000). The value in EXPEVT at this time is 0x0000 0020. The value of the SPC and SSR registers is undefined.
   2.2 Interrupt
       If an ordinary interrupt occurs, the interrupt request is held pending and is accepted after the BL bit in SR has been cleared to 0 by software. If a nonmaskable interrupt (NMI) occurs, it can be held pending or accepted according to the setting made by software. In the sleep or standby state, an interrupt is accepted even if the BL bit in SR is set to 1.

3 SPC when an exception occurs
   3.1 Re-execution type exception
       The PC value for the instruction in which the exception occurred is set in SPC, and the instruction is re-executed after returning from exception handling. If an exception occurs in a delay slot instruction, the PC value for the delay slot instruction is saved in SPC, regardless of whether or not the preceding delay slot instruction condition is satisfied.
   3.2 Completion type exception or interrupt
       The PC value for the instruction following that in which the exception
occurred is set in SPC. If an exception occurs in a branch instruction with
delay slot, the PC value for the branch destination is saved in SPC.

4 An exception must not be generated in an RTE instruction delay slot, as the
operation will be undefined in this case.
PRELIMINARY DATA
Floating-point unit

6.1 Overview

The floating-point unit (FPU) has the following features:

- Conforms to IEEE 754 standard
- 32 single-precision floating-point registers (can also be referenced as 16 double-precision registers)
- Two rounding modes: Round to Nearest and Round to Zero
- Two denormalization modes: Flush to Zero and Treat Denormalized Number
- Six exception sources: FPU Error, Invalid Operation, Divide By Zero, Overflow, Underflow, and Inexact
- Comprehensive instructions: Single-precision, double-precision, graphics support, system control

When the FD bit in SR is set to 1, the FPU cannot be used, and an attempt to execute an FPU instruction will cause an FPU disable exception.
### 6.2 Floating-point format

An IEEE 754 floating-point number contains three fields: a sign (s), an exponent (e) and a fraction (f) in the format given in Figure 31.

![Figure 31: IEEE754 floating-point representations](image)

The sign, s, is the sign of the represented number. If s is 0, the number is positive. If s is 1, the number is negative.

The exponent, e, is held as a biased value. The relationship between the biased exponent, e, and the unbiased exponent, E, is given by e = E + bias, where bias is a fixed positive number. The unbiased exponent, E, takes any value in the range \([E_{\text{min}}-1, E_{\text{max}}+1]\). The minimum and maximum values in that range, \(E_{\text{min}}-1\) and \(E_{\text{max}}+1\), designate special values such as positive zero, negative zero, positive infinity, negative infinity, denormalized numbers and “Not a Number” (NaN).

The fraction, f, specifies the binary digits that lie to the right of the binary point. A normalized floating-point number has a leading bit of 1 which lies to the left of the binary point. A denormalized floating-point number has a leading bit of 0 which lies to the left of the binary point. The leading bit is implicitly represented; it is determined by whether the number is normalized or denormalized, and is not explicitly encoded. The implicit leading bit and the explicit fraction bits together form the significance of the floating-point number.

Floating-point number value \(v\) is determined as follows:

The value, \(v\), of a floating-point number is determined as follows:

- NaN: if \(E = E_{\text{max}} + 1\) and \(f \neq 0\), then \(v\) is Not a Number irrespective of the sign s
- Positive or negative infinity: if \(E = E_{\text{max}} + 1\) and \(f = 0\), then \(v = (-1)^s \infty\)
- Normalized number: if \(E_{\text{min}} \leq E \leq E_{\text{max}}\), then \(v = (-1)^s 2^E (1.f)\)
- Denormalized number: if \(E = E_{\text{min}} - 1\) and \(f \neq 0\), then \(v = (-1)^s 2^{E_{\text{min}}}(0.f)\)
- Positive or negative zero: if \(E = E_{\text{min}} - 1\) and \(f = 0\), then \(v = (-1)^s 0\)
The architecture supports two IEEE 754 basic floating-point number formats: single-precision and double-precision.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>Single-precision</th>
<th>Double-precision</th>
</tr>
</thead>
<tbody>
<tr>
<td>Total bit width</td>
<td>32 bits</td>
<td>64 bits</td>
</tr>
<tr>
<td>Sign bit</td>
<td>1 bit</td>
<td>1 bit</td>
</tr>
<tr>
<td>Exponent field</td>
<td>8 bits</td>
<td>11 bits</td>
</tr>
<tr>
<td>Fraction field</td>
<td>23 bits</td>
<td>52 bits</td>
</tr>
<tr>
<td>Precision</td>
<td>24 bits</td>
<td>53 bits</td>
</tr>
<tr>
<td>Bias</td>
<td>+127</td>
<td>+1023</td>
</tr>
<tr>
<td>$E_{\text{max}}$</td>
<td>+127</td>
<td>+1023</td>
</tr>
<tr>
<td>$E_{\text{min}}$</td>
<td>-126</td>
<td>-1022</td>
</tr>
</tbody>
</table>

Table 35: Floating-point number formats and parameters

Table 36 shows the ranges of the various numbers in hexadecimal notation.

<table>
<thead>
<tr>
<th>Type</th>
<th>Single-precision</th>
<th>Double-precision</th>
</tr>
</thead>
<tbody>
<tr>
<td>sNaN (Signaling not-a-number)</td>
<td>0x7FFFFFFF to 0x7FC00000 0xFFC00000</td>
<td>0x7FFFFF 0xFFFF0000 0x00000000 0xFFFF0000 0x00000000</td>
</tr>
<tr>
<td>qNaN (Quiet not-a-number)</td>
<td>0x7F8000000 to 0x7F800001 0xFF800001</td>
<td>0x7FF7FFF 0xFFFF0000 0x00000001 0xFF000000 0x00000000 0xFFFF0000 0x00000000</td>
</tr>
<tr>
<td>+INF (Positive infinity)</td>
<td>0x7F8000000</td>
<td>0x7FF00000 0x00000000</td>
</tr>
<tr>
<td>+NORM (Positive normalized number)</td>
<td>0x7F7FFFFF to 0x00800000 0x7FF000000 0x00000000 0x7FEFFFFF 0xFFFF0000 0x00000000 0x00100000 0x00000000</td>
<td></td>
</tr>
</tbody>
</table>

Table 36: Floating-point ranges
6.2.1 Non-numbers (NaN)

Figure 32 shows the bit pattern of a non-number (NaN).

A floating-point number is a NaN if the exponent field contains the maximum representable value and the fraction is non-zero, regardless of the value of the sign. In the figure above, x can have a value of 0 or 1. If the most significant bit of the fraction (N, in the figure above) is 1, the value is a signaling NaN (sNaN), otherwise the value is a quiet NaN (qNaN).

An sNaN is input in an operation, except copy, FABS, and FNEG, that generates a floating-point value.
• When the EN.V bit in the FPSCR register is 0, the operation result (output) is a qNaN.

• When the EN.V bit in the FPSCR register is 1, an invalid operation exception will be generated. In this case, the contents of the operation destination register are unchanged.

If a qNaN is input in an operation that generates a floating-point value, and an sNaN has not been input in that operation, the output will always be a qNaN irrespective of the setting of the EN.V bit in the FPSCR register. An exception will not be generated in this case.

See the individual instruction descriptions for details of floating-point operations when a non-number (NaN) is input.

### 6.2.2 Denormalized numbers

For a denormalized number floating-point value, the exponent field is expressed as 0, and the fraction field as a non-zero value.

When the DN bit in the FPU's status register FPSCR is 1, a denormalized number (source operand or operation result) is always flushed to 0 in a floating-point operation that generates a value (an operation other than copy, FNEG, or FABS).

When the DN bit in FPSCR is 0, a denormalized number (source operand or operation result) is processed as it is. See the individual instruction descriptions for details of floating-point operations when a denormalized number is input.

### 6.3 Rounding

In a floating-point instruction, rounding is performed when generating the final operation result from the intermediate result. Therefore, the result of combination instructions such as FMAC, FTRV, and FIPR will differ from the result when using a basic instruction such as FADD, FSUB, or FMUL. Rounding is performed once in FMAC, but twice in FADD, FSUB, and FMUL.

There are two rounding methods, the method to be used being determined by the RM field in FPSCR.

• RM = 00: Round to Nearest
• RM = 01: Round to Zero
Round to Nearest:
The value is rounded to the nearest expressible value. If there are two nearest expressible values, the one with an LSB of 0 is selected.

If the unrounded value is $2^{\text{Emax}} \times (2^{-2^P})$ or more, the result will be infinity with the same sign as the unrounded value. The values of $\text{Emax}$ and $P$, respectively, are 127 and 24 for single-precision, and 1023 and 53 for double-precision.

Round to Zero:
The digits below the round bit of the unrounded value are discarded.

If the unrounded value is larger than the maximum expressible absolute value, the value will be the maximum expressible absolute value.

### 6.4 Floating-point exceptions

FPU-related exceptions are as follows:

- General illegal instruction/slot illegal instruction exception
  
The exception occurs if an FPU instruction is executed when SR.FD = 1.

- FPU exceptions
  
The exception sources are as follows:
  - FPU error (E): When FPSCR.DN = 0 and a denormalized number is input
  - Invalid operation (V): In case of an invalid operation, such as NaN input
  - Division by zero (Z): Division with a zero divisor
  - Overflow (O): When the operation result overflows
  - Underflow (U): When the operation result underflows
  - Inexact exception (I): When overflow, underflow, or rounding occurs

  The FPSCR cause field contains bits corresponding to all of above sources E, V, Z, O, U, and I, and the FPSCR flag and enable fields contain bits corresponding to sources V, Z, O, U, and I, but not E. Thus, FPU errors cannot be disabled.

  When an exception source occurs, the corresponding bit in the cause field is set to 1, and 1 is added to the corresponding bit in the flag field. When an exception source does not occur, the corresponding bit in the cause field is cleared to 0, but the corresponding bit in the flag field remains unchanged.

- Enable/disable exception handling
The SH-4 CPU core supports enable exception handling and disable exception handling.

Enable exception handling is initiated in the following cases:
- FPU error (E): FPSCR.DN = 0 and a denormalized number is input
- Invalid operation (V): FPSCR.EN.V = 1 and (instruction = FTRV or invalid operation)
- Division by zero (Z): FPSCR.EN.Z = 1 and division with a zero divisor
- Overflow (O): FPSCR.EN.O = 1 and instruction with any possibility of the operation result overflowing
- Underflow (U): FPSCR.EN.U = 1 and instruction with any possibility of the operation result underflowing
- Inexact exception (I): FPSCR.EN.I = 1 and instruction with any possibility of an inexact operation result

These possibilities are shown in the individual instruction descriptions. All exception events that originate in the FPU are assigned as the same exception event. The meaning of an exception is determined by software by reading system register FPSCR and interpreting the information it contains. If no bits are set in the cause field of FPSCR when one or more of bits O, U, I, and V (in case of FTRV only) are set in the enable field, this indicates that an actual exception source is not generated. Also, the destination register is not changed by any enable exception handling operation.

Except for the above, the FPU disables exception handling. In all processing, the bit corresponding to source V, Z, O, U, or I is set to 1, and disable exception handling is provided for each exception.
- Invalid operation (V): qNAN is generated as the result.
- Division by zero (Z): Infinity with the same sign as the unrounded value is generated.
- Overflow (O):
  When rounding mode = RZ, the maximum normalized number, with the same sign as the unrounded value, is generated.
  When rounding mode = RN, infinity with the same sign as the unrounded value is generated.
- Underflow (U):
When FPSCR.DN = 0, a denormalized number with the same sign as the unrounded value, or zero with the same sign as the unrounded value, is generated.

When FPSCR.DN = 1, zero with the same sign as the unrounded value, is generated.

- Inexact exception (I): An inexact result is generated.

### 6.5 Graphics support functions

The SH-4 CPU core supports two kinds of graphics functions: new instructions for geometric operations, and pair single-precision transfer instructions that enable high-speed data transfer.

#### 6.5.1 Geometric operation instructions

Geometric operation instructions perform approximate-value computations. To enable high-speed computation with a minimum of hardware, the SH-4 CPU core ignores comparatively small values in the partial computation results of four multiplications. Consequently, the error shown below is produced in the result of the computation:

\[
\text{Maximum error} = \max (\text{individual multiplication result} \times 2^{\min (\text{number of multiplier significant digits}, \text{number of multiplicand significant digits})}) + \max (\text{result value} \times 2^{-23}, 2^{-149})
\]

The number of significant digits is 24 for a normalized number and 23 for a denormalized number (number of leading zeros in the fractional part).

In future versions of the SuperH series, the above error is guaranteed, but the same result as SH-4 is not guaranteed.

**FIPR FVm, FVn (m, n: 0, 4, 8, 12):** This instruction is basically used for the following purposes:

- Inner product (m does not = n):

  This operation is generally used for surface/rear surface determination for polygon surfaces.

- Sum of square of elements (m = n):

  This operation is generally used to find the length of a vector.
Since approximate-value computations are performed to enable high-speed computation, the inexact exception (I) bit in the cause field and flag field is always set to 1 when an FIPR instruction is executed. Therefore, if the corresponding bit is set in the enable field, enable exception handling will be executed.
FTRV XMTRX, FVn (n: 0, 4, 8, 12): This instruction is basically used for the following purposes:

- Matrix (4 x 4), vector (4):
  This operation is generally used for viewpoint changes, angle changes, or movements called vector transformations (4-dimensional). Since affine transformation processing for angle + parallel movement basically requires a 4 x 4 matrix, the SH-4 CPU core supports 4-dimensional operations.

- Matrix (4 x 4) x matrix (4 x 4):
  This operation requires the execution of four FTRV instructions. Since approximate-value computations are performed to enable high-speed computation, the inexact exception (I) bit in the cause field and flag field is always set to 1 when an FTRV instruction is executed. Therefore, if the corresponding bit is set in the enable field, enable exception handling will be executed. For the same reason, it is not possible to check all data types in the registers beforehand when executing an FTRV instruction. If the V bit is set in the enable field, enable exception handling will be executed.

FRCHG: This instruction modifies banked registers. For example, when the FTRV instruction is executed, matrix elements must be set in an array in the background bank. However, to create the actual elements of a translation matrix, it is easier to use registers in the foreground bank. When the LDC instruction is used on FPSCR, this instruction expends 4 to 5 cycles in order to maintain the FPU state. With the FRCHG instruction, an FPSCR.FR bit modification can be performed in one cycle.

6.5.2 Pair single-precision data transfer

In addition to the powerful new geometric operation instructions, the SH-4 CPU core also supports high-speed data transfer instructions.

When FPSCR.SZ = 1, the SH-4 CPU core can perform data transfer by means of pair single-precision data transfer instructions.

- FMOV DRm/XDm, DRn/XDRn (m, n: 0, 2, 4, 6, 8, 10, 12, 14)
- FMOV DRm/XDm, @Rn (m: 0, 2, 4, 6, 8, 10, 12, 14; n: 0 to 15)

These instructions enable two single-precision (2 32-bit) data items to be transferred; that is, the transfer performance of these instructions is doubled.

- FSCHG - this instruction changes the value of the SZ bit in FPSCR, enabling fast switching between use and non-use of pair single-precision data transfer.
Instruction set

7.1 Execution environment

PC

At the start of instruction execution, PC indicates the address of the instruction itself.

Data sizes and data types: The SH-4 instruction set is implemented with 16-bit fixed-length instructions. The SH-4 CPU core can use byte (8-bit), word (16-bit), longword (32-bit), and quadword (64-bit) data sizes for memory access. Single-precision floating-point data (32 bits) can be moved to and from memory using longword or quadword size. Double-precision floating-point data (64 bits) can be moved to and from memory using longword size. When a double-precision floating-point operation is specified (FPSCR.PR = 1), the result of an operation using quadword access will be undefined. When the SH-4 CPU core moves byte-size or word-size data from memory to a register, the data is sign-extended.

Load-store architecture

The SH-4 CPU core features a load-store architecture in which operations are basically executed using registers. Except for bit-manipulation operations such as logical AND that are executed directly in memory, operands in an operation that requires memory access are loaded into registers and the operation is executed between the registers.
Delayed branches

Except for the two branch instructions BF and BT, the SH-4's branch instructions and RTE are delayed branches. In a delayed branch, the instruction following the branch is executed before the branch destination instruction. This execution slot following a delayed branch is called a delay slot. For example, the BRA execution sequence is as follows:

<table>
<thead>
<tr>
<th>Static sequence</th>
<th>Dynamic sequence</th>
</tr>
</thead>
<tbody>
<tr>
<td>BRA TARGET</td>
<td>BRA TARGET</td>
</tr>
<tr>
<td>ADD next_2 R1, R0</td>
<td>ADD target_instr R1, R0</td>
</tr>
</tbody>
</table>

Delay slot

An illegal instruction exception may occur when a specific instruction is executed in a delay slot. See section 5, Exceptions. The instruction following BF/S or BT/S for which the branch is not taken is also a delay slot instruction.

T bit

The T bit in the status register (SR) is used to show the result of a compare operation, and is referenced by a conditional branch instruction. An example of the use of a conditional branch instruction is shown below.

- ADD #1, R0 T bit is not changed by ADD operation
- CMP/EQ R1, R0 If R0 = R1, T bit is set to 1
- BT TARGET Branches to TARGET if T bit = 1 (R0 = R1)

In an RTE delay slot, status register (SR) bits are referenced as follows. In instruction access, the MD bit is used before modification, and in data access, the MD bit is accessed after modification. The other bits S, T, M, Q, FD, BL, and RB after modification are used for delay slot instruction execution. The STC and STC.L SR instructions access all SR bits after modification.
**Constant values**

An 8-bit constant value can be specified by the instruction code and an immediate value. 16-bit and 32-bit constant values can be defined as literal constant values in memory, and can be referenced by a PC-relative load instruction.

\[
\begin{align*}
\text{MOV.W} & \quad @(\text{disp, PC}), \text{Rn} \\
\text{MOV.L} & \quad @(\text{disp, PC}), \text{Rn}
\end{align*}
\]

There are no PC-relative load instructions for floating-point operations. However, it is possible to set 0.0 or 1.0 by using the FLDI0 or FLDI1 instruction on a single-precision floating-point register.
7.2 Addressing modes

Addressing modes and effective address calculation methods are shown in Table 37. When a location in virtual memory space is accessed (MMUCR.AT = 1), the effective address is translated into a physical memory address. If multiple virtual memory space systems are selected (MMUCR.SV = 0), the least significant bit of PTEH is also referenced as the access ASID. See Chapter 3: Memory management unit (MMU) on page 41.

<table>
<thead>
<tr>
<th>Addressing mode</th>
<th>Instruction format</th>
<th>Effective address calculation method</th>
<th>Calculation formula</th>
</tr>
</thead>
<tbody>
<tr>
<td>Register direct</td>
<td>Rn</td>
<td>Effective address is register Rn. (Operand is register Rn contents.)</td>
<td>—</td>
</tr>
<tr>
<td>Register indirect</td>
<td>@Rn</td>
<td>Effective address is register Rn contents.</td>
<td>Rn → EA</td>
</tr>
<tr>
<td></td>
<td></td>
<td>(EA: effective address)</td>
<td>(EA: effective address)</td>
</tr>
<tr>
<td>Register indirect with</td>
<td>@Rn+</td>
<td>Effective address is register Rn contents. A constant is added to Rn after instruction execution: 1 for a byte operand, 2 for a word operand, 4 for a longword operand, 8 for a quadword operand.</td>
<td>Rn → EA</td>
</tr>
<tr>
<td>post-increment</td>
<td></td>
<td></td>
<td>After instruction execution Byte: Rn + 1 → Rn</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Word: Rn + 2 → Rn</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Longword: Rn + 4 → Rn</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Quadword: Rn + 8 → Rn</td>
</tr>
</tbody>
</table>

Table 37: Addressing modes and effective addresses
### Table 37: Addressing modes and effective addresses

<table>
<thead>
<tr>
<th>Addressing mode</th>
<th>Instruction format</th>
<th>Effective address calculation method</th>
<th>Calculation formula</th>
</tr>
</thead>
<tbody>
<tr>
<td>Register indirect with pre-decrement</td>
<td>@–Rn</td>
<td>Effective address is register Rn contents, decremented by a constant beforehand: 1 for a byte operand, 2 for a word operand, 4 for a longword operand, 8 for a quadword operand.</td>
<td>Byte: Rn – 1 → Rn  Word: Rn – 2 → Rn  Longword: Rn – 4 → Rn  Quadword: Rn – 8 → Rn  Rn → EA  (Instruction executed with Rn after calculation)</td>
</tr>
<tr>
<td>Register indirect with displacement</td>
<td>@(disp:4, Rn)</td>
<td>Effective address is register Rn contents with 4-bit displacement disp added. After disp is zero-extended, it is multiplied by 1 (byte), 2 (word), or 4 (longword), according to the operand size.</td>
<td>Byte: Rn + disp → EA  Word: Rn + disp × 2 → EA  Longword: Rn + disp × 4 → EA</td>
</tr>
<tr>
<td>Indexed register indirect</td>
<td>@(R0, Rn)</td>
<td>Effective address is sum of register Rn and R0 contents.</td>
<td>Rn + R0 → EA</td>
</tr>
<tr>
<td>Addressing mode</td>
<td>Instruction format</td>
<td>Effective address calculation method</td>
<td>Calculation formula</td>
</tr>
<tr>
<td>------------------------------------------------------</td>
<td>-------------------</td>
<td>------------------------------------------------------------------------------------------------------</td>
<td>--------------------------------------</td>
</tr>
<tr>
<td>GBR indirect with displacement</td>
<td>@(disp:8, GBR)</td>
<td>Effective address is register GBR contents with 8-bit displacement disp added. After disp is</td>
<td>Byte: GBR + disp → EA</td>
</tr>
<tr>
<td></td>
<td></td>
<td>zero-extended, it is multiplied by 1 (byte), 2 (word), or 4 (longword), according to the operand size.</td>
<td>Word: GBR + disp × 2 → EA</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>Longword: GBR + disp × 4 → EA</td>
</tr>
<tr>
<td></td>
<td></td>
<td>![Diagram](GBR + disp × 1/2/4)</td>
<td></td>
</tr>
<tr>
<td>Indexed GBR indirect</td>
<td>@(R0, GBR)</td>
<td>Effective address is sum of register GBR and R0 contents.</td>
<td>GBR + R0 → EA</td>
</tr>
<tr>
<td></td>
<td></td>
<td>![Diagram](GBR + R0)</td>
<td></td>
</tr>
</tbody>
</table>

Table 37: Addressing modes and effective addresses
### Addressing modes and effective addresses

<table>
<thead>
<tr>
<th>Addressing mode</th>
<th>Instruction format</th>
<th>Effective address calculation method</th>
<th>Calculation formula</th>
</tr>
</thead>
</table>
| PC-relative with displacement | @(disp:8, PC) | Effective address is PC+4 with 8-bit displacement disp added. After disp is zero-extended, it is multiplied by 2 (word), or 4 (longword), according to the operand size. With a longword operand, the lower 2 bits of PC are masked. | Word: \( PC + 4 + \text{disp} \times 2 \to EA \)  
Longword: \( PC \& 0xFFFFFFFFFC + 4 + \text{disp} \times 4 \to EA \) |
| PC-relative disp:8 | disp:8 | Effective address is PC+4 with 8-bit displacement disp added after being sign-extended and multiplied by 2. | \( PC + 4 + \text{disp} \times 2 \to \text{Branch-Target} \) |

![Diagram](image)
### Addressing modes and effective addresses

<table>
<thead>
<tr>
<th>Addressing mode</th>
<th>Instruction format</th>
<th>Effective address calculation method</th>
<th>Calculation formula</th>
</tr>
</thead>
<tbody>
<tr>
<td>PC-relative</td>
<td>disp:12</td>
<td>Effective address is PC+4 with 12-bit displacement disp added after being sign-extended and multiplied by 2.</td>
<td>PC + 4 + disp × 2 → Branch-Target</td>
</tr>
<tr>
<td>Rn</td>
<td></td>
<td>Effective address is sum of PC+4 and Rn.</td>
<td>PC + 4 + Rn → Branch-Target</td>
</tr>
<tr>
<td>Immediate</td>
<td>#imm:8</td>
<td>8-bit immediate data imm of TST, AND, OR, or XOR instruction is zero-extended.</td>
<td>—</td>
</tr>
<tr>
<td></td>
<td>#imm:8</td>
<td>8-bit immediate data imm of MOV, ADD, or CMP/EQ instruction is sign-extended.</td>
<td>—</td>
</tr>
</tbody>
</table>

#### Table 37: Addressing modes and effective addresses

**Note:** For the addressing modes below that use a displacement (disp), the assembler descriptions in this manual show the value before scaling (×1, ×2, or ×4) is performed according to the operand size. This is done to clarify the operation of the chip. Refer to the relevant assembler notation rules for the actual assembler descriptions.

- @ (disp:4, Rn)  
  Register indirect with displacement
- @ (disp:8, GBR)  
  GBR indirect with displacement
- @ (disp:8, PC)  
  PC-relative with displacement
- disp:8, disp:12  
  PC-relative
7.3 Instruction set summary

Table 38 shows the notation used in the following SH instruction list.

<table>
<thead>
<tr>
<th>Item</th>
<th>Format</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Instruction mnemonic</td>
<td>OP: Sz SRC, DEST</td>
<td>OP: Operation code</td>
</tr>
<tr>
<td></td>
<td>Sz: Size</td>
<td>Sz: Size</td>
</tr>
<tr>
<td></td>
<td>SRC: Source</td>
<td>SRC: Source and/or destination operand</td>
</tr>
<tr>
<td></td>
<td>DEST: Source and/or destination operand</td>
<td>Dest: Source and/or destination operand</td>
</tr>
<tr>
<td>Summary of operation</td>
<td>→, ← Transfer direction</td>
<td>(xx) Memory operand</td>
</tr>
<tr>
<td></td>
<td>M/Q/T SR flag bits</td>
<td>M/Q/T SR flag bits</td>
</tr>
<tr>
<td></td>
<td>&amp; Logical AND of individual bits</td>
<td>&amp; Logical AND of individual bits</td>
</tr>
<tr>
<td></td>
<td></td>
<td>Logical OR of individual bits</td>
</tr>
<tr>
<td></td>
<td></td>
<td>Logical exclusive-OR of individual bits</td>
</tr>
<tr>
<td></td>
<td></td>
<td>Logical NOT of individual bits</td>
</tr>
<tr>
<td></td>
<td></td>
<td>&lt;&lt;n, &gt;&gt;n n-bit shift</td>
</tr>
<tr>
<td>Instruction code</td>
<td>MSB ↔ LSB</td>
<td>mmmm: Register number (Rm, FRm)</td>
</tr>
<tr>
<td></td>
<td></td>
<td>nnnn: Register number (Rn, FRn)</td>
</tr>
<tr>
<td></td>
<td></td>
<td>0000: R0, FR0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>0001: R1, FR1</td>
</tr>
<tr>
<td></td>
<td></td>
<td>:</td>
</tr>
<tr>
<td></td>
<td></td>
<td>1111: R15, FR15</td>
</tr>
<tr>
<td></td>
<td></td>
<td>mmm: Register number (DRm, XDm, Rm_BANK)</td>
</tr>
<tr>
<td></td>
<td></td>
<td>nnn: Register number (DRm, XDm, Rn_BANK)</td>
</tr>
<tr>
<td></td>
<td></td>
<td>000: DR0, XD0, R0_BANK</td>
</tr>
<tr>
<td></td>
<td></td>
<td>001: DR2, XD2, R1_BANK</td>
</tr>
<tr>
<td></td>
<td></td>
<td>:</td>
</tr>
<tr>
<td></td>
<td></td>
<td>111: DR14, XD14, R7_BANK</td>
</tr>
<tr>
<td></td>
<td></td>
<td>mm: Register number (FVm)</td>
</tr>
<tr>
<td></td>
<td></td>
<td>nn: Register number (FVn)</td>
</tr>
<tr>
<td></td>
<td></td>
<td>00: FV0</td>
</tr>
<tr>
<td></td>
<td></td>
<td>01: FV4</td>
</tr>
<tr>
<td></td>
<td></td>
<td>10: FV8</td>
</tr>
<tr>
<td></td>
<td></td>
<td>11: FV12</td>
</tr>
<tr>
<td></td>
<td></td>
<td>iii: Immediate data</td>
</tr>
<tr>
<td></td>
<td></td>
<td>dddd: Displacement</td>
</tr>
<tr>
<td>Privileged mode</td>
<td>“Privileged” means the instruction can only be executed in privileged mode</td>
<td>“Privileged” means the instruction can only be executed in privileged mode.</td>
</tr>
</tbody>
</table>

Table 38: Notation used in instruction list
### Table 38: Notation used in instruction list

Note: Scaling ($\times 1, \times 2, \times 4$, or $\times 8$) is executed according to the size of the instruction operand(s).

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>MOV #imm,Rn</td>
<td>imm $\rightarrow$ sign extension $\rightarrow$ Rn</td>
<td>1110nnnniiiiii</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W @(disp,PC),Rn</td>
<td>(disp $\times 2 + PC + 4$) $\rightarrow$ sign extension $\rightarrow$ Rn</td>
<td>1001nnnndddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L @(disp,PC),Rn</td>
<td>(disp $\times 4 + PC &amp; 0xFFFFFFFF + 4$) $\rightarrow$ Rn</td>
<td>1101nnnndddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV Rm,Rn</td>
<td>Rm $\rightarrow$ Rn</td>
<td>0110nnnnm0011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B Rm,@Rn</td>
<td>Rm $\rightarrow$ (Rn)</td>
<td>0010nnnnn0000</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W Rm,@Rn</td>
<td>Rm $\rightarrow$ (Rn)</td>
<td>0010nnnnn0001</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L Rm,@Rn</td>
<td>Rm $\rightarrow$ (Rn)</td>
<td>0010nnnnn0010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B @Rm,Rn</td>
<td>(Rm) $\rightarrow$ sign extension $\rightarrow$ Rn</td>
<td>0110nnnnn0000</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W @Rm,Rn</td>
<td>(Rm) $\rightarrow$ sign extension $\rightarrow$ Rn</td>
<td>0110nnnnn0001</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L @Rm,Rn</td>
<td>(Rm) $\rightarrow$ Rn</td>
<td>0110nnnnn0010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B Rm,@-Rn</td>
<td>Rn-1 $\rightarrow$ Rn, Rm $\rightarrow$ (Rn)</td>
<td>0010nnnnn0100</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W Rm,@-Rn</td>
<td>Rn-2 $\rightarrow$ Rn, Rm $\rightarrow$ (Rn)</td>
<td>0010nnnnn0101</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L Rm,@-Rn</td>
<td>Rn-4 $\rightarrow$ Rn, Rm $\rightarrow$ (Rn)</td>
<td>0010nnnnn0110</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B @Rm+,Rn</td>
<td>(Rm) $\rightarrow$ sign extension $\rightarrow$ Rn, Rm + 1 $\rightarrow$ Rm</td>
<td>0110nnnnn0100</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W @Rm+,Rn</td>
<td>(Rm) $\rightarrow$ sign extension $\rightarrow$ Rn, Rm + 2 $\rightarrow$ Rm</td>
<td>0110nnnnn0101</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L @Rm+,Rn</td>
<td>(Rm) $\rightarrow$ Rn, Rm + 4 $\rightarrow$ Rm</td>
<td>0110nnnnn0110</td>
<td>—</td>
<td>—</td>
</tr>
</tbody>
</table>

### Table 39: Fixed-point transfer instructions

<table>
<thead>
<tr>
<th>Item</th>
<th>Format</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>T bit</td>
<td>Value of T bit after instruction execution</td>
<td>—: No change</td>
</tr>
</tbody>
</table>
### Table 39: Fixed-point transfer instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>MOV.B</td>
<td>R0,@(disp,Rn)</td>
<td>R0 → (disp + Rn)</td>
<td>10000000nnnndddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W</td>
<td>R0,@(disp,Rn)</td>
<td>R0 → (disp × 2 + Rn)</td>
<td>10000001nnnndddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L</td>
<td>Rm,@(disp,Rn)</td>
<td>Rm → (disp × 4 + Rn)</td>
<td>0001nnnnmmmmddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B</td>
<td>@(disp,Rm),R0</td>
<td>(disp + Rm) → sign extension → R0</td>
<td>10000100mmmmddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W</td>
<td>@(disp,Rm),R0</td>
<td>(disp × 2 + Rm) → sign extension → R0</td>
<td>10001011mmmmddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L</td>
<td>@(disp,Rm),Rn</td>
<td>(disp × 4 + Rm) → Rn</td>
<td>0101nnnnmmmmddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B</td>
<td>Rm,@(R0,Rn)</td>
<td>Rm → (R0 + Rn)</td>
<td>0000nnnnmmmm0100</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W</td>
<td>Rm,@(R0,Rn)</td>
<td>Rm → (R0 + Rn)</td>
<td>0000nnnnmmmm0101</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L</td>
<td>Rm,@(R0,Rn)</td>
<td>Rm → (R0 + Rn)</td>
<td>0000nnnnmmmm0110</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B</td>
<td>@(R0,Rm),Rn</td>
<td>(R0 + Rm) → sign extension → Rn</td>
<td>0000nnnnmmmm1100</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W</td>
<td>@(R0,Rm),Rn</td>
<td>(R0 + Rm) → sign extension → Rn</td>
<td>0000nnnnmmmm1101</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L</td>
<td>@(R0,Rm),Rn</td>
<td>(R0 + Rm) → Rn</td>
<td>0000nnnnmmmm1110</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B</td>
<td>R0,@(disp,GBR)</td>
<td>R0 → (disp + GBR)</td>
<td>11000000ddddddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W</td>
<td>R0,@(disp,GBR)</td>
<td>R0 → (disp × 2 + GBR)</td>
<td>11000001ddddddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L</td>
<td>R0,@(disp,GBR)</td>
<td>R0 → (disp × 4 + GBR)</td>
<td>11000010ddddddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.B</td>
<td>@(disp,GBR),R0</td>
<td>(disp + GBR) → sign extension → R0</td>
<td>11000100ddddddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.W</td>
<td>@(disp,GBR),R0</td>
<td>(disp × 2 + GBR) → sign extension → R0</td>
<td>11000101ddddddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.L</td>
<td>@(disp,GBR),R0</td>
<td>(disp × 4 + GBR) → R0</td>
<td>11000110ddddddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOVA</td>
<td>@(disp,PC),R0</td>
<td>disp × 4 + PC &amp; 0xFFFFFFF + 4 → R0</td>
<td>11000111ddddddddd</td>
<td>—</td>
</tr>
<tr>
<td>MOV.T</td>
<td>Rn</td>
<td>Rn → R0</td>
<td>0000nnnn0101001</td>
<td>—</td>
</tr>
</tbody>
</table>
### Instruction Operation Instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>SWAP.B</td>
<td>Rm,Rn</td>
<td>Rm → swap lower 2 bytes → REG</td>
<td>0110nnnnnnnn1000</td>
<td>—</td>
</tr>
<tr>
<td>SWAP.W</td>
<td>Rm,Rn</td>
<td>Rm → swap upper/lower words → Rn</td>
<td>0110nnnnnnnn1001</td>
<td>—</td>
</tr>
<tr>
<td>XTRCT</td>
<td>Rm,Rn</td>
<td>Rm:Rn middle 32 bits → Rn</td>
<td>0010nnnnnnnn1101</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 39: Fixed-point transfer instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T Bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>ADD</td>
<td>Rm,Rn</td>
<td>Rn + Rm → Rn</td>
<td>0011nnnnnnnn1100</td>
<td>—</td>
</tr>
<tr>
<td>ADD</td>
<td>#imm,Rn</td>
<td>Rn + imm → Rn</td>
<td>0111nnnnnnnnnnnn</td>
<td>—</td>
</tr>
<tr>
<td>ADDC</td>
<td>Rm,Rn</td>
<td>Rn + Rm + T → Rn, carry → T</td>
<td>0011nnnnnnnn1110</td>
<td>—</td>
</tr>
<tr>
<td>ADDV</td>
<td>Rm,Rn</td>
<td>Rn + Rm → Rn, overflow → T</td>
<td>0011nnnnnnnn1111</td>
<td>—</td>
</tr>
<tr>
<td>CMP/EQ</td>
<td>#imm, R0</td>
<td>When R0 = imm, 1 → T Otherwise, 0 → T</td>
<td>10001000iiiiiiii</td>
<td>—</td>
</tr>
<tr>
<td>CMP/EQ</td>
<td>Rm,Rn</td>
<td>When Rn = Rm, 1 → T Otherwise, 0 → T</td>
<td>0011nnnnnnnn0000</td>
<td>—</td>
</tr>
<tr>
<td>CMP/HS</td>
<td>Rm,Rn</td>
<td>When Rn ≥ Rm (unsigned), 1 → T Otherwise, 0 → T</td>
<td>0011nnnnnnnn0010</td>
<td>—</td>
</tr>
<tr>
<td>CMP/GE</td>
<td>Rm,Rn</td>
<td>When Rn ≥ Rm (signed), 1 → T Otherwise, 0 → T</td>
<td>0011nnnnnnnn0011</td>
<td>—</td>
</tr>
<tr>
<td>CMP/HI</td>
<td>Rm,Rn</td>
<td>When Rn &gt; Rm (unsigned), 1 → T Otherwise, 0 → T</td>
<td>0011nnnnnnnn0110</td>
<td>—</td>
</tr>
<tr>
<td>CMP/GT</td>
<td>Rm,Rn</td>
<td>When Rn &gt; Rm (signed), 1 → T Otherwise, 0 → T</td>
<td>0011nnnnnnnn0111</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 40: Arithmetic operation instructions
### Table 40: Arithmetic operation instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T Bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>CMP/PZ</td>
<td>Rn, When Rn ≥ 0, 1 → T, Otherwise, 0 → T</td>
<td>0100nnnn00010001</td>
<td>—</td>
<td></td>
</tr>
<tr>
<td>CMP/PL</td>
<td>Rn, When Rn &gt; 0, 1 → T, Otherwise, 0 → T</td>
<td>0100nnnn00010101</td>
<td>—</td>
<td></td>
</tr>
<tr>
<td>CMP/STR</td>
<td>Rm,Rn, When any bytes are equal, 1 → T, Otherwise, 0 → T</td>
<td>0010nnnnnnnn1100</td>
<td>—</td>
<td></td>
</tr>
<tr>
<td>DIV1</td>
<td>Rm,Rn, 1-step division (Rn ÷ Rm)</td>
<td>0011nnnnnnnn0100</td>
<td>—</td>
<td></td>
</tr>
<tr>
<td>DIV0S</td>
<td>Rm,Rn, MSB of Rn → Q, MSB of Rm → M, M^Q → T</td>
<td>0010nnnnnnnn1111</td>
<td>—</td>
<td></td>
</tr>
<tr>
<td>DIV0U</td>
<td>0 → M/Q/T</td>
<td>000000011100100</td>
<td>0</td>
<td>—</td>
</tr>
<tr>
<td>DMULS.L</td>
<td>Rm,Rn, Signed, Rn × Rm → MAC, 32 × 32 → 64 bits</td>
<td>0011nnnnnnnn1101</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>DMULU.L</td>
<td>Rm,Rn, Unsigned, Rn × Rm → MAC, 32 × 32 → 64 bits</td>
<td>0011nnnnnnnn1101</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>DT</td>
<td>Rn, Rn – 1 → Rn; when Rn = 0, 1 → T, When Rn ≠ 0, 0 → T</td>
<td>0100nnnn00010000</td>
<td>—</td>
<td></td>
</tr>
<tr>
<td>EXTS.B</td>
<td>Rm,Rn, Rm sign-extended from byte → Rn</td>
<td>0110nnnnnnnn1110</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>EXTS.W</td>
<td>Rm,Rn, Rm sign-extended from word → Rn</td>
<td>0110nnnnnnnn1111</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>EXTU.B</td>
<td>Rm,Rn, Rm zero-extended from byte → Rn</td>
<td>0110nnnnnnnn1100</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>EXTU.W</td>
<td>Rm,Rn, Rm zero-extended from word → Rn</td>
<td>0110nnnnnnnn1101</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>MAC.L</td>
<td>@Rm+,@Rn+, Signed, (Rn) × (Rm) + MAC → MAC, Rn + 4 → Rn, Rm + 4 → Rm 32 × 32 + 64 → 64 bits</td>
<td>0000nnnnnnnn1111</td>
<td>—</td>
<td>—</td>
</tr>
</tbody>
</table>
### Table 40: Arithmetic operation instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T Bit</th>
</tr>
</thead>
</table>
| MAC.W @Rm+,@Rn+ | Signed, (Rn) × (Rm) + MAC → MAC  
Rn + 2 → Rn, Rm + 2 → Rm  
16 × 16 + 64 → 64 bits | 0100nnnnnnnmmllll | — | — |
| MUL.L Rm,Rn | Rn × Rm → MACL  
32 × 32 → 32 bits | 0000nnnnnnnmm0111 | — | — |
| MULS.W Rm,Rn | Signed, Rn × Rm → MACL  
16 × 16 → 32 bits | 0010nnnnnnnmm1111 | — | — |
| MULU.W Rm,Rn | Unsigned, Rn × Rm → MACL  
16 × 16 → 32 bits | 0010nnnnnnnmm1110 | — | — |
| NEG Rm,Rn | 0 – Rm → Rn | 0110nnnnnnnmm1011 | — | — |
| NEGC Rm,Rn | 0 – Rm – T → Rn, borrow → T | 0110nnnnnnnmm1010 | — | Borrow |
| SUB Rm,Rn | Rn – Rm → Rn | 0011nnnnnnnmm1000 | — | — |
| SUBC Rm,Rn | Rn – Rm – T → Rn, borrow → T | 0011nnnnnnnmm1010 | — | Borrow |
| SUBV Rm,Rn | Rn – Rm → Rn, underflow → T | 0011nnnnnnnmm1011 | — | Underflow |
### PRELIMINARY DATA

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T Bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>AND Rm,Rn</td>
<td>Rm &amp; Rn → Rn</td>
<td>0010nnnnnnnn1001</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>AND #imm,R0</td>
<td>R0 &amp; imm → R0</td>
<td>11001001iiiiiiii</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>AND.B #imm,@(R0,GBR)</td>
<td>(R0 + GBR) &amp; imm → (R0 + GBR)</td>
<td>11001101iiiiiiii</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>NOT Rm,Rn</td>
<td>~Rm → Rn</td>
<td>0110nnnnnnnn0111</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>OR Rm,Rn</td>
<td>Rn</td>
<td>Rm → Rn</td>
<td>0010nnnnnnnn1011</td>
<td>—</td>
</tr>
<tr>
<td>OR #imm,R0</td>
<td>R0</td>
<td></td>
<td>imm → R0</td>
<td>11001011iiiiiiii</td>
</tr>
<tr>
<td>OR.B #imm,@(R0,GBR)</td>
<td>(R0 + GBR)</td>
<td></td>
<td>imm → (R0 + GBR)</td>
<td>11001111iiiiiiii</td>
</tr>
<tr>
<td>TST.B @Rn</td>
<td>When (Rn) = 0, 1 → T</td>
<td>Otherwise, 0 → T</td>
<td>In both cases, 1 → MSB of (Rn)</td>
<td>0100nnnn00011011</td>
</tr>
<tr>
<td>TST Rm,Rn</td>
<td>Rm &amp; Rm; when result = 0, 1 → T</td>
<td>Otherwise, 0 → T</td>
<td>0010nnnnnnnn1000</td>
<td>—</td>
</tr>
<tr>
<td>TST #imm,R0</td>
<td>R0 &amp; imm; when result = 0, 1 → T</td>
<td>Otherwise, 0 → T</td>
<td>11001000iiiiiiii</td>
<td>—</td>
</tr>
<tr>
<td>TST.B #imm,@(R0,GBR)</td>
<td>(R0 + GBR) &amp; imm; when result = 0, 1 → T</td>
<td>Otherwise, 0 → T</td>
<td>11001100iiiiiiii</td>
<td>—</td>
</tr>
<tr>
<td>XOR Rm,Rn</td>
<td>Rm ( \lor ) Rm → Rn</td>
<td>0010nnnnnnnn1010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>XOR #imm,R0</td>
<td>R0 ( \lor ) imm → R0</td>
<td>11001010iiiiiiii</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>XOR.B #imm,@(R0,GBR)</td>
<td>(R0 + GBR) ( \lor ) imm → (R0 + GBR)</td>
<td>11001110iiiiiiii</td>
<td>—</td>
<td>—</td>
</tr>
</tbody>
</table>

**Table 41: Logic operation instructions**
### Table 42: Shift instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>ROTL</td>
<td>Rn</td>
<td>T ← Rn ← MSB</td>
<td>0100nnnn00000100</td>
<td>—</td>
</tr>
<tr>
<td>ROTR</td>
<td>Rn</td>
<td>LSB → Rn → T</td>
<td>0100nnnn00000101</td>
<td>—</td>
</tr>
<tr>
<td>ROTCL</td>
<td>Rn</td>
<td>T ← Rn ← T</td>
<td>0100nnnn00100100</td>
<td>—</td>
</tr>
<tr>
<td>ROTCR</td>
<td>Rn</td>
<td>T → Rn → T</td>
<td>0100nnnn00100101</td>
<td>—</td>
</tr>
<tr>
<td>SHAD</td>
<td>Rm,Rn</td>
<td>When Rn ≥ 0, Rn &lt;&lt; Rm → Rn When Rn &lt; 0, Rn &gt;&gt; Rm → [MSB → Rn]</td>
<td>0100nnnnnnnn1100</td>
<td>—</td>
</tr>
<tr>
<td>SHAL</td>
<td>Rn</td>
<td>T ← Rn ← 0</td>
<td>0100nnnn00100000</td>
<td>—</td>
</tr>
<tr>
<td>SHAR</td>
<td>Rn</td>
<td>MSB → Rn → T</td>
<td>0100nnnn00100001</td>
<td>—</td>
</tr>
<tr>
<td>SHLD</td>
<td>Rm,Rn</td>
<td>When Rn ≥ 0, Rn &lt;&lt; Rm → Rn When Rn &lt; 0, Rn &gt;&gt; Rm → [0 → Rn]</td>
<td>0100nnnnnnnn1101</td>
<td>—</td>
</tr>
<tr>
<td>SHLL</td>
<td>Rn</td>
<td>T ← Rn ← 0</td>
<td>0100nnnn00000000</td>
<td>—</td>
</tr>
<tr>
<td>SHLR</td>
<td>Rn</td>
<td>0 → Rn → T</td>
<td>0100nnnn00000001</td>
<td>—</td>
</tr>
<tr>
<td>SHLL2</td>
<td>Rn</td>
<td>Rn &lt;&lt; 2 → Rn</td>
<td>0100nnnn00001000</td>
<td>—</td>
</tr>
<tr>
<td>SHLR2</td>
<td>Rn</td>
<td>Rn &gt;&gt; 2 → Rn</td>
<td>0100nnnn00001001</td>
<td>—</td>
</tr>
<tr>
<td>SHLL8</td>
<td>Rn</td>
<td>Rn &lt;&lt; 8 → Rn</td>
<td>0100nnnn00011000</td>
<td>—</td>
</tr>
<tr>
<td>SHLR8</td>
<td>Rn</td>
<td>Rn &gt;&gt; 8 → Rn</td>
<td>0100nnnn00011001</td>
<td>—</td>
</tr>
<tr>
<td>SHLL16</td>
<td>Rn</td>
<td>Rn &lt;&lt; 16 → Rn</td>
<td>0100nnnn00101000</td>
<td>—</td>
</tr>
<tr>
<td>SHLR16</td>
<td>Rn</td>
<td>Rn &gt;&gt; 16 → Rn</td>
<td>0100nnnn00101001</td>
<td>—</td>
</tr>
<tr>
<td>Instruction</td>
<td>Operation</td>
<td>Instruction code</td>
<td>Privileged</td>
<td>T bit</td>
</tr>
<tr>
<td>-------------</td>
<td>----------------------------------------------------------------------------</td>
<td>---------------------</td>
<td>------------</td>
<td>-------</td>
</tr>
<tr>
<td>BF</td>
<td>label When T = 0, disp \times 2 + PC + 4 \rightarrow PC, When T = 1, nop</td>
<td>10001011dddddddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>BF/S</td>
<td>label Delayed branch; when T = 0, disp \times 2 + PC + 4 \rightarrow PC, When T = 1, nop</td>
<td>10001111dddddddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>BT</td>
<td>label When T = 1, disp \times 2 + PC + 4 \rightarrow PC, When T = 0, nop</td>
<td>10001001dddddddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>BT/S</td>
<td>label Delayed branch; when T = 1, disp \times 2 + PC + 4 \rightarrow PC, When T = 0, nop</td>
<td>10001101dddddddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>BRA</td>
<td>label Delayed branch, disp \times 2 + PC + 4 \rightarrow PC</td>
<td>1010dddddddddddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>BRAF</td>
<td>Rn Rn + PC + 4 \rightarrow PC</td>
<td>0000nnnn00100011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>BSR</td>
<td>label Delayed branch, PC + 4 \rightarrow PR, disp \times 2 + PC + 4 \rightarrow PC</td>
<td>1011dddddddddddddd</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>BSRF</td>
<td>Rn Delayed branch, PC + 4 \rightarrow PR, Rn + PC + 4 \rightarrow PC</td>
<td>0000nnnn00000011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>JMP</td>
<td>@Rn Delayed branch, Rn \rightarrow PC</td>
<td>0100nnnn00101011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>JSR</td>
<td>@Rn Delayed branch, PC + 4 \rightarrow PR, Rn \rightarrow PC</td>
<td>0100nnnn00001011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>RTS</td>
<td>Delayed branch, PR \rightarrow PC</td>
<td>0000000000001011</td>
<td>—</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 43: Branch instructions
## Table 44: System control instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>CLRMACh</td>
<td>0 → MACH, MACL</td>
<td>000000000101000</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>CLRS</td>
<td>0 → S</td>
<td>0000000001001000</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>CLRT</td>
<td>0 → T</td>
<td>0000000001001000</td>
<td>—</td>
<td>0</td>
</tr>
<tr>
<td>LDC Rm,SR</td>
<td>Rm → SR</td>
<td>0100nnnnnn0001110</td>
<td>Privileged</td>
<td>LSB</td>
</tr>
<tr>
<td>LDC Rm,GBR</td>
<td>Rm → GBR</td>
<td>0100nnnnnn0011110</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>LDC Rm,VBR</td>
<td>Rm → VBR</td>
<td>0100nnnnnn0101110</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC Rm,SSR</td>
<td>Rm → SSR</td>
<td>0100nnnnnn0111110</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC Rm,SPC</td>
<td>Rm → SPC</td>
<td>0100nnnnnn1001110</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC Rm,DBR</td>
<td>Rm → DBR</td>
<td>0100nnnnnn1111010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC Rm,Rn_BANK</td>
<td>Rm → Rn_BANK (n = 0 to 7)</td>
<td>0100nnnn1nnnn1110</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC.L @Rm+,SR</td>
<td>(Rm) → SR, Rm + 4 → Rm</td>
<td>0100nnnnnn0000111</td>
<td>Privileged</td>
<td>LSB</td>
</tr>
<tr>
<td>LDC.L @Rm+,GBR</td>
<td>(Rm) → GBR, Rm + 4 → Rm</td>
<td>0100nnnnnn0001011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>LDC.L @Rm+,VBR</td>
<td>(Rm) → VBR, Rm + 4 → Rm</td>
<td>0100nnnnnn0010111</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC.L @Rm+,SSR</td>
<td>(Rm) → SSR, Rm + 4 → Rm</td>
<td>0100nnnnnn0011111</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC.L @Rm+,SPC</td>
<td>(Rm) → SPC, Rm + 4 → Rm</td>
<td>0100nnnnnn0100111</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC.L @Rm+,DBR</td>
<td>(Rm) → DBR, Rm + 4 → Rm</td>
<td>0100nnnnnn1001111</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDC.L @Rm+,Rn_BANK</td>
<td>(Rm) → Rn_BANK, Rm + 4 → Rm</td>
<td>0100nnnn1nnnn0111</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>LDS Rm,MACH</td>
<td>Rm → MACH</td>
<td>0100nnnnnn0001010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>LDS Rm,MACL</td>
<td>Rm → MACL</td>
<td>0100nnnnnn0011110</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>LDS Rm,PR</td>
<td>Rm → PR</td>
<td>0100nnnnnn0010110</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>LDS.L @Rm+,MACH</td>
<td>(Rm) → MACH, Rm + 4 → Rm</td>
<td>0100nnnnnn0000111</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>LDS.L @Rm+,MACL</td>
<td>(Rm) → MACL, Rm + 4 → Rm</td>
<td>0100nnnnnn0010110</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>LDS.L @Rm+,PR</td>
<td>(Rm) → PR, Rm + 4 → Rm</td>
<td>0100nnnnnn0010011</td>
<td>—</td>
<td>—</td>
</tr>
</tbody>
</table>
### Table 44: System control instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDTLB</td>
<td>PTEH/PTEL → TLB</td>
<td>0000000111100000</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>MOVCA.L</td>
<td>R0, @Rn (without fetching cache block)</td>
<td>0000nnnn11000011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>NOP</td>
<td>No operation</td>
<td>0000000000010001</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>OCBI</td>
<td>@Rn</td>
<td>0000nnnn10010011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>OCBP</td>
<td>@Rn</td>
<td>0000nnnn10100011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>OCBWB</td>
<td>@Rn</td>
<td>0000nnnn10110011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>PREF</td>
<td>@Rn</td>
<td>0000nnnn10000011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>RTE</td>
<td>Delayed branch, SSR/SPC → SR/PC</td>
<td>0000000101011101</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>SETS</td>
<td>1 → S</td>
<td>0000000101100000</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>SETT</td>
<td>1 → T</td>
<td>0000000011000010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>SLEEP</td>
<td>Sleep or standby</td>
<td>0000000000101111</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC SR,Rn</td>
<td>SR → Rn</td>
<td>0000nnnn00000010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC GBR,Rn</td>
<td>GBR → Rn</td>
<td>0000nnnn00010010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>STC VBR,Rn</td>
<td>VBR → Rn</td>
<td>0000nnnn00100010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC SSR,Rn</td>
<td>SSR → Rn</td>
<td>0000nnnn00110010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC SPC,Rn</td>
<td>SPC → Rn</td>
<td>0000nnnn01000010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC SGR,Rn</td>
<td>SGR → Rn</td>
<td>0000nnnn01110010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC DBR,Rn</td>
<td>DBR → Rn</td>
<td>0000nnnn11110010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC Rm_BANK,Rn</td>
<td>Rm_BANK → Rn (m = 0 to 7)</td>
<td>0000nnnn11110000</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC.L SR, @Rn</td>
<td>Rn – 4 → Rn, SR → (Rn)</td>
<td>0100nnnn00000011</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC.L GBR, @Rn</td>
<td>Rn – 4 → Rn, GBR → (Rn)</td>
<td>0100nnnn00010011</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>STC.L VBR, @Rn</td>
<td>Rn – 4 → Rn, VBR → (Rn)</td>
<td>0100nnnn00100011</td>
<td>Privileged</td>
<td>—</td>
</tr>
</tbody>
</table>

**Notes:**
- **Privileged:** Indicates whether the operation requires privileged mode.
- **T bit:** Indicates the type of operation (e.g., privileged, non-privileged).

---

**STMicroelectronics and Hitachi, Ltd.**

ADCS 7182230F      SH-4 CPU Core Architecture
<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>STC.L SSR, @-Rn</td>
<td>Rn – 4 → Rn, SSR → (Rn)</td>
<td>0100nnn00110011</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC.L SPC, @-Rn</td>
<td>Rn – 4 → Rn, SPC → (Rn)</td>
<td>0100nnn01000011</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC.L SGR, @-Rn</td>
<td>Rn – 4 → Rn, SGR → (Rn)</td>
<td>0100nnn00110010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC.L DBR, @-Rn</td>
<td>Rn – 4 → Rn, DBR → (Rn)</td>
<td>0100nnn11110010</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STC.L Rm_BANK, @-Rn</td>
<td>Rn – 4 → Rn, Rm_BANK → (Rn) (m = 0 to 7)</td>
<td>0100nnn1mm0011</td>
<td>Privileged</td>
<td>—</td>
</tr>
<tr>
<td>STS MACH,Rn</td>
<td>MACH → Rn</td>
<td>0000nnn00001010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>STS MACL,Rn</td>
<td>MACL → Rn</td>
<td>0000nnn00011010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>STS PR,Rn</td>
<td>PR → Rn</td>
<td>0000nnn00101010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>STS.L MACH, @-Rn</td>
<td>Rn – 4 → Rn, MACH → (Rn)</td>
<td>0100nnn00000010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>STS.L MACL, @-Rn</td>
<td>Rn – 4 → Rn, MACL → (Rn)</td>
<td>0100nnn00010010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>STS.L PR, @-Rn</td>
<td>Rn – 4 → Rn, PR → (Rn)</td>
<td>0100nnn00100010</td>
<td>—</td>
<td>—</td>
</tr>
<tr>
<td>TRAPA #imm</td>
<td>PC + 2 → SPC, SR → SSR, #imm &lt;&lt; 2 → TRA, 0x160 → EXPEVT, VBR + 0x0100 → PC</td>
<td>11000011111111</td>
<td>—</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 44: System control instructions
<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>FLDI0</td>
<td>FRn</td>
<td>0x00000000 → FRn</td>
<td>1111nnnn10001101</td>
<td>—</td>
</tr>
<tr>
<td>FLDI1</td>
<td>FRn</td>
<td>0x3F800000 → FRn</td>
<td>1111nnnn10011101</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>FRm,FRn</td>
<td>FRm → FRn</td>
<td>1111nnnnnnmm1100</td>
<td>—</td>
</tr>
<tr>
<td>FMOV.S</td>
<td>@Rm,FRn</td>
<td>(Rm) → FRn</td>
<td>1111nnnnnnmm1000</td>
<td>—</td>
</tr>
<tr>
<td>FMOV.S</td>
<td>@(R0,Rm),FRn</td>
<td>(R0 + Rm) → FRn</td>
<td>1111nnnnnnmm0110</td>
<td>—</td>
</tr>
<tr>
<td>FMOV.S</td>
<td>@Rm+,FRn</td>
<td>(Rm) → FRn, Rm + 4 → Rm</td>
<td>1111nnnnnnmm1001</td>
<td>—</td>
</tr>
<tr>
<td>FMOV.S</td>
<td>FRm,@Rn</td>
<td>FRm → (Rn)</td>
<td>1111nnnnnnmm1010</td>
<td>—</td>
</tr>
<tr>
<td>FMOV.S</td>
<td>FRm,@-Rn</td>
<td>Rn-4 → Rn, FRm → (Rn)</td>
<td>1111nnnnnnmm1011</td>
<td>—</td>
</tr>
<tr>
<td>FMOV.S</td>
<td>FRm,(R0,Rn)</td>
<td>FRm → (R0 + Rn)</td>
<td>1111nnnnnnmm0111</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm,DRn</td>
<td>DRm → DRn</td>
<td>1111nnn0mm01100</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm,DRn</td>
<td>(Rm) → DRn</td>
<td>1111nnn0mm01000</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>@(R0,Rm),DRn</td>
<td>(R0 + Rm) → DRn</td>
<td>1111nnn0mm0110</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm+,DRn</td>
<td>(Rm) → DRn, Rm + 8 → Rm</td>
<td>1111nnn0mm0101</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm,@Rn</td>
<td>DRm → (Rn)</td>
<td>1111nnnnnnmm01010</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm,@-Rn</td>
<td>Rn-8 → Rn, DRm → (Rn)</td>
<td>1111nnnnnnmm01011</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm,(R0,Rn)</td>
<td>DRm → (R0 + Rn)</td>
<td>1111nnnnnnmm00111</td>
<td>—</td>
</tr>
<tr>
<td>FLDS</td>
<td>FRm,FPUL</td>
<td>FRm → FPUL</td>
<td>1111nnmnnm0011101</td>
<td>—</td>
</tr>
<tr>
<td>FSTS</td>
<td>FPUL,FRn</td>
<td>FPUL → FRn</td>
<td>1111nnnn00001101</td>
<td>—</td>
</tr>
<tr>
<td>FABS</td>
<td>FRn</td>
<td>FRn &amp; 0x7FFF FFFF → FRn</td>
<td>1111nnnn01011101</td>
<td>—</td>
</tr>
<tr>
<td>FADD</td>
<td>FRm,FRn</td>
<td>FRn + FRm → FRn</td>
<td>1111nnnnnnmm0000</td>
<td>—</td>
</tr>
<tr>
<td>FCMP/EQ</td>
<td>FRm,FRn</td>
<td>When FRn = FRm, 1 → T Otherwise, 0 → T</td>
<td>1111nnnnnnmm0100</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 45: Floating-point single-precision instructions
### Table 45: Floating-point single-precision instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>FCMP/GT</td>
<td>FRm,FRn</td>
<td>When FRn &gt; FRm, 1 → T Otherwise, 0 → T</td>
<td>1111nnnnnnnn0101</td>
<td>—</td>
</tr>
<tr>
<td>FDIV</td>
<td>FRm,FRn</td>
<td>FRn/FRm → FRn</td>
<td>1111nnnnnnnn0011</td>
<td>—</td>
</tr>
<tr>
<td>FLOAT</td>
<td>FPUL,FRn</td>
<td>(float) FPUL → FRn</td>
<td>1111nnnn00101101</td>
<td>—</td>
</tr>
<tr>
<td>FMAC</td>
<td>FR0,FRm,FRn</td>
<td>FR0*FRm + FRn → FRn</td>
<td>1111nnnnnnnn1110</td>
<td>—</td>
</tr>
<tr>
<td>FMUL</td>
<td>FRm,FRn</td>
<td>FRn*FRm → FRn</td>
<td>1111nnnnnnnn0010</td>
<td>—</td>
</tr>
<tr>
<td>FNEG</td>
<td>FRn</td>
<td>FRn ∧ 0x80000000 → FRn</td>
<td>1111nnnn01001101</td>
<td>—</td>
</tr>
<tr>
<td>FSQRT</td>
<td>FRn</td>
<td>√FRn → FRn</td>
<td>1111nnnn01101101</td>
<td>—</td>
</tr>
<tr>
<td>FSUB</td>
<td>FRm,FRn</td>
<td>FRn − FRm → FRn</td>
<td>1111nnnnnnnn0001</td>
<td>—</td>
</tr>
<tr>
<td>FTRC</td>
<td>FRm,FPUL</td>
<td>(long) FRm → FPUL</td>
<td>1111nnnn01111101</td>
<td>—</td>
</tr>
</tbody>
</table>

### Table 46: Floating-point double-precision instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>FABS</td>
<td>DRn</td>
<td>DRn &amp; 0xFFFF FFFF FFFF FFFF → DRn</td>
<td>1111nnnn01111101</td>
<td>—</td>
</tr>
<tr>
<td>FADD</td>
<td>DRm,DRn</td>
<td>DRn + DRm → DRn</td>
<td>1111nnnn00000000</td>
<td>—</td>
</tr>
<tr>
<td>FCMP/EQ</td>
<td>DRm,DRn</td>
<td>When DRn = DRm, 1 → T Otherwise, 0 → T</td>
<td>1111nnnn00001000</td>
<td>—</td>
</tr>
<tr>
<td>FCMP/GT</td>
<td>DRm,DRn</td>
<td>When DRn &gt; DRm, 1 → T Otherwise, 0 → T</td>
<td>1111nnnn00001010</td>
<td>—</td>
</tr>
<tr>
<td>FDIV</td>
<td>DRm,DRn</td>
<td>DRn /DRm → DRn</td>
<td>1111nnnn00001101</td>
<td>—</td>
</tr>
<tr>
<td>FCNVDS</td>
<td>DRm,FPUL</td>
<td>double_to_float[DRm] → FPUL</td>
<td>1111nnnn010111101</td>
<td>—</td>
</tr>
<tr>
<td>FCNVSD</td>
<td>FPUL,DRn</td>
<td>float_to_double [FPUL] → DRn</td>
<td>1111nnnn010101101</td>
<td>—</td>
</tr>
<tr>
<td>Instruction</td>
<td>Operation</td>
<td>Instruction code</td>
<td>Privileged</td>
<td>T bit</td>
</tr>
<tr>
<td>-------------</td>
<td>-----------</td>
<td>------------------</td>
<td>------------</td>
<td>-------</td>
</tr>
<tr>
<td>FLOAT</td>
<td>FPUL,DRn</td>
<td>(float)FPUL → DRn</td>
<td>1111nnn000101101</td>
<td>—</td>
</tr>
<tr>
<td>FMUL</td>
<td>DRm,DRn</td>
<td>DRn *DRm → DRn</td>
<td>1111nnn0nnnn00010</td>
<td>—</td>
</tr>
<tr>
<td>FNEG</td>
<td>DRn</td>
<td>DRn ^ 0x8000 0000 0000 → DRn</td>
<td>1111nnn01001101</td>
<td>—</td>
</tr>
<tr>
<td>FSQRT</td>
<td>DRn</td>
<td>√DRn → DRn</td>
<td>1111nnn01101101</td>
<td>—</td>
</tr>
<tr>
<td>FSUB</td>
<td>DRm,DRn</td>
<td>DRn − DRm → DRn</td>
<td>1111nnn0nnnn00001</td>
<td>—</td>
</tr>
<tr>
<td>FTRC</td>
<td>DRm,FPUL</td>
<td>(long) DRm → FPUL</td>
<td>1111nnnn000111101</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 46: Floating-point double-precision instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction code</th>
<th>Privileged</th>
<th>T bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>LDS</td>
<td>Rm,FPSCR</td>
<td>Rm → FPSCR</td>
<td>0100nnnn01111010</td>
<td>—</td>
</tr>
<tr>
<td>LDS</td>
<td>Rm,FPUL</td>
<td>Rm → FPUL</td>
<td>0100nnnn01011010</td>
<td>—</td>
</tr>
<tr>
<td>LDS.L</td>
<td>@Rm+,FPSCR</td>
<td>(Rm) → FPSCR, Rm+4 → Rm</td>
<td>0100nnnn01100110</td>
<td>—</td>
</tr>
<tr>
<td>LDS.L</td>
<td>@Rm+,FPUL</td>
<td>(Rm) → FPUL, Rm+4 → Rm</td>
<td>0100nnnn01010110</td>
<td>—</td>
</tr>
<tr>
<td>STS</td>
<td>FPSCR,Rn</td>
<td>FPSCR → Rn</td>
<td>0000nnnn01101010</td>
<td>—</td>
</tr>
<tr>
<td>STS</td>
<td>FPUL,Rn</td>
<td>FPUL → Rn</td>
<td>0000nnnn01011010</td>
<td>—</td>
</tr>
<tr>
<td>STS.L</td>
<td>FPSCR,@-Rn</td>
<td>Rn − 4 → Rn, FPSCR → (Rn)</td>
<td>0100nnnn01100010</td>
<td>—</td>
</tr>
<tr>
<td>STS.L</td>
<td>FPUL,@-Rn</td>
<td>Rn − 4 → Rn, FPUL → (Rn)</td>
<td>0100nnnn01010010</td>
<td>—</td>
</tr>
</tbody>
</table>

Table 47: Floating-point control instructions
### Table 48: Floating-point graphics acceleration instructions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Operation</th>
<th>Instruction Code</th>
<th>Privileged</th>
<th>T Bit</th>
</tr>
</thead>
<tbody>
<tr>
<td>FMOV</td>
<td>DRm,XDn</td>
<td>DRm → XDn</td>
<td>1111nnn1mmm01100</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>XDm,DRn</td>
<td>XDm → DRn</td>
<td>1111nnn0mmm11100</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>XDm,XDn</td>
<td>XDm → XDn</td>
<td>1111nnn1mmm11100</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm,XDn</td>
<td>(Rm) → XDn</td>
<td>1111nnn1mmm11000</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm+,XDn</td>
<td>(Rm) → XDn, Rm + 8 → Rm</td>
<td>1111nnn1mmm11001</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>@(R0,Rm),XDn</td>
<td>(R0 + Rm) → XDn</td>
<td>1111nnn1mmm01100</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>XDm,@Rn</td>
<td>XDm → (Rn)</td>
<td>1111nnnmmm11010</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>XDm,@-Rn</td>
<td>Rn + 8 → Rn, XDm → (Rn)</td>
<td>1111nnnmmm11011</td>
<td>—</td>
</tr>
<tr>
<td>FMOV</td>
<td>XDm,@(R0,Rn)</td>
<td>XDm → (R0+Rn)</td>
<td>1111nnnmmm10111</td>
<td>—</td>
</tr>
<tr>
<td>FIPR</td>
<td>FVm,FVn</td>
<td>inner_product [FVm, FVn] → FR[n+3]</td>
<td>1111nnnmm1101101</td>
<td>—</td>
</tr>
<tr>
<td>FTRV</td>
<td>XMTRX,FVn</td>
<td>transform_vector [XMTRX, FVn] → FVn</td>
<td>1111nn0111111101</td>
<td>—</td>
</tr>
<tr>
<td>FRCHG</td>
<td>~FPSCR,FR → SPFCR.FR</td>
<td></td>
<td>1111101111111101</td>
<td>—</td>
</tr>
<tr>
<td>FSCHG</td>
<td>~FPSCR,SZ → SPFCR,SZ</td>
<td></td>
<td>1111001111111101</td>
<td>—</td>
</tr>
</tbody>
</table>
Instruction specification

8.1 Overview

The behavior of instructions is specified using a simple notational language to describe the effects of each instruction on the architectural state of the machine.

The language consists of the following features:

• A simple variable and type system.
• Expressions.
• Statements.
• Notation for the architectural state of the machine.
• An abstract sequential model of instruction execution.

These features are described in the following sections. Additional mechanisms are defined to model memory, synchronization instructions, cache instructions and floating-point. The final section gives example instruction specifications.

Each instruction is described using informal text as well as the formal notational language. Sometimes it is inappropriate for one of these descriptions to convey the full semantics. In such cases these two descriptions must be taken together to constitute the full specification.
8.2 Variables and types

Variables are used to hold state. The type of a variable determines the set of values that the variable can take and the available operators to manipulate that variable. The supported scalar types are integers, booleans and bit-fields. One-dimensional arrays of the scalar types are also supported.

The architectural state of the machine is represented by a set of variables. Each of these variables has an associated type, which is either a bit-field or an array of bit-fields. Bit-fields are used to give a bit-accurate representation.

Additional variables are used to hold temporary values. The type of temporary variables is implicit, and determined by their context rather than explicit declaration. The type of a temporary variable is an integer, a boolean or an array of these.

8.2.1 Integer

An integer variable can take the value of any mathematical integer. No limits are imposed on the range of integers supported. Integers obey their standard mathematical properties. Integer operations do not overflow. The integer operators are defined so that singularities do not occur. For example, no definition is given to the result of divide by zero; the operator is simply not available when the divisor is zero.

The representation of literal integer values is achieved using the following notations:

- Decimal numbers are represented by the regular expression: \{0-9\}+
- Hexadecimal numbers are represented by the regular expression: 0x\{0-9a-fA-F\}+
- Binary numbers are represented by the regular expression: 0b\{0-1\}+

These notations are standard and map onto integer values in the obvious way. Underscore characters (‘_’) can be inserted into any of the above literal representations. These do not change the represented value but can be used as spacers to aid readability.

The notations allow only zero and positive numbers to be represented directly. A monadic integer negation operator can subsequently be used to derive a negative value.
8.2.2 Boolean

A boolean variable can take two values:

- Boolean false. The literal representation of boolean false is ‘FALSE’.
- Boolean true. The literal representation of boolean true is ‘TRUE’.

8.2.3 Bit-fields

Bit-fields are provided to define ‘bit-accurate’ storage.

Bit-fields containing arbitrary numbers of bits are supported. A bit-field of b bits contains bits numbered from 0 (the least significant bit) up to b-1 (the most significant bit). Each bit can take the value 0 or the value 1. Bit-fields are mapped to, and from, integers in the usual way. If bit i of a b-bit, bit-field, where i is in [0, b), is set then it contributes $2^i$ to the integral value of the bit-field. The integral value of the bit-field as a whole is an integer in the range $[0, 2^b)$.

When a bit-field is read, it gives its integral value. When a bit-field is written with an integral value, the integer must be in the range of values supported by the bit-field. Typically, the only operations applied directly to bit-fields are conversions to other types.

8.2.4 Arrays

One-dimensional arrays of the above types are also available. Indexing into an n-element array A is achieved using the notation A[i] where A is an array of some type and i is an integer in the range [0, n). This selects the i$^{th}$ element of the array A. If i is zero this selects the first entry, and if i is n-1 then this selects the last entry. The type of the selected element is the base type of the array.

Multi-dimensional arrays are not provided.
8.2.5 Floating point values

Floating-point types and operators are not provided. Instead, the value in a floating-point register is represented as a bit-field. The organization of the bit-field is consistent with an IEEE 754 format.

When a floating-point register is read, an integral representation of that bit-pattern is returned. When an integral value is written into a floating-point register, the value written is the bit-pattern of that integer. Thus, reading and writing is achieved as bit-pattern transfers, and not by interpreting the bit-patterns as real numbers.

The language does not provide direct means to interpret these bit-patterns as real numbers. Instead, functions are provided which give the required functionality. For example, arithmetic on real numbers is represented using a function notation.

8.3 Expressions

Expressions are constructed from monadic operators, dyadic operators and functions applied to variable and literal values.

There are no defined precedence and associativity rules for the operators. Parentheses are used to specify the expression unambiguously.

Sub-expressions can be evaluated in any order. If a particular evaluation order is required, then sub-expressions must be split into separate statements.

8.3.1 Integer arithmetic operators

Since the notation uses straightforward mathematical integers, the set of standard mathematical operators is available and already defined.

The standard dyadic operators are listed in Table 49.

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>i + j</td>
<td>Integer addition</td>
</tr>
<tr>
<td>i - j</td>
<td>Integer subtraction</td>
</tr>
<tr>
<td>i × j</td>
<td>Integer multiplication</td>
</tr>
</tbody>
</table>

Table 49: Standard dyadic operators
The division operator truncates towards zero. The remainder operator is consistent with this. The sign of the result of the remainder operator follows the sign of the dividend. Division or remainder with a divisor of zero results in a singularity, and its behavior is not defined.

For a numerator \( n \) and a denominator \( d \), the following properties hold where \( d \neq 0 \):

\[
\begin{align*}
n &= d \times (n/d) + (n \mod d) \\
(-n)/d &= -(n/d) = n/(-d) \\
(-n) \mod d &= -(n \mod d) \\
n \mod (-d) &= n \mod d \\
0 &\leq (n \mod d) < d \text{ where } n \geq 0 \text{ and } d > 0
\end{align*}
\]
8.3.2 Integer shift operators

The available integer shift operators are listed in Table 51.

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>n &lt;&lt; b</td>
<td>Integer left shift</td>
</tr>
<tr>
<td>n &gt;&gt; b</td>
<td>Integer right shift</td>
</tr>
</tbody>
</table>

Table 51: Shift operators

The shift operators are defined on integers as follows where \( b \geq 0 \):

\[
\begin{align*}
    n \ll b &= n \times 2^b \\
    n \gg b &= \begin{cases} 
        n/2^b & \text{where } n \geq 0 \\
        (n - 2^b + 1)/2^b & \text{where } n < 0
    \end{cases}
\end{align*}
\]

Note that right shifting rounds the result towards minus infinity. This contrasts with division, which rounds towards zero, and is the reason why the right shift definition is separate for positive and negative \( n \).

8.3.3 Integer bitwise operators

The available integer bitwise operators are listed in Table 52.

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>( i &amp; j )</td>
<td>Integer bitwise AND</td>
</tr>
<tr>
<td>( i \lor j )</td>
<td>Integer bitwise OR</td>
</tr>
<tr>
<td>( i \oplus j )</td>
<td>Integer bitwise XOR</td>
</tr>
<tr>
<td>~ i</td>
<td>Integer bitwise NOT</td>
</tr>
<tr>
<td>( n_{&lt;b \text{ FOR } m&gt;} )</td>
<td>Integer bit-field extraction: extract ( m ) bits starting at bit ( b ) from integer ( n )</td>
</tr>
<tr>
<td>( n_{&lt;b&gt;} )</td>
<td>Integer bit-field extraction: extract 1 bit starting at bit ( b ) from integer ( n )</td>
</tr>
</tbody>
</table>

Table 52: Bitwise operators
In order to define bitwise operations all integers are considered as having an infinitely long two's complement representation. Bit 0 is the least significant bit of this representation, bit 1 is the next higher bit, and so on. The value of bit b, where \( b \geq 0 \), in integer n is given by:

\[
\text{BIT}(n, b) = \left( \frac{n}{2^b} \right) \mod 2 \quad \text{where} \quad n \geq 0
\]

\[
\text{BIT}(-n, b) = 1 - \text{BIT}(n - 1, b) \quad \text{where} \quad n > 0
\]

Care must be taken whenever the infinitely long two's complement representation of a negative number is constructed. This representation will contain an infinite number of higher bits with the value 1 representing the sign. Typically, a subsequent conversion operation is used to discard these upper bits and return the result back to a finite value.

Bitwise AND (\( \land \)), OR (\( \lor \)), XOR (\( \oplus \)) and NOT (\( \neg \)) are defined on integers as follows, where \( b \) takes all values such that \( b \geq 0 \):

\[
\text{BIT}(i \land j, b) = \text{BIT}(i, b) \times \text{BIT}(j, b)
\]

\[
\text{BIT}(i \lor j, b) = \text{BIT}(i \land j, b) + \text{BIT}(i \oplus j, b)
\]

\[
\text{BIT}(i \oplus j, b) = (\text{BIT}(i, b) + \text{BIT}(j, b)) \mod 2
\]

\[
\text{BIT}(\neg i, b) = 1 - \text{BIT}(i, b)
\]

**Note:** Bitwise NOT of any finite positive \( i \) will result in a value containing an infinite number of higher bits with the value 1 representing the sign.

Bitwise extraction is defined on integers as follows, where \( b \geq 0 \) and \( m > 0 \):

\[
\text{n}_{\langle b \text{ FOR } m \rangle} = (n \gg b) \land (2^m - 1)
\]

\[
\text{n}_{\langle b \rangle} = \text{n}_{\langle b \text{ FOR } 1 \rangle}
\]

The result of \( \text{n}_{\langle b \text{ FOR } m \rangle} \) is an integer in the range \([0, 2^m)\).
8.3.4 Relational operators

Relational operators are defined to compare integral values and give a boolean result.

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>i = j</td>
<td>Result is true if i is equal to j, otherwise false</td>
</tr>
<tr>
<td>i ≠ j</td>
<td>Result is true if i is not equal to j, otherwise false</td>
</tr>
<tr>
<td>i &lt; j</td>
<td>Result is true if i is less than j, otherwise false</td>
</tr>
<tr>
<td>i &gt; j</td>
<td>Result is true if i is greater than j, otherwise false</td>
</tr>
<tr>
<td>i ≤ j</td>
<td>Result is true if i is less than or equal to j, otherwise false</td>
</tr>
<tr>
<td>i ≥ j</td>
<td>Result is true if i is greater than or equal to j, otherwise false</td>
</tr>
</tbody>
</table>

Table 53: Relational operators

8.3.5 Boolean operators

Boolean operators are defined to perform logical AND, OR, XOR and NOT. These operators have boolean sources and result. Additionally, the conversion operator INT is defined to convert a boolean source into an integer result.

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>i AND j</td>
<td>Result is true if i and j are both true, otherwise false</td>
</tr>
<tr>
<td>i OR j</td>
<td>Result is true if either/both i and j are true, otherwise false</td>
</tr>
<tr>
<td>i XOR j</td>
<td>Result is true if exactly one of i and j are true, otherwise false</td>
</tr>
<tr>
<td>NOT i</td>
<td>Result is true if i is false, otherwise false</td>
</tr>
<tr>
<td>INT i</td>
<td>Result is 0 if i is false, otherwise 1</td>
</tr>
</tbody>
</table>

Table 54: Boolean operators
8.3.6 Single-value functions

In some cases it is inconvenient or inappropriate to describe an expression directly in the specification language. In these cases a function call is used to reference the undescribed behavior.

A single-value function evaluates to a single value (the result), which can be used in an expression. The type of the result value can be determined by the expression context from which the function is called. There are also multiple-value functions which evaluate to multiple values. These are only available in an assignment context, and are described in Section 8.4.2: Assignment on page 190.

Functions can contain side-effects.

Scalar conversions

Two monadic functions are defined to support conversions between integral representations of finite-precision signed and unsigned number spaces. These functions are often used to convert between bit-fields and integer values.

<table>
<thead>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ZeroExtend(_n(i))</td>
<td>Convert integer (i) to an (n)-bit 2's complement unsigned range</td>
</tr>
<tr>
<td>SignExtend(_n(i))</td>
<td>Convert integer (i) to an (n)-bit 2's complement signed range</td>
</tr>
</tbody>
</table>

Table 55: Integer conversion operators

These two functions are defined as follows, where \(n > 0\):

\[
\text{ZeroExtend}_n(i) = i_{(0 \text{ FOR } n)}
\]

\[
\text{SignExtend}_n(i) = \begin{cases} 
  i_{(0 \text{ FOR } n)} & \text{where } i_{n-1} = 0 \\
  i_{(0 \text{ FOR } (n-1))} - 2^n & \text{where } i_{n-1} = 1
\end{cases}
\]
For syntactic convenience, conversion functions are also defined for converting an integer to a single bit and to a 32-bit register. Table 56 shows the additional functions provided.

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>Bit(i)</td>
<td>Convert lowest bit of integer i to a 1-bit value</td>
</tr>
<tr>
<td></td>
<td>This is a convenient notation for i&lt;0&gt;</td>
</tr>
<tr>
<td>Register(i)</td>
<td>Convert lowest 32 bits of integer i to a 32-bit value</td>
</tr>
<tr>
<td></td>
<td>This is a convenient notation for i&lt;0 FOR 32&gt;</td>
</tr>
</tbody>
</table>

Table 56: Conversion operators from integers to bit-fields

**Floating-point conversions**

The specification language manipulates floating-point values as integers containing the associated IEEE 754 bit-pattern. The layout of these bit-patterns is described in Chapter 6: Floating-point unit on page 145. The language does not support a floating-point type.

Conversion functions are defined to support floating-point. Floating-point values are held as either scalar values in a single register, or vector values in multiple registers. The available register formats are:

- One 32-bit value in a single-precision register.
- One 64-bit value in a double-precision register.
- Two 32-bit values in a pair of single-precision registers.
- Four 32-bit values in a four-entry vector of single-precision registers.
- Sixteen 32-bit values in a four-by-four matrix of single-precision registers.

Conversions are available to convert between register bit-fields in these formats and integers or arrays of integers holding the appropriate IEEE 754 bit-patterns.
The following conversions are provided to convert from floating-point registers:

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>FloatValue_32(r)</td>
<td>Convert a single-precision floating-point register into a 32-bit integer bit-pattern.</td>
</tr>
<tr>
<td>FloatValue_64(r)</td>
<td>Convert a double-precision floating-point register into a 64-bit integer bit-pattern.</td>
</tr>
<tr>
<td>FloatValuePair_32(r)</td>
<td>Convert a pair of single-precision floating-point registers into an array of 2 x 32-bit integer bit-patterns.</td>
</tr>
<tr>
<td>FloatValueVector_32(r)</td>
<td>Convert a 4-entry vector of single-precision floating-point registers into an array of 4 x 32-bit integer bit-patterns.</td>
</tr>
<tr>
<td>FloatValueMatrix_32(r)</td>
<td>Convert a 16-entry matrix of single-precision floating-point registers into an array of 16 x 32-bit integer bit-patterns.</td>
</tr>
</tbody>
</table>

Table 57: Conversion from floating-point register formats

The following conversions are provided to convert to floating-point registers:

<table>
<thead>
<tr>
<th>Operation</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>FloatRegister_32(l)</td>
<td>Convert a 32-bit integer bit-pattern into a single-precision floating-point register.</td>
</tr>
<tr>
<td>FloatRegister_64(l)</td>
<td>Convert a 64-bit integer bit-pattern into a double-precision floating-point register.</td>
</tr>
<tr>
<td>FloatRegisterPair_32(a)</td>
<td>Convert an array of 2 x 32-bit integer bit-patterns into a pair of single-precision floating-point registers.</td>
</tr>
<tr>
<td>FloatRegisterVector_32(a)</td>
<td>Convert an array of 4 x 32-bit integer bit-patterns into a 4-entry vector of single-precision floating-point registers.</td>
</tr>
<tr>
<td>FloatRegisterMatrix_32(a)</td>
<td>Convert an array of 16 x 32-bit integer bit-patterns into a 16-entry matrix of single-precision floating-point registers.</td>
</tr>
</tbody>
</table>

Table 58: Conversion to floating-point register formats
8.4 Statements

An instruction specification consists of a sequence of statements. These statements are processed sequentially in order to specify the effect of the instruction on the architectural state of the machine. The available statements are discussed in this section.

Each statement has a semi-colon terminator. A sequence of statements can be aggregated into a statement block using '{' to introduce the block and '}' to terminate the block. A statement block can be used anywhere that a statement can.

8.4.1 Undefined behavior

The statement:

UNDEFINED();

indicates that the resultant behavior is architecturally undefined.

A particular implementation can choose to specify an implementation-defined behavior in such cases. It is very likely that any implementation-defined behavior will vary from implementation to implementation. Exploitation of implementation-defined behavior should be avoided to allow software to be portable between implementations.

In cases where architecturally undefined behavior can occur in user mode, the implementation will ensure that implemented behavior does not break the protection model. Thus, the implemented behavior will be some execution flow that is permitted for that user mode thread.

8.4.2 Assignment

The ‘←’ operator is used to denote assignment of an expression to a variable. An example assignment statement is:

variable ← expression;

The expression can be constructed from variables, literals, operators and functions as described in Section 8.3: Expressions on page 182. The expression is fully evaluated before the assignment takes place. The variable can be an integer, a boolean, a bit-field or an array of one of these types.
Assignment to architectural state

This is where the variable is part of the architectural state (as described in Table 59: Scalar architectural state on page 194). The type of the expression and the type of the variable must match.

Assignment to a temporary

Alternatively, if the variable is not part of the architectural state, then it is a temporary variable. The type of the variable is determined by the type of expression. A temporary variable must be assigned to, before it is used in the instruction specification.

Assignment of an undefined value

An assignment of the following form results in a variable being initialized with an architecturally undefined value:

```plaintext
variable ← UNDEFINED;
```

After assignment the variable will hold a value which is valid for its type. However, the value is architecturally undefined. The actual value can be unpredictable; that is to say the value indicated by UNDEFINED can vary with each use of UNDEFINED. Architecturally-undefined values can occur in both user and privileged modes.

A particular implementation can choose to specify an implementation-defined value in such cases. It is very likely that any implementation-defined values will vary from implementation to implementation. Exploitation of implementation-defined values should be avoided to allow software to be portable between implementations.

Assignment of multiple values

Multi-value functions are used to return multiple values, and are only available when used in a multiple assignment context. The syntax consists of a list of comma-separated variables, an assignment symbol followed by a function call. The function is evaluated and returns multiple results into the variables listed. The number of variables and the number of results of the function must match. The assigned variables must all be distinct (i.e. no aliases).

For example, a two-valued assignment from a function call with 3 parameters can be represented as:

```plaintext
variable1, variable2 ← call(param1, param2, param3);
```
8.4.3 Conditional

Conditional behavior is specified using ‘IF’, ‘ELSE IF’ and ‘ELSE’.

Conditions are expressions that result in a boolean value. If the condition after an ‘IF’ is true, then its block of statements is executed and the whole conditional then completes. If the condition is false, then any ‘ELSE IF’ clauses are processed, in turn, in the same fashion. If no conditions are met and there is an ‘ELSE’ clause then its block of statements is executed. Finally, if no conditions are met and there is no ‘ELSE’ clause, then the statement has no effect apart from the evaluation of the condition expressions.

The ‘ELSE IF’ and ‘ELSE’ clauses are optional. In ambiguous cases, the ‘ELSE’ matches with the nearest ‘IF’.

For example:

```
IF (condition1)
    block1
ELSE IF (condition2)
    block2
ELSE
    block3
```

8.4.4 Repetition

Repetitive behavior is specified using the following construct:

```
REPEAT i FROM m FOR n STEP s block
```

The block of statements is iterated n times, with the integer i taking the values:

\[ m, m + s, m + 2s, m + 3s, \ldots, m + (n - 1) \times s. \]

The behavior is equivalent to textually writing the block n times with i being substituted with the appropriate value in each copy of the block.

The value of n must be greater or equal to 0, and the value of s must be non-zero. The values of the expressions for m, n and s must be constant across the iteration. The integer i must not be assigned to within the iterated block. The ‘STEP s’ can be omitted in which case the step-size takes the default value of 1.
8.4.5 Exceptions

Exception handling is triggered by a `THROW` statement. When an exception is thrown, no further statements are executed from the instruction specification and control passes to an exception handler. The actions associated with the launch of the handler are not shown in the instruction specification, but are described separately in Chapter 5: Exceptions on page 105.

There are two forms of throw statement:

```plaintext
THROW type;
```

and:

```plaintext
THROW type, value;
```

where `type` indicates the type of exception which is launched, and `value` is an optional argument to the exception handling sequence.

The full set of exceptions is described in Chapter 5: Exceptions on page 105.

8.4.6 Procedures

Procedure statements contain a procedure name followed by a list of comma-separated arguments contained within parentheses followed by a semi-colon. The execution of procedures typically causes side-effects to the architectural state of the machine.

Procedures are generally used where it is difficult or inappropriate to specify the effect of an instruction using the abstract execution model. A fuller description of the effect of the instruction will be given in the surrounding text.

An example procedure with two parameters is:

```plaintext
proc(param1, param2);
```
8.5 Architectural state

The architectural state is described in Chapter 2: Programming model on page 21. The notations used in the model to refer to this state are summarized in Table 59 and Table 60. Each item of scalar architectural state is a bit-field of a particular width. Each item of array architectural state is an array of bit-fields of a particular width.

<table>
<thead>
<tr>
<th>Architectural state</th>
<th>Type is a bit-field containing:</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>MD (SR.MD)</td>
<td>1 bit</td>
<td>User (0) or privileged (1) mode</td>
</tr>
<tr>
<td>PC</td>
<td>32 bits</td>
<td>32-bit program counter</td>
</tr>
<tr>
<td>MMUCR</td>
<td>32 bits</td>
<td>For details of the MMU control register see Chapter 3: Memory management unit (MMU) on page 41.</td>
</tr>
<tr>
<td>FPSCR</td>
<td>32 bits</td>
<td>32-bit floating-point status and control register</td>
</tr>
<tr>
<td>GBR</td>
<td>32 bits</td>
<td>Global base register</td>
</tr>
<tr>
<td>MACL</td>
<td>32 bits</td>
<td>Multiply-accumulate low</td>
</tr>
<tr>
<td>MACH</td>
<td>32 bits</td>
<td>Multiply-accumulate high</td>
</tr>
<tr>
<td>PR</td>
<td>32 bits</td>
<td>Procedure link register</td>
</tr>
<tr>
<td>T</td>
<td>1 bit</td>
<td>Condition code flag</td>
</tr>
<tr>
<td>S</td>
<td>1 bit</td>
<td>Multiply-accumulate saturation flag</td>
</tr>
<tr>
<td>M</td>
<td>1 bit</td>
<td>Divide-step M flag</td>
</tr>
<tr>
<td>Q</td>
<td>1 bit</td>
<td>Divide-step Q flag</td>
</tr>
<tr>
<td>FPUL</td>
<td>32 bits</td>
<td>FPU communication register</td>
</tr>
<tr>
<td>Ri</td>
<td>32 bits</td>
<td>16 x 32-bit general purpose registers</td>
</tr>
<tr>
<td>FRi where i is in [0, 31]</td>
<td>32 bits</td>
<td>32 x 32-bit floating-point registers</td>
</tr>
<tr>
<td>DRi where i is in [0, 15]</td>
<td>64 bits</td>
<td>16 x 64-bit floating-point registers</td>
</tr>
</tbody>
</table>

Table 59: Scalar architectural state
Note: FR, FP, FV, MTRX and DR provide different views of the same architectural state.

There is no implicit meaning to the value held by the collection of bits in a register. The interpretation of the register is supplied by each instruction that reads or writes the register value.

PC denotes the program counter of the currently executing instruction. PC’ denotes the program counter of the next instruction that is to be executed.
8.6 Memory model

Instruction specification uses a simple model of memory. It assumes, for example, that any caches have no architectural visibility. For typical well-disciplined instruction sequences these effects will not be architecturally visible. However, a fuller description of the behavior in other cases is defined by the text of the architecture manual.

MEM is an array of bytes indexed by an effective address. Elements in arrays are selected using array indexing notation: MEM[i] selects the i\textsuperscript{th} entry in the MEM array. The total range of array indices into MEM is [0, 2\textsuperscript{32}), though not all of this memory is available on all implementations.

Array slicing can be used to view an array as consisting of elements of a larger size. The notation MEM[s FOR n], where n > 0, denotes a memory slice containing the elements MEM[s], MEM[s+1] through to MEM[s+n-1]. The type of this slice is a bit-field exactly large enough to contain a concatenation of the n selected elements. In this case it contain 8n bits since the base type of MEM is byte.

The order of the concatenation depends on the endianness of the processor:

- If the processor is operating in a little-endian mode, the concatenation order obeys the following condition as i (the byte number) varies in the range [0, n):
  \[(MEM[s FOR n])_{(8i FOR 8)} = MEM[s + i]\]

  This equivalence states that byte number i, using little-endian byte numbering (i.e. byte 0 is bits 0 to 7), in the bit-field MEM[s FOR n] is the i\textsuperscript{th} byte in memory counting upwards from MEM[s].

- If the processor is operating in a big-endian mode, the concatenation order obeys the following condition as i (the byte number) varies in the range [0, n):
  \[(MEM[s FOR n])_{(8(n - 1 - i) FOR 8)} = MEM[s + i]\]

  This equivalence states that byte number i, using big-endian byte numbering (i.e. byte 0 is bits 8n-8 to 8n-1), in the bit-field MEM[s FOR n] is the i\textsuperscript{th} byte in memory counting upwards from MEM[s].

For syntactic convenience, functions and procedures are provided to read, write and swap memory. The basic primitives support aligned accesses. Misaligned read and write primitives support the instructions for misaligned load and store.
Additionally, mechanisms are provided for reading and writing pairs of values. Pair access requires that each half of the pair is endianness converted separately, and that the lower half is written into memory at the provided address while the upper half is written into that address plus the object size. This maintains the ordering of the halves of the pair as they are transferred between registers and memory. Pair access is used only for loading and storing pairs of single-precision floating-point registers (see Chapter 6: Floating-point unit on page 145).

### 8.6.1 Support functions

The specification of the memory instructions relies on the support functions listed in Table 61. These functions are used to model the behavior of the memory management unit described in Chapter 3: Memory management unit (MMU) on page 41.

<table>
<thead>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>AddressUnavailable(address)</td>
<td>Returns true if the provided address is outside of the available part of the effective address space. For further details refer to Chapter 3: Memory management unit (MMU) on page 41.</td>
</tr>
<tr>
<td>MMU()</td>
<td>Returns true if the MMU is enabled.</td>
</tr>
<tr>
<td>DataAccessMiss(address)</td>
<td>Returns true if the provided address does not have a mapping for a data access.</td>
</tr>
<tr>
<td>InstFetchMiss(address)</td>
<td>Returns true if the provided address does not have a mapping for an instruction fetch.</td>
</tr>
<tr>
<td>InstInvalidateMiss(address)</td>
<td>Returns true if the provided address does not have a mapping for an instruction invalidation.</td>
</tr>
<tr>
<td>ReadProhibited(address)</td>
<td>Returns true if the provided address has no read permission for the current privilege.</td>
</tr>
<tr>
<td>WriteProhibited(address)</td>
<td>Returns true if the provided address has no write permission for the current privilege.</td>
</tr>
<tr>
<td>ExecuteProhibited(address)</td>
<td>Returns true if the provided address has no execute permission for the current privilege.</td>
</tr>
</tbody>
</table>

Table 61: Support functions for memory access
More detailed properties of translation miss detection are not modelled here. The conditions that determine whether an access is a translation miss or a hit depend on the MMU and cache.

DataAccessMiss is used to check for the absence of a data translation. This function is used for all data accesses when the MMU is enabled. InstFetchMiss is used to check for instruction fetches.

### 8.6.2 Reading memory

Functions are provided to read memory.

<table>
<thead>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ReadMemory(_n)(address)</td>
<td>Aligned memory read of an (_n)-bit value</td>
</tr>
<tr>
<td>ReadMemoryPair(_n)(address)</td>
<td>Aligned memory read of a pair of (_n)-bit values</td>
</tr>
</tbody>
</table>

The ReadMemory\(_n\) function takes an integer parameter to indicate the address being accessed. The number of bits being read (\(n\)) is one of 8, 16 or 32 bits. The required bytes are read from memory, interpreted according to endianness, and an integer result returns the read bit-field value. If the read memory value is to be interpreted as signed, then a sign-extension should be used on the result.
The assignment:

\[
result \leftarrow \text{ReadMemory}_n(a);
\]

is equivalent to:

\[
\begin{align*}
\text{width} & \leftarrow n \gg 3; \\
\text{IF} \ (\text{AddressUnavailable}(a) \ \text{OR} \ ((a \land (\text{width} - 1)) \neq 0)) \ \text{THROW} \\
& \quad \text{RADDERR}, a; \\
\text{IF} \ (\text{MMU()} \ \text{AND} \ \text{DataAccessMiss}(a)) \ \text{THROW} \ RTLBMISS, a; \\
\text{IF} \ (\text{MMU()} \ \text{AND} \ \text{ReadProhibited}(a)) \ \text{THROW} \ READPROT, a;
\end{align*}
\]

\[
result \leftarrow \text{MEM}[a \ \text{FOR width}];
\]

ReadMemoryPair_n reads a pair of n-bit values. The alignment check requires alignment for a 2n-bit access. The access maintains the ordering of the two halves of the pair, with endianness applied separately to each half. The assignment:

\[
result \leftarrow \text{ReadMemoryPair}_n(a);
\]

is equivalent to:

\[
\begin{align*}
\text{width} & \leftarrow n \gg 3; \\
\text{pairwidth} & \leftarrow n \ll 1; \\
\text{IF} \ (\text{AddressUnavailable}(a) \ \text{OR} \ ((a \land (\text{pairwidth} - 1)) \neq 0)) \ \text{THROW} \\
& \quad \text{RADDERR}, a; \\
\text{IF} \ (\text{MMU()} \ \text{AND} \ \text{DataAccessMiss}(a)) \ \text{THROW} \ RTLBMISS, a; \\
\text{IF} \ (\text{MMU()} \ \text{AND} \ \text{ReadProhibited}(a)) \ \text{THROW} \ READPROT, a;
\end{align*}
\]

\[
\begin{align*}
\text{low} & \leftarrow \text{MEM}[a \ \text{FOR width}]; \\
\text{high} & \leftarrow \text{MEM}[a+\text{width} \ \text{FOR width}]; \\
\text{result} & \leftarrow \text{low} + (\text{high} \ll n);
\end{align*}
\]
8.6.3 Prefetching memory

A function is provided to denote memory prefetch.

<table>
<thead>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>PrefetchMemory(address)</td>
<td>Memory prefetch</td>
</tr>
</tbody>
</table>

Table 63: Support procedure to prefetch memory

This is used for a software-directed data prefetch from a specified effective address. This is a hint to give advance notice that particular data will be required. It is implementation-specific as to whether a prefetch will be performed.

The statement:

\[
\text{result} \leftarrow \text{PrefetchMemory}(a);
\]

is equivalent to:

\[
\begin{align*}
\text{IF} \ (\text{NOT} \ \text{AddressUnavailable}(\text{address})) \\
\quad \text{IF} \ (\text{NOT} \ (\text{MMU}() \ \text{AND} \ \text{DataAccessMiss}(\text{address}))) \\
\quad \quad \text{IF} \ (\text{NOT} \ (\text{MMU}() \ \text{AND} \ \text{ReadProhibited}(\text{address}))) \\
\quad \quad \text{PREF}(\text{address});
\end{align*}
\]

\[
\text{result} \leftarrow 0;
\]

where PREF is a cache operation defined in Section 8.7: Cache model on page 202. This function does not raise exceptions. PrefetchMemory evaluates to zero for syntactic convenience.

8.6.4 Writing memory

Procedures are provided to write memory.

<table>
<thead>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>WriteMemory(_n)(address, value)</td>
<td>Aligned memory write to an n-bit value</td>
</tr>
<tr>
<td>WriteMemoryPair(_n)(address, value)</td>
<td>Aligned memory write to a pair of n-bit values</td>
</tr>
</tbody>
</table>

Table 64: Support procedures to write memory

The WriteMemory\(_n\) procedure takes an integer parameter to indicate the address being accessed, followed by an integer parameter containing the value to be written.
The number of bits being written (n) is one of 8, 16 or 32 bits. The written value is interpreted as a bit-field of the required size; all higher bits of the value are discarded. The bytes are written to memory, ordered according to endianness. The statement:

```
WriteMemory\_n(a, value);
```

is equivalent to:

```
width ← n >> 3;
IF (AddressUnavailable(a) OR ((a∧(width-1)) ≠ 0)) THROW WADDERR,a;
IF (MMU() AND DataAccessMiss(a)) THROW WTTLBMISS,a;
IF (MMU() AND WriteProhibited(a)) THROW WRITEPROT,a;
IF (MMU() AND NOT DirtyBit(a)) THROW FIRSTWRITE,a;
MEM[a FOR width] ← value\_<0 FOR n>;
```

WriteMemory\_Pair\_n writes a pair of n-bit values. The alignment check requires alignment for a 2n-bit access. The access maintains the ordering of the two halves of the pair, with endianness applied separately to each half. The statement:

```
WriteMemory\_Pair\_n(a, value);
```

is equivalent to:

```
width ← n >> 3;
pairwidth ← n << 1;
IF (AddressUnavailable(a) OR ((a∧(pairwidth-1)) ≠ 0)) THROW WADDERR,a;
IF (MMU() AND DataAccessMiss(a)) THROW WTTLBMISS,a;
IF (MMU() AND WriteProhibited(a)) THROW WRITEPROT,a;
IF (MMU() AND NOT DirtyBit(a)) THROW FIRSTWRITE,a;
MEM[a FOR width] ← value\_<0 FOR n>;
MEM[a+width FOR width] ← value\_<n FOR n>;
```

Sleep operations

The SLEEP operation is used to enter sleep mode. The effects of this operation is beyond the scope of the specification language, and it is therefore modelled using
procedure calls. The behavior of these procedure calls is elaborated in the text of the manual.

<table>
<thead>
<tr>
<th>Procedure</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>SLEEP()</td>
<td>Procedure to enter sleep mode</td>
</tr>
</tbody>
</table>

Table 65: Procedures to model sleep operation

### 8.7 Cache model

Cache operations are used to allocate, prefetch and cohere lines in caches. The effects of these operations are beyond the scope of the specification language, and are therefore modelled using procedure calls. The behavior of these procedure calls is elaborated in the text of the manual.

<table>
<thead>
<tr>
<th>Procedure</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>ALLOCO(address)</td>
<td>Procedure to allocate an operand cache block.</td>
</tr>
<tr>
<td>OCBI(address)</td>
<td>Procedure to invalidate an operand cache block.</td>
</tr>
<tr>
<td>OCBP(address)</td>
<td>Procedure to purge an operand cache block.</td>
</tr>
<tr>
<td>OCBWB(address)</td>
<td>Procedure to write-back an operand cache block.</td>
</tr>
<tr>
<td>PREF (address)</td>
<td>Procedure to prefetch an operand cache block.</td>
</tr>
</tbody>
</table>

Table 66: Procedures to model cache operations

### 8.8 Floating-point model

The floating-point specification is abstracted using functions to hide the low-level details. Additional information is provided in a tabular form to describe special and exceptional cases. Chapter 6: Floating-point unit on page 145 provides a textual description of floating-point operation.

#### 8.8.1 Functions to access SR and FPSCR

The floating-point instruction specifications use a function notation to access SR and FPSCR state. The used functions are described in Table 67.
### Table 6.7: SR and FPSCR access

<table>
<thead>
<tr>
<th>Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>FpuIsDisabled(SR)</td>
<td>True if SR.FD is 1, otherwise false</td>
</tr>
<tr>
<td>FpuFlagI(FPSCR)</td>
<td>True if FPSCR.FLAG.I (sticky flag for inexact) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuFlagU(FPSCR)</td>
<td>True if FPSCR.FLAG.U (sticky flag for underflow) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuFlagO(FPSCR)</td>
<td>True if FPSCR.FLAG.O (sticky flag for overflow) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuFlagZ(FPSCR)</td>
<td>True if FPSCR.FLAG.Z (sticky flag for divide by zero) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuFlagV(FPSCR)</td>
<td>True if FPSCR.FLAG.V (sticky flag for invalid) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuCauseI(FPSCR)</td>
<td>True if FPSCR.CAUSE.I (cause flag for inexact) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuCauseU(FPSCR)</td>
<td>True if FPSCR.CAUSE.U (cause flag for underflow) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuCauseO(FPSCR)</td>
<td>True if FPSCR.CAUSE.O (cause flag for overflow) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuCauseZ(FPSCR)</td>
<td>True if FPSCR.CAUSE.Z (cause flag for divide by zero) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuCauseV(FPSCR)</td>
<td>True if FPSCR.CAUSE.V (cause flag for invalid) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuEnableI(FPSCR)</td>
<td>True if FPSCR.ENABLE.I (exception enable for inexact) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuEnableU(FPSCR)</td>
<td>True if FPSCR.ENABLE.U (exception enable for underflow) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuEnableO(FPSCR)</td>
<td>True if FPSCR.ENABLE.O (exception enable for overflow) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuEnableZ(FPSCR)</td>
<td>True if FPSCR.ENABLE.Z (exception enable for divide by zero) is 1, otherwise false</td>
</tr>
<tr>
<td>FpuEnableV(FPSCR)</td>
<td>True if FPSCR.ENABLE.V (exception enable for invalid) is 1, otherwise false</td>
</tr>
</tbody>
</table>
8.8.2 Functions to model floating-point behavior

Functions are used to model almost all of the floating-point behavior. Each function is associated with a list of results and a list of parameters. The functions encapsulate the computation associated with the instruction. This includes handling of input denormalized values, special case detection, exceptional cases and the floating-point arithmetic.

The following tables summarize the functions used by each instruction. The table shows how the parameters are interpreted and how the results are computed. The nth parameter is denoted as Pn and the nth result as RESn.

The parameters and results of these functions are all modeled as integer values. For floating-point parameters and results, these values are integer bit-patterns representing the IEEE754 formats. Multi-value results are used to return two results: the computed result and a new value for FPSCR. If the new value of FPSCR causes an exception to be raised, then the destination register will not be updated with the computed result.

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Function</th>
<th>RES0</th>
<th>RES1</th>
<th>P0, P1</th>
<th>P2</th>
</tr>
</thead>
<tbody>
<tr>
<td>FADD.S</td>
<td>FADD_S</td>
<td>Single result of ((P0 +_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FADD.D</td>
<td>FADD_D</td>
<td>Double result of ((P0 +_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FSUB.S</td>
<td>FSUB_S</td>
<td>Single result of ((P0 -_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FSUB.D</td>
<td>FSUB_D</td>
<td>Double result of ((P0 -_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FMUL.S</td>
<td>FMUL_S</td>
<td>Single result of ((P0 \times_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FMUL.D</td>
<td>FMUL_D</td>
<td>Double result of ((P0 \times_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FDIV.S</td>
<td>FDIV_S</td>
<td>Single result of ((P0 /_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FDIV.D</td>
<td>FDIV_D</td>
<td>Double result of ((P0 /_{\text{IEEE754}} P1))</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
</tbody>
</table>

Table 68: Floating-point dyadic arithmetic

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Function</th>
<th>RES0</th>
<th>RES1</th>
<th>P0</th>
<th>P1</th>
</tr>
</thead>
<tbody>
<tr>
<td>FABS.S</td>
<td>FABS_S</td>
<td>Single result of absolute (P0)</td>
<td>(not used)</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FABS.D</td>
<td>FABS_D</td>
<td>Double result of absolute (P0)</td>
<td>(not used)</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
</tbody>
</table>

Table 69: Floating-point monadic arithmetic
### Table 69: Floating-point monadic arithmetic

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Function</th>
<th>RES0</th>
<th>RES1</th>
<th>P0</th>
<th>P1</th>
</tr>
</thead>
<tbody>
<tr>
<td>FNEG.S</td>
<td>FNEG_S</td>
<td>Single result of negating P0</td>
<td>(not used)</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FNEG.D</td>
<td>FNEG_D</td>
<td>Double result of negating of P0</td>
<td>(not used)</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FSQRT.S</td>
<td>FSQRT_S</td>
<td>Single result of $\sqrt{P0}$</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FSQRT.D</td>
<td>FSQRT_D</td>
<td>Double result of $\sqrt{P0}$</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
</tbody>
</table>

### Table 70: Floating-point comparisons

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Function</th>
<th>RES0</th>
<th>RES1</th>
<th>P0, P1</th>
<th>P2</th>
</tr>
</thead>
<tbody>
<tr>
<td>FCMPEQ.S</td>
<td>FCMPEQ_S</td>
<td>Boolean result of $(P0 =_{\text{IEEE754}} P1)$</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FCMPEQ.D</td>
<td>FCMPEQ_D</td>
<td>Boolean result of $(P0 =_{\text{IEEE754}} P1)$</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FCMPGT.S</td>
<td>FCMPGT_S</td>
<td>Boolean result of $(P0 &gt;_{\text{IEEE754}} P1)$</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FCMPGT.D</td>
<td>FCMPGT_D</td>
<td>Boolean result of $(P0 &gt;_{\text{IEEE754}} P1)$</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
</tbody>
</table>

### Table 71: Floating-point conversions

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Function</th>
<th>RES0</th>
<th>RES1</th>
<th>P0</th>
<th>P1</th>
</tr>
</thead>
<tbody>
<tr>
<td>FCNV.SD</td>
<td>FCNV_SD</td>
<td>P0 is converted to double result</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FCNV.DS</td>
<td>FCNV_DS</td>
<td>P0 is converted to single result</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FTRC.SL</td>
<td>FTRC_SL</td>
<td>P0 is converted to signed 32-bit integer result</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FTRC.DL</td>
<td>FTRC_DL</td>
<td>P0 is converted to signed 32-bit integer result</td>
<td>New FPSCR</td>
<td>Double</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FLOAT.LS</td>
<td>FLOAT_LS</td>
<td>P0 is converted to single result</td>
<td>New FPSCR</td>
<td>32-bit int</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FLOAT.LD</td>
<td>FLOAT_LD</td>
<td>P0 is converted to double result</td>
<td>New FPSCR</td>
<td>32-bit int</td>
<td>Old FPSCR</td>
</tr>
</tbody>
</table>

### Table 72: Floating-point multiply-accumulate

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Function</th>
<th>RES0</th>
<th>RES1</th>
<th>P0, P1, P2</th>
<th>P3</th>
</tr>
</thead>
<tbody>
<tr>
<td>FMAC.S</td>
<td>FMAC_S</td>
<td>Single result of fused $(P0 \times P1) + P2$</td>
<td>New FPSCR</td>
<td>Single</td>
<td>Old FPSCR</td>
</tr>
</tbody>
</table>
8.8.3 Floating-point special cases and exceptions

A special-case table is provided for each floating-point instruction that is considered an operation and has at least one input that is interpreted as a floating-point value. This table enumerates all different possible combinations of input values and the results returned by the instruction in the absence of an exception being raised.

The entries in the table are IEEE754 floating-point values as described in Chapter 6: Floating-point unit on page 145. Each cell entry in the table describes the result returned for a particular combination of floating-point inputs. If the result is invariant, its value is given in the cell. If the result is variable, the name of the appropriate operation is entered in the cell. If the cell contains 'n/a' then this indicates that an exception is always raised for that combination of inputs and that the implementation does not associate any value with the result.

8.9 Abstract sequential model

This section describes the abstract sequential model that is used to specify how instructions are executed on the SH4. It is described in terms of transitions in the explicit architectural state of the device plus some hidden internal state held in PC” and PR” which are used to keep track of delayed state changes.

Section 8.9.1 describes the initial values taken by the internal state.

Section 8.9.2 describes the steps taken to execute each SH compact instruction in the abstract sequential model. Section describes the mechanisms used to model delayed branching.

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Function</th>
<th>RES0</th>
<th>RES1</th>
<th>P0</th>
<th>P1</th>
<th>P2</th>
</tr>
</thead>
<tbody>
<tr>
<td>FIPR.S</td>
<td>FIPR_S</td>
<td>Single result of inner product of P0 with P1</td>
<td>New FPSCR</td>
<td>Array of 4 singles</td>
<td>Array of 4 singles</td>
<td>Old FPSCR</td>
</tr>
<tr>
<td>FTRV.S</td>
<td>FTRV_S</td>
<td>Array of 4 single results of matrix transform of P0 with P1</td>
<td>New FPSCR</td>
<td>Array of 16 singles</td>
<td>Array of 4 singles</td>
<td>Old FPSCR</td>
</tr>
</tbody>
</table>

Table 73: Special-purpose floating-point dyadic arithmetic
8.9.1 Initial conditions

The hidden internal state used to keep track of delayed state changes are automatically set to appropriate initial conditions at the beginning of a sequence of instructions.

The initial state is set as follows:
- PC'' is set to PC+2
- PR'' is set to the same value as PR

8.9.2 Instruction execution loop

The steps associated with executing each instruction are:

1. Check for asynchronous events, such as interrupt or reset, and initiate handling if required. Asynchronous events are not accepted between a delayed branch and a delay slot. They are delayed until after the delay slot.
2. Check the current program counter (PC) for instruction address exceptions, and initiate handling if required.
3. Fetch the instruction bytes from the address in memory, as indicated by the current program counter, 2 bytes need to be fetched for each instruction.
4. Calculate the default values of PC' and PR'. PC' is set to the value of PC'', PR' is set to the value of PR''.
5. Calculate the default values of PC'' and PR'' assuming continued sequential execution without procedure call or mode switch: PC'' is PC'+2, while PR'' is unchanged.
6. Decode and execute the instruction. This includes checks for synchronous events, such as exceptions and panics, and initiation of handling if required. Synchronous events are not accepted between a delayed branch and a delay slot. They are detected either before the delayed branch or after the delay slot.

The execution of an instruction can update the PC and PR state as follows:
- The instruction can change PC' to achieve a branch after this instruction has completed. It must also update PC'' to the value of PC'+2 to ensure correct sequential execution after the control flow.
- The instruction can change PR' to load the procedure link register. It must also update PR'' to the same value as PR'.
• The instruction can change PC” and PR” to achieve a branch or procedure call after the next instruction has completed.

Any changes made to PC’, PR’, PC” or PR” over-ride the default values.

7 Set the current program counter (PC) to the value of the next program counter (PC’) and PR to the value of PR’.

The actions associated with the handling of asynchronous and synchronous events are described in Chapter 5: Exceptions on page 105. The actions required by step 6 depend on the instruction, and are specified by the instruction specification for that instruction. Step 7 specifies the behavior for PC overflow. Non-delayed And Delayed

8.9.3 State changes

Non-delayed and delayed state changes are used to model the branch mechanism. These correspond to non-delayed and delayed branches.

In the model, PC and PR are never written directly by an instruction. Instead, an instruction writes to PC’ or PR’ to cause a non-delayed state change, or to PC” or PR” to cause a delayed state change:

• A non-delayed state change is achieved by updating PC’ or PR’ to over-ride their default values. After the execution of this instruction, PC’ and PR’ get copied to PC and PR respectively, and then influence instruction execution. Hence, there is no delay slot before the values of PC’ and PR’ propagate through to PC and PR.

• A delayed state change is achieved by updating PC” or PR” to over-ride their default values. After the execution of this instruction, PC” and PR” get copied to PC’ and PR’ respectively. After the execution of the next instruction, PC’ and PR’ get copied to PC and PR respectively, and then influence instruction execution. Hence, there is a delay slot before the values of PC’ and PR” propagate through to PC and PR.

There are potential ambiguities when one instruction makes a delayed state change and the immediately following instruction (which is in a delay slot) makes a non-delayed state change. These are handled as follows:

• The case of a delayed state change to PC immediately followed by a non-delayed state change to PC does not occur. This is because delay slot instructions that write to PC are illegal and cause an ILLSLOT exception.
• The case of a delayed state change to PR immediately followed by a non-delayed state change to PR can occur. The ambiguous cases are when a BSR, BSRF or JSR instruction is followed by an LDS that writes to PR. In this case the PR, observed by the instruction that dynamically follows the LDS instruction, is the value written by LDS not the value written by the sub-routine call. This behavior follows from the model described above.

There are also potential ambiguities when one instruction makes a delayed state change and the immediately following instruction (which is in a delay slot) reads from that state. These are handled as follows:

• The case of a delayed state change to PC immediately followed by a read of PC does not occur. This is because delay slot instructions that read from PC are illegal and cause an ILLSLOT exception.

• The case of a delayed state change to PR immediately followed by a read from PR can occur. The ambiguous cases are when a BSR, BSRF or JSR instruction is followed by an STS that reads from PR. In this case the PR, observed by the STS instruction, is the value written by the sub-routine call and not the previous value. This behavior is modeled explicitly in the definition of the STS instruction. It reads the value from PR’ (rather than the intuitive read from PR).

8.10 Example instructions

8.10.1 ADD #imm, Rn

An example specification for this instruction is shown below.

ADD #imm, Rn

<table>
<thead>
<tr>
<th>0111</th>
<th>n</th>
<th>s</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>1</td>
</tr>
</tbody>
</table>

imm ← SignExtend8(s);
op2 ← SignExtend32(Rn);
op2 ← op2 + imm;
Rn ← Register(op2);
The top half of this figure shows the assembly syntax and the binary encoding of the instruction. Particular fields within the encoding are identified by single characters. The opcode field, and any extension field, contain the literal encoding values associated with that instruction. Reserved fields must be encoded with the literal value given in the figure. Operand fields contain register designators or immediate constants.

The lower half of this figure specifies the effects of the execution of the instruction on the architectural state of the machine. The specification statements are organized into 3 stages as follows:

1. The first two statements read all required source information:
   
   \[
   \begin{align*}
   &\text{imm} \leftarrow \text{SignExtend}_8(s); \\
   &\text{op2} \leftarrow \text{SignExtend}_{32}(Rn);
   \end{align*}
   \]

   The first statement reads the value of \( s \), interprets it as a sign-extended 8-bit integer value and assigns this to a temporary integer called \( \text{imm} \). The name \( \text{imm} \) corresponds to the name of the immediate used in the assembly syntax. The second statement reads the value of \( R_n \) register, interprets it as a sign-extended 32-bit integer value and assigns this to a temporary integer called \( \text{op2} \).

2. The next statement implements the addition:
   
   \[
   \text{op2} \leftarrow \text{op2} + \text{imm};
   \]

   This statement does not refer to any architectural state. It adds the 2 integers \( \text{imm} \) and \( \text{op2} \) together, and assigns the result to a temporary integer called \( \text{op2} \). Note that since this is a conventional mathematical addition, the result can contain more significant bits of information than the sources.

3. The final statement updates the architectural state:
   
   \[
   \text{R}_n \leftarrow \text{Register} (\text{op2});
   \]

   The integer \( \text{op2} \) is converted back to a register bit-field, assigned to the \( \text{R}_n \) register.
8.10.2 FADD FRm, FRn

An example specification for this instruction is shown below.

**FADD FRm, FRn**

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>0000</th>
</tr>
</thead>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
  THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
  THROW FPUDIS;
op2, fps ← FADD_S(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
  THROW FPUEXC, fps;
IF (FpuCauseE(fps))
  THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
  THROW FPUEXC, fps;
FRn ← FloatRegister32(op2);
FPSCR ← ZeroExtend32(fps);

The specification statements are organized as follows:

1. Read all required source information:
   
   sr ← ZeroExtend32(SR);
   fps ← ZeroExtend32(FPSCR);
   op1 ← FloatValue32(FRm);
   op2 ← FloatValue32(FRn);

2. Execute the instruction:
   
   IF (FpuIsDisabled(sr) AND IsDelaySlot())
     THROW SLOTFPUDIS;
   IF (FpuIsDisabled(sr))
     THROW FPUDIS;
THROW FPUDIS;
  op2, fps ← FADD_S(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
  THROW FPUEXC, fps;
IF (FpuCauseE(fps))
  THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
  THROW FPUEXC, fps;

The behavior of the floating-point single-precision addition is modelled by the
FADD_S procedure. This procedure is given the two source operands and the
current value of FPSCR, and calculates the result and the new value of FPSCR. It is
responsible for detecting special cases and exceptions, and setting the result and
new FPSCR values accordingly.

This instruction contains exception cases. These are detected by IF statements and
are raised by THROW statements. When a THROW statement is executed, no
further statements from the specification are processed. In exception cases, this
specification makes no updates to architectural state. Instead, a handler is launched
for the exception as described in Chapter 5: Exceptions on page 105. The THROW
statement includes arguments to specify the kind of exception and any necessary
parameters of that exception. For an FPUEXC exception, the THROW statement
includes an updated value of ‘fps’ which the exception handler uses to initialize
FPSCR during the launch sequence.

2  Update the architectural state:

  FRn ← FloatRegister32(op2);
  FPSCR ← ZeroExtend32(fps);
Instruction descriptions

9.1 Alphabetical list of instructions

Instructions are listed in this section in alphabetical order.
ADD Rm, Rn

**Description**

This instruction adds Rm to Rn and places the result in Rn.

**Operation**

```
ADD Rm, Rn

<table>
<thead>
<tr>
<th></th>
<th>0011</th>
<th>n</th>
<th>m</th>
<th>1100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>
```

- \( \text{op1} \leftarrow \text{SignExtend32}(Rm) \);
- \( \text{op2} \leftarrow \text{SignExtend32}(Rn) \);
- \( \text{op2} \leftarrow \text{op2} + \text{op1} \);
- \( Rn \leftarrow \text{Register(op2)} \);

**Note**
ADD #imm, Rn

Description
This instruction adds Rn to the sign-extended 8-bit immediate s and places the result in Rn.

Operation

<table>
<thead>
<tr>
<th>0111</th>
<th>n</th>
<th>s</th>
</tr>
</thead>
</table>

imm ← SignExtend8(s);
op2 ← SignExtend32(Rn);
op2 ← op2 + imm;
Rn ← Register(op2);

Note
The ‘#imm’ in the assembly syntax represents the immediate s after sign extension.
ADDCC Rm, Rn

Description

This instruction adds Rm, Rn and the T-bit. The result of the addition is placed in Rn, and the carry-out from the addition is placed in the T-bit.

Operation

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0011</td>
<td>n</td>
<td>m</td>
<td>1110</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
t \leftarrow \text{ZeroExtend}_4(T);
\]
\[
op1 \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_m));
\]
\[
op2 \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_n));
\]
\[
op2 \leftarrow (\text{op2} + \text{op1}) + t;
\]
\[
t \leftarrow \text{op2}_{32 \text{ FOR } t};
\]
\[
R_n \leftarrow \text{Register}(\text{op2});
\]
\[
T \leftarrow \text{Bit}(t);
\]

Note
ADDV Rm, Rn

**Description**

This instruction adds Rm to Rn and places the result in Rn. The T-bit is set to 1 if the addition result is outside the 32-bit signed range, otherwise the T-bit is set to 0.

**Operation**

\[
\begin{array}{c|c|c|c|c|c|c|c}
& & 0011 & n & m & 1111 \\
\hline
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0 \\
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{op2} \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{op2} \leftarrow \text{op2} + \text{op1}; \\
t \leftarrow \text{INT}((\text{op2} < (-2^{31})) \text{ OR } (\text{op2} \geq 2^{31})); \\
R_n \leftarrow \text{Register}(\text{op2}); \\
T \leftarrow \text{Bit}(t);
\]

**Note**
**AND Rm, Rn**

**Description**
This instruction performs bitwise AND of \( R_m \) with \( R_n \) and places the result in \( R_n \).

**Operation**

\[
\text{AND } R_m, R_n
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0010</td>
<td>n</td>
<td>m</td>
<td>1001</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
\begin{align*}
op1 & \leftarrow \text{ZeroExtend}_{32}(R_m); \\
op2 & \leftarrow \text{ZeroExtend}_{32}(R_n); \\
op2 & \leftarrow \text{op2} \land \text{op1}; \\
R_n & \leftarrow \text{Register}(\text{op2});
\end{align*}
\]

**Note**
This instruction performs a 32-bit bitwise AND.
**AND #imm, R0**

**Description**

This instruction performs bitwise AND of R0 with the zero-extended 8-bit immediate i and places the result in R0.

**Operation**

AND #imm, R0

```
          11001001
          15       7       0
```

```
r0 ← ZeroExtend32(R0);
imm ← ZeroExtend8(i);
r0 ← r0 ∧ imm;
R0 ← Register(r0);
```

**Note**

This instruction performs a 32-bit bitwise AND. The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
AND.B #imm, @(R0, GBR)

Description
This instruction performs a bitwise AND of an immediate constant with 8 bits of
data held in memory. The effective address is calculated by adding R0 and GBR. The
8 bits of data at the effective address are read. A bitwise AND is performed of the
read data with the zero-extended 8-bit immediate i. The result is written back to the
8 bits of data at the same effective address.

Operation

AND.B #imm, @(R0, GBR)

<table>
<thead>
<tr>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>11001101</td>
<td>i</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

r0 ← SignExtend32(R0);
gbr ← SignExtend32(GBR);
imm ← ZeroExtend8(i);
address ← ZeroExtend32(r0 + gbr);
value ← ZeroExtend8(ReadMemory8(address));
value ← value ∧ imm;
WriteMemory8(address, value);

Exceptions
WADDERR, WTLBMISS, READPROT, WRITEPROT, FIRSTWRITE

Note
Zero-extension is performed on the effective address computation allowing wrap
around to occur.

The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
BF label

Description

This instruction is a conditional branch. The 8-bit displacement $s$ is sign-extended, doubled and added to PC+4 to form the target address. If the T-bit is 1, the branch is not taken. If the T-bit is 0, the target address is copied to the PC.

Operation

$\text{BF label}$

<table>
<thead>
<tr>
<th>Bit</th>
<th>Function</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>$10001011$</td>
</tr>
<tr>
<td>8</td>
<td>$s$</td>
</tr>
<tr>
<td>7</td>
<td></td>
</tr>
<tr>
<td>0</td>
<td></td>
</tr>
</tbody>
</table>

$t \leftarrow \text{ZeroExtend}_1(T)$;
$pc \leftarrow \text{SignExtend}_{32}(PC)$;
$newpc \leftarrow \text{SignExtend}_{32}(PC')$;
$delayedpc \leftarrow \text{SignExtend}_{32}(PC'')$;
$label \leftarrow \text{SignExtend}_8(s) \ll 1$;
$\text{IF (IsDelaySlot())}$
$\text{THROW ILLSLOT;}$
$\text{IF (t = 0)}$
$\lbrace$
$\quad \text{temp} \leftarrow \text{ZeroExtend}_{32}(pc + 4 + label)$;
$\quad newpc \leftarrow \text{temp}$;
$\quad delayedpc \leftarrow \text{temp} + 2$;
$\rbrace$
$PC' \leftarrow \text{Register(newpc)}$;
$PC'' \leftarrow \text{Register(delayedpc)}$;

Exceptions

ILLSLOT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
This is not a delayed branch instruction. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

The ‘label’ in the assembly syntax represents the immediate s after sign extension and scaling.

If the branch target address is invalid then the IADDERR trap is not delivered until after the branch instruction completes its execution and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
BF/S label

Description

This instruction is a delayed conditional branch. The 8-bit displacement s is sign-extended, doubled and added to PC+4 to form the target address. If the T-bit is 1, the branch is not taken. If the T-bit is 0, the delay slot is executed and then the target address is copied to the PC.

Operation

<table>
<thead>
<tr>
<th>BF/S label</th>
</tr>
</thead>
<tbody>
<tr>
<td>10001111</td>
</tr>
<tr>
<td>15  8  7  0</td>
</tr>
</tbody>
</table>

\[
t \leftarrow \text{ZeroExtend}_1(T);
\]
\[
\text{pc} \leftarrow \text{SignExtend}_{32}(PC);
\]
\[
\text{delayedpc} \leftarrow \text{SignExtend}_{32}(PC^*);
\]
\[
\text{label} \leftarrow \text{SignExtend}_8(s) \ll 1;
\]
\[
\text{IF (IsDelaySlot())}
\]
\[
\quad \text{THROW ILLSLOT;}
\]
\[
\text{IF (t = 0)}
\]
\[
\quad \{ \text{temp} \leftarrow \text{ZeroExtend}_{32}(pc + 4 + \text{label});
\]
\[
\quad \quad \text{delayedpc} \leftarrow \text{temp;}
\]
\[
\quad \}
\]
\[
\text{PC}^* \leftarrow \text{Register}(\text{delayedpc});
\]

Exceptions

ILLSLOT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The delay slot is executed before branching. An ILLSLOT exception is raised if this instruction is executed in a delay slot.
The ‘label’ in the assembly syntax represents the immediate s after sign extension and scaling.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
BRA label

**Description**

This instruction is a delayed unconditional branch. The 12-bit displacement $s$ is sign-extended, doubled and added to $PC+4$ to form the target address. The delay slot is executed and then the target address is copied to the PC.

**Operation**

```
BRA label
```

```
1010
  15  12  11  0

pc ← SignExtend32(PC);
label ← SignExtend12(s) << 1;
if (IsDelaySlot())
  THROW ILLSLOT;
temp ← ZeroExtend32(pc + 4 + label);
delayedpc ← temp;
PC* ← Register(delayedpc);
```

**Exceptions**

ILLSLOT

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The delay slot is executed before branching. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

The ‘label’ in the assembly syntax represents the immediate $s$ after sign extension and scaling.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
**BRAF Rn**

**Description**

This instruction is a delayed unconditional branch. The target address is calculated by adding Rn to PC+4. If the least significant bit of the target address is set, an IADDERR exception is raised, otherwise, the delay slot is executed.

**Operation**

```
BRAF Rn
```

```
0000 n 00100011
```

```
15 12 11  8  7  0
```

```
pc ← SignExtend32(PC);
op1 ← SignExtend32(Rn);
IF (IsDelaySlot())
    THROW ILLSLOT;
target ← ZeroExtend32(pc + 4 + op1);
delayedpc ← target ∧ (~ 0x1);
PC' ← Register(delayedpc);
```

**Exceptions**

ILLSLOT

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The delay slot is executed before branching occurs. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
**BRK**

**Description**

The BRK instruction causes a pre-execution BREAK exception. This exception is generated even if BRK is executed in a delay slot. BRK is typically reserved for use by the debugger.

**Operation**

<table>
<thead>
<tr>
<th>BRK</th>
<th>0000000000111011</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>0</td>
</tr>
</tbody>
</table>

```
THROW BREAK;
```

**Exceptions**

BREAK
BSR label

**Description**

This instruction is a delayed unconditional branch used for branching to a subroutine. The 12-bit displacement \( s \) is sign-extended, doubled and added to PC + 4 to form the target address. The delay slot is executed and then the target address is copied to the PC. The address of the instruction immediately following the delay slot is copied to PR to indicate the return address.

**Operation**

<table>
<thead>
<tr>
<th>BSR label</th>
</tr>
</thead>
<tbody>
<tr>
<td>( 1011 )</td>
</tr>
<tr>
<td>( 15 )</td>
</tr>
</tbody>
</table>

\[
\begin{align*}
\text{pc} & \leftarrow \text{SignExtend}_{32}(\text{PC}); \\
\text{label} & \leftarrow \text{SignExtend}_{12}(s) \ll 1; \\
\text{IF} \ (\text{IsDelaySlot}()) \\
\quad \text{THROW ILLSLOT}; \\
\text{delayedpr} & \leftarrow \text{pc} + 4; \\
\text{temp} & \leftarrow \text{ZeroExtend}_{32}(\text{pc} + 4 + \text{label}); \\
\text{delayedpc} & \leftarrow \text{temp}; \\
\text{PR}'' & \leftarrow \text{Register}(\text{delayedpr}); \\
\text{PC}'' & \leftarrow \text{Register}(\text{delayedpc});
\end{align*}
\]

**Exceptions**

ILLSLOT

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The delay slot is executed before branching. An ILLSLOT exception is raised if this instruction is executed in a delay slot. The 'label' in the assembly syntax represents the immediate \( s \) after sign extension and scaling.
If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
BSRF Rn

Description

This instruction is a delayed unconditional branch used for branching to a far subroutine. The target address is calculated by adding \( R_n \) to PC+4. If the least significant bit of the target address is set, an IADDERR exception is raised, otherwise, the delay slot is executed. The address of the instruction immediately following the delay slot is copied to PR to indicate the return address.

Operation

\[
\text{BSRF Rn}
\]

```
\begin{array}{|c|c|c|}
\hline
0000 & n & 00000011 \\
\hline
15 & 12 & 11 & 8 & 7 & 0
\hline
\end{array}
```

```
p_{c} \leftarrow \text{SignExtend}_{32}(PC);
op_{1} \leftarrow \text{SignExtend}_{32}(R_n);
\text{IF (IsDelaySlot())}
    \text{THROW ILLSLOT;}
delayedpr \leftarrow \text{pc} + 4;
target \leftarrow \text{ZeroExtend}_{32}(\text{pc} + 4 + op_{1});
delayedpc \leftarrow \text{target} \land (-0x1);
PR^* \leftarrow \text{Register}\text{\{delayedpr\}};
PC^* \leftarrow \text{Register}\text{\{delayedpc\}};
\text{\}
```

Exceptions

ILLSLOT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The delay slot is executed before branching and before PR is updated. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
BT label

Description
This instruction is a conditional branch. The 8-bit displacement $s$ is sign-extended, doubled and added to PC+4 to form the target address. If the T-bit is 0, the branch is not taken. If the T-bit is 1, the target address is copied to the PC.

Operation

BT label

<table>
<thead>
<tr>
<th></th>
<th>10001001</th>
<th>$s$</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td>7</td>
</tr>
<tr>
<td>8</td>
<td></td>
<td>0</td>
</tr>
</tbody>
</table>

$t \leftarrow \text{ZeroExtend}_1(T)$;
$pc \leftarrow \text{SignExtend}_{32}(PC)$;
$newpc \leftarrow \text{SignExtend}_{32}(PC')$;
$delayedpc \leftarrow \text{SignExtend}_{32}(PC'')$;
$label \leftarrow \text{SignExtend}_8(s) \ll 1$;
IF (IsDelaySlot())
    THROW ILLSLOT;
IF ($t = 1$)
    
        temp $\leftarrow \text{ZeroExtend}_{32}(pc + 4 + label)$;
        newpc $\leftarrow$ temp;
        delayedpc $\leftarrow$ temp + 2;
    
PC' $\leftarrow$ Register(newpc);
PC'' $\leftarrow$ Register(delayedpc);

Exceptions
ILLSLOT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
This is not a delayed branch instruction. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

The 'label' in the assembly syntax represents the immediate s after sign extension and scaling.

If the branch target address is invalid then the IADDERR trap is not delivered until after the branch instruction completes its execution and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
BT/S label

Description
This instruction is a delayed conditional branch. The 8-bit displacement s is sign-extended, doubled and added to PC+4 to form the target address. If the T-bit is 0, the branch is not taken. If the T-bit is 1, the delay slot is executed and then the target address is copied to the PC.

Operation

\[
\begin{array}{ccc}
11 & \infty & 7 & 0 \\
\end{array}
\]

\[
t \leftarrow \text{ZeroExtend}_1(T); \\
\text{pc} \leftarrow \text{SignExtend}_{32}(\text{PC}); \\
\text{delayedpc} \leftarrow \text{SignExtend}_{32}(\text{PC}'); \\
\text{label} \leftarrow \text{SignExtend}_8(s) \ll 1; \\
\text{IF (IsDelaySlot())} \\
\text{THROW ILLSLOT;} \\
\text{IF (t = 1)} \\
\{ \\
\text{temp} \leftarrow \text{ZeroExtend}_{32}((\text{pc} + 4 + \text{label}); \\
\text{delayedpc} \leftarrow \text{temp;} \\
\} \\
\text{PC'} \leftarrow \text{Register}(\text{delayedpc});
\]

Exceptions
ILLSLOT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The delay slot is executed before branching. An ILLSLOT exception is raised if this instruction is executed in a delay slot.
The ‘label’ in the assembly syntax represents the immediate s after sign extension and scaling.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
CLRMAC

Description
This instruction clears MACL and MACH.

Operation

```
CLRMAC

0000000000101000

15
0

macl ← 0;
mach ← 0;
MACL ← ZeroExtend32(macl);
MACH ← ZeroExtend32(mach);
```
CLRS

Description
This instruction clears the S-bit.

Operation

CLRS

```
s ← 0;
S ← Bit(s);
```

```plaintext
0000000001001000
```
CLRT

Description
This instruction clears the T-bit.

Operation

CLRT

\[
\begin{array}{c}
\text{t} \leftarrow 0; \\
\text{T} \leftarrow \text{Bl}(t);
\end{array}
\]
CMP/EQ Rm, Rn

Description
This instruction sets the T-bit if the value of Rn is equal to the value of Rm, otherwise it clears the T-bit.

Operation

\[
\text{CMP/EQ Rm, Rn}
\]

\[
\begin{array}{cccccc}
0011 & n & m & 0000 \\
\hline
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0 \\
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(Rm);
\text{op2} \leftarrow \text{SignExtend}_{32}(Rn);
\]

\[
t \leftarrow \text{INT}(op2 = op1);
T \leftarrow \text{Bit}(t);
\]

Note
**CMP/EQ #imm, R0**

**Description**
This instruction sets the T-bit if the value of R0 is equal to the sign-extended 8-bit immediate s, otherwise it clears the T-bit.

**Operation**

CMP/EQ #imm, R0

<table>
<thead>
<tr>
<th></th>
<th>10001000</th>
<th>s</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

d0 ← SignExtend32(R0);
imm ← SignExtend8(s);
t ← INT (d0 = imm);
T ← Bit(t);

**Note**
The ‘#imm’ in the assembly syntax represents the immediate s after sign extension.
**CMP/GE Rm, Rn**

**Description**
This instruction sets the T-bit if the signed value of Rn is greater than or equal to the signed value of Rm, otherwise it clears the T-bit.

**Operation**

<table>
<thead>
<tr>
<th></th>
<th>0011</th>
<th></th>
<th></th>
<th></th>
<th>0011</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

\[
op1 \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
op2 \leftarrow \text{SignExtend}_{32}(Rn);
\]
\[t \leftarrow \text{INT}(\op2 \geq \op1);
\]
\[T \leftarrow \text{Bit}(t);
\]

**Note**
CMP/GT Rm, Rn

Description
This instruction sets the T-bit if the signed value of Rn is greater than the signed value of Rm, otherwise it clears the T-bit.

Operation

<table>
<thead>
<tr>
<th></th>
<th>0011</th>
<th>n</th>
<th>m</th>
<th>0111</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_m);
\text{op2} \leftarrow \text{SignExtend}_{32}(R_n);
\text{t} \leftarrow \text{INT}\ (\text{op2} > \text{op1});
\text{T} \leftarrow \text{Bit}(t);
\]

Note
**CMP/HI Rm, Rn**

**Description**

This instruction sets the T-bit if the unsigned value of \( R_n \) is greater than the unsigned value of \( R_m \), otherwise it clears the T-bit.

**Operation**

\[
\begin{array}{c|c|c|c|c|c}
\text{op1} & \text{ZeroExtend}_{32} & \text{SignExtend}_{32} & \text{R}_m & & \\
\text{op2} & \text{ZeroExtend}_{32} & \text{SignExtend}_{32} & \text{R}_n & & \\
\text{t} & \left( \text{op2} > \text{op1} \right) & & & \\
\text{T} & \text{Bit}(t) & & & \\
\end{array}
\]

**Note**

- \( \text{op1} \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_m)) \)
- \( \text{op2} \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_n)) \)
- \( t \leftarrow \text{INT}(\text{op2} > \text{op1}) \)
- \( T \leftarrow \text{Bit}(t) \)
CMP/HS Rm, Rn

Description
This instruction sets the T-bit if the unsigned value of Rn is greater than or equal to
the unsigned value of Rm, otherwise it clears the T-bit.

Operation

```
<table>
<thead>
<tr>
<th></th>
<th>011</th>
<th>n</th>
<th>m</th>
<th>0010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td>12</td>
<td>11</td>
<td></td>
</tr>
<tr>
<td>11</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
```

\[
\text{op1} \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_m)) ;
\text{op2} \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_n)) ;
\text{t} \leftarrow \text{INT}(\text{op2} \geq \text{op1}) ;
\text{T} \leftarrow \text{Bit}(t) ;
\]

Note
**CMP/PL Rn**

**Description**

This instruction sets the T-bit if the signed value of \( R_n \) is greater than 0, otherwise it clears the T-bit.

**Operation**

\[
\begin{array}{|c|c|c|c|}
\hline
& 0100 & n & 00010101 \\
\hline
15 & 12 & 11 & 8 \quad 7 \quad 0 \\
\hline
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{t} \leftarrow \text{INT}(\text{op1} > 0); \\
T \leftarrow \text{Bit}(t);
\]

**Note**
**CMP/PZ Rn**

**Description**
This instruction sets the T-bit if the signed value of Rn is greater than or equal to 0, otherwise it clears the T-bit.

**Operation**

```
<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>00010001</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>
```

op1 ← SignExtend32(Rn);

t ← INT (op1 ≥ 0);

T ← Bit(t);

**Note**
CMP/STR Rm, Rn

Description
This instruction sets the T-bit if any byte in Rn has the same value as the corresponding byte in Rm, otherwise it clears the T-bit.

Operation

<table>
<thead>
<tr>
<th></th>
<th>0010</th>
<th>n</th>
<th>m</th>
<th>1100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td>12</td>
<td>11</td>
<td></td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{op2} \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{temp} \leftarrow \text{op1} \oplus \text{op2}; \\
t \leftarrow \text{INT}(\text{temp}_{<0} \text{FOR} 8 \geq 0); \\
t \leftarrow (\text{INT}(\text{temp}_{<8} \text{FOR} 8 \geq 0)) \lor t; \\
t \leftarrow (\text{INT}(\text{temp}_{<16} \text{FOR} 8 \geq 0)) \lor t; \\
t \leftarrow (\text{INT}(\text{temp}_{<24} \text{FOR} 8 \geq 0)) \lor t; \\
T \leftarrow \text{Bit}(t);
\]

Note
DIV0S Rm, Rn

Description

This instruction initializes the divide-step state for a signed division. The Q-bit is initialized with the sign-bit of the dividend, and the M-bit with the sign-bit of the divisor. The T-bit is initialized to 0 if the Q-bit and the M-bit are the same, otherwise it is initialized to 1.

Operation

\[
\begin{array}{c|c|c|c|c|c}
  & 0010 & n & m & 0111 \\
\hline
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0 \\
\end{array}
\]

\[
\begin{align*}
  \text{op1} & \leftarrow \text{SignExtend}_{32}(R_m) \\
  \text{op2} & \leftarrow \text{SignExtend}_{32}(R_n) \\
  q & \leftarrow \text{op2}_{31 \text{ FOR } 1} \\
  m & \leftarrow \text{op1}_{31 \text{ FOR } 1} \\
  t & \leftarrow m \oplus q \\
  Q & \leftarrow \text{Bit}(q) \\
  M & \leftarrow \text{Bit}(m) \\
  T & \leftarrow \text{Bit}(t)
\end{align*}
\]

Note
**DIV0U**

**Description**
This instruction initializes the divide-step state for an unsigned division. The Q-bit, M-bit and T-bit are all set to 0.

**Operation**

```
DIV0U

\[
\begin{array}{c}
q \leftarrow 0; \\
m \leftarrow 0; \\
t \leftarrow 0; \\
Q \leftarrow \text{Bit}(q); \\
M \leftarrow \text{Bit}(m); \\
T \leftarrow \text{Bit}(t);
\end{array}
\]
```
DIV1 Rm, Rn

Description
This instruction is used to perform a single-bit divide-step for the division of a dividend held in Rn by a divisor held in Rm. The Q-bit, M-bit and T-bit are used to hold additional state through a divide-step sequence. Each DIV1 consumes 1 bit of the dividend from Rn and produces 1 bit of result. The divide initialization and step instructions do not detect divide-by-zero nor overflow. If required, these cases should be checked using additional instructions.

Operation

DIV1 Rm, Rn

<table>
<thead>
<tr>
<th>0011</th>
<th>n</th>
<th>7</th>
<th>m</th>
<th>0100</th>
</tr>
</thead>
</table>

\[
\begin{align*}
q & \leftarrow \text{ZeroExtend}_1(Q); \\
m & \leftarrow \text{ZeroExtend}_1(M); \\
t & \leftarrow \text{ZeroExtend}_1(T); \\
op1 & \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_m)); \\
op2 & \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_n)); \\
oldq & \leftarrow q; \\
q & \leftarrow \text{op2}_{<31 \text{ FOR } 1>}; \\
op2 & \leftarrow \text{ZeroExtend}_{32}(\text{op2} \ll 1) \lor t; \\
\text{IF (oldq} = m)\\n\text{op2} & \leftarrow \text{op2} - \text{op1}; \\
\text{ELSE} \\
\text{op2} & \leftarrow \text{op2} + \text{op1}; \\
q & \leftarrow (q \oplus m) \oplus \text{op2}_{<32 \text{ FOR } 1>}; \\
t & \leftarrow 1 - (q \oplus m); \\
R_n & \leftarrow \text{Register(op2)}; \\
Q & \leftarrow \text{Bit}(q); \\
T & \leftarrow \text{Bit}(t);
\end{align*}
\]

Note
DMULS.L Rm, Rn

Description

This instruction multiplies the signed 32-bit value held in Rm with the signed 32-bit value held in Rn to give a full 64-bit result. The lower half of the result is placed in MACL and the upper half in MACH.

Operation

<table>
<thead>
<tr>
<th>0011</th>
<th>n</th>
<th>m</th>
<th>1101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

op1 ← SignExtend32(Rm);
op2 ← SignExtend32(Rn);
mac ← op2 × op1;
macl ← mac;
mach ← mac >> 32;
MACL ← ZeroExtend32(macl);
MACH ← ZeroExtend32(mach);

Note
DMULU.L Rm, Rn

Description

This instruction multiplies the unsigned 32-bit value held in Rm with the unsigned 32-bit value held in Rn to give a full 64-bit result. The lower half of the result is placed in MACL and the upper half in MACH.

Operation

<table>
<thead>
<tr>
<th>DMULU.L Rm, Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>0011</td>
</tr>
<tr>
<td>15</td>
</tr>
</tbody>
</table>

\[
\begin{align*}
\text{op1} & \leftarrow \text{ZeroExtend}_32(\text{SignExtend}_32(R_m)); \\
\text{op2} & \leftarrow \text{ZeroExtend}_32(\text{SignExtend}_32(R_n)); \\
\text{mac} & \leftarrow \text{op2} \times \text{op1}; \\
\text{macl} & \leftarrow \text{mac}; \\
\text{mach} & \leftarrow \text{mac} >> 32; \\
\text{MACL} & \leftarrow \text{ZeroExtend}_32(\text{macl}); \\
\text{MACH} & \leftarrow \text{ZeroExtend}_32(\text{mach});
\end{align*}
\]

Note
**DT Rn**

**Description**

This instruction subtracts 1 from \( R_n \) and placed the result in \( R_n \). The T-bit is set if the result is zero, otherwise the T-bit is cleared.

**Operation**

\[
\begin{array}{c|c|c|c|c}
    & 15 & 12 & 11 & 8 & 7 & 0 \\
--- & --- & --- & --- & --- & --- & --- \\
\text{DT Rn} & 0100 & n & \text{op1} & \text{sign} & \text{00010000} & \text{0} \\
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{op1} \leftarrow \text{op1} - 1; \\
t \leftarrow \text{INT} (\text{op1} = 0); \\
R_n \leftarrow \text{Register}(\text{op1}); \\
T \leftarrow \text{Bit}(t);
\]

**Note**
EXTS.B Rm, Rn

Description
This instruction reads the 8 least significant bits of Rm, sign-extends, and places the result in Rn.

Operation

| op1 ← SignExtend8(Rm); |
| op2 ← op1; |
| Rn ← Register(op2); |

Note
**EXTS.W Rm, Rn**

**Description**

This instruction reads the 16 least significant bits of $R_m$, sign-extends, and places the result in $R_n$.

**Operation**

$$\text{EXTS.W } R_m, R_n$$

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0110</td>
<td>$n$</td>
<td>$m$</td>
<td>1111</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

$$op1 \leftarrow \text{SignExtend}_{16}(R_m);$$
$$op2 \leftarrow op1;$$
$$R_n \leftarrow \text{Register}(op2);$$

**Note**
EXTU.B Rm, Rn

Description
This instruction reads the 8 least significant bits of R_m, zero-extends, and places the result in R_n.

Operation

EXTU.B Rm, Rn

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0110</td>
<td>n</td>
<td>m</td>
<td>1100</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

op1 ← ZeroExtend8(R_m);
op2 ← op1;
R_n ← Register(op2);

Note
EXTU.W Rm, Rn

Description
This instruction reads the 16 least significant bits of Rm, zero-extends, and places the result in Rn.

Operation
EXTU.W Rm, Rn

\[
\begin{array}{cccccc}
0110 & \begin{array}{c}n \end{array} & \begin{array}{c}m \end{array} & \begin{array}{c}1101 \end{array} \\
15 & 12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0
\end{array}
\]

\begin{align*}
op1 & \leftarrow \text{ZeroExtend}_{16}(R_m); \\
op2 & \leftarrow \text{op1}; \\
R_n & \leftarrow \text{Register}(\text{op2});
\end{align*}

Note
FABS DRn

Description
This floating-point instruction computes the absolute value of a double-precision floating-point number. It reads DRn, clears the sign bit and places the result in DRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

<table>
<thead>
<tr>
<th>FABS DRn</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
</tr>
<tr>
<td>n</td>
</tr>
<tr>
<td>001011101</td>
</tr>
<tr>
<td>15</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

```
sr ← ZeroExtend32(SR);
op1 ← FloatValue64(DR2n);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
op1 ← FABS_D(op1);
DR2n ← FloatRegister64(op1);
```

Exceptions
SLOTFPUDIS, FPUDIS
FABS FRn

Description

This floating-point instruction computes the absolute value of a single-precision floating-point number. It reads FRn, clears the sign bit and places the result in FRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FABS FRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>01011101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
<tr>
<td>8</td>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);

op1 ← FloatValue32(FRn);

IF (FpuIsDisabled(sr) AND isDelaySlot())

THROW SLOTFPUDIS;

IF (FpuIsDisabled(sr))

THROW FPUDIS;

op1 ← FABS_S(op1);

FRn ← FloatRegister32(op1);

Exceptions

SLOTFPUDIS, FPUDIS
FADD DRm, DRn

Description
This floating-point instruction performs a double-precision floating-point addition. It adds DRm to DRn and places the result in DRn. The rounding mode is determined by FPSCR.RM.

Operation

FADD DRm, DRn

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>5</th>
<th>4</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue64(DR2m);
op2 ← FloatValue64(DR2n);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
     THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
     THROW FPUDIS;
op2, fps ← FADD_D(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
     THROW FPUEXC, fps;
IF (FpuCauseE(fps))
     THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
     THROW FPUEXC, fps;
DR2n ← FloatRegister64(op2);
FPSCR ← ZeroExtend32(fps);

Exceptions
SLOTFPUDIS, FPUDIS, FPUEXC
FADD FRm, FRn

Description
This floating-point instruction performs a single-precision floating-point addition. It adds FRm to FRn and places the result in FRn. The rounding mode is determined by FPSCR.RM.

Operation

<table>
<thead>
<tr>
<th>FADD FRm, FRn</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111 n m 0000</td>
</tr>
<tr>
<td>15 12 11  8 7  4  3  0</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
THROW FPUDIS;
op2, fps ← FADD_S(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
THROW FPUEXC, fps;
IF (FpuCauseE(fps))
THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
THROW FPUEXC, fps;
FRn ← FloatRegister32(op2);
FPSCR ← ZeroExtend32(fps);

Exceptions
SLOTFPUDIS, FPUDIS, FPUEXC
FADD Special Cases:

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1 Disabled: an exception is raised if the FPU is disabled.
2 Invalid: an invalid operation is signaled if either input is a signaling NaN, or if the inputs are differently signed infinities.
3 Error: an FPU error is signaled if FPSCR.DN is zero, neither input is a NaN and either input is a denormalized number.
4 Inexact, underflow and overflow: these are checked together and can be signaled in combination. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised regardless of whether that condition arose.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>op1 →</th>
<th>+NORM, -NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM, -DNORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>+INF</td>
<td>ADD op2</td>
<td>op2</td>
<td>-INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>+0</td>
<td>op1</td>
<td>+0</td>
<td>+0</td>
<td>+INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>-0</td>
<td>op1</td>
<td>+0</td>
<td>-0</td>
<td>+INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>qNaN</td>
<td>n/a</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>-INF</td>
<td>-INF</td>
<td>-INF</td>
<td>-INF</td>
<td>qNaN</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>+, -DNORM</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>sNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
</tbody>
</table>

FPU error is indicated by heavy shading and always raises an exception. Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled, inexact, underflow and overflow cases are not shown.

The behavior of the normal ‘ADD’ case is described by the IEEE 754 specification.
**FCMP/EQ DRm, DRn**

**Description**

This floating-point instruction performs a double-precision floating-point equality comparison. It sets the T-bit to 1 if DRm is equal to DRn, and otherwise sets the T-bit to 0.

**Operation**

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>00100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>9</td>
<td>7</td>
<td>5</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue64(DR2m);
op2 ← FloatValue64(DR2n);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
t, fps ← FCMP_EQ_D(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
FPSCR ← ZeroExtend32(fps);
T ← Bit(t);

**Exceptions**

SLOTFPUDIS, FPUDIS, FPUEXC
FCMP/EQ FRm, FRn

Description

This floating-point instruction performs a single-precision floating-point equality comparison. It sets the T-bit to 1 if FRm is equal to FRn, and otherwise sets the T-bit to 0.

Operation

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>0100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
t, fps ← FCMP_EQ_S(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
FPSCR ← ZeroExtend32(fps);
T ← Bit(t);

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC

FCMP/EQ Special Cases:

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.
Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. Disabled: an exception is raised if the FPU is disabled.
2. Invalid: an invalid operation is signaled if either input is a signaling NaN.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>op1 →</th>
<th>+NORM, -NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM, -DNORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>+NORM</td>
<td>CMPEQ</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>+0</td>
<td>false</td>
<td>true</td>
<td>true</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>-0</td>
<td>false</td>
<td>true</td>
<td>true</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>+INF</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>true</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>-INF</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>true</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>+DNORM</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>true</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>qNaN</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>sNaN</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
</tbody>
</table>

Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled cases are not shown.

The behavior of the normal ‘CMPEQ’ case is described by the IEEE 754 specification.
FCMP/GT DRm, DRn

Description
This floating-point instruction performs a double-precision floating-point greater-than comparison. It sets the T-bit to 1 if DRn is greater than DRm, and otherwise sets the T-bit to 0.

Operation

\[
\text{FCMP/GT DRm, DRn}
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>9</th>
<th>7</th>
<th>5</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
<td>n</td>
<td>0</td>
<td>m</td>
<td>00101</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

\[
\begin{align*}
sr & \leftarrow \text{ZeroExtend}_{32}(SR); \\
fps & \leftarrow \text{ZeroExtend}_{32}(FPSCR); \\
op1 & \leftarrow \text{FloatValue}_{64}(DR_{2m}); \\
op2 & \leftarrow \text{FloatValue}_{64}(DR_{2n}); \\
\text{IF} \ (\text{FpuIsDisabled}(sr) \ \text{AND} \ \text{IsDelaySlot}()) \\
& \quad \text{THROW SLOTFPUDIS}; \\
\text{IF} \ (\text{FpuIsDisabled}(sr)) \\
& \quad \text{THROW FPUDIS}; \\
t, fps & \leftarrow \text{FCMPGT}_D(op2, op1, fps); \\
\text{IF} \ (\text{FpuEnableV}(fps) \ \text{AND} \ \text{FpuCauseV}(fps)) \\
& \quad \text{THROW FPUEXC, fps}; \\
FPSCR & \leftarrow \text{ZeroExtend}_{32}(fps); \\
T & \leftarrow \text{Bit}(t);
\end{align*}
\]

Exceptions
SLOTFPUDIS, FPUDIS, FPUEXC
FCMP/GT FRm, FRn

Description

This floating-point instruction performs a single-precision floating-point greater-than comparison. It sets the T-bit to 1 if FRn is greater than FRm, and otherwise sets the T-bit to 0.

Operation

```
1111  n  m  0101
15  12  11  7  4  3  0
```

Available only when PR=0

```
sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
THROW FPUDIS;
t, fps ← FCMPGT_S(op2, op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
THROW FPUEXC, fps;
FPSCR ← ZeroExtend32(fps);
T ← Bit(t);
```

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC

FCMP/GT Special Cases:

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.
Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. Disabled: an exception is raised if the FPU is disabled.
2. Invalid: an invalid operation is signaled if either input is a NaN.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>op2 →</th>
<th>+NORM, -NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+ONORM, -ONORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>+.NORM</td>
<td>CMPGT</td>
<td>CMPGT</td>
<td>CMPGT</td>
<td>true</td>
<td>false</td>
<td>CMPGT</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>+0</td>
<td>CMPGT</td>
<td>false</td>
<td>false</td>
<td>true</td>
<td>false</td>
<td>CMPGT</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>-.NORM</td>
<td>CMPGT</td>
<td>true</td>
<td>false</td>
<td>true</td>
<td>false</td>
<td>CMPGT</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>+INF</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>-INF</td>
<td>true</td>
<td>true</td>
<td>true</td>
<td>true</td>
<td>false</td>
<td>true</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>+,-DNORM</td>
<td>CMPGT</td>
<td>CMPGT</td>
<td>CMPGT</td>
<td>true</td>
<td>false</td>
<td>CMPGT</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>qNaN</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
<tr>
<td>sNaN</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
<td>false</td>
</tr>
</tbody>
</table>

Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled cases are not shown.

The behavior of the normal 'CMPGT' case is described by the IEEE 754 specification.
**FCNVDS DRm, FPUL**

**Description**

This floating-point instruction performs a double-precision to single-precision floating-point conversion. It reads a double-precision value from DRm, converts it to single-precision and places the result in FPUL. The rounding mode is determined by FPSCR.RM.

**Operation**

```c
FCNVDS DRm, FPUL
```

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>m</th>
<th>01011101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>9</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue64(DRm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
fpul, fps ← FCNV_DS(op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
IF (FpuCauseE(fps))
    THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
    THROW FPUEXC, fps;
FPSCR ← ZeroExtend32(fps);
FPUL ← ZeroExtend32(fps);
```

**Exceptions**

SLOTFPUDIS, FPUDIS, FPUEXC
FCNVSD FPUL, DRn

**Description**

This floating-point instruction performs a single-precision to double-precision floating-point conversion. It reads a single-precision value from FPUL, converts it to double-precision and places the result in DRn. FPSCR.RM has no effect since the conversion is exact.

**Operation**

\[
\text{FCNVSD FPUL, DRn}
\]

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>01010101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

\[
sr \leftarrow \text{ZeroExt}_{32}(SR);
fps \leftarrow \text{ZeroExt}_{32}(FPSCR);
fpu \leftarrow \text{SignExt}_{32}(FPUL);
\]

\[
\begin{align*}
\text{IF (FpuIsDisabled(sr) AND IsDelaySlot())} \\
\text{THROW SLOTFPUDIS;}
\end{align*}
\]

\[
\begin{align*}
\text{IF (FpuIsDisabled(sr))} \\
\text{THROW FPUDIS;}
\end{align*}
\]

\[
\begin{align*}
\text{op1, fps} & \leftarrow \text{FCNV_SD}(fpu, fps); \\
\text{IF (FpuEnableV(fps) AND FpuCauseV(fps))} \\
\text{THROW FPUEXC, fps;}
\end{align*}
\]

\[
\begin{align*}
\text{IF (FpuCauseE(fps))} \\
\text{THROW FPUEXC, fps;}
\end{align*}
\]

\[
\begin{align*}
\text{DR2n} & \leftarrow \text{FloatRegister}_{64}(op1); \\
\text{FPSCR} & \leftarrow \text{ZeroExt}_{32}(fps);
\end{align*}
\]

**Exceptions**

SLOTFPUDIS, FPUDIS, FPUEXC
FCNVDS and FCNVSD Special Cases:

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. Disabled: an exception is raised if the FPU is disabled.
2. Invalid: an invalid operation is signaled if the input is a signaling NaN.
3. Error: an FPU error is signaled if FPSCR.DN is zero and the input is a denormalized number.
4. Inexact, underflow and overflow: these are checked together and can be signaled in combination. These cases occur for FCNVDS but not for FCNVSD. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised for FCNVDS regardless of whether that condition arose.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>opt</th>
<th>+NORM</th>
<th>-NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM, -DNORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>CNV</td>
<td>CNV</td>
<td>+0</td>
<td>-0</td>
<td>+INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>

FPU error is indicated by heavy shading and always raises an exception. Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled, inexact, underflow and overflow cases are not shown.

The behavior of the normal ‘CNV’ case is described by the IEEE 754 specification.
FDIV DRm, DRn

Description

This floating-point instruction performs a double-precision floating-point division. It divides DRn by DRm and places the result in DRn. The rounding mode is determined by FPSCR.RM.

Operation

FDIV DRm, DRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>00011</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue64(DR2m);
op2 ← FloatValue64(DR2n);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
  THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
  THROW FPUDIS;
op2, fps ← FDIV_D(op2, op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
  THROW FPUEXC, fps;
IF (FpuEnableZ(fps) AND FpuCauseZ(fps))
  THROW FPUEXC, fps;
IF (FpuCauseE(fps))
  THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
  THROW FPUEXC, fps;
DR2n ← FloatRegister64(op2);
FPSCR ← ZeroExtend32(fps);

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC
FDIV FRm, FRn

Description
This floating-point instruction performs a single-precision floating-point division. It divides FRn by FRm and places the result in FRn. The rounding mode is determined by FPSCR.RM.

Operation

<table>
<thead>
<tr>
<th>FDIV FRm, FRn</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
</tr>
<tr>
<td>15</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op2, fps ← FDIV_S(op2, op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
IF (FpuEnableZ(fps) AND FpuCauseZ(fps))
    THROW FPUEXC, fps;
IF (FpuCauseE(fps))
    THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
    THROW FPUEXC, fps;
FRn ← FloatRegister32(op2);
FPSCR ← ZeroExtend32(fps);

Exceptions
SLOTFPUDIS, FPUDIS, FPUEXC
**FDIV Special Cases:**

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. **Disabled:** an exception is raised if the FPU is disabled.
2. **Invalid:** an invalid operation is signaled if either input is a signaling NaN, or if the division is of a zero by a zero, or of an infinity by an infinity.
3. **Divide-by-zero:** a divide-by-zero is signaled if the divisor is zero and the dividend is a finite non-zero number.
4. **Error:** an FPU error is signaled if FPSCR.DN is zero, neither input is a NaN and either of the following conditions is true: the divisor is a denormalized number, or the dividend is a denormalized number and the divisor is not a zero.
5. **Inexact, underflow and overflow:** these are checked together and can be signaled in combination. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised regardless of whether that condition arose.

If the instruction does not raise an exception, a result is generated as follows:

<table>
<thead>
<tr>
<th>op2 →</th>
<th>+NORM, -NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM, -DNORM</th>
<th>qNaN</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>+0</td>
<td>DIV</td>
<td>+0, -0</td>
<td>-0, +0</td>
<td>+INF, -INF</td>
<td>-INF, +INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-0</td>
<td>+INF, -INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td>+INF</td>
<td>-INF</td>
<td>+INF, -INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+INF</td>
<td>+0, -0</td>
<td>+0</td>
<td>-0</td>
<td>qNaN</td>
<td>qNaN</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-INF</td>
<td>-0, +0</td>
<td>-0</td>
<td>+0</td>
<td>qNaN</td>
<td>qNaN</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+, -DNORM</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>sNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>

FPU error is indicated by heavy shading and always raises an exception. Invalid operations and divide-by-zero are indicated by light shading and raise an exception if enabled. FPU disabled, inexact, underflow and overflow cases are not shown.
The behavior of the normal 'DIV' case is described by the IEEE 754 specification.
FIPR FVm, FVn

Description
This floating-point instruction computes dot-product of two vectors, FVm and FVn, and places the result in element 3 of FVn. Each vector contains four single-precision floating-point values. The dot-product is specified as:

\[ FR_{n+3} = \sum_{i=0}^{3} FR_{m+i} \times FR_{n+i} \]

This is an approximate computation. The specified error in the result value:

\[ \text{spec\_error} = \begin{cases} 0 & \text{if}(e_{pm} = e_{z}) \\ 2^{e_{pm} - 24} + 2^{E - 24 + r_m} & \text{if}(e_{pm} \neq e_{z}) \end{cases} \]

where

\[ r_m = \begin{cases} 0 & \text{if(round \text{-} to\text{-} nearest)} \\ 1 & \text{if(round \text{-} to\text{-} zero) } \end{cases} \]

\[ E = \text{unbiased exponent value of the result} \]

\[ e_{z} < -252 \]

\[ e_{pm} = \max(e_{p0}, e_{p1}, e_{p2}, e_{p3}) \]

\[ e_{p1} = \text{pre-normalized exponent of the product } FR_{m+i} \text{ and } FR_{n+i} \]

\[ e_{FR_{m+i}} = \text{biased exponent value of } FR_{m+i} \]

\[ e_{FR_{n+i}} = \text{biased exponent value of } FR_{n+i} \]

\[ e_{p1} = \begin{cases} e_{z} & \text{if}((FR_{m+i} = 0.0) \text{OR}(FR_{n+i} = 0.0)) \\ \max(e_{FR_{m+i}}, 1) + \max(e_{FR_{n+i}}, 1) - 254 & \text{otherwise} \end{cases} \]
Operation

FIPR FVm, FVn

<table>
<thead>
<tr>
<th></th>
<th>111</th>
<th>n</th>
<th>m</th>
<th></th>
<th>11101101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
<td>7</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValueVector32(FV4m);
op2 ← FloatValueVector32(FV4n);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
  THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
  THROW FPUDIS;
op2[3], fps ← FIPR_S(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
  THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
  THROW FPUEXC, fps;
FV4n ← FloatRegisterVector32(op2);
FPSCR ← ZeroExtend32(fps);

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC

FIPR Special Cases:

FIPR is an approximate instruction. Denormalized numbers are supported:

- When FPSCR.DN is 0, denormalized numbers are treated as their denormalized value in the FIPR.S calculation. This instruction never signals an FPU error.
- When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.
1. Disabled: an exception is raised if the FPU is disabled.

2. Invalid: an invalid operation is signaled if any of the following arise:
   - Any of the inputs is a signaling NaN.
   - Multiplication of a zero by an infinity.
   - Addition of differently signed infinities where none of the inputs is a qNaN.

   The multiplication is performed with sufficient precision to avoid overflow, and therefore the multiplication of any two finite numbers does not produce an infinity. The multiplication result will be an infinity only if there is a multiplication of an infinity with a normalized number, an infinity with a denormalized number or an infinity with an infinity.

   The addition of differently signed infinities is detected if there is (at least) one positive infinity and (at least) one negative infinity in the set of 4 multiplication results.

3. Inexact, underflow and overflow: these are checked together and can be signaled in combination. This is an approximate instruction and inexact is signaled except where special cases occur. Precise details of the approximate inner-product algorithm, including the detection of underflow and overflow cases, are implementation dependent. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised regardless of whether that condition arose.

   If the instruction does not raise an exception, a result is generated according to the following tables. Where the behavior is not a special case, the instruction computes an approximate result using an implementation-dependent algorithm.
FIPR Special Cases (continued):

Each of the 4 pairs of multiplication operands (op1 and op2) is selected from corresponding elements of the two 4-element source vectors and multiplied:

<table>
<thead>
<tr>
<th>op1 →</th>
<th>+,-NORM</th>
<th>+,-DENORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>+0</td>
<td>FIPRMUL</td>
<td>+0, -0</td>
<td>-0</td>
<td>+0</td>
<td>-INF</td>
<td>+INF</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-0</td>
<td>+0, -0</td>
<td>-0</td>
<td>+0</td>
<td>+0</td>
<td>+INF</td>
<td>-INF</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+INF</td>
<td>+INF, -INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td>+INF</td>
<td>-INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>-INF</td>
<td>+INF, +INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td>-INF</td>
<td>-INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
</tr>
<tr>
<td>sNaN</td>
<td>sNaN</td>
<td>sNaN</td>
<td>sNaN</td>
<td>sNaN</td>
<td>sNaN</td>
<td>sNaN</td>
<td>sNaN</td>
<td></td>
</tr>
</tbody>
</table>

If any of the multiplications evaluates to qNaN, then the result of the instruction is qNaN and no further analysis need be performed. In the 'FIPRMUL', +0, -0, +INF and -INF cases, the 4 addition operands (labelled intermediate 0 to 3) are summed:

<table>
<thead>
<tr>
<th>intermediate 0 →</th>
<th>FIPRMUL, +0, -0</th>
<th>+INF</th>
<th>-INF</th>
</tr>
</thead>
<tbody>
<tr>
<td>intermediate 1 →</td>
<td>FIPRMUL, +0, -0</td>
<td>+INF</td>
<td>-INF</td>
</tr>
<tr>
<td>↓ intermediate 2</td>
<td>FIPRMUL, +0, -0</td>
<td>+INF</td>
<td>-INF</td>
</tr>
<tr>
<td>↓ intermediate 3</td>
<td>FIPRMUL, +0, -0</td>
<td>+INF</td>
<td>-INF</td>
</tr>
<tr>
<td>FIPRMUL, +0, -0</td>
<td>+INF</td>
<td>-INF</td>
<td></td>
</tr>
<tr>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
</tr>
<tr>
<td>-INF</td>
<td>-INF</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
</tr>
<tr>
<td>-INF</td>
<td>-INF</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>

Inexact is signaled in the 'FIPRADD' case. Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled, inexact, underflow and overflow cases are not shown.
FLDS FRm, FPUL

Description
This floating-point instruction copies FRm to FPUL.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations.

Operation

FLDS FRm, FPUL

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
<td>m</td>
<td>00011101</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

sr ← ZeroExtend32(SR);
op1 ← FloatValue32(FRm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
fpul ← op1;
FPUL ← ZeroExtend32(fpul);

Exceptions
SLOTFPUDIS, FPUDIS
FLDI0 FRn

Description
This floating-point instruction loads a constant representing the single-precision floating-point value of 0.0 into FRn.

Operation

<table>
<thead>
<tr>
<th>FLD10 FRn</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
</tr>
<tr>
<td>n</td>
</tr>
<tr>
<td>10001101</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
op1 ← 0x00000000;
FRn ← FloatRegister32(op1);

Exceptions
SLOTFPUDIS, FPUDIS
**FLDI1 FRn**

**Description**

This floating-point instruction loads a constant representing the single-precision floating-point value of 1.0 into FR\_n.

**Operation**

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th>1111</th>
<th>n</th>
<th>1001101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0

```
sr ← ZeroExtend32(SR);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
opt ← 0x3F800000;
FR\_n ← FloatRegister32(op1);
```

**Exceptions**

SLOTFPUDIS, FPUDIS
FLOAT FPUL, DRn

Description
This floating-point instruction performs a signed 32-bit integer to double-precision floating-point conversion. It reads a signed 32-bit integer value from FPUL, converts it to a double-precision range and places the result in DRn. In all cases the provided integer value will be exactly represented in the destination floating-point format. FPSCR.RM has no effect since the conversion is exact.

Operation

FLOAT FPUL, DRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>000101101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
<tr>
<td>9</td>
<td>8</td>
<td>0</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

fpul ← SignExtend32(FPUL);
sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op1, fps ← FLOAT_LD(fpul, fps);
DR2n ← FloatRegister64(op1);

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC
FLOAT FPUL, FRn

Description

This floating-point instruction performs a signed 32-bit integer to single-precision floating-point conversion. It reads a signed 32-bit integer value from FPUL, converts it to a single-precision range and places the result in FRn. In cases where the integer value cannot be exactly represented in the destination floating-point format, the rounding mode is determined by FPSCR.RM.

Operation

FLOAT FPUL, FRn

<table>
<thead>
<tr>
<th></th>
<th></th>
<th>n</th>
<th></th>
<th></th>
<th>00101101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
fpul ← SignExtend32(FPUL);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
op1, fps ← FLOAT_LS(fpul, fps);
IF (FpuEnableI(fps))
   THROW FPUEXC, fps;
FRn ← FloatRegister32(op1);
FPSCR ← ZeroExtend32(fps);

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC

FLOAT Special Cases:

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.
1. Disabled: an exception is raised if the FPU is disabled.

2. Inexact: inexact can occur for FLOAT FPUL, FRₙ but not for FLOAT FPUL, DRₙ. When inexact exceptions are requested by the user, an exception is always raised for FLOAT FPUL, FRₙ regardless of whether that condition arose. Overflow and underflow do not occur for either of these instructions.

If the instruction does not raise an exception, the conversion is performed as indicated by the IEEE 754 specification.
FMAC FR0, FRm, FRn

Description

This floating-point instruction performs a single-precision floating-point multiply-accumulate. It multiplies FR0 by FRm, adds this intermediate to FRn, and places the result back to FRn. The multiplication and addition are performed as if the exponent and precision ranges were unbounded, followed by one rounding down to single-precision format. The rounding mode is determined by FPSCR.RM.

Operation

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>1110</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
<tr>
<td></td>
<td>7</td>
<td>4</td>
<td>3</td>
</tr>
<tr>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
fr0 ← FloatValue32(FR0);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op2, fps ← FMAC_S(fr0, op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
IF (FpuCauseE(fps))
    THROW FPUEXC, fps;
IF ((FpuEnableO(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
    THROW FPUEXC, fps;
FRn ← FloatRegister32(op2);
FPSCR ← ZeroExtend32(fps);

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC
**FMAC Special Cases:**

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. **Disabled:** an exception is raised if the FPU is disabled.
2. **Invalid:** an invalid operation is signaled if any of the three inputs is a signaling NaN, there is a multiplication of a zero by an infinity, or there is an addition of differently signed infinities.

   The multiplication is performed with sufficient precision to avoid overflow, and therefore the multiplication of any two finite numbers does not produce an infinity. The multiplication result will be an infinity only if there is a multiplication of an infinity with a normalized number, an infinity with a denormalized number or an infinity with an infinity.

3. **Error:** an FPU error is signaled if FPSCR.DN is 0 and none of the inputs are a NaN and at least one of the inputs is a denormalized number.

4. **Inexact, underflow and overflow:** these are checked together and can be signaled in combination. The multiply-accumulate is implemented using a fused-mac algorithm, and these are detected during the conversion of the exactly evaluated intermediate to the single-precision result. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised regardless of whether that condition arose.

If the instruction does not raise an exception, a result is generated according to the following tables.

Firstly, the operands are checked for sNaN:

<table>
<thead>
<tr>
<th>fr0 →</th>
<th>other</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>op1 →</td>
<td>other</td>
<td>sNaN</td>
</tr>
<tr>
<td>↓ op2</td>
<td>other</td>
<td>sNaN</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>other</th>
<th>qNaN</th>
<th>qNaN</th>
<th>qNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>sNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>
FMAC Special Cases (continued):

If the result of the previous table is a qNaN, no further analysis is performed. In all other cases, fr0 and op1 are checked for a zero multiplied by an infinity:

<table>
<thead>
<tr>
<th>fr0 →</th>
<th>op1 ↓</th>
<th>o</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
</tr>
</thead>
<tbody>
<tr>
<td>other</td>
<td></td>
<td>0</td>
<td></td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-0</td>
<td></td>
<td></td>
<td></td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-INF</td>
<td></td>
<td></td>
<td></td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-INF</td>
<td></td>
<td></td>
<td></td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>

If the result of the previous table is a qNaN, no further analysis is performed. In all other cases, the operands are checked for input qNaN values:

<table>
<thead>
<tr>
<th>fr0 →</th>
<th>op1 ↓</th>
<th>op2 ↓</th>
</tr>
</thead>
<tbody>
<tr>
<td>other</td>
<td></td>
<td>qNaN</td>
</tr>
<tr>
<td></td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>

By this stage all operations involving sNaN or qNaN operands have been dealt with. If the result of the previous table is a qNaN, no further analysis is performed. In all other cases, the operands are checked for the addition of differently signed infinities:

<table>
<thead>
<tr>
<th>fr0 →</th>
<th>op1 →</th>
<th>op2 ↓</th>
</tr>
</thead>
<tbody>
<tr>
<td>+other</td>
<td></td>
<td></td>
</tr>
<tr>
<td>-other</td>
<td></td>
<td></td>
</tr>
<tr>
<td>+INF</td>
<td></td>
<td></td>
</tr>
<tr>
<td>-INF</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
FMAC Special Cases (continued):

If the result of the previous table is a qNaN, no further analysis is performed. In all other cases, fr0 and op1 are multiplied:

<table>
<thead>
<tr>
<th>fr0 →</th>
<th>op1 ↓</th>
<th>+NORM, -NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM, -DNORM</th>
</tr>
</thead>
<tbody>
<tr>
<td>+, -NORM</td>
<td>FULLMUL</td>
<td>+0, -0</td>
<td>-0, +0</td>
<td>+INF, -INF</td>
<td>-INF, +INF</td>
<td>n/a</td>
<td></td>
</tr>
<tr>
<td>+0</td>
<td>+0, -0</td>
<td>+0</td>
<td>-0</td>
<td>n/a</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>-0</td>
<td>-0, +0</td>
<td>-0</td>
<td>+0</td>
<td>n/a</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>+INF</td>
<td>+INF, -INF</td>
<td>+INF</td>
<td>-INF</td>
<td>n/a</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>-INF</td>
<td>-INF, +INF</td>
<td>-INF</td>
<td>+INF</td>
<td>n/a</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>+, -DNORM</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td></td>
</tr>
</tbody>
</table>

The empty cells in this table correspond to cases that have already been dealt with. If either source is denormalized, no further analysis is performed. In the ‘FULLMUL’ case, a multiplication is performed without loss of precision. There is no rounding nor overflow, and this multiplication cannot produce an intermediate infinity.

In the ‘FULLMUL’, +0, -0, +INF and -INF cases, the 2 addition operands (fr0*op1 and op2) are summed:

<table>
<thead>
<tr>
<th>(fr0*op1) →</th>
<th>op2 ↓</th>
<th>FULLMUL</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
</tr>
</thead>
<tbody>
<tr>
<td>+, -NORM</td>
<td>FULLADD</td>
<td>op2</td>
<td>op2</td>
<td>+INF</td>
<td>-INF</td>
<td></td>
</tr>
<tr>
<td>+0</td>
<td>FULLADD</td>
<td>+0</td>
<td>+0</td>
<td>+INF</td>
<td>-INF</td>
<td></td>
</tr>
<tr>
<td>-0</td>
<td>FULLADD</td>
<td>+0</td>
<td>-0</td>
<td>+INF</td>
<td>-INF</td>
<td></td>
</tr>
<tr>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td></td>
<td></td>
</tr>
<tr>
<td>-INF</td>
<td>-INF</td>
<td>-INF</td>
<td>-INF</td>
<td>-INF</td>
<td></td>
<td></td>
</tr>
<tr>
<td>+, -DNORM</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td></td>
</tr>
</tbody>
</table>

The two empty cells in this table correspond to cases that have already been dealt with. In the ‘FULLADD’ cases the fully-precise addition intermediate is rounded to give a single-precision result.
In the above tables, FPU error is indicated by heavy shading and always raises an exception. Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled, inexact, underflow and overflow cases are not shown.
FMOV DRm, DRn

Description
This floating-point instruction reads a pair of single-precision floating-point values from DRm and copies them to DRn. This is a bit-by-bit copy with no interpretation or conversion of the values.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV DRm, DRn

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>01100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>9</td>
<td>7</td>
<td>5</td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
op1 ← FloatValuePair32(FP2m);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op2 ← op1;
FP2n ← FloatRegisterPair32(op2);

Exceptions
SLOTFPUDIS, FPUDIS
FMOV DRm, XDn

Description

This floating-point instruction reads a pair of single-precision floating-point values from DRm and copies them to XDn. This is a bit-by-bit copy with no interpretation or conversion of the values.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV DRm, XDn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>1</th>
<th>m</th>
<th>01100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
</tr>
<tr>
<td></td>
<td>8</td>
<td>7</td>
<td>6</td>
<td>5</td>
</tr>
<tr>
<td></td>
<td>4</td>
<td>3</td>
<td>2</td>
<td>1</td>
</tr>
<tr>
<td></td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);

op1 ← FloatValuePair32(DR2m);

IF (FpuIsDisabled(sr) AND IsDelaySlot())

THROW SLOTFPUDIS;

IF (FpuIsDisabled(sr))

THROW FPUDIS;

op2 ← op1;

XD2n ← FloatRegisterPair32(op2);

Exceptions

SLOTFPUDIS, FPUDIS
**FMOV DRm, @Rn**

**Description**
This floating-point instruction stores a pair of single-precision floating-point registers to memory using register indirect with zero-displacement addressing. DR\textsubscript{m} is written as two consecutive 32-bit values to the effective address specified in R\textsubscript{n}. This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

**Operation**

<table>
<thead>
<tr>
<th>FMOV DRm, @Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
</tr>
<tr>
<td>15</td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

\[
\begin{align*}
\text{sr} & \leftarrow \text{ZeroExtend}_{32}(SR); \\
\text{fps} & \leftarrow \text{ZeroExtend}_{32}(FPSCR); \\
\text{op1} & \leftarrow \text{FloatValuePair}_{32}(FP_{2m}); \\
\text{op2} & \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{IF} \ (\text{FpuIsDisabled}(sr) \ \text{AND IsDelaySlot()})) & \ \text{THROW SLOTFPUDIS}; \\
\text{IF} \ (\text{FpuIsDisabled}(sr)) & \ \text{THROW FPUDIS}; \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op2}); \\
\text{WriteMemoryPair}_{32}(\text{address}, \text{op1}); \\
\end{align*}
\]

**Exceptions**
SLOTFPUDIS, FPUDIS, WADDER, WTLBMISS, WRITEMPROT, FIRSTWRITE

**Note**
FMOV DRm, @-Rn

Description

This floating-point instruction stores a pair of single-precision floating-point registers to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 8 to give the effective address. DRm is written as two consecutive 32-bit values to the effective address.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV DRm, @-Rn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>0111</th>
</tr>
</thead>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValuePair32(FP2m);
op2 ← SignExtend32(Rn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
address ← ZeroExtend32(op2 - 8);
WriteMemoryPair32(address, op1);
op2 ← address;
Rn ← Register(op2);
Exceptions
SLOTFPUDIS, FPDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
FMOV DRm, @(R0, Rn)

Description

This floating-point instruction stores a pair of single-precision floating-point registers to memory using register indirect addressing. The effective address is formed by adding R0 to Rn. DRm is written as two consecutive 32-bit values to the effective address.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV DRm, @(R0, Rn)

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>00111</th>
</tr>
</thead>
<tbody>
<tr>
<td>5</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
r0 ← SignExtend32(R0);
np1 ← FloatValuePair32(FP2m);
op2 ← SignExtend32(Rn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
address ← ZeroExtend32(r0 + op2);
WriteMemoryPair32(address, op1);

Exceptions

SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
**FMOV.S FRm, FRn**

**Description**

This floating-point instruction reads a single-precision floating-point value from FRm and copies it to FRn. This is a bit-by-bit copy with no interpretation or conversion of the value.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

**Operation**

```
FMOV.S FRm, FRn
```

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>1100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

Available only when SZ=0

```
sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
op2 ← op1;
FRn ← FloatRegister32(op2);
```

**Exceptions**

SLOTFPUDIS, FPUDIS
FMOVS FRm, @Rn

Description

This floating-point instruction stores a single-precision floating-point register to memory using register indirect with zero-displacement addressing. The 32-bit value of FRm is written to the effective address specified in Rn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOVS FRm, @Rn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>1010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
<tr>
<td></td>
<td>7</td>
<td>4</td>
<td>3</td>
</tr>
<tr>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when SZ=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← SignExtend32(Rn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
address ← ZeroExtend32(op2);
WriteMemory32(address, op1);

Exceptions

SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
FMOV.S FRm, @-Rn

Description
This floating-point instruction stores a single-precision floating-point register to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of FRm is written to the effective address.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

```
FMOV.S FRm, @-Rn

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>1011</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
```

Available only when SZ=0

```
sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← SignExtend32(Rn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
address ← ZeroExtend32(op2 - 4);
WriteMemory32(address, op1);
op2 ← address;
Rn ← Register(op2);
```
Exceptions
SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
**FMOV.S FRm, @(R0, Rn)**

**Description**

This floating-point instruction stores a single-precision floating-point register to memory using register indirect addressing. The effective address is formed by adding R0 to Rn. The 32-bit value of FRm is written to the effective address.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

**Operation**

```
FMOV.S FRm, @(R0, Rn)
```

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>0111</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

Available only when SZ=0

```
sr ← ZeroExtend32(SR);
r0 ← SignExtend32(R0);
op1 ← FloatValue32(FRm);
op2 ← SignExtend32(Rn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
address ← ZeroExtend32(r0 + op2);
WriteMemory32(address, op1);
```

**Exceptions**

SLOTFPUDIS, FPUDIS, WADDER, WTLBMISS, WRITEPROT, FIRSTWRITE

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
FMOV XDM, DRn

Description
This floating-point instruction reads a pair of single-precision floating-point values from XDM and copies them to DRn. This is a bit-by-bit copy with no interpretation or conversion of the values.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV XDM, DRn

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>11100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>9</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValuePair32(XDm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op2 ← op1;
DR2n ← FloatRegisterPair32(op2);

Exceptions
SLOTFPUDIS, FPUDIS
FMOV XDM, XDN

Description
This floating-point instruction reads a pair of single-precision floating-point values from XDM and copies them to XDn. This is a bit-by-bit copy with no interpretation or conversion of the values.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

<table>
<thead>
<tr>
<th>FMOV XDM, XDN</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111 n 1 m 11100</td>
</tr>
<tr>
<td>15 12 11 9 7 6 5 4 0</td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

\[
\begin{align*}
\text{sr} &\leftarrow \text{ZeroExtend}_{32}(SR); \\
\text{fps} &\leftarrow \text{ZeroExtend}_{32}(FPSCR); \\
\text{op1} &\leftarrow \text{FloatValue}_{64}(\text{XDM}) ; \\
\text{IF (FpuIsDisabled(sr) AND IsDelaySlot())} &\text{THROW SLOTFPUDIS; } \\
\text{IF (FpuIsDisabled(sr))} &\text{THROW FPUDIS; } \\
\text{op2} &\leftarrow \text{op1}; \\
\text{XD}_{2n} &\leftarrow \text{FloatRegister}_{64}(\text{op2}); \\
\end{align*}
\]

Exceptions
SLOTFPUDIS, FPUDIS
FMOV XDm, @Rn

Description

This floating-point instruction stores a pair of single-precision floating-point registers to memory using register indirect with zero-displacement addressing. XDm is written as two consecutive 32-bit values to the effective address specified in Rn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV XDm, @Rn

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValuePair32(XD2m);
op2 ← SignExtend32(Rn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
  THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
  THROW FPUDIS;
address ← ZeroExtend32(op2);
WriteMemoryPair32(address, op1);

Exceptions

SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
FMOV XDM, @-Rn

Description

This floating-point instruction stores a pair of single-precision floating-point registers to memory using register indirect with pre-decrement addressing. \( R_n \) is pre-decremented by 8 to give the effective address. \( X_{Dm} \) is written as two consecutive 32-bit values to the effective address.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

\[
\text{FMOV XDM, @-Rn}
\]

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>11011</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>7</td>
<td>5</td>
<td>4</td>
</tr>
<tr>
<td>3</td>
<td>2</td>
<td>1</td>
<td></td>
</tr>
</tbody>
</table>

Available only when \( PR=0 \) and \( SZ=1 \)

\[
\begin{align*}
sr & \leftarrow \text{ZeroExtend}_{32}(SR); \\
fps & \leftarrow \text{ZeroExtend}_{32}(FPSCR); \\
op1 & \leftarrow \text{FloatValuePair}_{32}(X_{D2m}); \\
op2 & \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{IF} \ (\text{FpuIsDisabled}(sr) \ \text{AND} \ \text{IsDelaySlot}()) & \ 	ext{THROW SLOTFPUDIS}; \\
\text{IF} \ (\text{FpuIsDisabled}(sr)) & \ 	ext{THROW FPUDIS}; \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(op2 - 8); \\
\text{WriteMemoryPair}_{32}(\text{address}, \text{op1}); \\
op2 & \leftarrow \text{address}; \\
R_n & \leftarrow \text{Register}(op2); \\
FPSCR & \leftarrow \text{ZeroExtend}_{32}(fps);
\end{align*}
\]
Exceptions
SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
FMOV XDm, @(R0, Rn)

Description
This floating-point instruction stores a pair of single-precision floating-point registers to memory using register indirect addressing. The effective address is formed by adding R0 to Rn. XDm is written as two consecutive 32-bit values to the effective address.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV XDm, @(R0, Rn)

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>10111</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
r0 ← SignExtend32(R0);
op1 ← FloatValuePair32(XD2m);
op2 ← SignExtend32(Rn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
address ← ZeroExtend32(r0 + op2);
WriteMemoryPair32(address, op1);

Exceptions
SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
FMOV @Rm, DRn

Description
This floating-point instruction loads a pair of single-precision floating-point registers from memory using register indirect with zero-displacement addressing. Two consecutive 32-bit values are read from the effective address specified in Rm and loaded into DRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV @Rm, DRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>1000</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUIS;
address ← ZeroExtend32(op1);
op2 ← ReadMemoryPair32(address);
FP2n ← FloatRegisterPair32(op2);

Exceptions
SLOTFPUDIS, FPUIS, RDDERR, RTLBMIS, READPROT

Note
FMOV @Rm+, DRn

Description

This floating-point instruction loads a pair of single-precision floating-point registers from memory using register indirect with post-increment addressing. Two consecutive 32-bit values are read from the effective address specified in Rm and loaded into DRn. Rm is post-incremented by 8.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV @Rm+, DRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>1001</th>
</tr>
</thead>
<tbody>
<tr>
<td>12</td>
<td>11</td>
<td>9</td>
<td>8</td>
<td>4</td>
</tr>
<tr>
<td>3</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
THROW FPUDIS;
address ← ZeroExtend32(op1);
op2 ← ReadMemoryPair32(address);
op1 ← op1 + 8;
Rm ← Register(op1);
FP2n ← FloatRegisterPair32(op2);

Exceptions

SLOTFPUDIS, FPUDIS, RADERR, RTLBMIS, READPROT

Note
FMOV @(R0, Rm), DRn

Description
This floating-point instruction loads a pair of single-precision floating-point registers from memory using register indirect addressing. The effective address is formed by adding R0 to Rm. Two consecutive 32-bit values are read from the effective address and loaded into DRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV @(R0, Rm), DRn

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>0110</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>9</td>
<td>12</td>
<td></td>
<td>7</td>
<td>4</td>
</tr>
<tr>
<td></td>
<td>3</td>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
r0 ← SignExtend32(R0);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
THROW FPUDIS;
address ← ZeroExtend32(r0 + op1);
op2 ← ReadMemoryPair32(address);
FP2n ← FloatRegisterPair32(op2);

Exceptions
SLOTFPUDIS, FPUDIS, RADDERR, RTLBMISS, READPROT

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
FMOV.S @Rm, FRn

Description

This floating-point instruction loads a single-precision floating-point register from memory using register indirect with zero-displacement addressing. A 32-bit value is read from the effective address specified in Rm and loaded into FRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV.S @Rm, FRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>1000</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
</tr>
</tbody>
</table>

Available only when SZ=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
address ← ZeroExtend32(op1);
op2 ← ReadMemory32(address);
FR2n ← FloatRegister32(op2);

Exceptions

SLOTFPUDIS, FPUDIS, RADDERR, RTLBMISS, READPROT

Note
FMOV.S @Rm+, FRn

Description

This floating-point instruction loads a single-precision floating-point register from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in Rm and loaded into FRn; Rm is post-incremented by 4.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV.S @Rm+, FRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>1001</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when SZ=0

sr ← ZeroExtend32(SR);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
address ← ZeroExtend32(op1);
op2 ← ReadMemory32(address);
op1 ← op1 + 4;
Rm ← Register(op1);
FRn ← FloatRegister32(op2);

Exceptions

SLOTFPUDIS, FPUDIS, RADDERR, RTLBMISS, READPROT

Note
FMOV.S @(R0, Rm), FRn

Description

This floating-point instruction loads a single-precision floating-point register from memory using register indirect addressing. The effective address is formed by adding R0 to Rn. A 32-bit value is read from the effective address and loaded into FRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV.S @(R0, Rm), FRn

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>0110</th>
</tr>
</thead>
<tbody>
<tr>
<td>5</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

Available only when SZ=0

sr ← ZeroExtend32(SR);
r0 ← SignExtend32(R0);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
address ← ZeroExtend32(r0 + op1);
op2 ← ReadMemory32(address);
FRn ← FloatRegister32(op2);

Exceptions

SLOTFPUDIS, FPUDIS, RADDERR, RTLBMISS, READPROT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
FMOV @Rm, XDn

Description
This floating-point instruction loads a pair of single-precision floating-point registers from memory using register indirect with zero-displacement addressing. Two consecutive 32-bit values are read from the effective address specified in Rm and loaded into XDn.
This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV @Rm, XDn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>1</th>
<th>m</th>
<th>1000</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

\[
\begin{align*}
\text{sr} & \leftarrow \text{ZeroExtend}_{32}(\text{SR}); \\
\text{fps} & \leftarrow \text{ZeroExtend}_{32}(\text{FPSCR}); \\
\text{op1} & \leftarrow \text{SignExtend}_{32}(\text{Rm}); \\
\text{IF} & \left(\text{FpuIsDisabled}(\text{sr}) \text{ AND } \text{IsDelaySlot})\right) \\
\text{throw} & \text{SLOTFPUDIS}; \\
\text{IF} & \left(\text{FpuIsDisabled}(\text{sr})\right) \\
\text{throw} & \text{FPUDIS}; \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op1}); \\
\text{op2} & \leftarrow \text{ReadMemoryPair}_{32}(\text{address}); \\
\text{XD}_{2n} & \leftarrow \text{FloatRegisterPair}_{32}(\text{op2}); \\
\end{align*}
\]

Exceptions
SLOTFPUDIS, FPUDIS, RADDERR, RTLBMIS, READPROT

Note
FMOV @Rm+, XDn

Description
This floating-point instruction loads a pair of single-precision floating-point registers from memory using register indirect with post-increment addressing. Two consecutive 32-bit values are read from the effective address specified in Rm and loaded into XDn. Rm is post-incremented by 8.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FMOV @Rm+, XDn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>1</th>
<th>m</th>
<th>1001</th>
</tr>
</thead>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
address ← ZeroExtend32(op1);
op2 ← ReadMemoryPair32(address);
op1 ← op1 + 8;
Rm ← Register(op1);
XD2n ← FloatRegisterPair32(op2);

Exceptions
SLOTFPUDIS, FPUDIS, RADDERR, RTLBMISS, READPROT

Note
**FMOV @(R0, Rm), XDn**

**Description**

This floating-point instruction loads a pair of single-precision floating-point registers from memory using register indirect addressing. The effective address is formed by adding R0 to Rm. Two consecutive 32-bit values are read from the effective address and loaded into XDn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

**Operation**

```
FMOV @(R0, Rm), XDn
```

<table>
<thead>
<tr>
<th>Field</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
<td>n</td>
</tr>
<tr>
<td></td>
<td>1</td>
</tr>
<tr>
<td></td>
<td>m</td>
</tr>
<tr>
<td></td>
<td>0110</td>
</tr>
</tbody>
</table>

Available only when PR=0 and SZ=1

sr ← ZeroExtend32(SR);
r0 ← SignExtend32(R0);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
THROW FPUDIS;
address ← ZeroExtend32(r0 + op1);
op2 ← ReadMemoryPair32(address);
XD2n ← FloatRegisterPair32(op2);

**Exceptions**

SLOTFPUDIS, FPUDIS, RADDERR, RTLBMISS, READPROT

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
FMUL DRm, DRn

Description

This floating-point instruction performs a double-precision floating-point multiplication. It multiplies DRm by DRn and places the result in DRn. The rounding mode is determined by FPSCR.RM.

Operation

<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>0</th>
<th>m</th>
<th>00010</th>
</tr>
</thead>
<tbody>
<tr>
<td>Available only when PR=1 and SZ=0</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue64(DRm);
op2 ← FloatValue64(DRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
  THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
  THROW FPUDIS;
op2, fps ← FMUL_D(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
  THROW FPUEXC, fps;
IF (FpuCauseE(fps))
  THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
  THROW FPUEXC, fps;
DRn ← FloatRegister64(op2);
FPSCR ← Zero Extend32(fps);

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC
FMUL FRm, FRn

Description

This floating-point instruction performs a single-precision floating-point multiplication. It multiplies FRm by FRn and places the result in FRn. The rounding mode is determined by FPSCR.RM.

Operation

```
FMUL FRm, FRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>m</th>
<th>0010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op2, fps ← FMUL_S(op1, op2, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
IF (FpuCauseE(fps))
    THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
    THROW FPUEXC, fps;
FRn ← FloatRegister32(op2);
FPSCR ← ZeroExtend32(fps);
```

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC
FMUL Special Cases:

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. Disabled: an exception is raised if the FPU is disabled.
2. Invalid: an invalid operation is signaled if either input is a signaling NaN, or if this is a multiplication of a zero by an infinity.
3. Error: an FPU error is signaled if FPSCR.DN is zero, neither input is a NaN and either input is a denormalized number.
4. Inexact, underflow and overflow: these are checked together and can be signaled in combination. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised regardless of whether that condition arose.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>op1 →</th>
<th>+NORM, -NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM, -DNORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>+INF</td>
<td>MUL</td>
<td>+0</td>
<td>+0</td>
<td>-INF</td>
<td>-INF</td>
<td>-INF, +INF</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-INF</td>
<td>+0, -0</td>
<td>+0</td>
<td>+0</td>
<td>qNaN</td>
<td>qNaN</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-0</td>
<td>-0, +0</td>
<td>-0</td>
<td>+0</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+NINF</td>
<td>-INF, -INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td>-INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-NINF</td>
<td>-INF, +INF</td>
<td>qNaN</td>
<td>qNaN</td>
<td>-INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+NINF</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>sNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>

FPU error is indicated by heavy shading and always raises an exception. Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled, inexact, underflow and overflow cases are not shown.

The behavior of the normal 'MUL' case is described by the IEEE754 specification.
FNEG DRn

Description

This floating-point instruction computes the negated value of a double-precision floating-point number. It reads DR_n, inverts the sign bit and places the result in DR_n.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

FNEG DRn

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>00101101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

sr ← ZeroExtend32(SR);
op1 ← FloatValue64(DR_{2n});
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
op1 ← FNEG_D(op1);
DR_{2n} ← FloatRegister64(op1);

Exceptions

SLOTFPUDIS, FPUDIS
FNEG FRn

Description

This floating-point instruction computes the negated value of a single-precision floating-point number. It FRn, inverts the sign bit and places the result in FRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations. There are no special floating-point cases for this instruction.

Operation

<table>
<thead>
<tr>
<th>FNEG FRn</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
</tr>
<tr>
<td>n</td>
</tr>
<tr>
<td>01001101</td>
</tr>
</tbody>
</table>

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op1 ← FNEG_S(op1);
FRn ← FloatRegister32(op1);

Exceptions

SLOTFPUDIS, FPUDIS
FRCHG

Description

This floating-point instruction toggles the FPSCR.FR bit. This has the effect of switching the basic and extended banks of the floating-point register file.

Operation

<table>
<thead>
<tr>
<th>Bit</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>1111101111111101</td>
</tr>
</tbody>
</table>

Available only when PR=0

```
sr ← ZeroExtend32(SR);
fr ← ZeroExtend1(SR.FR);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
  THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
  THROW FPUDIS;
fr ← fr ⊕ 1;
SR.FR ← Bit(fr);
```

Exceptions

SLOTFPUDIS, FPUDIS
**FSCHG**

**Description**

This floating-point instruction toggles the FPSCR.SZ bit. This has the effect of changing the size of the data transfer for subsequent floating-point loads, stores and moves. Two transfer sizes are available: FPSCR.SZ = 0 indicates 32-bit transfer and FPSCR.SZ = 1 indicates 64-bit transfer.

**Operation**

```
FSCHG
```

<p>| | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>$15$</td>
<td>1111001111111101</td>
</tr>
</tbody>
</table>

Available only when PR=0

```
sr ← ZeroExtend32(SR);
sz ← ZeroExtend1(SR.SZ);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
sz ← sz ⊕ 1;
SR.SZ ← Bit(sz);
```

**Exceptions**

SLOTFPUDIS, FPUDIS
FSQRT DRn

Description
This floating-point instruction performs a double-precision floating-point square root. It extracts the square root of DRn and places the result in DRn. The rounding mode is determined by FPSCR.RM.

Operation

```
FSQRT DRn
```

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>0</th>
<th>001101101</th>
</tr>
</thead>
</table>

Available only when PR=1 and SZ=0

```
sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue64(DRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op1, fps ← FSQRT_D(op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
IF (FpuCauseE(fps))
    THROW FPUEXC, fps;
IF (FpuEnableI(fps))
    THROW FPUEXC, fps;
DRn ← FloatRegister64(op1);
FPSCR ← ZeroExtend32(fps);
```

Exceptions
SLOTFPUDIS, FPUDIS, FPUEXC
**FSQRT FRn**

**Description**

This floating-point instruction performs a single-precision floating-point square root. It extracts the square root of FRn and places the result in FRn. The rounding mode is determined by FPSCR.RM.

**Operation**

```
<table>
<thead>
<tr>
<th></th>
<th>1111</th>
<th>n</th>
<th>7</th>
<th>01101101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>0</td>
</tr>
</tbody>
</table>
```

Available only when PR=0

sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op1, fps ← FSQRT_S(op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
IF (FpuCauseE(fps))
    THROW FPUEXC, fps;
IF (FpuEnableI(fps))
    THROW FPUEXC, fps;
FRn ← FloatRegister32(op1);
FPSCR ← ZeroExtend32(fps);
```

**Exceptions**

SLOTFPUDIS, FPUDIS, FPUEXC
**PRELIMINARY DATA**

**FSQRT Special Cases:**
When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. **Disabled:** an exception is raised if the FPU is disabled.
2. **Invalid:** an invalid operation is signaled if the input is a signaling NaN, or if this is a square root of a number less than zero (including negative infinity and negative normalized/denormalized numbers, but excluding negative zero).
3. **Error:** an FPU error is signaled if FPSCR.DN is zero and the input is a positive denormalized number.
4. **Inexact:** only inexact is checked. When inexact exceptions are requested by the user, an exception is always raised regardless of whether that condition arose. Overflow and underflow do not occur.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>op1</th>
<th>+NORM</th>
<th>-NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM</th>
<th>-DNORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>SQRT</td>
<td>qNaN</td>
<td>+0</td>
<td>-INF</td>
<td>+INF</td>
<td>qNaN</td>
<td>na</td>
<td>qNaN</td>
<td>qNaN</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

FPU error is indicated by heavy shading and always raises an exception. Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled and inexact cases are not shown.

The behavior of the normal ‘SQRT’ case is described by the IEEE 754 specification.
FSTS FPUL, FRn

**Description**
This floating-point instruction copies FPUL to FRn.

This instruction is not considered an arithmetic operation, and it does not signal invalid operations.

**Operation**

```
FSTS FPUL, FRn
```

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
<td>n</td>
<td>00001101</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

```
sr ← ZeroExtend32(SR);
fpul ← SignExtend32(FPUL);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
op1 ← fpul;
FRn ← FloatRegister32(op1);
```

**Exceptions**

SLOTFPUDIS, FPUDIS
**FSUB DRm, DRn**

**Description**

This floating-point instruction performs a double-precision floating-point subtraction. It subtracts DRm from DRn and places the result in DRn. The rounding mode is determined by FPSCR.RM.

**Operation**

```
FSUB DRm, DRn
```

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>10</th>
<th>9</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
<td>n</td>
<td>0</td>
<td>m</td>
<td>00001</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

- \(sr \leftarrow ZeroExtend_{32}(SR)\);
- \(fps \leftarrow ZeroExtend_{32}(FPSCR)\);
- \(op1 \leftarrow FloatValue_{64}(DR_{2m})\);
- \(op2 \leftarrow FloatValue_{64}(DR_{2n})\);
- IF (FpuIsDisabled(sr) AND IsDelaySlot())
  - THROW SLOTFPUDIS;
- IF (FpuIsDisabled(sr))
  - THROW FPUDIS;
- \(op2, fps \leftarrow FSUB\_D(op2, op1, fps)\);
- IF (FpuEnableV(fps) AND FpuCauseV(fps))
  - THROW FPUEXC, fps;
- IF (FpuCauseE(fps))
  - THROW FPUEXC, fps;
- IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
  - THROW FPUEXC, fps;
- \(DR_{2n} \leftarrow FloatRegister_{64}(op2)\);
- \(FPSCR \leftarrow ZeroExtend_{32}(fps)\);

**Exceptions**

SLOTFPUDIS, FPUDIS, FPUEXC
FSUB FRm, FRn

Description
This floating-point instruction performs a single-precision floating-point subtraction. It subtracts FRm from FRn and places the result in FRn. The rounding mode is determined by FPSCR.RM.

Operation
FSUB FRm, FRn

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
<td>n</td>
<td>m</td>
<td>0001</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Available only when PR=0

```
sr ← ZeroExtend32(SR);
fps ← ZeroExtend32(FPSCR);
op1 ← FloatValue32(FRm);
op2 ← FloatValue32(FRn);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
op2, fps ← FSUB_S(op2, op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
    THROW FPUEXC, fps;
IF (FpuCauseE(fps))
    THROW FPUEXC, fps;
IF ((FpuEnableI(fps) OR FpuEnableO(fps)) OR FpuEnableU(fps))
    THROW FPUEXC, fps;
FRn ← FloatRegister32(op2);
FPSCR ← ZeroExtend32(fps);
```

Exceptions
SLOTFPUDIS, FPUDIS, FPUEXC
FSUB Special Cases:

When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. Disabled: an exception is raised if the FPU is disabled.
2. Invalid: an invalid operation is signaled if either input is a signaling NaN, or if the inputs are similarly signed infinities.
3. Error: an FPU error is signaled if FPSCR.DN is zero, neither input is a NaN and either input is a denormalized number.
4. Inexact, underflow and overflow: these are checked together and can be signaled in combination. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised regardless of whether that condition arose.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>op2 →</th>
<th>+NORM, -NORM</th>
<th>+0</th>
<th>-0</th>
<th>+INF</th>
<th>-INF</th>
<th>+DNORM, -DNORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>+NORM</td>
<td>SUB</td>
<td>SUB</td>
<td>SUB</td>
<td>-INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+0</td>
<td>op2</td>
<td>+0</td>
<td>-0</td>
<td>+INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-0</td>
<td>op2</td>
<td>+0</td>
<td>+0</td>
<td>+INF</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+INF</td>
<td>-INF</td>
<td>-INF</td>
<td>-INF</td>
<td>qNaN</td>
<td>-INF</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>-INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>+INF</td>
<td>qNaN</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>+,-DNORM</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>n/a</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
<tr>
<td>sNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
<td>qNaN</td>
</tr>
</tbody>
</table>

FPU error is indicated by heavy shading and always raises an exception. Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled, inexact, underflow and overflow cases are not shown.

The behavior of the normal ‘SUB’ case is described by the IEEE 754 specification.
**FTRC DRm, FPUL**

**Description**

This floating-point instruction performs a double-precision floating-point to signed 32-bit integer conversion. It reads a double-precision value from DR<sub>m</sub>, converts it to a signed 32-bit integral range and places the result in FPUL. The conversion is achieved by rounding to zero (truncation) with saturation to the limits of the target signed integral range. The value of FPSCR.RM is ignored.

**Operation**

<table>
<thead>
<tr>
<th>FTRC DRm, FPUL</th>
</tr>
</thead>
<tbody>
<tr>
<td>1111</td>
</tr>
</tbody>
</table>

Available only when PR=1 and SZ=0

\[
sr \leftarrow \text{ZeroExtend}_{32}(SR);
fps \leftarrow \text{ZeroExtend}_{32}(FPSCR);
op1 \leftarrow \text{FloatValue}_{64}(\text{DR}_m);
\]

IF (FpuIsDisabled(sr) AND IsDelaySlot())
   THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
   THROW FPUDIS;
fpul, fps \leftarrow FTRC_DL(op1, fps);
IF (FpuEnableV(fps) AND FpuCauseV(fps))
   THROW FPUEXC, fps;
FPUL \leftarrow \text{ZeroExtend}_{32}(fpul);
FPSCR \leftarrow \text{ZeroExtend}_{32}(fps);

**Exceptions**

SLOTFPUDIS, FPUDIS, FPUEXC
FTRC FRm, FPUL

Description

This floating-point instruction performs a single-precision floating-point to signed 32-bit integer conversion. It reads a single-precision value from FRm, converts it to a signed 32-bit integral range and places the result in FPUL. The conversion is achieved by rounding to zero (truncation) with saturation to the limits of the target signed integral range. The value of FPSCR.RM is ignored.

Operation

\[
\begin{array}{c|c|c|c|c}
 & 1111 & m & \infty & 00111101 \\
15 & 12 & 11 & 7 & 0
\end{array}
\]

Available only when PR=0

\[
\begin{align*}
\text{sr} & \leftarrow \text{ZeroExtend}_{32}(SR); \\
\text{fps} & \leftarrow \text{ZeroExtend}_{32}(FPSCR); \\
\text{op1} & \leftarrow \text{FloatValue}_{32}(FRm); \\
\text{IF } (\text{FpuIsDisabled}(sr) \text{ AND IsDelaySlot}) & \text{ THROW SLOTFPUDIS;} \\
\text{IF } (\text{FpuIsDisabled}(sr)) & \text{ THROW FPUDIS;} \\
\text{fpul, fps} & \leftarrow \text{FTRC_SL}(\text{op1}, \text{fps}); \\
\text{IF } (\text{FpuEnableV}(fps) \text{ AND FpuCauseV}(fps)) & \text{ THROW FPUEXC, fps;} \\
\text{FPSCR} & \leftarrow \text{ZeroExtend}_{32}(fps); \\
\text{FPUL} & \leftarrow \text{ZeroExtend}_{32}(fpul);
\end{align*}
\]

Exceptions

SLOTFPUDIS, FPUDIS, FPUEXC

FTRC Special Cases:

Regardless of FPSCR.DN, denormalized numbers are treated as 0. These instructions do not cause FPU Error.
Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1. Disabled: an exception is raised if the FPU is disabled.
2. Invalid: an invalid operation is signaled if the conversion overflows the target range. This is caused by out-of-range normalized numbers, infinities and NaNs.

If the instruction does not raise an exception, a result is generated according to the following table.

<table>
<thead>
<tr>
<th>op1 →</th>
<th>+NORM (in range)</th>
<th>-NORM (in range)</th>
<th>+0</th>
<th>-0</th>
<th>+INF or -INF or +DNORM or -DNORM</th>
<th>qNaN</th>
<th>sNaN</th>
</tr>
</thead>
<tbody>
<tr>
<td>TRC</td>
<td>TRC</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>2^231 - 1</td>
<td>0</td>
<td>2^31</td>
</tr>
<tr>
<td>TRC</td>
<td>TRC</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>-2^31</td>
<td>0</td>
<td>2^31</td>
</tr>
</tbody>
</table>

Invalid operations are indicated by light shading and raise an exception if enabled. FPU disabled cases are not shown.

The behavior of the normal ‘TRC’ case is described by the IEEE 754 specification, though only the round to zero rounding mode is supported by this instruction.
FTRV XMTRX, FVn

Description

This floating-point instruction multiplies the matrix, XMTRX, with a vector, FVn, and places the resulting vector in FVn. The matrix contains sixteen single-precision floating-point values. The vector contains four single-precision floating-point values. The matrix-vector multiplication is specified as:

\[
F_{Rn} = \sum_{i=0}^{3} X_{Fi4} \times F_{Rni}
\]

\[
F_{Rn+1} = \sum_{i=0}^{3} X_{F1i4} \times F_{Rni}
\]

\[
F_{Rn+2} = \sum_{i=0}^{3} X_{F2i4} \times F_{Rni}
\]

\[
F_{Rn+3} = \sum_{i=0}^{3} X_{F3i4} \times F_{Rni}
\]

This is an approximate computation. The specified error in the \(p^{th}\) element value of the result vector:

\[
\text{spec\_error}_p = \begin{cases} 
0 & \text{if (e} \text{p}_{m} = e\text{z)} \\
2^{e \text{p}_{m} - 24} + 2^{E - 24 + r_{m}} & \text{if (e} \text{p}_{m} \neq e\text{z)}
\end{cases}
\]

where

\[
r_{m} = \begin{cases} 
0 & \text{if (round to nearest)} \\
1 & \text{if (round to zero)}
\end{cases}
\]

\(E\) = unbiased exponent value of the result

\(e\text{z} < -252\)

\(e \text{p}_{m} = \max \{e \text{p}_0, e \text{p}_1, e \text{p}_2, e \text{p}_3\}\)

\(e \text{p}_1 = \text{pre-normalized exponent of the product } X_{Fp+i4} \text{ and } F_{Rn+i}\)
eXF_{p+i4} = biased exponent value of XF_{p+i4}
eFR_{n+i} = biased exponent value of FR_{n+i}

\[ eP_i = \begin{cases} 
   e & \text{if } \left(\text{XF}_{p+i4} = 0.0\right)\text{OR} \left(\text{FR}_{n+i} = 0.0\right) \\
   \max(eXF_{p+i4} - 1) + \max(eFR_{n+i} - 1) - 254 & \text{otherwise}
\end{cases} \]

**Operation**

**FTRV XMTRX, FVn**

<table>
<thead>
<tr>
<th>1111</th>
<th>n</th>
<th>011111101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11 10 9 0</td>
</tr>
</tbody>
</table>

Available only when PR=0

\[
\text{sr} \leftarrow \text{ZeroExtend}_{32}(SR); \\
\text{fps} \leftarrow \text{ZeroExtend}_{32}(\text{FPSCR}); \\
\text{xmtrx} \leftarrow \text{FloatValueMatrix}_{32}(\text{XMTRX}); \\
\text{op1} \leftarrow \text{FloatValueVector}_{32}(\text{FV}_{40}); \\
\text{IF (FpuIsDisabled(sr) AND IsDelaySlot())} \\
\quad \text{THROW SLOTFPUDIS;} \\
\text{IF (FpuIsDisabled(sr))} \\
\quad \text{THROW FPUDIS;} \\
\text{op1, fps} \leftarrow \text{FTRV}_S(\text{xmtrx}, \text{op1}, \text{fps}); \\
\text{IF (FpuEnableV(fps) OR FpuEnableI(fps) OR FpuEnableO(fps) OR FpuEnableU(fps))} \\
\quad \text{THROW FPUEXC, fps;} \\
\text{FV}_{40} \leftarrow \text{FloatRegisterVector}_{32}(\text{op1}); \\
\text{FPSCR} \leftarrow \text{ZeroExtend}_{32}(\text{fps});
\]

**Exceptions**

SLOTFPUDIS, FPUDIS, FPUEXC

**FTRV Special Cases:**

FTRV is an approximate instruction. Denormalized numbers are supported:

- When FPSCR.DN is 0, denormalized numbers are treated as their denormalized value in the FTRV.S calculation. This instruction never signals an FPU error.
• When FPSCR.DN is 1, a positive denormalized number is treated as +0 and a negative denormalized number as -0. This flush-to-zero treatment is applied before exception detection and special case handling.

Exceptional conditions are checked in the order given below. Execution of the instruction is terminated once any check detects an exceptional condition.

1 Disabled: an exception is raised if the FPU is disabled.

2 Invalid: an invalid operation is signaled if any of the inputs is a signaling NaN, there is a multiplication of a zero by an infinity, or there is an addition of differently signed infinities where none of the inputs is a qNaN.

   The multiplication is performed with sufficient precision to avoid overflow, and therefore the multiplication of any two finite numbers does not produce an infinity. The multiplication result will be an infinity only if there is a multiplication of an infinity with a normalized number, an infinity with a denormalized number or an infinity with an infinity.

   The addition of differently signed infinities is detected if there is (at least) one positive infinity and (at least) one negative infinity in the set of 4 multiplication results in any of the 4 inner-products calculated by this instruction.

   This instruction is not capable of checking its inputs for invalid operations and raising an invalid operation exception accordingly. Instead, this instruction always raises an invalid operation exception if this exception is requested by the user. If this exception is not requested by the user, then qNaN results are correctly produced for invalid operations as described above.

3 Inexact, underflow and overflow: these are checked together and can be signaled in combination. This is an approximate instruction and inexact is signaled except where special cases occur. Precise details of the approximate inner-product algorithm, including the detection of underflow and overflow cases, are implementation dependent. When inexact, underflow or overflow exceptions are requested by the user, an exception is always raised regardless of whether that condition arose.

If the instruction does not raise an exception, results are generated according to the following tables. The special case tables are applied separately with the appropriate vector operands to each of the four inner-products calculated by this instruction.
**FTRV Special Cases (continued):**

Each of the 4 pairs of multiplication operands \( \text{op1 and op2} \) is selected from corresponding elements of the two 4-element source vectors and multiplied:

| \( \text{op1} \rightarrow \) | \( \rightarrow \) | \( \downarrow \text{op2} \) | \( +0 \) | \( -0 \) | \( +\infty \) | \( -\infty \) | \( q\text{NaN} \) | \( s\text{NaN} \) |
|---|---|---|---|---|---|---|---|
| \(+0\) | \(+0\) | \(+0\) | \(+0\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) |
| \(-0\) | \(-0\) | \(-0\) | \(-0\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) |
| \(+\infty\) | \(+\infty\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \(+\infty\) | \(+\infty\) | \( q\text{NaN} \) | \(+\infty\) |
| \(-\infty\) | \(-\infty\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \(-\infty\) | \(+\infty\) | \( q\text{NaN} \) | \(+\infty\) |
| \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) |
| \(s\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) |

If any of the multiplications evaluates to \( q\text{NaN} \), then the result of the instruction is \( q\text{NaN} \) and no further analysis need be performed. In the ‘FTRVMUL’, \(+0\), \(-0\), \(+\infty\) and \(-\infty\) cases, the 4 addition operands (labelled intermediate 0 to 3) are summed:

| Intermediate 0 | \( \text{FTRVMUL}, +0, -0 \) | \(+\infty\) | \(-\infty\) | \( \text{FTRVMUL}, +0, -0 \) | \(+\infty\) | \(-\infty\) | \( \text{FTRVMUL}, +0, -0 \) | \(+\infty\) | \(-\infty\) |
|---|---|---|---|---|---|---|---|---|
| \(+0\) | \(+0\) | \(+0\) | \(+0\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) |
| \(-0\) | \(-0\) | \(-0\) | \(-0\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) | \( q\text{NaN} \) |
| \(+\infty\) | \(+\infty\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \(+\infty\) | \(+\infty\) | \( q\text{NaN} \) | \(+\infty\) |
| \(-\infty\) | \(-\infty\) | \( q\text{NaN} \) | \( q\text{NaN} \) | \(-\infty\) | \(+\infty\) | \( q\text{NaN} \) | \(+\infty\) |
| \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) |
| \(s\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) | \(q\text{NaN}\) |

Inexact is signaled in the ‘FTRVADD’ case. Exception cases are not indicated by shading for this instruction. Where the behavior is not a special case, the instruction computes an approximate result using an implementation-dependent algorithm.
JMP @Rn

Description
This instruction is a delayed unconditional branch used for jumping to the target address specified in Rn.

Operation

JMP @Rn

\[
\begin{array}{cccc}
0100 & n & 00101011 \\
15 & 12 & 11 & 8 & 7 & 0
\end{array}
\]

\[
op1 \leftarrow \text{SignExtend}_{32}(R_n);
IF (IsDelaySlot())
\quad \text{THROW ILLSLOT};
target \leftarrow op1;
delayedpc \leftarrow target \land \neg 0x1;
PC' \leftarrow \text{Register}(delayedpc);
\]

Exceptions
ILLSLOT

Note
The delay slot is executed before branching. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
JSR @Rn

Description
This instruction is a delayed unconditional branch used for jumping to the subroutine starting at the target address specified in Rn. The address of the instruction immediately following the delay slot is copied to PR to indicate the return address.

Operation

```
JSR @Rn
```

| pc ← SignExtend32(PC); |
| op1 ← SignExtend32(Rn); |
| IF (IsDelaySlot()) |
|    THROW ILLSLOT; |
| delayedpr ← pc + 4; |
| target ← op1; |
| delayedpc ← target ∧ (~ 0x1); |
| PR* ← Register(delayedpr); |
| PC* ← Register(delayedpc); |

Exceptions
ILLSLOT

Note
The delay slot is executed before branching and before PR is updated. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.
LDC Rm, GBR

Description
This instruction copies $R_m$ to GBR.

Operation

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>m</th>
<th>00011110</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
<tr>
<td>op1 ←</td>
<td>SignExtend32(Rm);</td>
<td>gbr ← op1;</td>
<td>GBR ← Register(gbr);</td>
</tr>
</tbody>
</table>

Note
LDC Rm, SR

Description
This instruction copies Rm to SR, it is a privileged instruction.

Operation
LDC Rm, SR

```
0100 m 00001110
```

```
md ← ZeroExtend1(MD);
IF (md = 0)
    THROW RESINST;
o1 ← SignExtend32(Rm);
sr ← o1;
SR ← Register(sr);
```

Exceptions
RESINST

Note
LDC Rm, VBR

Description
This instruction copies Rm to VBR, it is a privileged instruction.

Operation
LDC Rm, VBR

\[
\begin{array}{cccccc}
  & 0100 & m & 00101110 & \\
15 & 12 & 11 & 8 & 7 & 0
\end{array}
\]

\[
md \leftarrow \text{ZeroExtend}_1(MD);
\]
\[
\text{IF} (md = 0)
\]
\[
\text{THROW RESINST;}
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
vbr \leftarrow \text{op1};
\]
\[
VBR \leftarrow \text{Register}(vbr);
\]

Exceptions
RESINST

Note
LDC Rm, SSR

Description
This instruction copies Rm to SSR, it is a privileged instruction.

Operation
LDC Rm, SSR

\[
\begin{array}{cccc}
0100 & m & 00111110 \\
15 & 12 & 11 & 8 & 7 & 0
\end{array}
\]

\[
md \leftarrow \text{ZeroExtend}_1(MD);
\]
\[
\text{IF } (md = 0)\quad \text{THROW RESINST;}
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
ssr \leftarrow op1;
\]
\[
SSR \leftarrow \text{Register}(ssr);
\]

Exceptions
RESINST

Note
LDC Rm, SPC

Description
This instruction copies \( R_m \) to SPC, it is a privileged instruction.

Operation

\[
\text{LDC Rm, SPC}
\]

<p>| | | | | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>m</td>
<td></td>
<td>01001110</td>
<td></td>
</tr>
</tbody>
</table>

\[
\text{md} \leftarrow \text{ZeroExtend}_1(\text{MD}); \\
\text{IF (md = 0)} \\
\quad \text{THROW RESINST;} \\
\text{op1} \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{spc} \leftarrow \text{op1;} \\
\text{SPC} \leftarrow \text{Register(spc)};
\]

Exceptions
RESINST

Note
**LDC  Rm, DBR**

**Description**
This instruction copies Rm to DBR, it is a privileged instruction.

**Operation**

```plaintext
LDC Rm, SPC
```

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>m</td>
<td>11111010</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

```plaintext
md ← ZeroExtend1(MD);
IF (md = 0)
  THROW RESINST;
op1 ← SignExtend32(Rm);
dbr ← op1;
DBR ← Register(dbr);
```

**Exceptions**

RESINST

**Note**
LDC  Rm, Rn_BANK

Description
This instruction copies R_m to Rn_BANK, it is a privileged instruction.

Operation

LDC Rm, Rn_BANK

\[
\begin{array}{cccccc}
0100 & m & 1 & n & 1110 \\
15 & 12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0 \\
\end{array}
\]

\[\text{md} \leftarrow \text{ZeroExt}_{1} (\text{MD});\]
\[\text{IF (md = 0)};\]
\[\text{THROW RESINST;}\]
\[\text{op1} \leftarrow \text{SignExt}_{32} (R_m);\]
\[\text{rn_bank} \leftarrow \text{op1};\]
\[\text{Rn_BANK} \leftarrow \text{Register(rn_bank)};\]

Exceptions
RESINST

Note
**LDC.L @Rm+, GBR**

**Description**

This instruction loads GBR from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in R_m and loaded into GBR. R_m is post-incremented by 4.

**Operation**

\[
\text{LDC.L @Rm+, GBR}
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>00010111</th>
</tr>
</thead>
</table>

\[
\begin{align*}
op1 & \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(op1); \\
gbr & \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address})); \\
op1 & \leftarrow op1 + 4; \\
R_m & \leftarrow \text{Register}(op1); \\
\text{GBR} & \leftarrow \text{Register}(gbr);
\end{align*}
\]

**Exceptions**

RADDERR, RTLBMISS, READPROT

**Note**
**LDC.L @Rm+, SR**

**Description**

This instruction loads SR from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in Rm and loaded into SR. Rm is post-incremented by 4. This is a privileged instruction.

**Operation**

```
LDC.L @Rm+, SR
```

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>m</td>
<td>00000111</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

- \( md \leftarrow \text{ZeroExtend}_1(MD) \);
- IF (\( md = 0 \))
  - THROW RESINST;
- \( op1 \leftarrow \text{SignExtend}_{32}(R_m) \);
- \( \text{address} \leftarrow \text{ZeroExtend}_{32}(op1) \);
- \( \text{sr} \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address})) \);
- \( op1 \leftarrow op1 + 4 \);
- \( R_m \leftarrow \text{Register}(op1) \);
- \( SR \leftarrow \text{Register}(sr) \);

**Exceptions**

RESINST, RADDERR, RTLBMISS, READPROT

**Note**
LDC.L @Rm+, VBR

Description

This instruction loads VBR from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in Rm and loaded into VBR. Rm is post-incremented by 4. This is a privileged instruction.

Operation

LDC.L @Rm+, VBR

\[
\begin{array}{cccc}
15 & 12 & 11 & 7 & 0 \\
0100 & m & & 00100111 \\
\end{array}
\]

\[
\begin{aligned}
\text{md} & \leftarrow \text{ZeroExtend}_3(MD); \\
\text{IF (md} = 0) & \quad \text{THROW RESINST}; \\
\text{op1} & \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op1}); \\
\text{vbr} & \leftarrow \text{SignExtend}_{32}({\text{ReadMemory}}_{32}(\text{address})); \\
\text{op1} & \leftarrow \text{op1} + 4; \\
R_m & \leftarrow \text{Register}(\text{op1}); \\
\text{VBR} & \leftarrow \text{Register}(\text{vbr}); \\
\end{aligned}
\]

Exceptions

RESINST, RADDERR, RTLBMISS, READPROT

Note
LDC.L @Rm+, SSR

Description
This instruction loads SSR from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in $R_m$ and loaded into SSR. $R_m$ is post-incremented by 4. This is a privileged instruction.

Operation

LDC.L @Rm+, SR

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>m</td>
<td></td>
<td></td>
<td>00110111</td>
<td></td>
</tr>
</tbody>
</table>

```
m ← $\text{ZeroExtend}_32(MD)$;
IF (md = 0)
  THROW RESINST;
op1 ← $\text{SignExtend}_32(R_m)$;
address ← $\text{ZeroExtend}_32(op1)$;
ssr ← $\text{SignExtend}_32(\text{ReadMemory}_32(address))$;
op1 ← op1 + 4;
$R_m$ ← Register(op1);
SSR ← Register(ssr);
```

Exceptions
RESINST, RADDERR, RTLBMISS, READPROT

Note
**LDC.L @Rm+, SPC**

**Description**

This instruction loads SPC from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in \( R_m \) and loaded into SPC. \( R_m \) is post-incremented by 4. This is a privileged instruction.

**Operation**

\[
\text{LDC.L @Rm+, SPC}
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>( m )</td>
<td>0100111</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
\begin{align*}
\text{md} & \leftarrow \text{ZeroExtend}_3(\text{MD}); \\
\text{IF (md} = 0) & \quad \text{THROW RESINST;} \\
\text{op1} & \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op1}); \\
\text{spc} & \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address})); \\
\text{op1} & \leftarrow \text{op1} + 4; \\
R_m & \leftarrow \text{Register(op1)}; \\
\text{SPC} & \leftarrow \text{Register(spc)};
\end{align*}
\]

**Exceptions**

RESINST, RADDERR, RTLBMISS, READPROT

**Note**
**LDC.L @Rm+, DBR**

**Description**

This instruction loads SR from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in \( R_m \) and loaded into DBR. \( R_m \) is post-incremented by 4. This is a privileged instruction.

**Operation**

\[
\text{LDC.L @Rm+, DBR}
\]

<table>
<thead>
<tr>
<th></th>
<th></th>
<th>11</th>
<th>8</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

\[
\text{md} \leftarrow \text{ZeroExtend} _{32}(\text{MD});
\]

IF (\( \text{md} = 0 \))

THROW RESINST;

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_m);
\]

\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{op1});
\]

\[
\text{dbr} \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address}));
\]

\[
\text{op1} \leftarrow \text{op1} + 4;
\]

\[
R_m \leftarrow \text{Register}(\text{op1});
\]

\[
\text{DBR} \leftarrow \text{Register}(\text{dbr});
\]

**Exceptions**

RESINST, RADDERR, RTLBMISS, READPROT

**Note**
LDC.L @Rm+, Rn_BANK

Description
This instruction loads Rn_BANK from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in Rm and loaded into Rn_BANK. Rm is post-incremented by 4. This is a privileged instruction.

Operation

LDC.L @Rm+, Rn_BANK

<p>| | | | | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>m</td>
<td>1</td>
<td>n</td>
<td>0111</td>
</tr>
</tbody>
</table>

md ← ZeroExtend1(MD);
IF (md = 0)
  THROW RESINST;
op1 ← SignExtend32(Rm);
address ← ZeroExtend32(op1);
rn_bank ← SignExtend32(ReadMemory32(address));
op1 ← op1 + 4;
Rm ← Register(op1);
Rn_BANK ← Register(rn_bank);

Exceptions
RESINST, RADDERR, RTLBMISS, READPROT

Note
LDS Rm, FPSCR

Description

This floating-point instruction copies Rm to FPSCR. The setting of FPSCR does not cause any floating-point exceptional conditions to be signaled.

Operation

LDS Rm, FPSCR

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>m</th>
<th>8</th>
<th>01101010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>

sr ← ZeroExtend32(SR);
op1 ← SignExtend32(Rm);
IF (FpuIsDisabled(sr) AND IsDelaySlot())
    THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
    THROW FPUDIS;
fps, pr, sz, fr ← UnpackFPSCR(op1);
FPSCR ← ZeroExtend32(fps);
SR.PR ← Bit(pr);
SR.SZ ← Bit(sz);
SR.FR ← Bit(fr);

Exceptions

SLOTFPUDIS, FPUDIS

Note
LDS.L @Rm+, FPSCR

Description

This floating-point instruction loads FPSCR from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in Rm and loaded into FPSCR. Rm is post-incremented by 4. The setting of FPSCR does not cause any floating-point exceptional conditions to be signaled.

Operation

LDS.L @Rm+, FPSCR

\[
\begin{array}{ccc}
  & 0100 & m \\
15 & 12 & 11 \\
7 & 0 & 01100110
\end{array}
\]

\[
sr \leftarrow \text{ZeroExtend}_{32}(SR); \\
op1 \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{IF} \ (\text{FpuIsDisabled}(sr) \ \text{AND} \ \text{IsDelaySlot}(i)) \\
\quad \text{THROW SLOTFPUDIS}; \\
\text{IF} \ (\text{FpuIsDisabled}(sr)) \\
\quad \text{THROW FPUDIS}; \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(op1); \\
\text{value} \leftarrow \text{ReadMemory}_{32}(\text{address}); \\
fps, pr, sz, fr \leftarrow \text{UnpackFPSCR}(value); \\
op1 \leftarrow op1 + 4; \\
R_m \leftarrow \text{Register}(op1); \\
\text{FPSCR} \leftarrow \text{ZeroExtend}_{32}(fps); \\
SR.PR \leftarrow \text{Bit}(pr); \\
SR.SZ \leftarrow \text{Bit}(sz); \\
SR.FR \leftarrow \text{Bit}(fr);
\]

Exceptions

SLOTFPUDIS, FPUDIS, RADDERR, RTLBMIS, READPROT

Note
LDS Rm, FPUL

Description
This floating-point instruction copies Rm to FPUL.

Operation

\[
\begin{array}{c|c|c|c}
0100 & m & 01011010 \\
15 & 12 & 11 & 8 & 7 & 0 \\
\end{array}
\]

\[
sr \leftarrow \text{ZeroExtend}_{32}(SR);
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
\text{IF (FpuIsDisabled}(sr) \text{ AND IsDelaySlot())}
\]
\[\text{THROW SLOTFPUDIS;}
\]
\[
\text{IF (FpuIsDisabled}(sr))
\]
\[\text{THROW FPUDIS;}
\]
\[
fpul \leftarrow op1;
\]
\[
FPUL \leftarrow \text{ZeroExtend}_{32}(fpul);
\]

Exceptions
SLOTFPUDIS, FPUDIS

Note
**LDS.L @Rm+, FPUL**

**Description**

This floating-point instruction loads FPUL from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in \( R_m \) and loaded into FPUL. \( R_m \) is post-incremented by 4.

**Operation**

\[
\text{LDS.L @Rm+, FPUL}
\]

```
<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>m</td>
<td>\infty</td>
<td>7</td>
<td>01010110</td>
</tr>
</tbody>
</table>
```

\( sr \leftarrow \text{ZeroExtend}_{32}(SR); \)
\( \text{op1} \leftarrow \text{SignExtend}_{32}(R_m); \)
\( \text{IF (FpuIsDisabled}(sr) \text{ AND IsDelaySlot())} \)
\( \quad \text{THROW SLOTFPUDIS;} \)
\( \text{IF (FpuIsDisabled}(sr)) \)
\( \quad \text{THROW FPUDIS;} \)
\( \text{address} \leftarrow \text{ZeroExtend}_{32}(\text{op1}); \)
\( \text{fpul} \leftarrow \text{ReadMemory}_{32}(\text{address}); \)
\( \text{op1} \leftarrow \text{op1} + 4; \)
\( R_m \leftarrow \text{Register}(\text{op1}); \)
\( \text{FPUL} \leftarrow \text{ZeroExtend}_{32}(\text{fpul}); \)

**Exceptions**

SLOTFPUDIS, FPUDIS, RADDERR, RTLBMISS, READPROT

**Note**
LDS Rm, MACH

Description
This instruction copies \( R_m \) to MACH.

Operation

\[
\begin{array}{c|c|c|c|c}
0100 & m & 00001010 \\
15 & 12 & 11 & 8 & 7 & 0
\end{array}
\]

\[ \text{op1} \leftarrow \text{SignExtend}_{32}(R_m); \]
\[ \text{mach} \leftarrow \text{op1}; \]
\[ \text{MACH} \leftarrow \text{ZeroExtend}_{32}(\text{mach}); \]

Note
LDS.L @Rm+, MACH

Description

This instruction loads MACH from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in $R_m$ and loaded into MACH. $R_m$ is post-incremented by 4.

Operation

```
LDS.L @Rm+, MACH
```

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>m</th>
<th></th>
<th>00000110</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

```
op1 ← SignExtend32(Rm);
address ← ZeroExtend32(op1);
mach ← SignExtend32(ReadMemory32(address));
op1 ← op1 + 4;
R_m ← Register(op1);
MACH ← ZeroExtend32(mach);
```

Exceptions

RADDERR, RTLBMISS, READPROT

Note
LDS Rm, MACL

**Description**

This instruction copies R<sub>m</sub> to MACL.

**Operation**

LDS Rm, MACL

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>m</th>
<th>00011010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8 7 0</td>
</tr>
</tbody>
</table>

\[
op1 \leftarrow \text{SignExtend}_{32}(R_m);
\]
\[
\text{macl} \leftarrow \text{op1};
\]
\[
\text{MACL} \leftarrow \text{ZeroExtend}_{32}(\text{macl});
\]

**Note**
LDS.L @Rm+, MACL

Description

This instruction loads MACL from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in Rm and loaded into MACL. Rm is post-incremented by 4.

Operation

LDS.L @Rm+, MACL

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>m</td>
<td>00010110</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
\begin{align*}
\text{op1} & \leftarrow \text{SignExtend}_{32}(R_m) \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op1}) \\
\text{macl} & \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address})) \\
\text{op1} & \leftarrow \text{op1} + 4 \\
R_m & \leftarrow \text{Register}($\text{op1}$) \\
\text{MACL} & \leftarrow \text{ZeroExtend}_{32}(\text{macl})
\end{align*}
\]

Exceptions

RADDERR, RTLBMISS, READPROT

Note
LDS Rm, PR

Description
This instruction copies Rm to PR.

Operation

LDS Rm, PR

<table>
<thead>
<tr>
<th>0100</th>
<th>m</th>
<th>00101010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
<tr>
<td>10</td>
<td>9</td>
<td>8</td>
</tr>
<tr>
<td>7</td>
<td>0</td>
<td></td>
</tr>
</tbody>
</table>

op1 ← SignExtend32(Rm);
newpr ← op1;
delayedpr ← newpr;
PR' ← Register(newpr);
PR'' ← Register(delayedpr);

Note
LDS.L @Rm+, PR

Description
This instruction loads PR from memory using register indirect with post-increment addressing. A 32-bit value is read from the effective address specified in Rm and loaded into PR. Rm is post-incremented by 4.

Operation

LDS.L @Rm+, PR

<table>
<thead>
<tr>
<th></th>
<th></th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>0100</td>
<td>m</td>
<td></td>
<td></td>
<td>00100110</td>
<td></td>
</tr>
</tbody>
</table>


op1 ← SignExtend32(Rm);
address ← ZeroExtend32(op1);
newpr ← SignExtend32(ReadMemory32(address));
delayedpr ← newpr;
op1 ← op1 + 4;
Rm ← Register(op1);
PR' ← Register(newpr);
PR'' ← Register(delayedpr);

Exceptions
RADERR, RTLBMIS, READPROT

Note
LDTLB

Description
This instruction loads the contents of the PTEH/PTEL registers into the UTLB (unified translation lookaside buffer) specified by MMUCR.URC (random counter field in the MMC control register).

LDTLB is a privileged instruction, and can only be used in privileged mode. Use of this instruction in user mode will cause a RESINST trap.

Operation

LDTLB

```
0000000000111000
```

```
md ← ZeroExtend3(MD);
IF (md = 0)
    THROW RESINST;
UTLB[MMUCR.URC].ASID ← PTEH.ASID
UTLB[MMUCR.URC].VPN ← PTEH.VPN
UTLB[MMUCR.URC].PPN ← PTEH.PPN
UTLB[MMUCR.URC].SZ ← PTEL.SZ1<<1 + PTEL.SZ0
UTLB[MMUCR.URC].SH ← PTEL.SH
UTLB[MMUCR.URC].PR ← PTEL.PR
UTLB[MMUCR.URC].WT ← PTEL.WT
UTLB[MMUCR.URC].C ← PTEL.C
UTLB[MMUCR.URC].D ← PTEL.D
UTLB[MMUCR.URC].V ← PTEL.V
```

Exceptions
RESINST

Note
As this instruction loads the contents of the PTEH/PTEL registers into a UTLB entry, it should be used either with the MMU disabled, or in the P1 or P2 virtual space with the MMU enabled (see Chapter 3: Memory management unit (MMU) on page 41, for details). After this instruction is issued, there must be at least one
PRELIMINARY DATA

instruction between the LDTLB instruction and the execution of an instruction from the areas P0, U0, and P3 (i.e. via a BRAF, BSRF, JMP, JSR, RTS, or RTE).
MAC.L @Rm+, @Rn+

Description
This instruction reads the signed 32-bit value at the effective address specified in Rn, and then post-increments Rn by 4. It also reads the signed 32-bit value at the effective address specified in Rm, and then post-increments Rm by 4. These 2 values are multiplied together to give a 64-bit result, and this result is added to the 64-bit accumulator held in MACL and MACH. This accumulation gives an output with 65 bits of precision.

If the S-bit is 0, the result is the lower 64 bits of the accumulation. If the S-bit is 1, the result is calculated by saturating the accumulation to the signed range [-2^48, 2^48]. In either case, the 64-bit result is split into low and high halves, which are placed into MACL and MACH respectively.

Exceptions
RADDERR, RTLBMISS, READPROT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

If Rm and Rn refer to the same register (i.e. m = n), then this register will be post-incremented twice. The instruction will read two long-words from consecutive memory locations.

Operation

MAC.L @Rm+, @Rn+

<p>| | | | | | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>0000</td>
<td>n</td>
<td>m</td>
<td>1111</td>
<td></td>
<td></td>
</tr>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

STMicroelectronics and Hitachi, Ltd.

ADCS 7182230F

SH-4 CPU Core Architecture
MAC.L @Rm+, @Rn+

```plaintext
macl ← ZeroExtend32(MACL);
mach ← ZeroExtend32(MACH);
s ← ZeroExtend4(S);
m_field ← ZeroExtend4(m);
n_field ← ZeroExtend4(n);
m_address ← SignExtend32(Rm);
n_address ← SignExtend32(Rn);
value2 ← SignExtend32(ReadMemory32(ZeroExtend32(n_address)));
n_address ← n_address + 4;
IF (n_field = m_field)
{
    m_address ← m_address + 4;
    n_address ← n_address + 4;
}
value1 ← SignExtend32(ReadMemory32(ZeroExtend32(m_address)));
m_address ← m_address + 4;
mul ← value2 × value1;
mac ← (mach << 32) + mac;
result ← mac + mul;
IF (s = 1)
    IF (((result ⊕ mac) ∧ (result ⊕ mul))< 63 FOR 1 = 1)
        IF (mac< 63 FOR 1 = 0)
            result ← 2^47 - 1;
        ELSE
            result ← - 2^47;
        ELSE
            result ← SignedSaturate48(result);
    macl ← result;
mach ← result >> 32;
Rm ← Register(m_address);
Rn ← Register(n_address);
MACL ← ZeroExtend32(macl);
MACH ← ZeroExtend32(mach);
```

MAC.W @Rm+, @Rn+

Description

This instruction reads the signed 16-bit value at the effective address specified in Rn, and then post-increments Rn by 2. It also reads the signed 16-bit value at the effective address specified in Rm, and then post-increments Rm by 2. These 2 values are multiplied together to give a 32-bit result.

If the S-bit is 0, the 32-bit multiply result is added to the 64-bit accumulator held in MACL and MACH. This accumulation gives an output with 65 bits of precision, and the result is the lower 64 bits of the accumulation. The result is split into low and high halves, which are placed into MACL and MACH respectively.

If the S-bit is 1, the 32-bit multiply result is added to the 32-bit accumulator held in MACL. This accumulation gives an output with 33 bits of precision, and is saturated to the signed range \([-2^{31}, 2^{31})\), and then placed in MACL. If the accumulation overflows this signed range, then MACH is set to 1 to denote overflow otherwise MACH is unchanged.

Exceptions
RADDERR, RTLBMISS, READPROT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

If Rm and Rn refer to the same register (i.e. m = n), then this register will be post-incremented twice. The instruction will read two words from consecutive memory locations.

Operation

\[
\text{MAC.W } @\text{Rm+}, @\text{Rn+}
\]

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>m</th>
<th>1111</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

STMicroelectronics and Hitachi, Ltd.
ADCS 7182230F
SH-4 CPU Core Architecture
MAC.W @Rm+, @Rn+

```
macl ← ZeroExtend32(MACL);
mach ← ZeroExtend32(MACH);
s ← ZeroExtend1(S);
m_field ← ZeroExtend4(m);
n_field ← ZeroExtend4(n);
m_address ← SignExtend32(Rm);
n_address ← SignExtend32(Rn);
value2 ← SignExtend16(ReadMemory16(ZeroExtend32(n_address)));
n_address ← n_address + 2;
IF (n_field = m_field)
{  
m_address ← m_address + 2;
n_address ← n_address + 2;
}
value1 ← SignExtend16(ReadMemory16(ZeroExtend32(m_address)));
m_address ← m_address + 2;
mul ← value2 × value1;
IF (s = 1)
{  
macl ← SignExtend32(macl) + mul;
temp ← SignedSaturate32(macl);
IF (macl = temp)
  
result ← (mach << 32) ∨ ZeroExtend32(macl);
ELSE
  
result ← (0x1 << 32) ∨ ZeroExtend32(temp);
}
ELSE
  
result ← ((mach << 32) + macl) + mul;
macl ← result;
mach ← result >> 32;
R_m ← Register(m_address);
R_n ← Register(n_address);
MACL ← ZeroExtend32(macl);
MACH ← ZeroExtend32(mach);
```

MAC.W @Rm+, @Rn+
**MOV Rm, Rn**

**Description**
This instruction copies the value of R<sub>m</sub> to R<sub>n</sub>.

**Operation**

```
0110  n  m  0011
```

```
op1 ← ZeroExtend<sub>32</sub>(R<sub>m</sub>);
op2 ← op1;
R<sub>n</sub> ← Register(op2);
```

**Note**
**MOV #imm, Rn**

**Description**
This instruction sign-extends the 8-bit immediate `s` and places the result in `Rn`.

**Operation**

```
MOV #imm, Rn
```

```
<table>
<thead>
<tr>
<th></th>
<th>1110</th>
<th>n</th>
<th>s</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
</tr>
</tbody>
</table>
```

```
imm ← SignExtend_8(s);
op2 ← imm;
Rn ← Register(op2);
```

**Note**
The ‘#imm’ in the assembly syntax represents the immediate `s` after sign extension.
**MOV.B Rm, @Rn**

**Description**

This instruction stores a byte to memory using register indirect with zero-displacement addressing. The effective address is specified in Rn. The byte to be stored is held in the lowest 8 bits of Rm.

**Operation**

\[
\text{MOV.B Rm, @Rn}
\]

\[
\begin{array}{cccccc}
0 & 0 & 1 & 0 & n & m & 0 & 0 & 0 & 0 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0 & 0
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
\text{op2} \leftarrow \text{SignExtend}_{32}(Rn);
\]
\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{op2});
\]
\[
\text{WriteMemory}_{8}(\text{address, op1});
\]

**Exceptions**

WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

**Note**
MOV.B Rm, @-Rn

Description
This instruction stores a byte to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 1 to give the effective address. The byte to be stored is held in the lowest 8 bits of Rm.

Operation

```
MOV.B Rm, @-Rn
```

```
|   | op1 ← SignExtend32(Rm);  |
|   | op2 ← SignExtend32(Rn);  |
|   | address ← ZeroExtend32(op2 - 1); |
|   | WriteMemory8(address, op1); |
|   | op2 ← address;           |
|   | Rn ← Register(op2);     |
```

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
MOV.B Rm, @(R0, Rn)

Description
This instruction stores a byte to memory using register indirect addressing. The effective address is formed by adding R₀ to Rₙ. The byte to be stored is held in the lowest 8 bits of Rₘ.

Operation
MOV.B Rm, @(R0, Rn)

\[
\begin{array}{cccccc}
0000 & n & m & 0100 \\
15 & 12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 0
\end{array}
\]

\[
r0 \leftarrow \text{SignExtend}_{32}(R0);
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
op2 \leftarrow \text{SignExtend}_{32}(Rn);
\]
\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(r0 + op2);
\]
\[
\text{WriteMemory}_{8}(\text{address}, op1);
\]

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
MOV.B R0, @(disp, GBR)

Description
This instruction stores a byte to memory using GBR-relative with displacement addressing. The effective address is formed by adding GBR to the zero-extended 8-bit immediate i. The byte to be stored is held in the lowest 8 bits of R0.

Operation
MOV.B R0, @(disp, GBR)

<table>
<thead>
<tr>
<th>15</th>
<th>14</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>11000000</td>
<td>i</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
gbr \leftarrow \text{SignExtend}_{32}(\text{GBR}); \\
g0 \leftarrow \text{SignExtend}_{8}(R0); \\
disp \leftarrow \text{ZeroExtend}_{8}(i); \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{disp} + \text{gbr}); \\
\text{WriteMemory}_{8}(\text{address}, g0); \\
\]

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate i after zero extension.
MOV.B R0, @(disp, Rn)

Description
This instruction stores a byte to memory using register indirect with displacement addressing. The effective address is formed by adding Rn and the zero-extended 4-bit immediate i. The byte to be stored is held in the lowest 8 bits of R0.

Operation
MOV.B R0, @(disp, Rn)

<table>
<thead>
<tr>
<th>10000000</th>
<th>n</th>
<th>i</th>
</tr>
</thead>
<tbody>
<tr>
<td>7</td>
<td>4</td>
<td>0</td>
</tr>
</tbody>
</table>

r0 ← SignExtend32(R0);
disp ← ZeroExtend4(i);
op2 ← SignExtend32(Rn);
address ← ZeroExtend32(disp + op2);
WriteMemory8(address, r0);

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The 'disp' in the assembly syntax represents the immediate i after zero extension.
**MOV.B @Rm, Rn**

**Description**

This instruction loads a signed byte from memory using register indirect with zero-displacement addressing. The effective address is specified in Rm. The byte is loaded from the effective address, sign-extended and placed in Rn.

**Operation**

```
MOV.B @Rm, Rn
```

```
<table>
<thead>
<tr>
<th></th>
<th>0110</th>
<th>n</th>
<th>m</th>
<th>0000</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>6</td>
<td>4</td>
</tr>
</tbody>
</table>
```

```
op1 ← SignExtend32(Rm);
address ← ZeroExtend32(op1);
op2 ← SignExtend8(ReadMemory8(address));
Rn ← Register(op2);
```

**Exceptions**

RADDERR, RTLBMISS, READPROT

**Note**
MOV.B @Rm+, Rn

Description
This instruction loads a signed byte from memory using register indirect with post-increment addressing. The byte is loaded from the effective address specified in R_m and sign-extended. R_m is post-incremented by 1, and then the loaded byte is placed in R_n.

Operation

\[
\begin{array}{cccccc}
  & 0110 & n & m & 0100 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0
\end{array}
\]

\[
m\_field \leftarrow \text{ZeroExtend}_4(m);
n\_field \leftarrow \text{ZeroExtend}_4(n);
op1 \leftarrow \text{SignExtend}_{32}(R_m);
address \leftarrow \text{ZeroExtend}_8(op1);
op2 \leftarrow \text{SignExtend}_8(\text{ReadMemory}_8(address));
\]

IF (m\_field = n\_field)
\[
op1 \leftarrow op2;
\]
ELSE
\[
op1 \leftarrow op1 + 1;
R_m \leftarrow \text{Register}(op1);
R_n \leftarrow \text{Register}(op2);
\]

Exceptions
RADDERR, RTLBMISS, READPROT

Note
If R_m and R_n refer to the same register (i.e. m = n), the result placed in this register will be the sign-extended byte loaded from memory.
MOV.B @(R0, Rm), Rn

**Description**

This instruction loads a signed byte from memory using register indirect addressing. The effective address is formed by adding $R_0$ to $R_m$. The byte is loaded from the effective address, sign-extended and placed in $R_n$.

**Operation**

\[
\text{MOV.B @(R0, Rm), Rn}
\]

\[
\begin{array}{cccccc}
0000 & n & m & 1100 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0
\end{array}
\]

\[
r_0 \leftarrow \text{SignExtend}_{32}(R_0);
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(R_m);
\]
\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(r_0 + op1);
\]
\[
op2 \leftarrow \text{SignExtend}_{8}(\text{ReadMemory}_{8}(\text{address})) ;
\]
\[
R_n \leftarrow \text{Register}(op2);
\]

**Exceptions**

RADDERR, RTLBMISS, READPROT

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
MOV.B @(disp, GBR), R0

Description
This instruction loads a signed byte from memory using GBR-relative with displacement addressing. The effective address is formed by adding GBR to the zero-extended 8-bit immediate i. The byte is loaded from the effective address, sign-extended and placed in R0.

Operation

\[
\text{MOV.B @(disp, GBR), R0}
\]

\[
\begin{array}{c|c}
15 & 11000100 \\
7 & 15 \\
0 & i
\end{array}
\]

gbr \leftarrow \text{SignExtend}_{32}(GBR);
disp \leftarrow \text{ZeroExtend}_8(i);
address \leftarrow \text{ZeroExtend}_{32}(\text{disp} + \text{gbr});
r0 \leftarrow \text{SignExtend}_8(\text{ReadMemory}_8(\text{address}));
R0 \leftarrow \text{Register}(r0);

Exceptions
RADDERR, RTLBMISS, READPROT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The 'disp' in the assembly syntax represents the immediate i after zero extension.
MOV.B @(disp, Rm), R0

Description
This instruction loads a signed byte from memory using register indirect with displacement addressing. The effective address is formed by adding $R_m$ to the zero-extended 4-bit immediate $i$. The byte is loaded from the effective address, sign-extended and placed in $R_0$.

Operation

\[
\begin{array}{c|c|c}
15 & 6 & 5 \\
\hline
10000100 & m & i \\
\end{array}
\]

\[
\text{disp} \leftarrow \text{ZeroExtend}_4(i); \\
\text{op}2 \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{disp} + \text{op}2); \\
\text{r}_0 \leftarrow \text{SignExtend}_8(\text{ReadMemory}_8(\text{address})); \\
R_0 \leftarrow \text{Register}(r_0);
\]

Exceptions
RADDERR, RTLBMISS, READPROT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate $i$ after zero extension.
MOV.L Rm, @Rn

Description
This instruction stores a long-word to memory using register indirect with zero-displacement addressing. The effective address is specified in Rn. The long-word to be stored is held in Rm.

Operation

\[
\begin{array}{ccccccc}
0010 & n & m & 0010 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0 \\
\end{array}
\]

\[
\begin{align*}
\text{op1} & \leftarrow \text{SignExtend}_{32}(Rm) \\
\text{op2} & \leftarrow \text{SignExtend}_{32}(Rn) \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op2}) \\
\text{WriteMemory}_{32}(\text{address}, \text{op1}) \\
\end{align*}
\]

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
MOV.L Rm, @-Rn

**Description**

This instruction stores a long-word to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The long-word to be stored is held in Rm.

**Operation**

```
MOV.L Rm, @-Rn
```

```
\[
\begin{array}{cccccc}
0010 & n & m & 0110 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0
\end{array}
\]
```

```
\begin{align*}
\text{op1} & \leftarrow \text{SignExtend}_{32}(Rn); \\
\text{op2} & \leftarrow \text{SignExtend}_{32}(Rn); \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op2} - 4); \\
\text{WriteMemory}_{32}(&\text{address}, \text{op1}); \\
\text{op2} & \leftarrow \text{address}; \\
\text{Rn} & \leftarrow \text{Register}(\text{op2}); \\
\end{align*}
```

**Exceptions**

WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
**MOV.L Rm, @(R0, Rn)**

**Description**

This instruction stores a long-word to memory using register indirect addressing. The effective address is formed by adding $R_0$ to $R_n$. The long-word to be stored is held in $R_m$.

**Operation**

\[
\text{MOV.L } Rm, @(R0, Rn) \\
\begin{array}{cccccc}
0 & 0 & 0 & 0 & n & m \\
15 & 12 & 11 & 8 & 4 & 0 & 110
\end{array}
\]

- $r0 \leftarrow \text{SignExtend}_{32}(R0)$;
- $op1 \leftarrow \text{SignExtend}_{32}(Rm)$;
- $op2 \leftarrow \text{SignExtend}_{32}(Rn)$;
- $\text{address} \leftarrow \text{ZeroExt}_{32}(r0 + op2)$;
- $\text{WriteMemory}_{32}(\text{address}, op1)$;

**Exceptions**

WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
MOV.L R0, @(disp, GBR)

Description

This instruction stores a long-word to memory using GBR-relative with displacement addressing. The effective address is formed by adding GBR to the zero-extended 8-bit immediate i multiplied by 4. The long-word to be stored is held in R0.

Operation

```
11000010 i
```

- `gbr ← SignExtend32(GBR);`
- `r0 ← SignExtend32(R0);`
- `disp ← ZeroExtend8(i) << 2;`
- `address ← ZeroExtend32(disp + gbr);`
- `WriteMemory32(address, r0);`

Exceptions

WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate i after zero extension and scaling.
MOV.L Rm, @(disp, Rn)

Description
This instruction stores a long-word to memory using register indirect with displacement addressing. The effective address is formed by adding Rn to the zero-extended 4-bit immediate i multiplied by 4. The long-word to be stored is held in Rm.

Operation

\[
\text{MOV.L Rm, @(disp, Rn)} \quad 0001 \quad n \quad m \quad i
\]

\[
\begin{align*}
op1 & \leftarrow \text{SignExtend}_{32}(Rm); \\
disp & \leftarrow \text{ZeroExtend}_4(i) \ll 2; \\
op3 & \leftarrow \text{SignExtend}_{32}(Rn); \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{disp} + \text{op3}); \\
\text{WriteMemory}_{32}(\text{address}, \text{op1});
\end{align*}
\]

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate i after zero extension and scaling.
MOV.L @Rm, Rn

Description
This instruction loads a signed long-word from memory using register indirect with zero-displacement addressing. The effective address is specified in Rm. The long-word is loaded from the effective address and placed in Rn.

Operation

\[
\text{MOV.L @Rm, Rn}
\]

\[
\begin{array}{cccccc}
0110 & n & m & 0010 \\
15 & 12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(Rm); \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{op1}); \\
\text{op2} \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address})); \\
Rn \leftarrow \text{Register(op2)};
\]

Exceptions
RADDERR, RTLBMIS, READPROT

Note
**MOV.L @Rm+, Rn**

**Description**

This instruction loads a signed long-word from memory using register indirect with post-increment addressing. The long-word is loaded from the effective address specified in Rm. Rm is post-incremented by 4, and then the loaded long-word is placed in Rn.

**Operation**

```
MOV.L @Rm+, Rn

0110  n  m  0110
15 12 11  8  7  4  3  0

m_field ← ZeroExtend4(m);
n_field ← ZeroExtend4(n);
op1 ← SignExtend32(Rm);
address ← ZeroExtend32(op1);
op2 ← SignExtend32(ReadMemory32(address));
IF (m_field = n_field)
op1 ← op2;
ELSE
   op1 ← op1 + 4;
Rm ← Register(op1);
Rn ← Register(op2);
```

**Exceptions**

RADDERR, RTLBMISS, READPROT

**Note**

If Rm and Rn refer to the same register (i.e. m = n), the result placed in this register will be the sign-extended byte loaded from memory.
MOV.L @(R0, Rm), Rn

Description

This instruction loads a signed long-word from memory using register indirect addressing. The effective address is formed by adding R0 to Rm. The long-word is loaded from the effective address and placed in Rn.

Operation

\[
\text{MOV.L @(R0, Rm), Rn}
\]

<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>m</th>
<th>1110</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>4</td>
</tr>
</tbody>
</table>

\[
r0 \leftarrow \text{SignExtend}_{32}(R0);
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(r0 + op1);
\]
\[
op2 \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address}));
\]
\[
Rn \leftarrow \text{Register}(op2);
\]

Exceptions

RADDERR, RTLBMISS, READPROT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
**MOV.L @(disp, GBR), R0**

**Description**

This instruction loads a signed long-word from memory using GBR-relative with displacement addressing. The effective address is formed by adding GBR to the zero-extended 8-bit immediate i multiplied by 4. The long-word is loaded from the effective address and placed in R0.

**Operation**

\[
\text{MOV.L } @(\text{disp}, \text{GBR}), \text{R0} \\
\begin{array}{cccc}
15 & 8 & 7 & 0 \\
\text{11000110} & \text{i} & \text{disp} & \text{gbr}
\end{array}
\]

\[
\begin{align*}
gbr & \leftarrow \text{SignExtend}_{32}(\text{GBR}); \\
\text{disp} & \leftarrow \text{ZeroExtend}_8(i) \ll 2; \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{disp} + \text{gbr}); \\
\text{r0} & \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address})); \\
\text{R0} & \leftarrow \text{Register}(\text{r0});
\end{align*}
\]

**Exceptions**

RADDERR, RTLBMISS, READPROT

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate i after zero extension and scaling.
**MOV.L @(disp, PC), Rn**

**Description**

This instruction loads a signed long-word from memory using PC-relative with displacement addressing. The effective address is formed by calculating PC+4, clearing the lowest 2 bits, and adding the zero-extended 8-bit immediate \( i \) multiplied by 4. This address calculation ensures that the effective address is correctly aligned for a long-word access regardless of the PC alignment. The long-word is loaded from the effective address and placed in \( R_n \).

**Operation**

\[
\text{MOV.L @(disp, PC), Rn}
\]

<table>
<thead>
<tr>
<th>pc</th>
<th>SignExtend(_{32})(PC);</th>
</tr>
</thead>
<tbody>
<tr>
<td>disp</td>
<td>ZeroExtend(_{8})(i) &lt;&lt; 2;</td>
</tr>
<tr>
<td>IF (IsDelaySlot())</td>
<td></td>
</tr>
<tr>
<td>THROW ILLSLOT;</td>
<td></td>
</tr>
<tr>
<td>address</td>
<td>ZeroExtend(_{32})(disp + ((pc + 4) &amp; (\sim 0x3)));</td>
</tr>
<tr>
<td>op2</td>
<td>SignExtend(<em>{32})(ReadMemory(</em>{32})(address));</td>
</tr>
<tr>
<td>( R_n )</td>
<td>Register(op2);</td>
</tr>
</tbody>
</table>

**Exceptions**

ILLSLOT, RADDERR, RTLBMISS, READPROT

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

An ILLSLOT exception is raised if this instruction is executed in a delay slot.

The ‘\( \text{disp} \)’ in the assembly syntax represents the immediate \( i \) after zero extension and scaling.
MOV.L @(disp, Rm), Rn

Description
This instruction loads a signed long-word from memory using register indirect with displacement addressing. The effective address is formed by adding \( R_m \) to the zero-extended 4-bit immediate \( i \) multiplied by 4. The long-word is loaded from the effective address and placed in \( R_n \).

Operation

\[
\text{MOV.L } @(\text{disp}, \ R_m), \ R_n
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0101</td>
<td>n</td>
<td>m</td>
<td>i</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
\text{disp} \leftarrow \text{ZeroExtend}_4(i) \ll 2; \\
\text{op2} \leftarrow \text{SignExtend}_{32}(R_m); \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{disp} + \text{op2}); \\
\text{op3} \leftarrow \text{SignExtend}_{32}(\text{ReadMemory}_{32}(\text{address})); \\
\text{R}_n \leftarrow \text{Register}(\text{op3});
\]

Exceptions
RADDERR, RTLBMISS, READPROT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate \( i \) after zero extension and scaling.
MOV.W Rm, @Rn

Description
This instruction stores a word to memory using register indirect with zero-displacement addressing. The effective address is specified in Rn. The word to be stored is held in the lowest 16 bits of Rm.

Operation

<table>
<thead>
<tr>
<th>0010</th>
<th>n</th>
<th>m</th>
<th>0001</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>10</td>
</tr>
</tbody>
</table>

op1 ← SignExtend32(Rm);
op2 ← SignExtend32(Rn);
address ← ZeroExtend16(op2);
WriteMemory16(address, op1);

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
MOV.W Rm, @-Rn

Description
This instruction stores a word to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 2 to give the effective address. The word to be stored is held in the lowest 16 bits of Rm.

Operation

```
MOV.W Rm, @-Rn
```

<table>
<thead>
<tr>
<th>0010</th>
<th>n</th>
<th>m</th>
<th>0101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_{m}); \\
\text{op2} \leftarrow \text{SignExtend}_{32}(R_{n}); \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{op2} \cdot 2); \\
\text{WriteMemory}_{16}(\text{address}, \text{op1}); \\
\text{op2} \leftarrow \text{address}; \\
R_{n} \leftarrow \text{Register}(\text{op2});
\]

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
MOV.W Rm, @(R0, Rn)

Description
This instruction stores a word to memory using register indirect addressing. The effective address is formed by adding R0 to Rn. The word to be stored is held in the lowest 16 bits of Rm.

Operation

MOV.W Rm, @(R0, Rn)

<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>m</th>
<th>0101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

r0 ← SignExtend32(R0);
op1 ← SignExtend32(Rm);
op2 ← SignExtend32(Rn);
address ← ZeroExtend32(r0 + op2);
WriteMemory16(address, op1);

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
MOV.W R0, @(disp, GBR)

Description
This instruction stores a word to memory using GBR-relative with displacement addressing. The effective address is formed by adding GBR to the zero-extended 8-bit immediate i multiplied by 2. The word to be stored is held in the lowest 16 bits of R0.

Operation

```
MOV.W R0, @(disp, GBR)
```

<table>
<thead>
<tr>
<th>15</th>
<th>14</th>
<th>13</th>
<th>12</th>
<th>11</th>
<th>10</th>
<th>9</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>11000001</td>
<td>i</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

```
gbr ← SignExtend32(GBR);
r0 ← SignExtend32(R0);
disp ← ZeroExtend8(i) << 1;
address ← ZeroExtend32(disp + gbr);
WriteMemory16(address, r0);
```

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate i after zero extension and scaling.
**MOV.W R0, @(disp, Rn)**

**Description**
This instruction stores a word to memory using register indirect with displacement addressing. The effective address is formed by adding R<sub>n</sub> to the zero-extended 4-bit immediate i multiplied by 2. The word to be stored is held in the lowest 16 bits of R<sub>m</sub>.

**Operation**

```
MOV.W R0, @(disp, Rn)
```

<table>
<thead>
<tr>
<th></th>
<th>10000001</th>
<th>n</th>
<th>i</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

- r0 ← SignExtend<sub>32</sub>(R0);
- disp ← ZeroExtend<sub>4</sub>(i) &lt;&lt; 1;
- op2 ← SignExtend<sub>32</sub>(R<sub>n</sub>);
- address ← ZeroExtend<sub>32</sub>(disp + op2);
- WriteMemory<sub>16</sub>(address, r0);

**Exceptions**
WADDERR, WTLBMISS, WRITEMPROT, FIRSTWRITE

**Note**
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate i after zero extension and scaling.
MOV.W @Rm, Rn

Description
This instruction loads a signed word from memory using register indirect with zero-displacement addressing. The effective address is specified in Rm. The word is loaded from the effective address, sign-extended and placed in Rn.

Operation

\[
\text{MOV.W @Rm, Rn}
\]

\[
\begin{array}{cccccc}
0110 & n & m & 0001 \\
\hline
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(Rm);
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{op1});
\text{op2} \leftarrow \text{SignExtend}_{16}(\text{ReadMemory}_{16}(\text{address}));
R_n \leftarrow \text{Register}(\text{op2});
\]

Exceptions
RADDERR, RTLBMISS, READPROT

Note
**MOV.W @Rm+, Rn**

**Description**

This instruction loads a signed word from memory using register indirect with post-increment addressing. The word is loaded from the effective address specified in Rm, and sign-extended. Rm is post-incremented by 2, and then the loaded word is placed in Rn.

**Operation**

```
MOV.W @Rm+, Rn
```

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0110</td>
<td>n</td>
<td>m</td>
<td>0101</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

m_field ← ZeroExtend4(m);
n_field ← ZeroExtend4(n);
op1 ← SignExtend32(Rm);
address ← ZeroExtend32(op1);
op2 ← SignExtend16(ReadMemory16(address));
IF (m_field = n_field)
op1 ← op2;
ELSE
op1 ← op1 + 2;
Rm ← Register(op1);
Rn ← Register(op2);
```

**Exceptions**

RADDERR, RTLBMISS, READPROT

**Note**

If Rm and Rn refer to the same register (i.e. m = n), the result placed in this register will be the sign-extended byte loaded from memory.
MOV.W @(R0, Rm), Rn

Description
This instruction loads a signed word from memory using register indirect addressing. The effective address is formed by adding R0 to Rm. The word is loaded from the effective address, sign-extended and placed in Rn.

Operation

MOV.W @(R0, Rm), Rn

<table>
<thead>
<tr>
<th>0000</th>
<th>n</th>
<th>m</th>
<th>1101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

\[
r0 \leftarrow \text{SignExtend}_{32}(R0);
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(Rm);
\]
\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(r0 + op1);
\]
\[
op2 \leftarrow \text{SignExtend}_{16}(\text{ReadMemory}_{16}(\text{address}));
\]
\[
Rn \leftarrow \text{Register}(op2);
\]

Exceptions
RADDERR, RTLBMISS, READPROT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
MOV.W @(disp, GBR), R0

Description
This instruction loads a signed word from memory using GBR-relative with
displacement addressing. The effective address is formed by adding GBR to the
zero-extended 8-bit immediate i multiplied by 2. The word is loaded from the
effective address, sign-extended and placed in R0.

Operation
MOV.W @(disp, GBR), R0

<p>| | | | | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>8</td>
<td>7</td>
<td>0</td>
<td></td>
</tr>
</tbody>
</table>

gbr ← SignExtend32(GBR);
disp ← ZeroExtend8(i) << 1;
address ← ZeroExtend32(disp + gbr);
r0 ← SignExtend16(ReadMemory16(address));
R0 ← Register(r0);

Exceptions
RADDERR, RTLBMISS, READPROT

Note
The effective address calculation is performed using 32-bit zero extension to cause
wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate i after zero extension
and scaling.
MOV.W @(disp, PC), Rn

Description
This instruction loads a signed word from memory using PC-relative with displacement addressing. The effective address is formed by calculating PC+4, and adding the zero-extended 8-bit immediate i multiplied by 2. The word is loaded from the effective address, sign-extended and placed in Rn.

Operation

MOV.W @(disp, PC), Rn

\[
\begin{array}{cccc}
15 & 12 & 11 & 8 \\
1001 & n & i & 0 \\
\end{array}
\]

\[
p_c \leftarrow \text{SignExtend}_{32}(PC);
disp \leftarrow \text{ZeroExtend}_8(i) \ll 1;
\]
\[
\text{IF (IsDelaySlot())}
\text{THROW ILLSLOT;}
\]
\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{disp} + (pc + 4));
\]
\[
op2 \leftarrow \text{SignExtend}_{16}(\text{ReadMemory}_{16}(\text{address}));
\]
\[
R_n \leftarrow \text{Register}(op2);
\]

Exceptions
ILLSLOT, RADDERR, RTLBMISS, READPROT

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

An ILLSLOT exception is raised if this instruction is executed in a delay slot.

The 'disp' in the assembly syntax represents the immediate i after zero extension and scaling.
MOV.W @(disp, Rm), R0

Description

This instruction loads a signed word from memory using register indirect with displacement addressing. The effective address is formed by adding \( R_m \) to the zero-extended 4-bit immediate \( i \) multiplied by 2. The word is loaded from the effective address, sign-extended and placed in \( R_n \).

Operation

\[
\text{MOV.W @(disp, Rm), R0}
\]

\[
\begin{array}{ccc}
\text{op2} & \text{address} & \text{r0} \\
\text{disp} \leftarrow \text{ZeroExtend}_4(i) \ll 1; & \text{address} \leftarrow \text{ZeroExtend}_{32}(\text{disp} + \text{op2}); & \text{r0} \leftarrow \text{SignExtend}_{16}(\text{ReadMemory}_{16}(\text{address})); \\
\text{R0} \leftarrow \text{Register}(\text{r0}); & & \\
\end{array}
\]

Exceptions

RADDERR, RTLBMISS, READPROT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘disp’ in the assembly syntax represents the immediate \( i \) after zero extension and scaling.
**MOVA @(disp, PC), R0**

**Description**

This instruction calculates an effective address using PC-relative with displacement addressing. The effective address is formed by calculating PC+4, clearing the lowest 2 bits, and adding the zero-extended 8-bit immediate i multiplied by 4. This address calculation ensures that the effective address is correctly aligned for a long-word access regardless of the PC alignment. The effective address is placed in R0.

**Operation**

\[
\text{MOVA @(disp, PC), R0}
\]

| 15 | 11000111 | 7 | i | 0 |

\[
\text{pc} \leftarrow \text{SignExtend}_{32}(PC);
\text{disp} \leftarrow \text{ZeroExtend}_8(i) \ll 2;
\text{IF (IsDelaySlot())}
\quad \text{THROW ILLSLOT;}
\quad \text{r0} \leftarrow \text{disp} + ((\text{pc} + 4) \land (\sim 0x3));
\quad R_0 \leftarrow \text{Register(r0)};
\]

**Exceptions**

ILLSLOT

**Note**

The instructions only computes the effective address, no memory request is made. An ILLSLOT exception is raised if this instruction is executed in a delay slot.

The 'disp' in the assembly syntax represents the immediate i after zero extension and scaling.
**MOVCA.L R0, @Rn**

**Description**

This instruction stores the long-word in R0 to memory at the effective address specified in Rn. It provides a hint to the implementation that it is not necessary to retrieve the data of this operand cache block from memory. It is implementation-specific as to whether the memory access will occur.

The effective address specified in Rn identifies a surrounding block of memory, which starts at an address aligned to the cache block size and has a size equal to the cache block size. The cache block size is implementation dependent.

MOVCA.L checks for address error, translation miss and protection exception cases.

Apart from the written long-word, the value of all other locations in the memory block targeted by a MOVCA.L becomes architecturally undefined. Programs must not rely on these values. For compatibility with other implementations, software must exercise care when using MOVCA.L.

**Operation**

MOVCA.L R0, @Rn

<table>
<thead>
<tr>
<th>0000</th>
<th>n</th>
<th>11000011</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
<tr>
<td>8</td>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>

\[
\begin{align*}
r0 & \leftarrow \text{SignExtend}_{32}(R0); \\
\text{op1} & \leftarrow \text{SignExtend}_{32}(Rn); \\
\text{IF (AddressUnavailable(op1))} & \text{THROW WADDRR, op1}; \\
\text{IF (MMU() AND DataAccessMiss(op1))} & \text{THROW WTLMISS, op1}; \\
\text{IF (MMU() AND WriteProhibited(op1))} & \text{THROW WRITEPROT, op1}; \\
\text{IF (MMU() AND NOT DirtyBit(op1))} & \text{THROW FIRSTWRITE, op1}; \\
\text{ALLOC}(\text{op1}); \\
\text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op1}); \\
\text{WriteMemory}_{32}(& \text{op1, r0}); \\
\end{align*}
\]
Exceptions

WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE
MOVT Rn

Description
This instruction copies the T-bit to Rn.

Operation

<table>
<thead>
<tr>
<th>0000</th>
<th>n</th>
<th>00101001</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
<tr>
<td>8</td>
<td>7</td>
<td>6</td>
</tr>
<tr>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[ t \leftarrow \text{ZeroExtend}_1(T); \]
\[ \text{op1} \leftarrow t; \]
\[ R_n \leftarrow \text{Register(op1)}; \]

Note
MUL.L Rm, Rn

Description

This instruction multiplies the 32-bit value in Rm by the 32-bit value in Rn, and places the least significant 32 bits of the result in MACL. The most significant 32 bits of the result are not provided, and MACH is not modified.

Operation

\[
\begin{array}{cccccc}
0000 & n & m & 0111 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0
\end{array}
\]

\begin{align*}
op1 & \leftarrow \text{SignExtend}_{32}(R_m); \\
op2 & \leftarrow \text{SignExtend}_{32}(R_n); \\
mcl & \leftarrow \text{op1} \times \text{op2}; \\
MACL & \leftarrow \text{ZeroExtend}_{32}(mcl);
\end{align*}

Note
MULS.W Rm, Rn

Description

This instruction multiplies the signed lowest 16 bits of Rm by the signed lowest 16 bits of Rn, and places the full 32-bit result in MACL. MACH is not modified.

Operation

MULS.W Rm, Rn

\[

tabular{|c|c|c|c|c|c|}
\hline
0010 & n & m & 1111 \\
\hline
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0 \\
\hline
\end{tabular}
\]

\[
\begin{align*}
\text{op1} & \leftarrow \text{SignExtend}_{16} (\text{SignExtend}_{32} (R_m)) ; \\
\text{op2} & \leftarrow \text{SignExtend}_{16} (\text{SignExtend}_{32} (R_n)) ; \\
\text{macl} & \leftarrow \text{op1} \times \text{op2} ; \\
\text{MACL} & \leftarrow \text{ZeroExtend}_{32} (\text{macl}) ;
\end{align*}
\]

Note
MULU.W Rm, Rn

Description

This instruction multiplies the unsigned lowest 16 bits of Rm by the unsigned lowest 16 bits of Rn, and places the full 32-bit result in MACL. MACH is not modified.

Operation

MULU.W Rm, Rn

<table>
<thead>
<tr>
<th></th>
<th>0010</th>
<th>n</th>
<th>m</th>
<th>1110</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

- \( op1 \leftarrow \text{ZeroExtend}_{16}(\text{SignExtend}_{32}(R_m)) \);
- \( op2 \leftarrow \text{ZeroExtend}_{16}(\text{SignExtend}_{32}(R_n)) \);
- \( \text{macl} \leftarrow op1 \times op2 \);
- \( \text{MACL} \leftarrow \text{ZeroExtend}_{32}(\text{macl}) \);

Note
NEG Rm, Rn

Description
This instruction subtracts R_m from zero and places the result in R_n.

Operation

\[
\text{NEG } R_m, R_n
\]

\[
\begin{array}{cccccc}
0110 & n & m & 1011 \\
15 & 12 & 11 & 10 & 9 & 8
\end{array}
\]

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_m);
\]
\[
\text{op2} \leftarrow - \text{op1};
\]
\[
R_n \leftarrow \text{Register(op2)};
\]

Note
**NEGÇ Rm, Rn**

**Description**

This instruction subtracts Rm and the T-bit from zero and places the result in Rn. The borrow from the subtraction is placed in the T-bit.

**Operation**

<table>
<thead>
<tr>
<th>NEGÇ Rm, Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>0110</td>
</tr>
<tr>
<td>n</td>
</tr>
<tr>
<td>m</td>
</tr>
<tr>
<td>1010</td>
</tr>
<tr>
<td>15 12 11</td>
</tr>
<tr>
<td>0 7 4 3 0</td>
</tr>
</tbody>
</table>

\[
t \leftarrow \text{ZeroExtend}_1(T); \\
\text{op1} \leftarrow \text{ZeroExtend}_{32}(R_m); \\
\text{op2} \leftarrow (-\text{op1}) \cdot t; \\
t \leftarrow \text{op2}[32 \text{ FOR } 1]; \\
R_n \leftarrow \text{Register(op2)}; \\
T \leftarrow \text{Bit}(t);
\]

**Note**
NOP

Description
This instruction performs no operation.

Operation

```
NOP
```

```
15  000000000001001
```

NOT Rm, Rn

Description
This instruction performs a bitwise NOT on R_m and places the result in R_n.

Operation

\[
\begin{array}{cccccc}
0110 & n & m & 0111 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0
\end{array}
\]

\[
\text{op1} \leftarrow \text{ZeroExtend}_{32}(R_m); \\
\text{op2} \leftarrow -\text{op1}; \\
R_n \leftarrow \text{Register(op2)};
\]

Note
OCBI @Rn

Description
This instruction invalidates an operand cache block (if any) that corresponds to a specified effective address. If the data in the operand cache block is dirty, it is discarded without write-back to memory. Immediately after execution of OCBI, assuming no exception was raised, it is guaranteed that the targeted memory block in physical address space is not present in the operand cache.

The effective address specified in Rn identifies a surrounding block of memory, which starts at an address aligned to the cache block size and has a size equal to the cache block size. The cache block size is implementation dependent.

OCBI invalidates an implementation-dependent amount of data. For compatibility with other implementations, software must exercise care when using OCBI.

OCBI checks for address error, translation miss and protection exception cases.

Operation

```
OCBI @Rn

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
</tr>
</thead>
<tbody>
<tr>
<td>0000</td>
<td>n</td>
<td>10010011</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
```

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_n);
\]

\[
\begin{align*}
\text{IF} \ (\text{AddressUnavailable}(\text{op1})) \\
& \quad \text{THROW WADDERR, op1;}
\end{align*}
\]

\[
\begin{align*}
\text{IF} \ (\text{MMU}() \ \text{AND DataAccessMiss}(\text{op1})) \\
& \quad \text{THROW WTLBMISS, op1;}
\end{align*}
\]

\[
\begin{align*}
\text{IF} \ (\text{MMU}() \ \text{AND WriteProhibited}(\text{op1})) \\
& \quad \text{THROW WRITEPROT, op1;}
\end{align*}
\]

\[
\begin{align*}
\text{IF} \ (\text{MMU}() \ \text{AND NOT DirtyBit}(\text{op1})) \\
& \quad \text{THROW FIRSTWRITE, op1}
\end{align*}
\]

OCBI(op1);

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
**OCBP @Rn**

**Description**

This instruction purges an operand cache block (if any) that corresponds to a specified effective address. If the data in the operand cache block is dirty, it is written back to memory before being discarded. Immediately after execution of OCBP, assuming no exception was raised, it is guaranteed that the targeted memory block in physical address space is not present in the operand cache.

The effective address specified in Rn identifies a surrounding block of memory, which starts at an address aligned to the cache block size and has a size equal to the cache block size. The cache block size is implementation dependent.

OCBP checks for address error, translation miss and protection exception cases.

**Operation**

```
OCBP @RN
```

```
<table>
<thead>
<tr>
<th></th>
<th></th>
<th>n</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>10</td>
<td>9</td>
<td>0</td>
</tr>
</tbody>
</table>
```

```
op1 ← SignExtend32(Rn);
IF (AddressUnavailable(op1))
    THROW RADDERR, op1;
IF (MMU() AND DataAccessMiss(op1))
    THROW RTLBMIS, op1;
IF (MMU() AND (ReadProhibited(op1) AND WriteProhibited(op1)))
    THROW READPROT, op1;
OCBP(op1);
```

**Exceptions**

RADDERR, RTLBMIS, READPROT

**Note**
**OCBWB @Rn**

**Description**

This instruction write-backs an operand cache block (if any) that corresponds to a specified effective address. If the data in the operand cache block is dirty, it is written back to memory but is not discarded. Immediately after execution of OCBWB, assuming no exception was raised, it is guaranteed that the targeted memory block in physical address space will not be dirty in the operand cache.

The effective address specified in Rn identifies a surrounding block of memory, which starts at an address aligned to the cache block size and has a size equal to the cache block size. The cache block size is implementation dependent.

OCBWB checks for address error, translation miss and protection exception cases.

**Operation**

```
OCBWB @Rn
```

```
<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>10110011</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>12</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>9</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>6</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
```

\[\text{op1} \leftarrow \text{SignExtend}_{32}(R_n)\];

IF \(\text{AddressUnavailable} (\text{op1})\)

\[\text{THROW RADDERR, op1}\];

IF \(\text{MMU() AND DataAccessMiss} (\text{op1})\)

\[\text{THROW RTLBMISS, op1}\];

IF \(\text{MMU() AND (ReadProhibited} (\text{op1}) \text{ AND WriteProhibited} (\text{op1}))\)

\[\text{THROW READPROT, op1}\];

\[\text{OCBWB} (\text{op1})\];

**Exceptions**

RADDERR, RTLBMISS, READPROT

**Note**

STMicroelectronics and Hitachi, Ltd.

SH-4 CPU Core Architecture

ADCS 7182230F
OR Rm, Rn

Description
This instruction performs a bitwise OR of Rm with Rn and places the result in Rn.

Operation

```
  OR Rm, Rn

<table>
<thead>
<tr>
<th></th>
<th>0010</th>
<th>n</th>
<th>m</th>
<th>1011</th>
</tr>
</thead>
</table>
| 15| 12   | 11| 8 | 4    | 0
```

```
op1 ← ZeroExtend32(Rm);
op2 ← ZeroExtend32(Rn);
op2 ← op2 ∨ op1;
Rn ← Register(op2);
```

Note
OR #imm, R0

Description
This instruction performs a bitwise OR of R0 with the zero-extended 8-bit immediate i and places the result in R0.

Operation

OR #imm, R0

<table>
<thead>
<tr>
<th>11001011</th>
<th>i</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>8</td>
</tr>
<tr>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>

\[ r0 \leftarrow \text{ZeroExtend}_{32}(R0); \]
\[ \text{imm} \leftarrow \text{ZeroExtend}_8(i); \]
\[ r0 \leftarrow r0 \lor \text{imm}; \]
\[ R0 \leftarrow \text{Register}(r0); \]

Note
The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
OR.B #imm, @(R0, GBR)

Description

This instruction performs a bitwise OR of an immediate constant with 8 bits of data held in memory. The effective address is calculated by adding R0 and GBR. The 8 bits of data at the effective address are read. A bitwise OR is performed of the read data with the zero-extended 8-bit immediate i. The result is written back to the 8 bits of data at the same effective address.

Operation

OR.B #imm, @(R0, GBR)

11001111

i

r0 ← SignExtend32(R0);
gbr ← SignExtend32(GBR);
imm ← ZeroExtend8(i);
address ← ZeroExtend32(r0 + gbr);
value ← ZeroExtend8(ReadMemory8(address));
value ← value ∨ imm;
WriteMemory8(address, value);

Exceptions

WADDERR, WTLBMISS, READPROT, WRITEPROT, FIRSTWRITE

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
**PREF @Rn**

**Description**

This instruction indicates a software-directed data prefetch from the specified effective address. Software can use this instruction to give advance notice that particular data will be required. It is implementation-specific as to whether a prefetch will be performed.

The effective address specified in Rn identifies a surrounding block of memory, which starts at an address aligned to the cache block size and has a size equal to the cache block size. The cache block size is implementation dependent.

Any OTLBMULTI_HIT or RADDERR exception is delivered, other exceptions are discarded and the prefetch has no effect.

The semantics of a PREF instruction, when applied to an address in the store queues range (0xE0000000 to 0xE3FFFFFF) is quite different to that elsewhere. For details refer to Section 4.6: Store queues on page 101.

**Operation**

\[
\text{PREF @Rn}
\]

\[
\begin{array}{cccc}
0000 & n & \text{null} & 1000011 \\
5 & 12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0 \\
\end{array}
\]

\[
op1 \leftarrow \text{SignExtend}_{32}(Rn);
\]

\[
\text{IF (AddressUnavailable(op1))}
\]

\[
\text{THROW RADDERR, op1}
\]

\[
\text{IF (NOT (MMU() AND DataAccessMiss(op1)))}
\]

\[
\text{IF (NOT (MMU() AND ReadProhibited(op1)))}
\]

\[
\text{PREF(op1)};
\]

**Exceptions**

RADDERR, OTLBMULTI_HIT

**Note**
ROTCL Rn

Description
This instruction performs a one-bit left rotation of the bits held in Rn and the T-bit. The 32-bit value in Rn is shifted one bit to the left, the least significant bit is given the old value of the T-bit, and the bit that is shifted out is moved to the T-bit.

Operation

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>n</td>
<td>00100100</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
t \leftarrow \text{ZeroExtend}_1(T); \\
op1 \leftarrow \text{ZeroExtend}_{32}(R_n); \\
op1 \leftarrow (op1 \ll 1) \lor t; \\
t \leftarrow \text{op1}_{\ll 32, \text{FOR} \, t}; \\
R_n \leftarrow \text{Register}(op1); \\
T \leftarrow \text{Bit}(t); \\
\]

Note
**ROTCR Rn**

**Description**

This instruction performs a one-bit right rotation of the bits held in Rn and the T-bit. The 32-bit value in Rn is shifted one bit to the right, the most significant bit is given the old value of the T-bit, and the bit that is shifted out is moved to the T-bit.

**Operation**

ROTCR Rn

```
<p>| | | | | | | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>n</td>
<td>0</td>
<td>0</td>
</tr>
</tbody>
</table>
```

```
t ← ZeroExtend32(T);
op1 ← ZeroExtend32(R_n);
oldt ← t;
t ← op1<0 FOR 1 >;
op1 ← (op1 >> 1) ∨ (oldt << 31);
R_n ← Register(op1);
T ← Bit(t);
```
ROTL Rn

Description
This instruction performs a one-bit left rotation of the bits held in Rn. The 32-bit value in Rn is shifted one bit to the left, and the least significant bit is given the value of the bit that is shifted out. The bit that is shifted out of the operand is also copied to the T-bit.

Operation

ROTL Rn

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>00000100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{ZeroExtend}_{32}(R_n);
\]
\[
t \leftarrow \text{op1} \ll 31 \text{ FOR } t\geq 1;
\]
\[
\text{op1} \leftarrow (\text{op1} \ll 1) \lor t;
\]
\[
R_n \leftarrow \text{Register(op1)};
\]
\[
T \leftarrow \text{Bit}(t);
\]

Note
ROTR Rn

Description

This instruction performs a one-bit right rotation of the bits held in Rn. The 32-bit value in Rn is shifted one bit to the right, and the most significant bit is given the value of the bit that is shifted out. The bit that is shifted out of the operand is also copied to the T-bit.

Operation

```plaintext
ROTR Rn

0100 n 00000101

0
7
8
11
12
15

op1 ← ZeroExtend32(Rn);
t ← op1<0 FOR 1 >;
op1 ← (op1 >> 1) ∨ (t << 31);
Rn ← Register(op1);
T ← Bit(t);
```

Note
RTE

Description

This instruction returns from an exception or interrupt handling routine by
restoring the PC and SR values from SPC and SSR. Program execution continues
from the address specified by the restored PC value.

RTE is a privileged instruction, and can only be used in privileged mode. Use of this
instruction in user mode will cause an RESINST exception.

Operation

RTE

\[
\begin{array}{c}
000000000101011 \\
\end{array}
\]

\[
\begin{array}{c}
md \leftarrow \text{ZeroExt}_{32}(MD); \\
\text{IF} (md = 0) \\
\quad \text{THROW RESINST}; \\
ssr \leftarrow \text{SignExt}_{32}(SSR); \\
pc \leftarrow \text{SignExt}_{32}(PC) \\
\text{IF (IsDelaySlot())} \\
\quad \text{THROW ILLSLOT}; \\
target \leftarrow pc; \\
delayedpc \leftarrow target \& (\sim 0x1); \\
PC^* \leftarrow \text{Register}(delayedpc);
\end{array}
\]

Exceptions

RESINST, ILLSLOT

Note

Since this is a delayed branch instruction, the instruction in the delay slot is
executed before branching and must not generate an exception.

An ILLSLOT exception is raised if this instruction is executed in a delay slot.

Interrupts are not accepted between this instruction and the instruction in the
delay slot.
The SR value defined prior to RTE execution is used to fetch the instruction in the RTE delay slot. However, the value of SR used during execution of the instruction in the delay slot, is that restored from SSR by the RTE instruction. It is recommended that, because of this feature, privileged instructions should not be placed in the delay slot.

If the branch target address is invalid then IADDERR trap is not delivered until after the instruction in the delay slot has executed and the PC has advanced to the target address, that is the exception is associated with the target instruction not the branch.

The behavior is architecturally undefined if the instruction in an RTE delay slot raises an exception. For this reason, it is recommended that only simple instructions that cannot generate exceptions are placed in RTE delay slots (unless considerable care is taken).
RTS

Description
This instruction is a delayed unconditional branch used for returning from a
subroutine. The value in PR specifies the target address.

Operation

\[
\text{RTS} \\
000000000001011
\]

\[
\begin{align*}
\text{pr} &\leftarrow \text{SignExtend}_{32}(\text{PR}); \\
\text{IF (IsDelaySlot())} &\quad \text{THROW ILLSLOT}; \\
\text{target} &\leftarrow \text{pr}; \\
\text{delayedpc} &\leftarrow \text{target} \land (\neg 0x1); \\
\text{PC}^* &\leftarrow \text{Register(delayedpc)};
\end{align*}
\]

Exceptions
ILLSLOT

Note
Since this is a delayed branch instruction, the delay slot is executed before
branching. An ILLSLOT exception is raised if this instruction is executed in a delay
slot.

If the branch target address is invalid then IADDERR trap is not delivered until
after the instruction in the delay slot has executed and the PC has advanced to the
target address, that is the exception is associated with the target instruction not the
branch.
SETS

Description
This instruction sets the S-bit to 1.

Operation

SETS

\[
\begin{array}{c}
15 & 0 \\
\end{array}
\]

\[
\begin{array}{c}
000000001011000 \\
15 & 0 \\
\end{array}
\]

\[
ds \leftarrow 1; \\
S \leftarrow \text{Bit}(s);
\]
SETT

Description
This instruction sets the T-bit to 1.

Operation
SETT

```
0000000000011000

15  0
```

\[ t \leftarrow 1; \\
T \leftarrow \text{Bl}(t); \]
SHAD Rm, Rn

Description

This instruction performs an arithmetic shift of Rn, with the dynamic shift direction and shift amount indicated by Rm, and places the result in Rn. If Rm is zero, no shift is performed. If Rm is greater than zero, this is a left shift and the shift amount is given by the least significant 5 bits of Rm. If Rm is less than zero, this is an arithmetic right shift and the shift amount is given by the least significant 5 bits of Rm subtracted from 32. In the case where Rm indicates an arithmetic right shift by 32, the result is filled with copies of the sign-bit of the original Rn.

Operation

SHAD Rm, Rn

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>m</th>
<th>1100</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

op1 ← SignExtend32(Rm);
op2 ← SignExtend32(Rn);
shift_amount ← ZeroExtend5(op1);
IF (op1 ≥ 0)
op2 ← op2 << shift_amount;
ELSE IF (shift_amount ≠ 0)
op2 ← op2 >> (32 - shift_amount);
ELSE IF (op2 < 0)
op2 ← -1;
ELSE
op2 ← 0;
Rn ← Register(op2);

Note
SHAL Rn

Description

Arithmetically shifts \( R_n \) to the left by one bit and places the result in \( R_n \). The bit that is shifted out of the operand is moved to T-bit.

Operation

<table>
<thead>
<tr>
<th>SHAL Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
</tr>
<tr>
<td>n</td>
</tr>
<tr>
<td>8</td>
</tr>
<tr>
<td>7</td>
</tr>
<tr>
<td>00100000</td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{t} \leftarrow \text{op1}_{<31 \text{ FOR 1 >}}; \\
\text{op1} \leftarrow \text{op1} \ll 1; \\
R_n \leftarrow \text{Register}(\text{op1}); \\
T \leftarrow \text{Bit}(\text{t}); \\
\]

Note
SHAR Rn

Description
Arithmetically shifts $R_n$ to the right by one bit and places the result in $R_n$. The bit that is shifted out of the operand is moved to T-bit.

Operation

<table>
<thead>
<tr>
<th>SHAR Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
</tr>
<tr>
<td>15</td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{SignExtend}_{32}(R_n); \\
t \leftarrow \text{op1} <_0 \text{FOR 1 >}; \\
\text{op1} \leftarrow \text{op1} >> 1; \\
R_n \leftarrow \text{Register(\text{op1})}; \\
T \leftarrow \text{Bit}(t);
\]

Note
SHLD Rm, Rn

Description

This instruction performs a logical shift of Rn, with the dynamic shift direction and shift amount indicated by Rm, and places the result in Rn. If Rm is zero, no shift is performed. If Rm is greater than zero, this is a left shift and the shift amount is given by the least significant 5 bits of Rm. If Rm is less than zero, this is a logical right shift and the shift amount is given by the least significant 5 bits of Rm subtracted from 32. In the case where Rm indicates a logical right shift by 32, the result is 0.

Operation

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>m</th>
<th>1101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

op1 ← SignExtend32(Rm);
op2 ← ZeroExtend32(Rn);
shift_amount ← ZeroExtend5(op1);

IF (op1 ≥ 0)
op2 ← op2 << shift_amount;
ELSE IF (shift_amount ≠ 0)
op2 ← op2 >> (32 - shift_amount);
ELSE
op2 ← 0;
Rn ← Register(op2);

Note
SHLL Rn

Description
This instruction performs a logical left shift of Rn by 1 bit and places the result in Rn. The bit that is shifted out is moved to the T-bit.

Operation

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>00000000</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
<td>00000000</td>
</tr>
</tbody>
</table>

\[
\text{SHLL Rn} \\
\text{op1} \leftarrow \text{ZeroExtend}_{32}(R_n); \\
t \leftarrow \text{op1}_{31} \text{ FOR } 1>; \\
\text{op1} \leftarrow \text{op1} \ll 1; \\
R_n \leftarrow \text{Register(op1)}; \\
T \leftarrow \text{Bit(t)};
\]

Note
SHLL2 Rn

Description
This instruction performs a logical left shift of \( R_n \) by 2 bits and places the result in \( R_n \). The bits that are shifted out are discarded.

Operation

\[
\begin{array}{|c|c|c|c|}
\hline
& 0100 & n & 0001000 \\
\hline
15 & 12 & 11 & 8 & 7 & 0 \\
\hline
\end{array}
\]

\[
\text{op1} \leftarrow \text{ZeroExtend}_{32}(R_n); \\
\text{op1} \leftarrow \text{op1} \ll 2; \\
R_n \leftarrow \text{Register(op1)};
\]

Note


SHLL8 Rn

Description
This instruction performs a logical left shift of Rn by 8 bits and places the result in Rn. The bits that are shifted out are discarded.

Operation

\[
\text{SHLL8 Rn}
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>6</th>
<th>5</th>
<th>4</th>
<th>3</th>
<th>2</th>
<th>1</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td><em>n</em></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>00011000</td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{ZeroExtend32}(Rn);
\]
\[
\text{op1} \leftarrow \text{op1} \ll 8;
\]
\[
Rn \leftarrow \text{Register(op1)};
\]

Note
**SHLL16 Rn**

**Description**

This instruction performs a logical left shift of Rn by 16 bits and places the result in Rn. The bits that are shifted out are discarded.

**Operation**

\[
\text{SHLL16 Rn}
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>n</td>
<td></td>
<td></td>
<td></td>
<td>00101000</td>
</tr>
</tbody>
</table>

\[
op1 \leftarrow \text{ZeroExtend}_{32}(Rn);
\]

\[
op1 \leftarrow \text{op1} \ll 16;
\]

\[
Rn \leftarrow \text{Register}(\text{op1});
\]

**Note**
SHLR Rn

Description
This instruction performs a logical right shift of \( R_n \) by 1 bit and places the result in \( R_n \). The bit that is shifted out is moved to the T-bit.

Operation

\[
\begin{array}{cccc|c}
\text{SHLR Rn} & \text{op1} & 0100 & \text{n} & 00000001 \\
15 & 12 & 11 & 8 & 0 \\
\end{array}
\]

\[
\text{op1} \leftarrow \text{ZeroExtend}_{32}(R_n); \\
\text{t} \leftarrow \text{op1} \ll 1; \\
\text{op1} \leftarrow \text{op1} \gg 1; \\
R_n \leftarrow \text{Register(op1)}; \\
T \leftarrow \text{Bit(t)};
\]

Note
SHLR2 Rn

Description

This instruction performs a logical right shift of \( R_n \) by 2 bits and places the result in \( R_n \). The bits that are shifted out are discarded.

Operation

\[
\begin{array}{c|c|c|c|c}
0100 & n & 0001001 & \hline
15 & 12 & 11 & 8 & 7 & 0
\end{array}
\]

\[
op1 \leftarrow \text{ZeroExtend}_{32}(R_n);
\]
\[
op1 \leftarrow \text{op1 >> 2};
\]
\[
R_n \leftarrow \text{Register(op1)};
\]

Note
SHLR8 Rn

Description
This instruction performs a logical right shift of \( R_n \) by 8 bits and places the result in \( R_n \). The bits that are shifted out are discarded.

Operation

```
SHLR8 Rn

\[
\begin{array}{cccccc}
15 & 12 & 11 & 8 & 7 & 0 \\
0100 & n & & & & 00011001
\end{array}
\]
```

\[
op1 \leftarrow \text{ZeroExtend}_{32}(R_n);
\]

\[
op1 \leftarrow \op1 >> 8;
\]

\[
R_n \leftarrow \text{Register}(\op1);
\]

Note
SHLR16 Rn

Description
This instruction performs a logical right shift of \( R_n \) by 16 bits and places the result in \( R_n \). The bits that are shifted out are discarded.

Operation

\[
\text{SHLR16 Rn}
\]

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>00101001</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

\[
op1 \leftarrow \text{ZeroExtend}_{32}(R_n);
\]
\[
op1 \leftarrow \text{op1} \gg 16;
\]
\[
R_n \leftarrow \text{Register(op1)};
\]

Note
SLEEP

Description

This instruction places the CPU in the power-down state.

In power-down mode, the CPU retains its internal state, but immediately stops executing instructions and waits for an interrupt request. The PC at the point of sleep is the address of the instruction immediately following the SLEEP instruction. This property ensures that when the CPU receives an interrupt request, and exits the power-down state, the SPC will contain the address of the instruction following the SLEEP.

SLEEP is a privileged instruction, and can only be used in privileged mode. Use of this instruction in user mode will cause an RESINST exception.

Operation

SLEEP

\[
\begin{array}{c}
0000000000011011 \\
5 \\
0 \\
\end{array}
\]

md ← ZeroExtend \(_1\)(MD);
IF (md = 0)
    THROW RESINST;
SLEEP()

Exceptions

RESINST

Note

The effect of SLEEP upon rest of system depends upon the system architecture. Refer to the system architecture manual of the appropriate product for further details.
STC SR, Rn

Description
This instruction copies SR to Rn, it is a privileged instruction.

Operation
STC SR, Rn

<table>
<thead>
<tr>
<th></th>
<th></th>
<th>n</th>
<th></th>
<th></th>
<th>00000010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>

md ← ZeroExtend32(MD);
IF (md = 0)
    THROW RESINST;
sr ← SignExtend32(SR);
op1 ← sr
Rn ← Register(op1);

Exceptions
RESINST

Note
STC VBR, Rn

Description
This instruction copies VBR to Rn, it is a privileged instruction.

Operation

<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>00100010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

\[
\begin{align*}
\text{md} & \leftarrow \text{ZeroExtend}_1(\text{MD}); \\
\text{IF (md} & \text{=} 0) \\
\text{THROW RESINST;} \\
\text{vbr} & \leftarrow \text{SignExtend}_{32}(\text{VBR}); \\
\text{op1} & \leftarrow \text{vbr} \\
\text{R}_n & \leftarrow \text{Register(op1)};
\end{align*}
\]

Exceptions
RESINST

Note
STC SSR, Rn

Description
This instruction copies SSR to Rn, it is a privileged instruction.

Operation

<table>
<thead>
<tr>
<th>0000</th>
<th>n</th>
<th>00110010</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>12</td>
<td>11</td>
</tr>
<tr>
<td></td>
<td>8</td>
<td>7</td>
</tr>
</tbody>
</table>

\[
\text{md} \leftarrow \text{ZeroExtend}_1(\text{MD}); \\
\text{IF (md} = 0) \\
\quad \text{THROW RESINST;} \\
\text{ssr} \leftarrow \text{SignExtend}_{32}(\text{SSR}); \\
\text{op1} \leftarrow \text{ssr} \\
\text{R}_n \leftarrow \text{Register(op1)};
\]

Exceptions
RESINST

Note
STC SPC, Rn

**Description**

This instruction copies SPC to Rn, it is a privileged instruction.

**Operation**

\[
\text{STC SPC, R}_n
\]

<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>01000010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

\[
\text{md} \leftarrow \text{ZeroExtend}_1(\text{MD}); \\
\text{IF} (\text{md} = 0) \\
\quad \text{THROW RESINST}; \\
\text{spc} \leftarrow \text{SignExtend}_{32}(\text{SPC}); \\
\text{op1} \leftarrow \text{spc} \\
\text{R}_n \leftarrow \text{Register(op1)};
\]

**Exceptions**

RESINST

**Note**
STC SGR, Rn

**Description**
This instruction copies SGR to \( R_n \), it is a privileged instruction.

**Operation**

\[
\text{STC SGR, } R_n
\]

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0000</td>
<td>n</td>
<td></td>
<td></td>
<td></td>
<td>00111010</td>
</tr>
</tbody>
</table>

\[
\text{md} \leftarrow \text{ZeroExtend}_1(\text{MD}); \\
\text{IF } (\text{md} = 0) \\
\quad \text{THROW RESINST}; \\
\text{sgr} \leftarrow \text{SignExtend}_{32}(\text{SGR}); \\
\text{op1} \leftarrow \text{sgr} \\
R_n \leftarrow \text{Register(op1)};
\]

**Exceptions**
RESINST

**Note**
STC DBR, Rn

Description
This instruction copies DBR to Rn, it is a privileged instruction.

Operation

STC DBR, Rn

<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>11111010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
<tr>
<td></td>
<td>7</td>
<td></td>
<td>0</td>
</tr>
</tbody>
</table>

\[
\text{md} \leftarrow \text{ZeroExtend}_1(\text{MD});
\]
\[
\text{IF} (\text{md} = 0)
\]
\[
\text{THROW RESINST};
\]
\[
\text{dbr} \leftarrow \text{SignExtend}_{32}(\text{DBR});
\]
\[
\text{op1} \leftarrow \text{dbr}
\]
\[
R_n \leftarrow \text{Register(op1)};
\]

Exceptions
RESINST

Note
STC Rm_BANK, Rn

Description
This instruction copies Rm_BANK to Rn, it is a privileged instruction.

Operation

STC Rm_BANK, Rn

```
   0000 | n | 1 | m | 0010
 15  12 11  8  7  6  4  0
```

```
md ← ZeroExtend_{1}(MD);
if (md = 0)
   THROW RESINST;
op1 ← SignExtend_{32}(Rm_BANK);
op2 ← op1;
Rn ← Register(op2);
```

Exceptions
RESINST

Note
STC.L SR, @-Rn

Description

This instruction stores SR to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of SR is written to the effective address. This is a privileged instruction.

Operation

<table>
<thead>
<tr>
<th>STC.L SR, @-Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
</tr>
<tr>
<td>15</td>
</tr>
</tbody>
</table>

md ← ZeroExtend\(_32\)(MD);
IF (md = 0)
    THROW RESINST;
sr ← SignExtend\(_32\)(SR);
sp ← SignExtend\(_32\)(Rn);
address ← ZeroExtend\(_32\)(op1 - 4);
WriteMemory\(_32\)(address, sr);
op1 ← address;
Rn ← Register(op1);

Exceptions

RESINST, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STC.L VBR, @-Rn

Description
This instruction stores VBR to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of VBR is written to the effective address. This is a privileged instruction.

Operation

STC.L VBR, @-Rn

<table>
<thead>
<tr>
<th></th>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>0100</td>
<td>n</td>
<td></td>
<td></td>
<td></td>
<td>00100011</td>
</tr>
</tbody>
</table>

md ← ZeroExtend1(MD);
IF (md = 0)
   THROW RESINST;
vbr ← SignExtend32(VBR);
op1 ← SignExtend32(Rn);
address ← ZeroExtend32(op1 - 4);
WriteMemory32(address, vbr);
op1 ← address;
Rn ← Register(op1);

Exceptions
RESINST, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STC.L SSR, @-Rn

Description

This instruction stores SSR to memory using register indirect with pre-decrement addressing. \( R_n \) is pre-decremented by 4 to give the effective address. The 32-bit value of SSR is written to the effective address. This is a privileged instruction.

Operation

```
STC.L SSR, @-Rn

<table>
<thead>
<tr>
<th>0100</th>
<th>n</th>
<th>00110011</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
</tbody>
</table>
```

```plaintext
md ← \text{ZeroExtend}_1(MD);
IF (md = 0)
    \text{THROW RESINST};
ssr ← \text{SignExtend}_{32}(SSR);
op1 ← \text{SignExtend}_{32}(R_n);
address ← \text{ZeroExtend}_{32}(op1 - 4);
\text{WriteMemory}_{32}(address, ssr);
op1 ← address;
R_n ← \text{Register}(op1);
```

Exceptions

RESINST, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STC.L SPC, @-Rn

Description

This instruction stores SPC to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of SPC is written to the effective address. This is a privileged instruction.

Operation

<table>
<thead>
<tr>
<th>STC.L SPC, @-Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td><img src="image" alt="" /></td>
</tr>
</tbody>
</table>

```
md ← ZeroExtend1(MD);
IF (md = 0)
   THROW RESINST;
spc ← SignExtend32(SPC);
op1 ← SignExtend32(Rn);
address ← ZeroExtend32(op1 - 4);
WriteMemory32(address, spc);
op1 ← address;
Rn ← Register(op1);
```

Exceptions

RESINST, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STC.L SGR, @-Rn

Description
This instruction stores SGR to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of SGR is written to the effective address. This is a privileged instruction.

Operation

\[
\begin{array}{cccccc}
\text{STC.L SGR, @-Rn} \\
0100 & n & 00110010 \\
15 & 12 & 11 & 8 & 0 & 0 \\
\end{array}
\]

md ← ZeroExtend\(_1\)(MD);
IF (md = 0)
THROW RESINST;
sgr ← SignExtend\(_{32}\)(SGR);
op1 ← SignExtend\(_{32}\)(Rn);
address ← ZeroExtend\(_{32}\)(op1 - 4);
WriteMemory\(_{32}\)(address, sgr);
op1 ← address;
Rn ← Register(op1);

Exceptions
RESINST, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STC.L DBR, @-Rn

Description
This instruction stores DBR to memory using register indirect with pre-decrement addressing. \( R_n \) is pre-decremented by 4 to give the effective address. The 32-bit value of DBR is written to the effective address. This is a privileged instruction.

Operation

\[
\begin{array}{cccccc}
  & & 0100 & n & 1110010 \\
15 & 12 & 11 & 8 & 7 & 0 \\
\end{array}
\]

\[
\begin{aligned}
  \text{md} & \leftarrow \text{ZeroExtend}_3(\text{MD}); \\
  \text{IF} (\text{md} = 0) & \text{THROW RESINST}; \\
  \text{dbr} & \leftarrow \text{SignExtend}_{32}(\text{DBR}); \\
  \text{op1} & \leftarrow \text{SignExtend}_{32}(R_n); \\
  \text{address} & \leftarrow \text{ZeroExtend}_{32}(\text{op1} - 4); \\
  \text{WriteMemory}_{32}(\text{address}, \text{dbr}); \\
  \text{op1} & \leftarrow \text{address}; \\
  R_n & \leftarrow \text{Register}(\text{op1});
\end{aligned}
\]

Exceptions
RESINST, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
**STC.L Rm_BANK, @-Rn**

**Description**

This instruction stores Rm_BANK to memory using register indirect with pre-decrement addressing. R_n is pre-decremented by 4 to give the effective address. The 32-bit value of Rm_BANK is written to the effective address. This is a privileged instruction.

**Operation**

```plaintext
STC.L Rm_BANK, @-Rn

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>1</th>
<th>m</th>
<th>0011</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>

md ← ZeroExtend1(MD);
IF (md = 0)
  THROW RESINST;
op1 ← SignExtend32(Rm_BANK);
op2 ← SignExtend32(Rn);
address ← ZeroExtend32(op2 - 4);
WriteMemory32(address, op1);
op2 ← address;
R_n ← Register(op2);
```

**Exceptions**

RESINST, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
**STC GBR, Rn**

**Description**

This instruction copies GBR to $R_n$.

**Operation**

<table>
<thead>
<tr>
<th>0000</th>
<th>n</th>
<th>00010010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
</tr>
</tbody>
</table>

```
gbr ← SignExtend_32(GBR);
op1 ← gbr;
R_n ← Register(op1);
```

**Note**
STC.L GBR, @-Rn

Description
This instruction stores GBR to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of GBR is written to the effective address.

Operation

\[
\begin{array}{c|c|c|c|c}
0100 & n & 0 & 0 & 00010011 \\
\hline
15 & 12 & 11 & 8 & 7 \\
\end{array}
\]

\[
gbr \leftarrow \text{SignExtend}_{32}(\text{GBR}); \\
op1 \leftarrow \text{SignExtend}_{32}(R_n); \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(op1 - 4); \\
\text{WriteMemory}_{32}(\text{address}, gbr); \\
op1 \leftarrow \text{address}; \\
R_n \leftarrow \text{Register}(op1);
\]

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STS FPSCR, Rn

Description
This floating-point instruction copies FPSCR to Rn.

Operation

\[
\begin{array}{cccccc}
\text{STS FPSCR, Rn} & \text{0000} & \text{n} & \text{fps} & \text{sr} & \text{01101010} \\
15 & 12 & 11 & 8 & 7 & 0 \\
\end{array}
\]

\[
sr \leftarrow \text{ZeroExtend}_{32}(SR);
fps \leftarrow \text{ZeroExtend}_{32}(\text{FPSCR});
\text{IF (FpuIsDisabled}(sr) \text{ AND IsDelaySlot))}
\text{THROW SLOTFPUDIS;}
\text{IF (FpuIsDisabled}(sr))
\text{THROW FPUDIS;}
op1 \leftarrow fps;
R_n \leftarrow \text{Register}(op1);
\]

Exceptions
SLOTFPUDIS, FPUDIS
STS.L FPSCR, @-Rn

Description

This floating-point instruction stores FPSCR to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of FPSCR is written to the effective address.

Operation

<table>
<thead>
<tr>
<th>STS.L FPSCR, @-Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
</tr>
<tr>
<td>n</td>
</tr>
<tr>
<td>01100010</td>
</tr>
</tbody>
</table>

\[
\text{sr} \leftarrow \text{ZeroExtend}_{32}(\text{SR}); \\
\text{fps} \leftarrow \text{ZeroExtend}_{32}(\text{FPSCR}); \\
\text{op1} \leftarrow \text{SignExtend}_{32}(\text{Rn}); \\
\text{IF } (\text{FpuIsDisabled}(\text{sr}) \text{ AND IsDelaySlot()}) \\
\hspace{1cm} \text{THROW SLOTFPUDIS;} \\
\text{IF } (\text{FpuIsDisabled}(\text{sr})) \\
\hspace{1cm} \text{THROW FPUDIS;} \\
\text{value} \leftarrow \text{fps}; \\
\text{address} \leftarrow \text{ZeroExtend}_{32}(\text{op1} - 4); \\
\text{WriteMemory}_{32}(\text{address, value}); \\
\text{op1} \leftarrow \text{address}; \\
\text{Rn} \leftarrow \text{Register}(\text{op1});
\]

Exceptions

SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STS FPUL, Rn

Description
This floating-point instruction copies FPUL to Rn.

Operation

\[
\text{STS FPUL, Rn}
\]

<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>8</th>
<th>7</th>
<th>01011010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>12</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>8</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>0</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
sr \leftarrow \text{ZeroExtend}_{32}(SR);
fpul \leftarrow \text{SignExtend}_{32}(FPUL);
\]

IF (FpuIsDisabled(sr) AND IsDelaySlot())
THROW SLOTFPUDIS;
IF (FpuIsDisabled(sr))
THROW FPUDIS;

\[\text{op1} \leftarrow \text{fpul};\]
\[\text{R}_n \leftarrow \text{Register(op1)};\]

Exceptions
SLOTFPUDIS, FPUDIS
STS.L FPUL, @-Rn

Description
This floating-point instruction stores FPUL to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of FPUL is written to the effective address.

Operation

\[
\text{STS.L FPUL, @-Rn}
\]

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th></th>
<th>01010010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td></td>
</tr>
</tbody>
</table>

\[
sr \leftarrow \text{ZeroExtend}_{32}(SR);
fpul \leftarrow \text{SignExtend}_{32}(\text{FPUL});
\text{op1} \leftarrow \text{SignExtend}_{32}(R_n);
\]

\[
\text{IF} \left(\text{FpuIsDisabled}(sr) \text{ AND IsDelaySlot()}\right) \\
\text{THROW SLOTFPUDIS};
\]

\[
\text{IF} \left(\text{FpuIsDisabled}(sr)\right) \\
\text{THROW FPUDIS};
\text{address} \leftarrow \text{ZeroExtend}_{32}\left(\text{op1} - 4\right);
\text{WriteMemory}_{32}\left(\text{address}, \text{fpul}\right);
\]

\[
\text{op1} \leftarrow \text{address};
\text{R}_n \leftarrow \text{Register}(\text{op1});
\]

Exceptions
SLOTFPUDIS, FPUDIS, WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STS MACH, Rn

Description
This instruction copies MACH to Rn.

Operation

```
  STS MACH, Rn

  mach ← SignExtend32(MACH);
  op1 ← mach;
  Rn ← Register(op1);
```
STS.L MACH, @-Rn

Description
This instruction stores MACH to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of MACH is written to the effective address.

Operation

\[
\text{STS.L MACH, @-Rn}
\]

\[
\begin{array}{cccccc}
  & 0 & 1 & 0 & 0 & n & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\
 15 & 14 & 13 & 12 & 11 & 10 & 9 & 8 & 7 & 6 & 5 & 4 & 3 & 2 & 1 & 0
\end{array}
\]

\[
mach \leftarrow \text{SignExtend}_{32}(\text{MACH});
\]
\[
op1 \leftarrow \text{SignExtend}_{32}(R_n);
\]
\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(op1 - 4);
\]
\[
\text{WriteMemory}_{32}(\text{address}, \text{mach});
\]
\[
op1 \leftarrow \text{address};
\]
\[
R_n \leftarrow \text{Register}(op1);
\]

Exceptions
WADDERR, WTLMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STS MACL, Rn

Description
This instruction copies MACL to Rn.

Operation

```
STS MACL, Rn

0000  n  00011010
 15  12  11  8  7  0

macl ← SignExtend32(MACL);
  op1 ← macl;
  Rn ← Register(op1);
```
STS.L MACL, @-Rn

Description
This instruction stores MACL to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of MACL is written to the effective address.

Operation

```
<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>00010010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>
```

macl ← SignExtend32(MACL);
op1 ← SignExtend32(Rn);
address ← ZeroExtend32(op1 - 4);
WriteMemory32(address, macl);
op1 ← address;
Rn ← Register(op1);

Exceptions
WADDERR, WTLBMISS, WRITEPROT, FIRSTWRITE

Note
The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
STS PR, Rn

Description
This instruction copies PR to Rn.

Operation

STS PR, Rn

```
<table>
<thead>
<tr>
<th></th>
<th>0000</th>
<th>n</th>
<th>00101010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>
```

- \( pr \leftarrow \text{SignExtend}_{32}(PR); \)
- \( \text{op1}\leftarrow pr; \)
- \( R_n \leftarrow \text{Register(op1)}; \)

Note
**STS.L PR, @-Rn**

**Description**

This instruction stores PR to memory using register indirect with pre-decrement addressing. Rn is pre-decremented by 4 to give the effective address. The 32-bit value of PR is written to the effective address.

**Operation**

STS.L PR, @-Rn

<table>
<thead>
<tr>
<th></th>
<th>0100</th>
<th>n</th>
<th>00100010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
</tr>
</tbody>
</table>

\[
pr \leftarrow \text{SignExtend}_{32}(PR');
\]

\[
op1 \leftarrow \text{SignExtend}_{32}(R_n);
\]

\[
\text{address} \leftarrow \text{ZeroExtend}_{32}(op1 - 4);
\]

\[
\text{WriteMemory}_{32}(\text{address, } pr);
\]

\[
op1 \leftarrow \text{address};
\]

\[
R_n \leftarrow \text{Register}(op1);
\]

**Exceptions**

WADDERR, WTLMISS, WRITEPROT, FIRSTWRITE

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.
SUB Rm, Rn

Description
This instruction subtracts $R_m$ from $R_n$ and places the result in $R_n$.

Operation

```
<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0011</td>
<td>n</td>
<td>m</td>
<td>1000</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
```

```
op1 ← SignExtend32(Rm);

op2 ← SignExtend32(Rn);

op2 ← op2 - op1;

Rn ← Register(op2);
```

Note
SUBC Rm, Rn

Description
This instruction subtracts R_m and the T-bit from R_n and places the result in R_n. The borrow from the subtraction is placed in the T-bit.

Operation

SUBC Rm, Rn

<table>
<thead>
<tr>
<th></th>
<th>0011</th>
<th>n</th>
<th>m</th>
<th>1010</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td></td>
<td>12</td>
<td>11</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td>8</td>
<td>7</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td>4</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
t \leftarrow \text{ZeroExtend}_t(T);
op1 \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_m));
op2 \leftarrow \text{ZeroExtend}_{32}(\text{SignExtend}_{32}(R_n));
op2 \leftarrow (op2 - op1) \cdot t;
t \leftarrow op2 \cdot 32 \text{ FOR } t > 0;
R_n \leftarrow \text{Register}(op2);
T \leftarrow \text{Bit}(t);
\]

Note
SUBV Rm, Rn

Description

This instruction subtracts Rm from Rn and places the result in Rn. The T-bit is set to 1 if the subtraction result is outside the 32-bit signed range, otherwise the T-bit is set to 0.

Operation

SUBV Rm, Rn

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0011</td>
<td>n</td>
<td>m</td>
<td>1011</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
\begin{align*}
\text{op1} & \leftarrow \text{SignExtend}_{32}(R_m) \\
\text{op2} & \leftarrow \text{SignExtend}_{32}(R_n) \\
\text{op2} & \leftarrow \text{op2} - \text{op1} \\
t & \leftarrow \text{INT}((\text{op2} < -2^{31}) \text{ OR } (\text{op2} \geq 2^{31})) \\
R_n & \leftarrow \text{Register(op2)} \\
T & \leftarrow \text{Bit}(t)
\end{align*}
\]

Note
SWAP.B Rm, Rn

Description

This instruction swaps the values of the lower 2 bytes in Rm and places the result in Rn. Bits [0,7] take the value of bits [8,15]. Bits [8,15] take the value of bits [0,7]. Bits [16,31] are unchanged.

Operation

<table>
<thead>
<tr>
<th>SWAP.B Rm, Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>0110</td>
</tr>
<tr>
<td>15</td>
</tr>
</tbody>
</table>

op1 ← ZeroExtend32(Rm);
op2 ← (op1< 16 FOR 16 > << 16) ∨ (op1< 0 FOR 8 > << 8)) ∨ op1< 8 FOR 8 >;
Rn ← Register(op2);

Note
**SWAP.W Rm, Rn**

**Description**
This instruction swaps the values of the 2 words in $R_m$ and places the result in $R_n$. Bits [0,15] take the value of bits [16,31]. Bits [16,31] take the value of bits [0,15].

**Operation**

$$SWAP.W \ Rm, \ Rn$$

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>8</th>
<th>7</th>
<th>4</th>
<th>3</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0110</td>
<td>n</td>
<td>m</td>
<td>1001</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
op1 \leftarrow \text{ZeroExtend}_{32}(R_m);
\]
\[
op2 \leftarrow (\text{op1}_<0 \text{ FOR } 16 > << 16) \lor \text{op1}_<16 \text{ FOR } 16 >;
\]
\[
R_n \leftarrow \text{Register}(op2);
\]

**Note**
TAS.B @Rn

Description

This instruction performs a test-and-set operation on the byte data at the effective address specified in Rn. It begins by purging the operand cache block containing the accessed memory location. The 8 bits of data at the effective address are read from memory. If the read data is 0 the T-bit is set, otherwise the T-bit is cleared. The highest bit of the 8-bit data (bit 7) is set, and the result is written back to the memory at the same effective address.

This test-and-set is atomic from the CPU perspective. This instruction cannot be interrupted during its operation.

Operation

TAS.B @Rn

<table>
<thead>
<tr>
<th>15</th>
<th>12</th>
<th>11</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0100</td>
<td>n</td>
<td>00011011</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

op1 ← SignExtend32(Rn);
address ← ZeroExtend32(op1);
OCBP(address)
value ← ZeroExtend8(ReadMemory8(address));
t ← INT (value = 0);
value ← value ∨ (1 << 7);
WriteMemory8(address, value);
T ← Bit(t);

Exceptions

WADDERR, WTLBMISS, READPROT, WRITEPROT, FIRSTWRITE

Note

The TAS.B instruction guarantees atomicity of access to all components of the core but not necessarily the entire address space. Refer to the system architecture manual of the appropriate product to determine the properties of individual targets in the address map.
TRAPA #imm

**Description**

This instruction causes a pre-execution trap. The value of the zero-extended 8-bit immediate i is used by the handler launch sequence to characterize the trap.

**Operation**

```
TRAPA #imm

<table>
<thead>
<tr>
<th>15</th>
<th>11000011</th>
<th>i</th>
</tr>
</thead>
<tbody>
<tr>
<td>8</td>
<td>7</td>
<td>0</td>
</tr>
</tbody>
</table>
```

\[
\text{imm} \leftarrow \text{ZeroExtend}_8(i);
\]

```
IF (IsDelaySlot())
    THROW ILLSLOT;
THROW TRAP, imm;
```

**Exceptions**

ILLSLOT, TRAP

**Note**

An ILLSLOT exception is raised if this instruction is executed in a delay slot.

The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
TST Rm, Rn

Description
This instruction performs a bitwise AND of \( R_m \) with \( R_n \). If the result is 0, the T-bit is set, otherwise the T-bit is cleared.

Operation
TST Rm, Rn

\[
\begin{array}{cccccc}
 & 0010 & n & m & & 1000 \\
15 & 12 & 11 & 8 & 7 & 4 & 3 & 0
\end{array}
\]

\[
op1 \leftarrow \text{SignExtend}32(\text{R}_m);
op2 \leftarrow \text{SignExtend}32(\text{R}_n);
t \leftarrow \text{INT}((\text{op1} \land \text{op2}) = 0);
T \leftarrow \text{Bit}(t);
\]

Note
TST #imm, R0

Description
This instruction performs a bitwise AND of R0 with the zero-extended 8-bit immediate i. If the result is 0, the T-bit is set, otherwise the T-bit is cleared.

Operation

TST #imm, R0

```
11001000 i
```

15  8  7  0

```
r0 ← SignExtend32(R0);
imm ← ZeroExtend8(i);
t ← INT ((r0 AND imm) = 0);
T ← Bit(t);
```

Note
The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
TST.B #imm, @(R0, GBR)

Description

This instruction performs a bitwise test of an immediate constant with 8 bits of data held in memory. The effective address is calculated by adding R0 and GBR. The 8 bits of data at the effective address are read. A bitwise AND is performed of the read data with the zero-extended 8-bit immediate i. If the result is 0, the T-bit is set, otherwise the T-bit is cleared.

Operation

TST.B #imm, @(R0, GBR)

<table>
<thead>
<tr>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>11001100</td>
<td>i</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
\begin{align*}
  & \text{r0} \leftarrow \text{SignExtend}_{32}(R0); \\
  & \text{gbr} \leftarrow \text{SignExtend}_{32}(GBR); \\
  & \text{imm} \leftarrow \text{ZeroExtend}_8(i); \\
  & \text{address} \leftarrow \text{ZeroExtend}_{32}(\text{r0} + \text{gbr}); \\
  & \text{value} \leftarrow \text{ZeroExtend}_8(\text{ReadMemory}_8(\text{address})); \\
  & t \leftarrow (\text{value} \land \text{imm}) = 0; \\
  & T \leftarrow \text{Bit}(t);
\end{align*}
\]

Exceptions

RADDERR, RTLBMISS, READPROT

Note

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
## XOR Rm, Rn

### Description
This instruction performs a bitwise XOR of Rm with Rn and places the result in Rn.

### Operation

<table>
<thead>
<tr>
<th></th>
<th></th>
<th>n</th>
<th>m</th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>8</td>
<td>4</td>
</tr>
</tbody>
</table>

\[
\text{op1} \leftarrow \text{ZeroExtend}_{32}(R_m);
\text{op2} \leftarrow \text{ZeroExtend}_{32}(R_n);
\text{op2} \leftarrow \text{op2} \oplus \text{op1};
R_n \leftarrow \text{Register(op2)};
\]

### Note
XOR #imm, R0

Description
This instruction performs a bitwise XOR of R0 with the zero-extended 8-bit immediate i and places the result in R0.

Operation
XOR #imm, R0

<table>
<thead>
<tr>
<th></th>
<th>11001010</th>
<th>i</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>8</td>
<td>7</td>
</tr>
<tr>
<td>0</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

r0 ← ZeroExtend32(R0);
imm ← ZeroExtend8(i);
r0 ← r0 ⊕ imm;
R0 ← Register(r0);

Note
The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
**XOR.B #imm, @(R0, GBR)**

**Description**

This instruction performs a bitwise XOR of an immediate constant with 8 bits of data held in memory. The effective address is calculated by adding R0 and GBR. The 8 bits of data at the effective address are read. A bitwise XOR is performed of the read data with the zero-extended 8-bit immediate i. The result is written back to the 8 bits of data at the same effective address.

**Operation**

\[
\text{XOR.B #imm, @(R0, GBR)}
\]

<table>
<thead>
<tr>
<th>15</th>
<th>8</th>
<th>7</th>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>11001110</td>
<td>i</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

\[
r0 \leftarrow \text{SignExtend}_{32}(R0);
gbr \leftarrow \text{SignExtend}_{32}(GBR);
imm \leftarrow \text{ZeroExtend}_8(i);
address \leftarrow \text{ZeroExtend}_{32}(r0 + gbr);
value \leftarrow \text{ZeroExtend}_8(\text{ReadMemory}_8(\text{address}));
value \leftarrow value \oplus imm;
\text{WriteMemory}_8(\text{address}, value);
\]

**Exceptions**

WADDERR, WTLBMISS, READPROT, WRITEPROT, FIRSTWRITE

**Note**

The effective address calculation is performed using 32-bit zero extension to cause wrap around if the address-space bounds are exceeded.

The ‘#imm’ in the assembly syntax represents the immediate i after zero extension.
XTRCT Rm, Rn

Description

This instruction extracts the lower 16-bit word from Rm and the upper 16-bit word from Rn, swaps their order, and places the result in Rn. Bits [0,15] of Rn take the value of bits [16,31] of the original Rn. Bits [16,31] of Rn take the value of bits [0,15] of Rm.

Operation

XTRCT Rm, Rn

<table>
<thead>
<tr>
<th></th>
<th>0010</th>
<th>n</th>
<th>m</th>
<th>1101</th>
</tr>
</thead>
<tbody>
<tr>
<td>15</td>
<td>12</td>
<td>11</td>
<td>7</td>
<td>4</td>
</tr>
</tbody>
</table>

op1 ← ZeroExtend32(Rm);
op2 ← ZeroExtend32(Rn);
op2 ← op2<16 FOR 16> ∨ (op1<0 FOR 16> << 16);
Rn ← Register(op2);

Note
Pipelining

The SH-4 CPU core is a dual-issue superscalar pipelining microprocessor. This section gives a high-level description of the way in which this particular implementation of the SH4 architecture executes instructions. Definitions in this section may not be applicable to SH-4 Series models other than the SH-4 CPU core.

10.1 Pipelines

Figure 33 shows the basic pipelines. Normally, a pipeline consists of five or six stages: instruction fetch (I), decode and register read (D), execution (EX/SX/F0/F1/F2/F3), data access (NA/MA), and write-back (S/FS). An instruction is executed as a combination of basic pipelines. Figure 34 to Figure 38 show the instruction execution patterns.
### Figure 33: Basic pipelines

<table>
<thead>
<tr>
<th>Type</th>
<th>Stages</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>General Pipeline</strong></td>
<td>I, D, EX, NA, S</td>
</tr>
<tr>
<td>- Instruction fetch</td>
<td>- Instruction decode</td>
</tr>
<tr>
<td>- Issue</td>
<td>- Issue</td>
</tr>
<tr>
<td>- Register read</td>
<td>- Register read</td>
</tr>
<tr>
<td>- Operation</td>
<td>- Operation for PC-relative branch</td>
</tr>
<tr>
<td>- Non-memory data access</td>
<td>- Non-memory data access</td>
</tr>
<tr>
<td>- Write-back</td>
<td>- Write-back</td>
</tr>
</tbody>
</table>

| **General Load/Store Pipeline** | I, D, EX, MA, S |
| - Instruction fetch           | - Instruction decode |
| - Issue                       | - Issue |
| - Register read               | - Register read |
| - Address calculation         | - Address calculation |
| - Memory data access          | - Memory data access |
| - Write-back                  | - Write-back |

| **Special Pipeline**         | I, D, SX, NA, S |
| - Instruction fetch           | - Instruction decode |
| - Issue                       | - Issue |
| - Register read               | - Register read |
| - Operation                   | - Operation |
| - Non-memory data access      | - Non-memory data access |
| - Write-back                  | - Write-back |

| **Special Load/Store Pipeline** | I, D, SX, MA, S |
| - Instruction fetch           | - Instruction decode |
| - Issue                       | - Issue |
| - Register read               | - Register read |
| - Address calculation         | - Address calculation |
| - Memory data access          | - Memory data access |
| - Write-back                  | - Write-back |

| **Floating-Point Pipeline**  | I, D, F1, F2, FS |
| - Instruction fetch           | - Instruction decode |
| - Issue                       | - Issue |
| - Register read               | - Register read |
| - Computation 1               | - Computation 1 |
| - Computation 2               | - Computation 2 |
| - Computation 3               | - Computation 3 |
| - Write-back                  | - Write-back |

| **Floating-Point Extended Pipeline** | I, D, F0, F1, F2, FS |
| - Instruction fetch           | - Instruction decode |
| - Issue                       | - Issue |
| - Register read               | - Register read |
| - Computation 0               | - Computation 0 |
| - Computation 1               | - Computation 1 |
| - Computation 2               | - Computation 2 |
| - Computation 3               | - Computation 3 |
| - Write-back                  | - Write-back |

| **FDIV/FSQRT Pipeline**      | F3 |
| - Computation: Takes several cycles |
1. 1-step operation: 1 issue cycle
   EXTSU/[BW], MOV, MOV#, MOVZ, MOVQ, SWAP/[BW], XTRCT, ADD*, CMP*,
   DIV*, DT, NEG*, SUB*, AND, AND#, NOT, OR, OR#, TST, TST#, XOR, XOR#,
   ROT*, SHA*, SHL*, BF*, BRA, NOP, CLR, CLRQ, SETS, SETT,
   LDS to FPUL, STS from FPUL/FPSCR, FLDI0, FLDI1, FMOV, FLDQ, FSTS,
   single-/double-precision FABS/FNEG
   ![Instruction execution patterns](image)

2. Load/store: 1 issue cycle
   MOV,[BW], FMOV*@[d], LDS.L to FPUL, LDTLB, PREF, STS.L from FPUL/FPSCR
   ![Instruction execution patterns](image)

3. GBR-based load/store: 1 issue cycle
   MOV,[BW]@[d,GBR]
   ![Instruction execution patterns](image)

4. JMP, RTS, BRAF: 2 issue cycles
   ![Instruction execution patterns](image)

5. TST.B: 3 issue cycles
   ![Instruction execution patterns](image)

6. AND.B, OR.B, XOR.B: 4 issue cycles
   ![Instruction execution patterns](image)

7. TAS.B: 5 issue cycles
   ![Instruction execution patterns](image)

8. RTE: 5 issue cycles
   ![Instruction execution patterns](image)

9. SLEEP: 4 issue cycles
   ![Instruction execution patterns](image)
Figure 35: Instruction execution patterns (continued)
19. LDC.L to SR: 4 issue cycles
   I  D  EX  MA  S
   D  SX
   D  SX
   D  SX  SX

20. STC from DBR/GBR/Rp_BANK/SR/SSR/SPC/VBR: 2 issue cycles
   I  D  SX  NA  S
   D  SX  NA  S
   D  SX  NA  S

21. STC from SGR: 3 issue cycles
   I  D  SX  NA  S
   D  SX  NA  S
   D  SX  NA  S

22. STC.L from DBR/GBR/Rp_BANK/SR/SSR/SPC/VBR: 2 issue cycles
   I  D  SX  NA  S
   D  SX  MA  S

23. STC.L from SGR: 3 issue cycles
   I  D  SX  NA  S
   D  SX  NA  S
   D  SX  MA  S

24. LDS to PR, JSR, BSRF: 2 issue cycles
   I  D  EX  NA  S
   D  SX
   D  SX

25. LDS.L to PR: 2 issue cycles
   I  D  EX  MA  S
   D  SX
   D  SX

26. STS from PR: 2 issue cycles
   I  D  SX  NA  S
   D  SX  NA  S

27. STS.L from PR: 2 issue cycles
   I  D  SX  NA  S
   D  SX  MA  S

28. MACH/L definition: 1 issue cycle
    CLRMAC, LDS to MACH/L
    I  D  EX  NA  S
    F1  F1  F2  FS

29. LDS.L to MACH/L: 1 issue cycle
    I  D  EX  MA  S
    F1  F1  F2  FS

30. STS from MACH/L: 1 issue cycle
    I  D  EX  NA  S

Figure 36: Instruction execution patterns (continued)
31. STS.L from MACH/L: 1 issue cycle
   I  D  EX  MA  S

32. LDS to FPSCR: 1 issue cycle
   I  D  EX  NA  S

33. LDS.L to FPSCR: 1 issue cycle
   I  D  EX  MA  S

34. Fixed-point multiplication: 2 issue cycles
   DMULS.L, DMULU.L, MUL.L, MULS.W, MULU.W
   I  D  EX  NA  S
   I  D  EX  NA  S
   (CPU)
   (FPU)

35. MAC.W, MAC.L: 2 issue cycles
   I  D  EX  MA  S
   I  D  EX  MA  S
   (CPU)
   (FPU)

36. Single-precision floating-point computation: 1 issue cycle
   FCMP/EQ, FCMP/GT, FADD, FLOAT, FMAC, FMUL, FSUB, FTRC, FRCHG, FSCHG
   I  D  F1  F2  FS

37. Single-precision FDIV/SQRT: 1 issue cycle
   I  D  F1  F2  FS

38. Double-precision floating-point computation 1: 1 issue cycle
   FCNVDS, FCNVSD, FLOAT, FTRC
   I  D  F1  F2  FS
   d  F1  F2  FS

39. Double-precision floating-point computation 2: 1 issue cycle
   FADD, FMUL, FSUB
   I  D  F1  F2  FS
   d  F1  F2  FS
   d  F1  F2  FS
   d  F1  F2  FS
   d  F1  F2  FS

Figure 37: Instruction execution patterns (continued)
40. Double-precision FCMP: 2 issue cycles
   FCMP/EQ, FCMP/GT
   \[ \begin{array}{cccccc}
   I & D & F1 & F2 & F3 \\
   D & F1 & F2 & F3 \\
   \end{array} \]

41. Double-precision FDIV/SQRT: 1 issue cycle
   FDIV, FSQRT
   \[ \begin{array}{cccccc}
   I & D & F1 & F2 & F3 \\
   d & F1 & F2 & F3 \\
   \end{array} \]

42. FIPR: 1 issue cycle
   \[ \begin{array}{cccc}
   I & D & F0 & F1 & F2 & F3 \\
   d & F0 & F1 & F2 & F3 \\
   d & F0 & F1 & F2 & F3 \\
   d & F0 & F1 & F2 & F3 \\
   \end{array} \]

43. FTRV: 1 issue cycle
   \[ \begin{array}{cccc}
   I & D & F0 & F1 & F2 & F3 \\
   d & F0 & F1 & F2 & F3 \\
   d & F0 & F1 & F2 & F3 \\
   \end{array} \]

Notes:
- ?? : Cannot overlap a stage of the same kind, except when two instructions are executed in parallel.
- D : Locks D-stage
- d : Register read only
- ?? : Locks, but no operation is executed.
- f1 : Can overlap another f1, but not another F1.

Figure 38: Instruction execution patterns (continued)
10.2 Parallel executables

Instructions are categorized into six groups according to the internal function blocks used, as shown in table 8.1. Table 8.2 shows the parallel executable pairs of instructions in terms of groups. For example, ADD in the EX group and BRA in the BR group can be executed in parallel.

1. MT Group

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Group</th>
<th>Function</th>
<th>Operands</th>
</tr>
</thead>
<tbody>
<tr>
<td>CLRT</td>
<td>CMP/HI</td>
<td>Rm,Rn</td>
<td>MOV</td>
</tr>
<tr>
<td>CMP/EQ #imm,R0</td>
<td>CMP/HS</td>
<td>Rm,Rn</td>
<td>NOP</td>
</tr>
<tr>
<td>CMP/EQ Rm,Rn</td>
<td>CMP/PL</td>
<td>Rn</td>
<td>SETT</td>
</tr>
<tr>
<td>CMP/GE Rm,Rn</td>
<td>CMP/PZ</td>
<td>Rn</td>
<td>TST #imm,R0</td>
</tr>
<tr>
<td>CMP/GT Rm,Rn</td>
<td>CMP/STR</td>
<td>Rm,Rn</td>
<td>TST Rm,Rn</td>
</tr>
</tbody>
</table>

2. EX Group

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Group</th>
<th>Function</th>
<th>Operands</th>
</tr>
</thead>
<tbody>
<tr>
<td>ADD #imm,Rn</td>
<td>MOVT</td>
<td>Rn</td>
<td>SHLL2 Rn</td>
</tr>
<tr>
<td>ADD Rm,Rn</td>
<td>NEG</td>
<td>Rm,Rn</td>
<td>SHLL8 Rn</td>
</tr>
<tr>
<td>ADDC Rm,Rn</td>
<td>NEGC</td>
<td>Rm,Rn</td>
<td>SHLR Rn</td>
</tr>
<tr>
<td>ADDV Rm,Rn</td>
<td>NOT</td>
<td>Rm,Rn</td>
<td>SHLR16 Rn</td>
</tr>
<tr>
<td>AND #imm,R0</td>
<td>OR #imm,R0</td>
<td>SHLR2 Rn</td>
<td></td>
</tr>
<tr>
<td>AND Rm,Rn</td>
<td>OR Rm,Rn</td>
<td>SHLR8 Rn</td>
<td></td>
</tr>
<tr>
<td>DIV0S Rm,Rn</td>
<td>ROTCL</td>
<td>Rn</td>
<td>SUB Rm,Rn</td>
</tr>
<tr>
<td>DIV0U Rm,Rn</td>
<td>ROTCR</td>
<td>Rn</td>
<td>SUBC Rm,Rn</td>
</tr>
<tr>
<td>DIV1 Rm,Rn</td>
<td>ROTL</td>
<td>Rn</td>
<td>SUBV Rm,Rn</td>
</tr>
<tr>
<td>DT Rn</td>
<td>ROTR</td>
<td>Rn</td>
<td>SWAP.B Rm,Rn</td>
</tr>
<tr>
<td>EXTS.B Rm,Rn</td>
<td>SHAD</td>
<td>Rm,Rn</td>
<td>SWAP.W Rm,Rn</td>
</tr>
<tr>
<td>EXTS.W Rm,Rn</td>
<td>SHAL</td>
<td>Rn</td>
<td>XOR #imm,R0</td>
</tr>
<tr>
<td>EXTU.B Rm,Rn</td>
<td>SHAR</td>
<td>Rn</td>
<td>XOR Rm,Rn</td>
</tr>
<tr>
<td>EXTU.W Rm,Rn</td>
<td>SHLD</td>
<td>Rm,Rn</td>
<td>XTRCT Rm,Rn</td>
</tr>
</tbody>
</table>

Table 74: Instruction groups
### PRELIMINARY DATA

<table>
<thead>
<tr>
<th>MOV</th>
<th>#imm, Rn</th>
<th>SHLL</th>
<th>Rn</th>
</tr>
</thead>
<tbody>
<tr>
<td>MOVA</td>
<td>@(disp, PC), R0</td>
<td>SHLL16</td>
<td>Rn</td>
</tr>
</tbody>
</table>

#### 3. BR Group

<table>
<thead>
<tr>
<th>BF</th>
<th>disp</th>
<th>BRA</th>
<th>disp</th>
<th>BT</th>
<th>disp</th>
</tr>
</thead>
<tbody>
<tr>
<td>BF/S</td>
<td>disp</td>
<td>BSR</td>
<td>disp</td>
<td>BT/S</td>
<td>disp</td>
</tr>
</tbody>
</table>

#### 4. LS Group

<table>
<thead>
<tr>
<th>FABS</th>
<th>DRn</th>
<th>FMOV.S</th>
<th>@(Rm+, FRn)</th>
<th>MOV.L</th>
<th>R0, @(disp, GBR)</th>
</tr>
</thead>
<tbody>
<tr>
<td>FABS</td>
<td>FRn</td>
<td>FMOV.S</td>
<td>FRm, @(R0, Rn)</td>
<td>MOV.L</td>
<td>Rm, @(disp, Rn)</td>
</tr>
<tr>
<td>FLDI0</td>
<td>FRn</td>
<td>FMOV.S</td>
<td>FRm, @-Rn</td>
<td>MOV.L</td>
<td>Rm, @(R0, Rn)</td>
</tr>
<tr>
<td>FLDI1</td>
<td>FRn</td>
<td>FMOV.S</td>
<td>FRm, @Rn</td>
<td>MOV.L</td>
<td>Rm, @-Rn</td>
</tr>
<tr>
<td>FLDS</td>
<td>FRm, FPUL</td>
<td>FNEG</td>
<td>DRn</td>
<td>MOV.L</td>
<td>Rm, @Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>@(R0, Rm), DRn</td>
<td>FNEG</td>
<td>FRn</td>
<td>MOV.W</td>
<td>@(disp, GBR), R0</td>
</tr>
<tr>
<td>FMOV</td>
<td>@(R0, Rm), XDn</td>
<td>FSTS</td>
<td>FPUL, FRn</td>
<td>MOV.W</td>
<td>@(disp, PC), Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm, DRn</td>
<td>LDS</td>
<td>Rm, FPUL</td>
<td>MOV.W</td>
<td>@(disp, Rm), R0</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm, XDn</td>
<td>MOV.B</td>
<td>@(disp, GBR), R0</td>
<td>MOV.W</td>
<td>@(R0, Rm), Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm+, DRn</td>
<td>MOV.B</td>
<td>@(disp, Rm), R0</td>
<td>MOV.W</td>
<td>@Rm, Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>@Rm+, XDn</td>
<td>MOV.B</td>
<td>@(R0, Rm), Rn</td>
<td>MOV.W</td>
<td>@Rm+, Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm, @(R0, Rn)</td>
<td>MOV.B</td>
<td>@Rm, Rn</td>
<td>MOV.W</td>
<td>R0, @(disp, GBR)</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm, @-Rn</td>
<td>MOV.B</td>
<td>@Rm+, Rn</td>
<td>MOV.W</td>
<td>R0, @(disp, Rn)</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm, @Rn</td>
<td>MOV.B</td>
<td>R0, @(disp, GBR)</td>
<td>MOV.W</td>
<td>Rm, @(R0, Rn)</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm, DRn</td>
<td>MOV.B</td>
<td>R0, @(disp, Rn)</td>
<td>MOV.W</td>
<td>Rm, @-Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>DRm, XDn</td>
<td>MOV.B</td>
<td>Rm, @(R0, Rn)</td>
<td>MOV.W</td>
<td>Rm, @Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>FRm, FRn</td>
<td>MOV.B</td>
<td>Rm, @-Rn</td>
<td>MOVCA.L</td>
<td>R0, @Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>XDm, @(R0, Rn)</td>
<td>MOV.B</td>
<td>Rm, @Rn</td>
<td>OCBI</td>
<td>@Rn</td>
</tr>
<tr>
<td>FMOV</td>
<td>XDm, @-Rn</td>
<td>MOV.L</td>
<td>@(disp, GBR), R0</td>
<td>OCBP</td>
<td>@Rn</td>
</tr>
</tbody>
</table>

**Table 74: Instruction groups**
### PRELIMINARY DATA

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Flags</th>
<th>Source</th>
<th>Destination</th>
</tr>
</thead>
<tbody>
<tr>
<td>FMOV XDm,@Rn</td>
<td>MOV.L @(disp,PC),Rn</td>
<td>OCBWB</td>
<td>@Rn</td>
<td></td>
</tr>
<tr>
<td>FMOV XDm,DRn</td>
<td>MOV.L @(disp,Rm),Rn</td>
<td>PREF</td>
<td>@Rn</td>
<td></td>
</tr>
<tr>
<td>FMOV XDm,XDn</td>
<td>MOV.L @(R0,Rm),Rn</td>
<td>STS</td>
<td>FPUL,Rn</td>
<td></td>
</tr>
<tr>
<td>FMOV.S @(R0,Rm),FRn</td>
<td>MOV.L @Rm,Rn</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>FMOV.S @Rm,FRn</td>
<td>MOV.L @Rm+,Rn</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

### 5. FE Group

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Flags</th>
<th>Source</th>
<th>Destination</th>
</tr>
</thead>
<tbody>
<tr>
<td>FADD DRm,DRn</td>
<td>FIPR FVm,FVn</td>
<td>FSQRT</td>
<td>DRn</td>
<td></td>
</tr>
<tr>
<td>FADD FRm,FRn</td>
<td>FLOAT FPUL,DRn</td>
<td>FSQRT</td>
<td>FRn</td>
<td></td>
</tr>
<tr>
<td>FCMP/EQ FRm,FRn</td>
<td>FLOAT FPUL,FRn</td>
<td>FSUB</td>
<td>DRm,DRn</td>
<td></td>
</tr>
<tr>
<td>FCMP/GT FRm,FRn</td>
<td>FMAC FR0,FRm,FRn</td>
<td>FSUB</td>
<td>FRm,FRn</td>
<td></td>
</tr>
<tr>
<td>FCNVDS DRm,FPUL</td>
<td>FMUL DRm,DRn</td>
<td>FTRC</td>
<td>DRm,FPUL</td>
<td></td>
</tr>
<tr>
<td>FCNVSD FPUL,DRn</td>
<td>FMUL FRm,FRn</td>
<td>FTRC</td>
<td>FRm,FPUL</td>
<td></td>
</tr>
<tr>
<td>FDIV DRm,DRn</td>
<td>FRCHG</td>
<td>FTRV</td>
<td>XMTRX,FVn</td>
<td></td>
</tr>
<tr>
<td>FDIV FRm,FRn</td>
<td>FSCHG</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

### 6. CO Group

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Description</th>
<th>Flags</th>
<th>Source</th>
<th>Destination</th>
</tr>
</thead>
<tbody>
<tr>
<td>AND.B #imm,@(R0,GBR)</td>
<td>LDS Rm,FPSCR</td>
<td>STC</td>
<td>SR,Rn</td>
<td></td>
</tr>
<tr>
<td>BRAF Rm</td>
<td>LDS Rm,MACH</td>
<td>STC</td>
<td>SSR,Rn</td>
<td></td>
</tr>
<tr>
<td>BSRF Rm</td>
<td>LDS Rm,MACL</td>
<td>STC</td>
<td>VBR,Rn</td>
<td></td>
</tr>
<tr>
<td>CLRMAC</td>
<td>LDS Rm,PR</td>
<td>STC.L</td>
<td>DBR,@-Rn</td>
<td></td>
</tr>
<tr>
<td>CLRS</td>
<td>LDS.L @Rm+,FPSCR</td>
<td>STC.L</td>
<td>GBR,@-Rn</td>
<td></td>
</tr>
<tr>
<td>DMULS.L Rm,Rn</td>
<td>LDS.L @Rm+,FPUL</td>
<td>STC.L</td>
<td>Rp_BANK,@-Rn</td>
<td></td>
</tr>
<tr>
<td>DMULUL.L Rm,Rn</td>
<td>LDS.L @Rm+,MACH</td>
<td>STC.L</td>
<td>SGR,@-Rn</td>
<td></td>
</tr>
<tr>
<td>FCMP/EQ DRm,DRn</td>
<td>LDS.L @Rm+,MACL</td>
<td>STC.L</td>
<td>SPC,@-Rn</td>
<td></td>
</tr>
<tr>
<td>FCMP/GT DRm,DRn</td>
<td>LDS.L @Rm+,PR</td>
<td>STC.L</td>
<td>SR,@-Rn</td>
<td></td>
</tr>
<tr>
<td>JMP @Rn</td>
<td>LDTLB</td>
<td>STC.L</td>
<td>SSR,@-Rn</td>
<td></td>
</tr>
</tbody>
</table>

*Table 74: Instruction groups*
### Table 74: Instruction groups

<table>
<thead>
<tr>
<th>Instruction</th>
<th>Source Register</th>
<th>Destination Register</th>
<th>Condition</th>
<th>Target Register</th>
</tr>
</thead>
<tbody>
<tr>
<td>JSR</td>
<td>@Rn</td>
<td>MAC.L</td>
<td>STC.L</td>
<td>VBR.@-Rn</td>
</tr>
<tr>
<td>LDC</td>
<td>Rm, DBR</td>
<td>MAC.W</td>
<td>STS</td>
<td>FPSCR.Rn</td>
</tr>
<tr>
<td>LDC</td>
<td>Rm, GBR</td>
<td>MUL.L</td>
<td>STS</td>
<td>MACH.Rn</td>
</tr>
<tr>
<td>LDC</td>
<td>Rm, Rp_BANK</td>
<td>MULS.W</td>
<td>STS</td>
<td>MACL.Rn</td>
</tr>
<tr>
<td>LDC</td>
<td>Rm, SPC</td>
<td>MULU.W</td>
<td>STS</td>
<td>PR.Rn</td>
</tr>
<tr>
<td>LDC</td>
<td>Rm, SR</td>
<td>OR.B</td>
<td>STS.L</td>
<td>FPSCR.@-Rn</td>
</tr>
<tr>
<td>LDC</td>
<td>Rm, SSR</td>
<td>RTE</td>
<td>STS.L</td>
<td>FPUL.@-Rn</td>
</tr>
<tr>
<td>LDC</td>
<td>Rm, VBR</td>
<td>RTS</td>
<td>STS.L</td>
<td>MACH.@-Rn</td>
</tr>
<tr>
<td>LDC.L</td>
<td>@Rm+, DBR</td>
<td>SETS</td>
<td>STS.L</td>
<td>MACL.@-Rn</td>
</tr>
<tr>
<td>LDC.L</td>
<td>@Rm+, GBR</td>
<td>SLEEP</td>
<td>STS.L</td>
<td>PR.@-Rn</td>
</tr>
<tr>
<td>LDC.L</td>
<td>@Rm+, Rp_BANK</td>
<td>STC</td>
<td>STS.L</td>
<td>GBR.Rn</td>
</tr>
<tr>
<td>LDC.L</td>
<td>@Rm+, SPC</td>
<td>TRAPA</td>
<td>#imm</td>
<td></td>
</tr>
<tr>
<td>LDC.L</td>
<td>@Rm+, SR</td>
<td>STC</td>
<td>TST.B</td>
<td>#imm,@(R0, GBR)</td>
</tr>
<tr>
<td>LDC.L</td>
<td>@Rm+, SSR</td>
<td>STC</td>
<td>XOR.B</td>
<td>#imm,@(R0, GBR)</td>
</tr>
<tr>
<td>LDC.L</td>
<td>@Rm+, VBR</td>
<td>STC</td>
<td>SPC.Rn</td>
<td></td>
</tr>
</tbody>
</table>
PRELIMINARY DATA

STMicroelectronics and Hitachi, Ltd.
SH-4 CPU Core Architecture

10.3 Execution cycles and pipeline stalling

Instruction execution cycles are summarized in Table 76: Execution cycles on page 501. Penalty cycles due to a pipeline stall or freeze are not considered in this table.

- Issue rate: Interval between the issue of an instruction and that of the next instruction
- Latency: Interval between the issue of an instruction and the generation of its result (completion)
- Instruction execution pattern (see Figure 34 to Figure 38)
- Locked pipeline stages
- Interval between the issue of an instruction and the start of locking
- Lock time: Period of locking in machine cycle units

<table>
<thead>
<tr>
<th>1st Instruction</th>
<th>2nd Instruction</th>
</tr>
</thead>
<tbody>
<tr>
<td>MT</td>
<td>MT</td>
</tr>
<tr>
<td>EX</td>
<td>EX</td>
</tr>
<tr>
<td>BR</td>
<td>BR</td>
</tr>
<tr>
<td>LS</td>
<td>LS</td>
</tr>
<tr>
<td>FE</td>
<td>FE</td>
</tr>
<tr>
<td>CO</td>
<td>CO</td>
</tr>
</tbody>
</table>

Table 75: Parallel executables

O: Can be executed in parallel
X: Cannot be executed in parallel
The instruction execution sequence is expressed as a combination of the execution patterns shown in Figure 34 to Figure 38. One instruction is separated from the next by the number of machine cycles for its issue rate. Normally, execution, data access, and write-back stages cannot be overlapped onto the same stages of another instruction; the only exception is when two instructions are executed in parallel under parallel executables conditions. Refer to (a) through (d) in Figure 39 for some simple examples.

Latency is the interval between issue and completion of an instruction, and is also the interval between the execution of two instructions with an interdependent relationship. When there is interdependency between two instructions fetched simultaneously, the latter of the two is stalled for the following number of cycles:

- (Latency) cycles when there is flow dependency (read-after-write)
- (Latency - 1) or (latency - 2) cycles when there is output dependency (write-after-write)
  - Single/double-precision FDIV, FSQRT is the preceding instruction (latency - 1) cycles
  - The other FE group except above is the preceding instruction (latency - 2) cycles
- 5 or 2 cycles when there is anti-flow dependency (write-after-read), as in the following cases:
  - FTRV is the preceding instruction (5 cycle)
  - A double-precision FADD, FSUB, or FMUL is the preceding instruction (2 cycles)

In the case of flow dependency, latency may be exceptionally increased or decreased, depending on the combination of sequential instructions (Figure 40 (e)).

- When a floating-point (FP) computation is followed by an FP register store, the latency of the FP computation may be decreased by 1 cycle.
- If there is a load of the shift amount immediately before an SHAD/SHLD instruction, the latency of the load is increased by 1 cycle.
- If an instruction with a latency of less than 2 cycles, including write-back to an FP register, is followed by a double-precision FP instruction, FIPR, or FTRV, the latency of the first instruction is increased to 2 cycles.

The number of cycles in a pipeline stall due to flow dependency will vary depending on the combination of interdependent instructions or the fetch timing (see Figure 40 (e)).
Output dependency occurs when the destination operands are the same in a preceding FE group instruction and a following LS group instruction.

For the stall cycles of an instruction with output dependency, the longest latency to the last write-back among all the destination operands must be applied instead of latency-2 (see Figure 41 (f)). A stall due to output dependency with respect to FPSCR, which reflects the result of an FP operation, never occurs. For example, when FADD follows FDIV with no dependency between FP registers, FADD is not stalled even if both instructions update the cause field of FPSCR.

Anti-flow dependency can occur only between a preceding double-precision FADD, FMUL, F SUB, or FTRV and a following FMOV, FLDI0, FLDI1, FABS, FNEG, or FSTS. See Figure 41 (g).

If an executing instruction locks any resource, i.e. a function block that performs a basic operation, a following instruction that happens to attempt to use the locked resource must be stalled (Figure 42 (h)). This kind of stall can be compensated by inserting one or more instructions independent of the locked resource to separate the interfering instructions. For example, when a load instruction and an ADD instruction that references the loaded value are consecutive, the 2-cycle stall of the ADD is eliminated by inserting three instructions without dependency. Software performance can be improved by such instruction scheduling.

Other penalties arise in the event of exceptions or external data accesses, as follows.

- Instruction TLB miss: a penalty of 7 CPU clocks
- Instruction access to external memory (instruction cache miss, etc.)
- Data access to external memory (operand cache miss, etc.)
- Data access to a memory-mapped control register. The penalty differs from register to register, and depends on the kind of operation (read or write), the clock mode, and the bus use conditions when the access is made.

During the penalty cycles of an instruction TLB miss or external instruction access, no instruction is issued, but execution of instructions that have already been issued continues. The penalty for a data access is a pipeline freeze: that is, the execution of uncompleted instructions is interrupted until the arrival of the requested data. The number of penalty cycles for instruction and data accesses is largely dependent on the user’s memory subsystems.
## Preliminary Data

### STMicroelectronics and Hitachi, Ltd.

### ADCS 7182230F

### SH-4 CPU Core Architecture

---

**Figure 39: Examples of pipelined execution**

### (a) Serial execution: non-parallel-executable instructions

<table>
<thead>
<tr>
<th>Instruction(s)</th>
<th>ID</th>
<th>EX</th>
<th>NA</th>
<th>S</th>
</tr>
</thead>
<tbody>
<tr>
<td>SHAD R0,R1</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ADD R2,R3</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

1 cycle

**Remark:** EX-group `SHAD` and EX-group `ADD` cannot be executed in parallel. Therefore, `SHAD` is issued first, and the following `ADD` is recombined with the next instruction.

### (b) Parallel execution: parallel-executable and no dependency

<table>
<thead>
<tr>
<th>Instruction(s)</th>
<th>ID</th>
<th>EX</th>
<th>NA</th>
<th>S</th>
</tr>
</thead>
<tbody>
<tr>
<td>ADD R2,R1</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>MOV L @R4,R5</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

1 cycle

**Remark:** EX-group `ADD` and LS-group `MOV L` can be executed in parallel. Overlapping of stages in the 2nd instruction is possible.

### (c) Issue rate: multi-step instruction

<table>
<thead>
<tr>
<th>Instruction(s)</th>
<th>ID</th>
<th>EX</th>
<th>MA</th>
<th>S</th>
</tr>
</thead>
<tbody>
<tr>
<td>AND.B#1,@(R0,GBR)</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>MOV R1,R2</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

4 cycles

**Remark:** AND.B and MOV are fetched simultaneously, but MOV is stalled due to resource locking. After the lock is released, MOV is refetched together with the next instruction.

### (d) Branch

<table>
<thead>
<tr>
<th>Instruction(s)</th>
<th>ID</th>
<th>EX</th>
<th>NA</th>
<th>S</th>
</tr>
</thead>
<tbody>
<tr>
<td>BT/S L_far</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ADD R0,R1</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>SUB R2,R3</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

2 cycle latency for I-stage of branch destination

**Remark:** No stall occurs if the branch is not taken.

<table>
<thead>
<tr>
<th>Instruction(s)</th>
<th>ID</th>
<th>EX</th>
<th>NA</th>
<th>S</th>
</tr>
</thead>
<tbody>
<tr>
<td>BT/S L_far</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ADD R0,R1</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>L_far</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

1 cycle

**Remark:** If the branch is taken, the I-stage of the branch destination is stalled for the period of latency. This stall can be covered with a delay slot instruction which is not parallel-executable with the branch instruction.

<table>
<thead>
<tr>
<th>Instruction(s)</th>
<th>ID</th>
<th>EX</th>
<th>NA</th>
<th>S</th>
</tr>
</thead>
<tbody>
<tr>
<td>BT L_skip</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>ADD #1,R0</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>L_skip:</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

2 cycle latency for I-stage of branch destination

**Remark:** Even if the BT/BF branch is taken, the I-stage of the branch destination is not stalled if the displacement is zero.
The following instruction, ADD, is not stalled when executed after an instruction with zero-cycle latency, even if there is dependency.

ADD and MOV.L are not executed in parallel, since MOV.L references the result of ADD as its destination address.

Because MOV.L and ADD are not fetched simultaneously in this example, ADD is stalled for only 1 cycle even though the latency of MOV.L is 2 cycles.

Due to the flow dependency between the load and the SHAD/SHLD shift amount, the latency of the load is increased to 3 cycles.

The latency of FLOAT is decreased by 1 cycle, only if followed by a lower FR store. This decrease does not apply to an upper FR store.

The latency of FMOV is decreased by 1 cycle, only if followed by a lower FR store.

Figure 40: Examples of pipelined execution (continued)
Figure 41: Examples of pipelined execution (continued)
Figure 42: Examples of pipelined execution (continued)
### Data transfer instructions

<table>
<thead>
<tr>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>EXTS.B</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>2</td>
<td>EXTS.W</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>3</td>
<td>EXTU.B</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>4</td>
<td>EXTU.W</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>5</td>
<td>MOV</td>
<td>MT</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>6</td>
<td>MOV</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>7</td>
<td>MOVA</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>8</td>
<td>MOV.W</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>9</td>
<td>MOV.L</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>10</td>
<td>MOV.B</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>11</td>
<td>MOV.W</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>12</td>
<td>MOV.L</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>13</td>
<td>MOV.B</td>
<td>LS</td>
<td>1</td>
<td>1/2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>14</td>
<td>MOV.W</td>
<td>LS</td>
<td>1</td>
<td>1/2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>15</td>
<td>MOV.L</td>
<td>LS</td>
<td>1</td>
<td>1/2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>16</td>
<td>MOV.B</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>17</td>
<td>MOV.W</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>18</td>
<td>MOV.L</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>19</td>
<td>MOV.B</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>20</td>
<td>MOV.W</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>21</td>
<td>MOV.L</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>22</td>
<td>MOV.B</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#3</td>
<td>-</td>
</tr>
<tr>
<td>23</td>
<td>MOV.W</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#3</td>
<td>-</td>
</tr>
</tbody>
</table>

**Table 76: Execution cycles**
### Functional category

<table>
<thead>
<tr>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock</th>
<th>Stage</th>
<th>Start</th>
<th>Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td>24</td>
<td>MOV.L @(disp,GBR),R0</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#3</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>25</td>
<td>MOV.B Rm,@Rn</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>26</td>
<td>MOV.W Rm,@Rn</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>27</td>
<td>MOV.L Rm,@Rn</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>28</td>
<td>MOV.B Rm,=@-Rn</td>
<td>LS</td>
<td>1</td>
<td>1/1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>29</td>
<td>MOV.W Rm,=@-Rn</td>
<td>LS</td>
<td>1</td>
<td>1/1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>30</td>
<td>MOV.L Rm,=@-Rn</td>
<td>LS</td>
<td>1</td>
<td>1/1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>31</td>
<td>MOV.B R0,@(disp,Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>32</td>
<td>MOV.W R0,@(disp,Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>33</td>
<td>MOV.L Rm,@(disp,Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>34</td>
<td>MOV.B Rm,@(R0,Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>35</td>
<td>MOV.W Rm,@(R0,Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>36</td>
<td>MOV.L Rm,@(R0,Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>37</td>
<td>MOV.B R0,@(disp,GBR)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#3</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>38</td>
<td>MOV.W R0,@(disp,GBR)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#3</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>39</td>
<td>MOV.L R0,@(disp,GBR)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#3</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>40</td>
<td>MOVCA.L R0,@Rn</td>
<td>LS</td>
<td>1</td>
<td>3-7</td>
<td>#12</td>
<td>MA</td>
<td>4</td>
<td>3-7</td>
<td></td>
</tr>
<tr>
<td>41</td>
<td>MOV.T Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>42</td>
<td>OCBI @Rn</td>
<td>LS</td>
<td>1</td>
<td>1-2</td>
<td>#10</td>
<td>MA</td>
<td>4</td>
<td>1-2</td>
<td></td>
</tr>
<tr>
<td>43</td>
<td>OCBP @Rn</td>
<td>LS</td>
<td>1</td>
<td>1-5</td>
<td>#11</td>
<td>MA</td>
<td>4</td>
<td>1-5</td>
<td></td>
</tr>
<tr>
<td>44</td>
<td>OCBWB @Rn</td>
<td>LS</td>
<td>1</td>
<td>1-5</td>
<td>#11</td>
<td>MA</td>
<td>4</td>
<td>1-5</td>
<td></td>
</tr>
<tr>
<td>45</td>
<td>PREF @Rn</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
<tr>
<td>46</td>
<td>SWAP.B Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td></td>
</tr>
</tbody>
</table>

Table 76: Execution cycles
### Table 76: Execution cycles

<table>
<thead>
<tr>
<th>Functional category</th>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock Stage</th>
<th>Start Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td>Data transfer</td>
<td>47</td>
<td>SWAP.W</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>48</td>
<td>XTRCT</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>Fixed-point</td>
<td>49</td>
<td>ADD</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>arithmetic</td>
<td>50</td>
<td>ADD</td>
<td>#imm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>51</td>
<td>ADDC</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>52</td>
<td>ADDV</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>53</td>
<td>CMP/EQ</td>
<td>#imm,R0</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>54</td>
<td>CMP/EQ</td>
<td>Rm,Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>55</td>
<td>CMP/GE</td>
<td>Rm,Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>56</td>
<td>CMP/GE</td>
<td>Rm,Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>57</td>
<td>CMP/Hi</td>
<td>Rm,Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>58</td>
<td>CMP/HS</td>
<td>Rm,Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>59</td>
<td>CMP/PL</td>
<td>Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>60</td>
<td>CMP/PZ</td>
<td>Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>61</td>
<td>CMP/STM</td>
<td>Rm,Rn</td>
<td>MT</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>62</td>
<td>DIV0S</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>63</td>
<td>DIV0U</td>
<td></td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>64</td>
<td>DIV1</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>65</td>
<td>DMULS.L</td>
<td>Rm,Rn</td>
<td>CO</td>
<td>2</td>
<td>4/4 #34</td>
<td>F1</td>
<td>4 2</td>
</tr>
<tr>
<td></td>
<td>66</td>
<td>DMULU.L</td>
<td>Rm,Rn</td>
<td>CO</td>
<td>2</td>
<td>4/4 #34</td>
<td>F1</td>
<td>4 2</td>
</tr>
<tr>
<td></td>
<td>67</td>
<td>DT</td>
<td>Rn</td>
<td>EX</td>
<td>1</td>
<td>1 #1</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>68</td>
<td>MAC.L</td>
<td>@Rm+,@Rn+</td>
<td>CO</td>
<td>2</td>
<td>2/2/4/4 #35</td>
<td>F1</td>
<td>4 2</td>
</tr>
<tr>
<td></td>
<td>69</td>
<td>MAC.W</td>
<td>@Rm+,@Rn+</td>
<td>CO</td>
<td>2</td>
<td>2/2/4/4 #35</td>
<td>F1</td>
<td>4 2</td>
</tr>
</tbody>
</table>
## PRELIMINARY DATA

<table>
<thead>
<tr>
<th>Functional category</th>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock Stage</th>
<th>Start Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Fixed-point arithmetic instructions</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>70 MUL.L</td>
<td>Rm,Rn</td>
<td>CO</td>
<td>2</td>
<td>4/4</td>
<td>#34</td>
<td>F1</td>
<td>4</td>
<td>2</td>
</tr>
<tr>
<td>71 MULS.W</td>
<td>Rm,Rn</td>
<td>CO</td>
<td>2</td>
<td>4/4</td>
<td>#34</td>
<td>F1</td>
<td>4</td>
<td>2</td>
</tr>
<tr>
<td>72 MULU.W</td>
<td>Rm,Rn</td>
<td>CO</td>
<td>2</td>
<td>4/4</td>
<td>#34</td>
<td>F1</td>
<td>4</td>
<td>2</td>
</tr>
<tr>
<td>73 NEG</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>74 NEGC</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>75 SUB</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>76 SUBC</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>77 SUBV</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td><strong>Logical instructions</strong></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>78 AND</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>79 AND #imm,R0</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>80 AND.B #imm,@(R0,GBR)</td>
<td>CO</td>
<td>4</td>
<td>4</td>
<td>#6</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>81 NOT</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>82 OR</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>83 OR #imm,R0</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>84 OR.B  #imm,@(R0,GBR)</td>
<td>CO</td>
<td>4</td>
<td>4</td>
<td>#6</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>85 TAS.B @Rn</td>
<td>CO</td>
<td>5</td>
<td>5</td>
<td>#7</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>86 TST</td>
<td>Rm,Rn</td>
<td>MT</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>87 TST #imm,R0</td>
<td>MT</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>88 TST.B #imm,@(R0,GBR)</td>
<td>CO</td>
<td>3</td>
<td>3</td>
<td>#5</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>89 XOR</td>
<td>Rm,Rn</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>90 XOR #imm,R0</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>91 XOR.B #imm,@(R0,GBR)</td>
<td>CO</td>
<td>4</td>
<td>4</td>
<td>#6</td>
<td>-</td>
<td>-</td>
<td>-</td>
<td>-</td>
</tr>
</tbody>
</table>

Table 76: Execution cycles
<table>
<thead>
<tr>
<th>Functional category</th>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock</th>
</tr>
</thead>
<tbody>
<tr>
<td>Shift instructions</td>
<td>92</td>
<td>ROLR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>93</td>
<td>RORR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>94</td>
<td>RORCL</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>95</td>
<td>RORCR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>96</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>97</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>98</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>99</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>100</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>101</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>102</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>103</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>104</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>105</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>106</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>107</td>
<td>SHAR</td>
<td>EX</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>Branch instructions</td>
<td>108</td>
<td>BF</td>
<td>disp</td>
<td>BR</td>
<td>1</td>
<td>2 (or 1)</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>109</td>
<td>BF/S</td>
<td>disp</td>
<td>BR</td>
<td>1</td>
<td>2 (or 1)</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>110</td>
<td>BT</td>
<td>disp</td>
<td>BR</td>
<td>1</td>
<td>2 (or 1)</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>111</td>
<td>BT/S</td>
<td>disp</td>
<td>BR</td>
<td>1</td>
<td>2 (or 1)</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>112</td>
<td>BRA</td>
<td>disp</td>
<td>BR</td>
<td>1</td>
<td>2</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>113</td>
<td>BRAF</td>
<td>Rn</td>
<td>CO</td>
<td>2</td>
<td>3</td>
<td>#4</td>
</tr>
<tr>
<td></td>
<td>114</td>
<td>BSR</td>
<td>disp</td>
<td>BR</td>
<td>1</td>
<td>2</td>
<td>#14</td>
</tr>
</tbody>
</table>

Table 76: Execution cycles
### Table 76: Execution cycles

<table>
<thead>
<tr>
<th>Functional category</th>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock Stage Start Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td>Branch instructions</td>
<td>115</td>
<td>BSRF Rn</td>
<td>CO</td>
<td>2</td>
<td>3</td>
<td>#24</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>116</td>
<td>JMP @Rn</td>
<td>CO</td>
<td>2</td>
<td>3</td>
<td>#4</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>117</td>
<td>JSR @Rn</td>
<td>CO</td>
<td>2</td>
<td>3</td>
<td>#24</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>118</td>
<td>RTS</td>
<td>CO</td>
<td>2</td>
<td>3</td>
<td>#4</td>
<td>- - -</td>
</tr>
<tr>
<td>System control instructions</td>
<td>119</td>
<td>NOP MT</td>
<td>MT</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>120</td>
<td>CLRMAC CO</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#28</td>
<td>F1 3 2</td>
</tr>
<tr>
<td></td>
<td>121</td>
<td>CLRS CO</td>
<td>CO</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>122</td>
<td>CLRT MT</td>
<td>MT</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>123</td>
<td>SETS CO</td>
<td>CO</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>124</td>
<td>SETT MT</td>
<td>MT</td>
<td>1</td>
<td>1</td>
<td>#1</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>125</td>
<td>TRAPA #imm</td>
<td>CO</td>
<td>7</td>
<td>7</td>
<td>#13</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>126</td>
<td>RTE CO</td>
<td>CO</td>
<td>5</td>
<td>5</td>
<td>#8</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>127</td>
<td>SLEEP CO</td>
<td>CO</td>
<td>4</td>
<td>4</td>
<td>#9</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>128</td>
<td>LDTLB CO</td>
<td>CO</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>- - -</td>
</tr>
<tr>
<td></td>
<td>129</td>
<td>LDC Rm,DBR CO</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#14</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>130</td>
<td>LDC Rm,GBR CO</td>
<td>CO</td>
<td>3</td>
<td>3</td>
<td>#15</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>131</td>
<td>LDC Rm,Rp_BANK CO</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#14</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>132</td>
<td>LDC Rm,Sr CO</td>
<td>CO</td>
<td>4</td>
<td>4</td>
<td>#16</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>133</td>
<td>LDC Rm,SSR CO</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#14</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>134</td>
<td>LDC Rm,SPC CO</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#14</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>135</td>
<td>LDC Rm,VBR CO</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#14</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>136</td>
<td>LDC.L @Rm+,DBR CO</td>
<td>CO</td>
<td>1</td>
<td>1/3</td>
<td>#17</td>
<td>SX 3 2</td>
</tr>
<tr>
<td></td>
<td>137</td>
<td>LDC.L @Rm+,GBR CO</td>
<td>CO</td>
<td>3</td>
<td>3/3</td>
<td>#18</td>
<td>SX 3 2</td>
</tr>
</tbody>
</table>
### Functional category

<table>
<thead>
<tr>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock Stage</th>
<th>Start Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td>138</td>
<td>LDC.L @Rm+,Rp_BANK</td>
<td>CO</td>
<td>1</td>
<td>1/3</td>
<td>#17</td>
<td>SX</td>
<td>3</td>
</tr>
<tr>
<td>139</td>
<td>LDC.L @Rm+,SR</td>
<td>CO</td>
<td>4</td>
<td>4/4</td>
<td>#19</td>
<td>SX</td>
<td>3</td>
</tr>
<tr>
<td>140</td>
<td>LDC.L @Rm+,SSR</td>
<td>CO</td>
<td>1</td>
<td>1/3</td>
<td>#17</td>
<td>SX</td>
<td>3</td>
</tr>
<tr>
<td>141</td>
<td>LDC.L @Rm+,SPC</td>
<td>CO</td>
<td>1</td>
<td>1/3</td>
<td>#17</td>
<td>SX</td>
<td>3</td>
</tr>
<tr>
<td>142</td>
<td>LDC.L @Rm+,VBR</td>
<td>CO</td>
<td>1</td>
<td>1/3</td>
<td>#17</td>
<td>SX</td>
<td>3</td>
</tr>
<tr>
<td>143</td>
<td>LDS Rm,MACH</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#28</td>
<td>F1</td>
<td>3</td>
</tr>
<tr>
<td>144</td>
<td>LDS Rm,MACL</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#28</td>
<td>F1</td>
<td>3</td>
</tr>
<tr>
<td>145</td>
<td>LDS Rm,PR</td>
<td>CO</td>
<td>2</td>
<td>3</td>
<td>#24</td>
<td>SX</td>
<td>3</td>
</tr>
<tr>
<td>146</td>
<td>LDS.L @Rm+,MACH</td>
<td>CO</td>
<td>1</td>
<td>1/3</td>
<td>#29</td>
<td>F1</td>
<td>3</td>
</tr>
<tr>
<td>147</td>
<td>LDS.L @Rm+,MACL</td>
<td>CO</td>
<td>1</td>
<td>1/3</td>
<td>#29</td>
<td>F1</td>
<td>3</td>
</tr>
<tr>
<td>148</td>
<td>LDS.L @Rm+,PR</td>
<td>CO</td>
<td>2</td>
<td>2/3</td>
<td>#25</td>
<td>SX</td>
<td>3</td>
</tr>
<tr>
<td>149</td>
<td>STC DBR,Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#20</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>150</td>
<td>STC SGR,Rn</td>
<td>CO</td>
<td>3</td>
<td>3</td>
<td>#21</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>151</td>
<td>STC GBR,Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#20</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>152</td>
<td>STC Rp_BANK,Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#20</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>153</td>
<td>STC SR,Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#20</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>154</td>
<td>STC SSR,Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#20</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>155</td>
<td>STC SPC,Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#20</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>156</td>
<td>STC VBR,Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#20</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>157</td>
<td>STC.L DBR,@-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#22</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>158</td>
<td>STC.L SGR,@-Rn</td>
<td>CO</td>
<td>3</td>
<td>3/3</td>
<td>#23</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>159</td>
<td>STC.L GBR,@-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#22</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td>160</td>
<td>STC.L Rp_BANK,@-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#22</td>
<td>-</td>
<td>-</td>
</tr>
</tbody>
</table>

Table 76: Execution cycles
### Functional category

<table>
<thead>
<tr>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock</th>
</tr>
</thead>
<tbody>
<tr>
<td>161</td>
<td>STC.L SR, @-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#22</td>
<td>-</td>
</tr>
<tr>
<td>162</td>
<td>STC.L SSR, @-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#22</td>
<td>-</td>
</tr>
<tr>
<td>163</td>
<td>STC.L SPC, @-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#22</td>
<td>-</td>
</tr>
<tr>
<td>164</td>
<td>STC.L VBR, @-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#22</td>
<td>-</td>
</tr>
<tr>
<td>165</td>
<td>STS MACH, Rn</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#30</td>
<td>-</td>
</tr>
<tr>
<td>166</td>
<td>STS MACL, Rn</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#30</td>
<td>-</td>
</tr>
<tr>
<td>167</td>
<td>STS PR, Rn</td>
<td>CO</td>
<td>2</td>
<td>2</td>
<td>#26</td>
<td>-</td>
</tr>
<tr>
<td>168</td>
<td>STS.L MACH, @-Rn</td>
<td>CO</td>
<td>1</td>
<td>1/1</td>
<td>#31</td>
<td>-</td>
</tr>
<tr>
<td>169</td>
<td>STS.L MACL, @-Rn</td>
<td>CO</td>
<td>1</td>
<td>1/1</td>
<td>#31</td>
<td>-</td>
</tr>
<tr>
<td>170</td>
<td>STS.L PR, @-Rn</td>
<td>CO</td>
<td>2</td>
<td>2/2</td>
<td>#27</td>
<td>-</td>
</tr>
</tbody>
</table>

### Single-precision floating-point instructions

<table>
<thead>
<tr>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock</th>
</tr>
</thead>
<tbody>
<tr>
<td>171</td>
<td>FLDI0 FRn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>172</td>
<td>FLDI1 FRn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>173</td>
<td>FMOV FRm, FRn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>174</td>
<td>FMOV.S @Rm, FRn</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>175</td>
<td>FMOV.S @Rm+, FRn</td>
<td>LS</td>
<td>1</td>
<td>1/2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>176</td>
<td>FMOV.S @(R0, Rm), FRn</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>177</td>
<td>FMOV.S FRm, @Rn</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>178</td>
<td>FMOV.S FRm, @-Rn</td>
<td>LS</td>
<td>1</td>
<td>1/1</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>179</td>
<td>FMOV.S FRm, @(R0, Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
</tr>
<tr>
<td>180</td>
<td>FLDS FRm, FPUL</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>181</td>
<td>FSTS FPUL, FRn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>182</td>
<td>FABS FRn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
<td>-</td>
</tr>
<tr>
<td>183</td>
<td>FADD FRm, FRn</td>
<td>FE</td>
<td>1</td>
<td>3/4</td>
<td>#36</td>
<td>-</td>
</tr>
</tbody>
</table>

**Table 76: Execution cycles**
### PRELIMINARY DATA

<table>
<thead>
<tr>
<th>Functional category</th>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock Stage Start Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>184</td>
<td>FCMP/EQ FRn,FRm</td>
<td>FE 1 2/4 #36</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>185</td>
<td>FCMP/GT FRn,FRm</td>
<td>FE 1 2/4 #36</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>186</td>
<td>FDIV FRn,FRm</td>
<td>FE 1 12/13 #37</td>
<td>F3 2 10</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>187</td>
<td>FLOAT FPUL,FRn</td>
<td>FE 1 3/4 #36</td>
<td>F1 2 2</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>188</td>
<td>FMAC FR0,FRm,FRn</td>
<td>FE 1 3/4 #36</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>189</td>
<td>FMUL FRm,FRn</td>
<td>FE 1 3/4 #36</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>190</td>
<td>FNEG FRn</td>
<td>LS 1 0 #1</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>191</td>
<td>FSQRT FRn</td>
<td>FE 1 11/12 #37</td>
<td>F3 2 9</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>192</td>
<td>FSUB FRm,FRn</td>
<td>FE 1 3/4 #36</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>193</td>
<td>FTRC FRm,FPUL</td>
<td>FE 1 3/4 #36</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>194</td>
<td>FMOV DRm,DRn</td>
<td>LS 1 0 #1</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>195</td>
<td>FMOV @Rm,DRn</td>
<td>LS 1 2 #2</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>196</td>
<td>FMOV @Rm+,DRn</td>
<td>LS 1 1/2 #2</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>197</td>
<td>FMOV @(R0,Rm),DRn</td>
<td>LS 1 2 #2</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>198</td>
<td>FMOV DRm,@Rn</td>
<td>LS 1 1 #2</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>199</td>
<td>FMOV DRm,@-Rn</td>
<td>LS 1 1/1 #2</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>200</td>
<td>FMOV DRm,(R0,Rn)</td>
<td>LS 1 1 #2</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Double-precision floating-point instructions</td>
<td>201</td>
<td>FABS DRn</td>
<td>LS 1 0 #1</td>
<td>- - -</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>202</td>
<td>FADD DRm,DRn</td>
<td>FE 1 (7, 8)/9 #39</td>
<td>F1 2 6</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>203</td>
<td>FCMP/EQ DRm,DRn</td>
<td>CO 2 3/5 #40</td>
<td>F1 2 2</td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td></td>
<td>204</td>
<td>FCMP/GT DRm,DRn</td>
<td>CO 2 3/5 #40</td>
<td>F1 2 2</td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 76: Execution cycles
### PRELIMINARY DATA

<table>
<thead>
<tr>
<th>Functional category</th>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>205</td>
<td>FCNVDS</td>
<td>DRm,FPUL</td>
<td>FE</td>
<td>1</td>
<td>4/5</td>
<td>#38</td>
</tr>
<tr>
<td></td>
<td>206</td>
<td>FCNVSD</td>
<td>FPUL,DRn</td>
<td>FE</td>
<td>1</td>
<td>(3, 4)/5</td>
<td>#38</td>
</tr>
<tr>
<td></td>
<td>207</td>
<td>FDIV</td>
<td>DRm,DRn</td>
<td>FE</td>
<td>1</td>
<td>(24, 25)/26</td>
<td>#41</td>
</tr>
<tr>
<td></td>
<td>208</td>
<td>FLOAT</td>
<td>FPUL,DRn</td>
<td>FE</td>
<td>1</td>
<td>(3, 4)/5</td>
<td>#38</td>
</tr>
<tr>
<td></td>
<td>209</td>
<td>FMUL</td>
<td>DRm,DRn</td>
<td>FE</td>
<td>1</td>
<td>(7, 8)/9</td>
<td>#39</td>
</tr>
<tr>
<td></td>
<td>210</td>
<td>FNEG</td>
<td>DRn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>211</td>
<td>FSQRT</td>
<td>DRn</td>
<td>FE</td>
<td>1</td>
<td>(23, 24)/25</td>
<td>#41</td>
</tr>
<tr>
<td></td>
<td>212</td>
<td>FSUB</td>
<td>DRm,DRn</td>
<td>FE</td>
<td>1</td>
<td>(7, 8)/9</td>
<td>#39</td>
</tr>
<tr>
<td></td>
<td>213</td>
<td>FTRC</td>
<td>DRm,FPUL</td>
<td>FE</td>
<td>1</td>
<td>4/5</td>
<td>#38</td>
</tr>
<tr>
<td></td>
<td>214</td>
<td>LDS</td>
<td>Rm,FPUL</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>215</td>
<td>LDS</td>
<td>Rm,FPSCR</td>
<td>CO</td>
<td>1</td>
<td>4</td>
<td>#32</td>
</tr>
<tr>
<td></td>
<td>216</td>
<td>LDS.L</td>
<td>@Rm+,FPUL</td>
<td>CO</td>
<td>1</td>
<td>1/2</td>
<td>#2</td>
</tr>
<tr>
<td></td>
<td>217</td>
<td>LDS.L</td>
<td>@Rm+,FPSCR</td>
<td>CO</td>
<td>1</td>
<td>1/4</td>
<td>#33</td>
</tr>
<tr>
<td></td>
<td>218</td>
<td>STS</td>
<td>FPUL,Rn</td>
<td>LS</td>
<td>1</td>
<td>3</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>219</td>
<td>STS</td>
<td>FPSCR,Rn</td>
<td>CO</td>
<td>1</td>
<td>3</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>220</td>
<td>STS.L</td>
<td>FPUL,@-Rn</td>
<td>CO</td>
<td>1</td>
<td>1/1</td>
<td>#2</td>
</tr>
<tr>
<td></td>
<td>221</td>
<td>STS.L</td>
<td>FPSCR,@-Rn</td>
<td>CO</td>
<td>1</td>
<td>1/1</td>
<td>#2</td>
</tr>
<tr>
<td></td>
<td>222</td>
<td>FMOV</td>
<td>DRm,XDn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>223</td>
<td>FMOV</td>
<td>XDm,DRn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
</tr>
<tr>
<td></td>
<td>224</td>
<td>FMOV</td>
<td>XDm,XDn</td>
<td>LS</td>
<td>1</td>
<td>0</td>
<td>#1</td>
</tr>
</tbody>
</table>

**Table 76: Execution cycles**
## PRELIMINARY DATA

<table>
<thead>
<tr>
<th>Functional category</th>
<th>No</th>
<th>Instruction</th>
<th>Instruction group</th>
<th>Issue rate</th>
<th>Latency</th>
<th>Execution pattern</th>
<th>Lock Stage</th>
<th>Start Cycles</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>FMOV @Rm,XDn</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>225</td>
<td>FMOV @Rm+,XDn</td>
<td>LS</td>
<td>1</td>
<td>1/2</td>
<td>#2</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>226</td>
<td>FMOV @(R0,Rm),XDn</td>
<td>LS</td>
<td>1</td>
<td>2</td>
<td>#2</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>227</td>
<td>FMOV XDm,@Rn</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>228</td>
<td>FMOV XDm,@-Rm</td>
<td>LS</td>
<td>1</td>
<td>1/1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>229</td>
<td>FMOV XDm,@(R0,Rn)</td>
<td>LS</td>
<td>1</td>
<td>1</td>
<td>#2</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>230</td>
<td>FIPR Vm,Vn</td>
<td>FE</td>
<td>1</td>
<td>4/5</td>
<td>#42</td>
<td>F1</td>
<td>3 1</td>
</tr>
<tr>
<td></td>
<td>231</td>
<td>FIPR Vm,FVn</td>
<td>FE</td>
<td>1</td>
<td>1/4</td>
<td>#36</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>232</td>
<td>FIPR Vm,Fv</td>
<td>FE</td>
<td>1</td>
<td>1/4</td>
<td>#36</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>233</td>
<td>FIPR Vm,Fv</td>
<td>FE</td>
<td>1</td>
<td>1/4</td>
<td>#36</td>
<td>-</td>
<td>-</td>
</tr>
<tr>
<td></td>
<td>234</td>
<td>FIPR Vm,Fv</td>
<td>FE</td>
<td>1</td>
<td>(5, 5, 6, 7)/8</td>
<td>#43</td>
<td>F0 2 4</td>
<td></td>
</tr>
<tr>
<td></td>
<td></td>
<td>FIPR Vm,Fv</td>
<td>FE</td>
<td>1</td>
<td>(5, 5, 6, 7)/8</td>
<td>#43</td>
<td>F1 3 4</td>
<td></td>
</tr>
</tbody>
</table>

**Table 76: Execution cycles**

1. See **Table 74** for the instruction groups.

2. Latency “L1/ L2...”: Latency corresponding to a write to each register, including MACH/ MACL/ FPSCR. Example MOV B @Rm+, Rn “1/ 2”: The latency for Rm is 1 cycle, and the latency for Rn is 2 cycles.

3. Branch latency: Interval until the branch destination instruction is fetched

4. Conditional branch latency “2 (or 1)”: The latency is 2 for a nonzero displacement, and 1 for a zero displacement.

5. Double-precision floating-point instruction latency “(L1, L2)/ L3”: L1 is the latency for FR [n+1], L2 that for FR [n], and L3 that for FPSCR.

6. FTRV latency “(L1, L2, L3, L4)/ L5”: L1 is the latency for FR [n], L2 that for FR [n+1], L3 that for FR [n+2], L4 that for FR [n+3], and L5 that for FPSCR.

7. Latency “L1/ L2/ L3/ L4” of MAC.L and MAC.W instructions: L1 is the latency for Rm, L2 that for Rn, L3 that for MACH, and L4 that for MACL.
8 Latency “L1/ L2” of MUL.L, MULS.W, MULU.W, DMULS.L, and DMULU.L instructions: L1 is the latency for MACH, and L2 that for MACL.

9 Execution pattern: The instruction execution pattern number (see figure 8.2)

10 Lock/ stage: Stage locked by the instruction

11 Lock/ start: Locking start cycle; 1 is the first D-stage of the instruction.

12 Lock/ cycles: Number of cycles locked.

Exceptions:

1 When a floating-point computation instruction is followed by an FMOV store, an STS FPUL, Rn instruction, or an STS.L FPUL, @Rn instruction, the latency of the floating-point computation is decreased by 1 cycle.

2 When the preceding instruction loads the shift amount of the following SHAD/ SHLD, the latency of the load is increased by 1 cycle.

3 When an LS group instruction with a latency of less than 3 cycles is followed by a double-precision floating-point instruction, FIPR, or FTRV, the latency of the first instruction is increased to 3 cycles.

Example: In the case of FMOV FR4,FR0 and FIPR FV0,FV4, FIPR is stalled for 2 cycles.

4 When MAC*/MUL*/DMUL* is followed by an STS.L MAC*, @Rn instruction, the latency of MAC*/MUL*/DMUL* is 5 cycles.

5 In the case of consecutive executions of MAC*/MUL*/DMUL*, the latency is decreased to 2 cycles.

6 When an LDS to MAC* is followed by an STS.L MAC*, @Rn instruction, the latency of the LDS to MAC* is 4 cycles.

7 When an LDS to MAC* is followed by MAC*/MUL*/DMUL*, the latency of the LDS to MAC* is 1 cycle.

8 When an FSCHG or FRCHG instruction is followed by an LS group instruction that reads or writes to a floating-point register, the aforementioned LS group instruction[s] cannot be executed in parallel.

9 When a single-precision FTRC instruction is followed by an STS FPUL, Rn instruction, the latency of the single-precision FTRC instruction is 1 cycle.
## Address list

<table>
<thead>
<tr>
<th>Module</th>
<th>Register</th>
<th>P4 address</th>
<th>Area 7 address</th>
<th>Size</th>
<th>Power-on reset</th>
<th>Manual reset</th>
<th>Sleep</th>
<th>Standby</th>
<th>Sync clock</th>
</tr>
</thead>
<tbody>
<tr>
<td>CCN</td>
<td>PTEH</td>
<td>0xFF00 0000</td>
<td>0x1F00 0000</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>PTEL</td>
<td>0xFF00 0004</td>
<td>0x1F00 0004</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>TTB</td>
<td>0xFF00 0008</td>
<td>0x1F00 0008</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>TEA</td>
<td>0xFF00 000C</td>
<td>0x1F00 000C</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>MMUCR</td>
<td>0xFF00 0010</td>
<td>0x1F00 0010</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>BASRA</td>
<td>0xFF00 0014</td>
<td>0x1F00 0014</td>
<td>8</td>
<td>Undefined</td>
<td>Held</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>BASRB</td>
<td>0xFF00 0018</td>
<td>0x1F00 0018</td>
<td>8</td>
<td>Undefined</td>
<td>Held</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>CCR</td>
<td>0xFF00 001C</td>
<td>0x1F00 001C</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>TRA</td>
<td>0xFF00 0020</td>
<td>0x1F00 0020</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>EXPEVT</td>
<td>0xFF00 0024</td>
<td>0x1F00 0024</td>
<td>32</td>
<td>0x0000 0000</td>
<td>0x0000 0020</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>INTEVT</td>
<td>0xFF00 0028</td>
<td>0x1F00 0028</td>
<td>32</td>
<td>0x0000 0000</td>
<td>Held</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>QACR0</td>
<td>0xFF00 0038</td>
<td>0x1F00 0038</td>
<td>32</td>
<td>Undefined</td>
<td>Undefined</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>CCN</td>
<td>QACR1</td>
<td>0xFF00 003C</td>
<td>0x1F00 003C</td>
<td>32</td>
<td>Undefined</td>
<td>Undefined</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>UBC</td>
<td>BARA</td>
<td>0xFF20 0000</td>
<td>0x1F20 0000</td>
<td>32</td>
<td>Undefined</td>
<td>Held</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
<tr>
<td>UBC</td>
<td>BAMRA</td>
<td>0xFF20 0004</td>
<td>0x1F20 0004</td>
<td>8</td>
<td>Undefined</td>
<td>Held</td>
<td>Held</td>
<td>Held</td>
<td>Iclk</td>
</tr>
</tbody>
</table>

*Table 77: Address list*
a. With control registers, the above addresses in the physical page number field can be accessed by means of a TLB setting. When these addresses are referenced directly without using the TLB, operations are limited.

Note: The address map for peripheral devices is contained in the system manual for the part.
Instruction prefetch side effects

The SH-4 is provided with an internal buffer for holding pre-read instructions, and always performs pre-reading. Therefore, program code must not be located in the last 20-byte area of any memory space. If program code is located in these areas, the memory area will be exceeded and a bus access for instruction pre-reading may be initiated. A case in which this is a problem is shown below.

<table>
<thead>
<tr>
<th>Address</th>
<th>Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>0x03FFFFFF8</td>
<td>ADD R1,R4</td>
<td>PC (program counter)</td>
</tr>
<tr>
<td>0x03FFFFFA</td>
<td>JMP @R2</td>
<td></td>
</tr>
<tr>
<td>Area 0</td>
<td>0x03FFFFFC</td>
<td>NOP</td>
</tr>
<tr>
<td>0x03FFFFFE</td>
<td>NOP</td>
<td></td>
</tr>
<tr>
<td>Area 1</td>
<td>0x040000000</td>
<td>Instruction prefetch address</td>
</tr>
<tr>
<td>0x40000002</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Table 78: Example

Table 78 illustrates a case in which the instruction (ADD) indicated by the program counter (PC) and the address 0x0400002 instruction prefetch are executed simultaneously. Note that the program branches to an area outside Area 1 after executing the following JMP instruction and delay slot instruction.

In this case, the program flow is unpredictable, and a bus access (instruction prefetch) to Area 1 may be initiated.
Instruction prefetch side effects

1. It is possible that an external bus access caused by an instruction prefetch may result in misoperation of an external device, such as a FIFO, connected to the area concerned.

2. If there is no device to reply to an external bus request caused by an instruction prefetch, hangup will occur.

Remedies

1. These illegal instruction fetches can be avoided by using the MMU.

2. The problem can be avoided by not locating program code in the last 20 bytes of any area.
## Index

### A
- ADD 209, 214-215, 261
- ADDC 216
- ADDV 217
- AND 186, 199-201, 218-220
- AND.B 220

### B
- Backus-Naur Form xiii
- BF 221, 223
- BNF. See Backus-naur Form.
- BRA 225
- BRAF 226
- BREAK 227
- BRK 227
- BSR 209, 228
- BSRF 209, 230
- BT 231, 233

### C
- CMPGT 267

### D
- DIV0S 247
- DIV1 249
- DMUL.S.L 250
- DMULU.L 251
- DT 252

### E
- ELSE 192
- EXTS.B 253
- EXTS.W 254
- EXTU.B 255
- EXTU.W 256

### F
- FABS 257-258
- FABS.D 204
- FABS.S 204
- FADD 211, 259-261
- FADD.D 204
- FADD.S 204
- FCMPEQ.D 205
- FCMPEQ.S 205
- FCMPT.D 205
- FCMPT.S 205
- FCNV.DS 205
- FCNV.SD 205
- FCNVDS 268, 270
FCNVSD 269-270
FDIV 271-273
FDIV.D 204
FDIV.S 204
FIPR 275-276, 278
FIPR.S 206, 276
FLDI 280-281
FLDS 279
FLOAT 282-284
FLOAT.LD 205
FLOAT.LS 205
FMAC 285
FMAC.S 205, 286-287
FMOV 290-293, 295-298, 300-304, 306-315
FMOV.S 296-298, 300, 310-312
FMUL 316-318
FMUL.D 204
FMUL.S 204
FNEG 319-320
FNEG.D 205
FNEG.S 205
FOR 184-185, 188, 192, 196, 199, 201
FPD 194, 203, 261, 264, 267, 270, 273,
276-277, 284, 286, 318, 325, 329,
331-332, 334-335
FPUDIS 257-260, 262-263, 265-266,
268-269, 271-272, 276, 279-283, 285,
290-292, 294-297, 299-303, 305-317,
319-324, 326-328, 330-331, 334,
353-356, 459-462
FPUE 212
FPUL 194, 268-269, 279, 282-284, 326,
330-331, 355-356, 461-462
FROM 192
FSQRT 323-325
FSQRT.D 205
FSQRT.S 205
<table>
<thead>
<tr>
<th>Function</th>
<th>Page(s)</th>
</tr>
</thead>
<tbody>
<tr>
<td>FNEG_D</td>
<td>205</td>
</tr>
<tr>
<td>FNEG_S</td>
<td>205</td>
</tr>
<tr>
<td>FpuCauseE()</td>
<td>203</td>
</tr>
<tr>
<td>FpuCauseI()</td>
<td>203</td>
</tr>
<tr>
<td>FpuCauseO()</td>
<td>203</td>
</tr>
<tr>
<td>FpuCauseU()</td>
<td>203</td>
</tr>
<tr>
<td>FpuCauseV()</td>
<td>203</td>
</tr>
<tr>
<td>FpuCauseZ()</td>
<td>203</td>
</tr>
<tr>
<td>FpuEnableI()</td>
<td>203</td>
</tr>
<tr>
<td>FpuEnableO()</td>
<td>203</td>
</tr>
<tr>
<td>FpuEnableU()</td>
<td>203</td>
</tr>
<tr>
<td>FpuEnableV()</td>
<td>203</td>
</tr>
<tr>
<td>FpuFlagI()</td>
<td>203</td>
</tr>
<tr>
<td>FpuFlagO()</td>
<td>203</td>
</tr>
<tr>
<td>FpuFlagU()</td>
<td>203</td>
</tr>
<tr>
<td>FpuFlagV()</td>
<td>203</td>
</tr>
<tr>
<td>FpuFlagZ()</td>
<td>203</td>
</tr>
<tr>
<td>FpuIsDisabled()</td>
<td>203</td>
</tr>
<tr>
<td>FSQRT_D</td>
<td>205</td>
</tr>
<tr>
<td>FSQRT_S</td>
<td>205</td>
</tr>
<tr>
<td>FSUB_D</td>
<td>204</td>
</tr>
<tr>
<td>FSUB_S</td>
<td>204</td>
</tr>
<tr>
<td>FTRC_DL</td>
<td>205</td>
</tr>
<tr>
<td>FTRC_SL</td>
<td>205</td>
</tr>
<tr>
<td>FTRV_S</td>
<td>206</td>
</tr>
<tr>
<td>InstFetchMiss(address)</td>
<td>197</td>
</tr>
<tr>
<td>InstInvalidateMiss(address)</td>
<td>197</td>
</tr>
<tr>
<td>IsLittleEndian()</td>
<td>198</td>
</tr>
<tr>
<td>MalformedAddress(address)</td>
<td>197, 199-201</td>
</tr>
<tr>
<td>MMU()</td>
<td>197, 199-201</td>
</tr>
<tr>
<td>OCBI(address)</td>
<td>202</td>
</tr>
<tr>
<td>OCBP(address)</td>
<td>202</td>
</tr>
<tr>
<td>OCBWB(address)</td>
<td>202</td>
</tr>
<tr>
<td>PrefetchMemory(address)</td>
<td>200</td>
</tr>
<tr>
<td>PREFO(address)</td>
<td>200, 202</td>
</tr>
<tr>
<td>ReadMemoryLown(address)</td>
<td>200</td>
</tr>
<tr>
<td>ReadMemoryn(address)</td>
<td>198-199</td>
</tr>
<tr>
<td>ReadMemoryPairn(address)</td>
<td>198-199</td>
</tr>
<tr>
<td>ReadProhibited(address)</td>
<td>197, 199-200</td>
</tr>
<tr>
<td>Register(i)</td>
<td>188</td>
</tr>
</tbody>
</table>

SignExtendn(i) 187
WriteControlRegister(index, value) 201
WriteMemoryLown(address, value) 201
WriteMemoryn(address, value) 200-201
WriteMemoryPairn(address, value) 200-201
WriteProhibited(address) 197, 201
ZeroExtendn(i) 187

I
IADDERR 226, 230
IF 192, 199-201, 212
INT 186
ISA 207-208, 226, 230, 338

J
JMP 337
JSR 209, 338

L
LDC 339-352, 443-449
LDC.L 340, 346-352
LDS 209, 348, 353-362
LDS.L 354, 356, 358, 360, 362

M
MAC.L 365
MAC.W 367
MD 194
MEM 195-196, 199, 201
MMU 197-201
MOV 369-402
MOV.B 371-380
MOV.L 381-391
MOV.W 392-402
MOVA 403
MOVCA.L 404
MOV.T 406
MUL.L 407
MULS.W 408
MULU.W 409
N
NEG 410
NEGC 411
NOT 186, 200, 413
O
OCBI 202, 414
OCBP 202, 415
OCBWB 202, 416
OR 186, 199, 201, 417-419
OR.B 419
P
PO 204-206
PREF 420
PREFO 200, 202
R
RADDERR 199
READPROT 199
Register
DR 195

FPSCR.CAUSE.E 203
FPSCR.CAUSE.I 203
FPSCR.CAUSE.O 203
FPSCR.CAUSE.U 203
FPSCR.CAUSE.V 203
FPSCR.CAUSE.Z 203
FPSCR.DN 261, 263, 266, 270, 273, 276, 286, 318, 325, 329, 331, 334-335
FPSCR.ENABLE.I 203
FPSCR.ENABLE.O 203
FPSCR.ENABLE.U 203
FPSCR.ENABLE.V 203
FPSCR.ENABLE.Z 203
FPSCR.FLAG.I 203
FPSCR.FLAG.O 203
FPSCR.FLAG.U 203
FPSCR.FLAG.V 203
FPSCR.FLAG.Z 203
FPSCR.FR 321
FPSCR.SZ 322
FR 285, 321
GBR 194, 220, 339-352, 374, 379, 384, 389, 395, 400, 419, 443-458, 478, 481
MTRX 195
### PRELIMINARY DATA

SR 194, 202-203
SR.FD 203
REPEAT 192
ROTCL 421
ROTCR 422
ROTL 423
ROTR 424
RTLBMİSS 199

**S**
- SHAD 430
- SHAL 431
- SHAR 432
- SHLD 433
- SHLL 434-437
- SHLR 438-442
- SLEEP 201-202
- STC 450-458
- STC.L 450-456, 458
- STEP 192
- STS 209, 459-468
- STS.L 460, 462, 464, 466, 468
- SUB 329, 469
- SUBC 470
- SUBV 471
- SuperH SH-Series documentation suite notation xiii
- SWAP.B 472
- SWAP.W 473
- SZ 322

**T**
- TAS.B 474
- The appendix 515
- THROW 193, 199, 201, 212
- TRAPA 475
- TST 476-478
- TST.B 478

**U**
- UNDEFINED 190-191

**W**
- WRITEPROT 201
- WTLBMİSS 201

**XYZ**
- XMTRX 333-334
- XOR 186, 479-481
- XOR.B 481
- XTRCT 482