Visualizing the ARM64 Instruction Set (2024)

Mapping the machine code

Researcher Zachary Yedidia has mapped the entire ARM64 instruction set into a 2D visualization using space-filling Hilbert curves. The project translates the complex web of 32-bit integers that govern modern mobile and server processors into a readable color-coded map. ARM64 encodes every instruction as a 32-bit integer, allowing for a total of 4,294,967,296 possible combinations. Yedidia used a Hilbert curve to organize these four billion possibilities into a two-dimensional plane. This specific type of space-filling curve preserves locality, meaning instructions with similar bit patterns remain near each other on the map. The resulting images reveal distinct clusters and patterns that define how ARM processors interpret code. The visualization relies on Arm’s Machine Readable Architecture (MRA) Specification. Yedidia used the June 2023 version of the spec, which includes all architectural extensions up to ARMv8.9. This document provides the XML and HTML data necessary to decode the semantics of every instruction in the Instruction Set Architecture (ISA).

Parsing the official specification

Yedidia developed a custom tool to parse the massive XML files provided by Arm. This tool identified approximately 3,000 unique instruction encodings within the architecture. The parser extracts critical metadata for each entry, including mnemonics, instruction classes, and specific ARMv8 feature variants. A second tool then iterates through every possible 32-bit value to determine its function. The process is computationally intensive because it must evaluate billions of potential instructions against the encoding diagrams. The official specification uses bits represented as 0, 1, or x, but it also includes parenthesized (0) and (1) values. Yedidia treated these parenthesized bits as "don't care" values (x) to match how existing disassemblers handle the data. These bits likely represent recommended but not strictly required encodings. This mapping creates the foundation for the final image, where each pixel represents a block of the instruction space.

Handling instruction logic errors

The official Arm specification often includes Arm Specification Language (ASL) code that can overrule simple bit-string encodings. For example, the EOR (Exclusive OR) instruction becomes undefined if certain bit conditions, like the "sf" and "N" flags, meet specific criteria. Simple bit-pattern matching cannot catch these edge cases. To solve this, Yedidia implemented a post-processing pass using the Capstone disassembler. Capstone understands the complex ASL rules and can identify which instructions are actually valid on real hardware. This step filters out "ghost" instructions that look correct in the bit-string but are rejected by the processor. The final visualization groups instructions into several distinct classes to make the map readable. These categories include:

General purpose instructions
System and control operations
Float and FPSIMD for floating-point math
SVE and SVE2 for scalable vector extensions
Mortlach and Mortlach2 (the internal names for SME and SME2)
Other miscellaneous encodings

Visualizing software sandbox security

The mapping project also supports Yedidia’s research into Lightweight Fault Isolation (LFI). LFI is a software sandboxing technique designed to secure ARM64 systems by restricting what instructions a program can execute. Yedidia will present the formal paper on LFI at the ASPLOS conference this April. LFI uses machine code analysis to verify that an untrusted binary is safe to run. The verifier only permits instructions that obey strict invariants regarding memory access and register modification. If an instruction could potentially leak data or crash the system, the verifier flags the entire program as unsafe. Yedidia created a security heatmap using the Hilbert curve to show which parts of the ARM64 ISA are "legal" under LFI. In this view, red areas indicate blocks where every instruction is safe, while blue areas show regions heavily restricted by the sandbox. This visualization helps researchers verify that the sandboxing logic correctly identifies dangerous code patterns.

Restricting the instruction space

The LFI verifier is significantly more restrictive than the standard ARM64 architecture. While the full ISA contains billions of possibilities, the current LFI verifier only permits roughly 750 million instructions. Most of these restrictions focus on protecting specific registers that the sandbox uses to maintain security boundaries. The verifier monitors and restricts instructions that modify the following registers:

x18 (often used as a platform register)
x21 through x24 (used for sandbox invariants)
sp (the stack pointer)
x30 (the link register)

The visualization shows checkered blue patterns in the load and store regions, representing blocked memory operations. Direct branches usually appear in solid red because their ranges are statically limited, making them inherently safe for the sandbox. Floating-point and SIMD instructions are also generally permitted because they rarely interact with the restricted general-purpose registers or system memory.

Building the interactive tool

Yedidia released an interactive web version of the map that allows users to explore the ARM64 space manually. The web tool uses a version of Capstone compiled to WebAssembly to provide real-time disassembly of any point on the Hilbert curve. Users can hover over pixels to see the exact assembly code and instruction class. One challenge in the web version is the assembler templates provided by Arm. These templates are designed for human readers rather than automated tools, making it difficult to generate string representations of instructions directly from the spec. The web tool currently falls back to displaying the instruction name if the WebAssembly version of Capstone does not recognize a specific extension. The source code for these visualization tools is available on GitHub under the "armvis" repository. Yedidia’s future plans include adding support for more ARM extensions and potentially creating a similar map for the RISC-V architecture. These tools provide a new way for security researchers and compiler engineers to understand the massive complexity of modern instruction sets.

Visualizing the ARM64 Instruction Set (2024)

Mapping the machine code

Parsing the official specification

Handling instruction logic errors

Visualizing software sandbox security

Restricting the instruction space

Building the interactive tool

Related Articles

GoFigr explains why plot capture works in Python but not R

Notepad++ declares hardened update process 'effectively unexploitable'

Stay in the loop

What to Do If (or When) Your Email Is Leaked to the Dark Web

Related Articles

GoFigr explains why plot capture works in Python but not R
Feb 20, 20264 min read

Notepad++ declares hardened update process 'effectively unexploitable'
Feb 19, 20264 min read

What to Do If (or When) Your Email Is Leaked to the Dark Web
Feb 20, 20263 min read