MIPS XIP Kernel
Execute-In-Place Linux kernel for memory-constrained embedded devices. Build firmware, IoT nodes, and router systems that run directly from SPI-NOR flash — no RAM copy required.
Project Overview
This project implements CONFIG_XIP_KERNEL for the MIPS architecture on Linux 6.12.34. It enables the kernel to execute its .text, .rodata, and .init.text sections directly from SPI-NOR flash without copying to RAM or decompressing.
Key Capabilities
Flash Execution
Kernel code runs directly from SPI-NOR flash at physical address 0x9FC01000. No decompression, no RAM copy.
RAM Optimization
Frees ~2304 KiB of宝贵 RAM on memory-constrained devices. Critical for IoT nodes and budget routers.
Minimal Patches
Only 5 kernel patches required. Uses MIPS KSEG0/KSEG1 direct-mapped segments — no page-table fixups needed.
Verified Boot
Static layout assertions + QEMU smoke tests ensure correctness. CI runs on every push.
Supported Configurations
| Defconfig | Purpose | Networking | Use Case |
|---|---|---|---|
xip_qemu_malta_defconfig | Minimal XIP test | No | Development & verification |
lwrt_qemu_malta_defconfig | Router userspace | Full IPv4 + nftables | LWRT router firmware |
Quick Start Guide
Build a bootable XIP kernel image in under 5 minutes.
Prerequisites
# Ubuntu/Debian
sudo apt update
sudo apt install -y clang lld llvm binutils-mipsel-linux-gnu \
qemu-system-mips flex bison bc libelf-dev libssl-dev \
python3 python3-pip texinfo
# Verify toolchain
clang --version
mipsel-linux-gnu-ld --version
qemu-system-mips --version
Build Everything
# Clone the repository
git clone https://github.com/user/mips-xip-kernel.git
cd mips-xip-kernel
# Full build: download + patch + compile + assemble
make
# Static layout verification
make verify
# Boot test in QEMU
make test
# Interactive QEMU session
make run
Build Outputs
build/
├── linux-6.12.34/ # Patched kernel source tree
│ ├── vmlinux # ELF kernel image
│ └── System.map # Symbol map
├── out/
│ ├── xip-bios.bin # Final flash image (shim + kernel ROM)
│ ├── shim.elf # Boot shim ELF
│ ├── shim.bin # Boot shim raw binary
│ └── kernel.bin # Kernel ROM blob (objcopy -O binary)
└── boot.log # QEMU serial output (if make test/run)
System Architecture
Boot Sequence
- CPU Reset: Jumps to reset vector at
0xBFC00000(virtual) /0x1FC00000(physical) - Boot Shim (4 KiB): Initializes GT-64120 system controller, fakes YAMON protocol, jumps to
kernel_entry - head.S XIP Data Copy: Copies writable data from flash LMA to RAM VMA (
__data_loc → _sdata) - BSS Clear: Zeros the
.bsssection in RAM - setup.c: Memblock accounting reserves only RAM-resident sections (
[_sdata, _end)) - Kernel Init: Standard Linux initialization with XIP-aware memory layout
- PID 1: Freestanding init binary executes, outputs markers, powers off
Firmware Fundamentals
Firmware is the software that bridges hardware and operating systems. In embedded Linux systems, firmware typically consists of a bootloader, kernel, device tree, and root filesystem — all packed into a flash image.
Firmware Image Components
4 KiB — Hardware init + jump to kernel
XIP — .text + .rodata in flash
.data + .bss copied at boot
Root filesystem in RAM
Flash Memory Types
| Flash Type | Typical Size | Speed | XIP Support | Common Devices |
|---|---|---|---|---|
| SPI-NOR | 4–16 MiB | ~50 MB/s | Yes (this project) | Routers, IoT nodes |
| SPI-NAND | 128 MiB–2 GiB | ~200 MB/s | Limited | Set-top boxes, NAS |
| eMMC | 4–64 GiB | ~400 MB/s | No | Phones, tablets |
| NOR (parallel) | 1–32 MiB | ~100 MB/s | Yes | Legacy embedded |
OpenWrt Firmware Image Structure
┌─────────────────────────────────────────────────┐
│ U-Boot Header (64 bytes) │
├─────────────────────────────────────────────────┤
│ Kernel (compressed or XIP) │
│ ├── Entry point │
│ ├── Device Tree Blob (DTB) │
│ └── Initramfs │
├─────────────────────────────────────────────────┤
│ Root Filesystem (SquashFS + JFFS2 overlay) │
├─────────────────────────────────────────────────┤
│ Bootloader (U-Boot / Breed) │
└─────────────────────────────────────────────────┘
Build System
The build system is a Makefile-driven pipeline that orchestrates kernel download, patching, compilation, and flash image assembly.
Makefile Targets
| Target | Command | Description | Time |
|---|---|---|---|
make | all: image | Full build pipeline | ~5 min |
make kernel | build-kernel.sh | Download, patch, compile kernel | ~3 min |
make image | build-image.sh | Assemble flash image | ~10 sec |
make verify | verify-layout.py | Static ELF assertions | ~1 sec |
make test | smoke-test.py | QEMU boot test | ~30 sec |
make run | run-qemu.sh | Interactive QEMU session | manual |
make clean | rm -rf $(OUT) | Remove build outputs | instant |
make distclean | rm -rf $(WORK) | Remove entire work directory | instant |
Build Pipeline Flow
Download linux-6.12.34.tar.xz from kernel.org
Untar to build/linux-6.12.34/
Apply 5 XIP patches via patch -p1
Build initramfs (LWRT rootfs or demo init)
Copy defconfig, run olddefconfig
Build vmlinux with clang + mipsel binutils
Concatenate shim + kernel ROM → xip-bios.bin
build-kernel.sh Details
#!/bin/bash
set -euo pipefail
KVER=6.12.34
WORK=build
KDIR=$WORK/linux-$KVER
OUT=$WORK/out
# Which defconfig to build. Defaults to the bare XIP boot-test config; the
# LWRT image build overrides it with DEFCONFIG=lwrt_qemu_malta_defconfig.
DEFCONFIG="${DEFCONFIG:-xip_qemu_malta_defconfig}"
# 1. Fetch kernel tarball
curl -fSL "https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-${KVER}.tar.xz" \
-o "$WORK/linux-${KVER}.tar.xz"
# 2. Extract
tar -xf "$WORK/linux-${KVER}.tar.xz" -C "$WORK"
# 3. Apply XIP patches
for p in patches/*.patch; do
patch -p1 -d "$KDIR" < "$p"
done
# 4. Build the initramfs userspace. With no arguments this builds the tiny
# freestanding demo PID 1; if LWRT_ROOTFS points at a staged rootfs, the
# script instead emits a gen_init_cpio list describing it (this is how the
# LWRT binary is baked in). Either way it writes $OUT/initramfs.list.
bash scripts/build-userspace.sh "$OUT"
# 5. Configure kernel from the selected defconfig, then point the embedded
# initramfs at the generated list.
cp "configs/$DEFCONFIG" "$KDIR/arch/mips/configs/"
make -C "$KDIR" CC=clang ARCH=mips CROSS_COMPILE=mipsel-linux-gnu- "$DEFCONFIG"
"$KDIR/scripts/config" --file "$KDIR/.config" \
--set-str INITRAMFS_SOURCE "$OUT/initramfs.list"
make -C "$KDIR" CC=clang ARCH=mips olddefconfig
# 6. Compile
make -C "$KDIR" CC=clang ARCH=mips \
CROSS_COMPILE=mipsel-linux-gnu- \
-j$(nproc) vmlinux
Kernel Configuration
The kernel configuration is carefully tuned for minimal footprint while retaining essential embedded functionality.
Minimal XIP Config
# Architecture
CONFIG_MIPS=y
CONFIG_32BIT=y
CONFIG_CPU_MIPS32_R2=y
CONFIG_CPU_LITTLE_ENDIAN=y
# XIP (the core feature)
CONFIG_XIP_KERNEL=y
CONFIG_XIP_PHYS_ADDR=0x1fc01000
# Optimization
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_EXPERT=y
CONFIG_SLUB_TINY=y
CONFIG_KALLSYMS=n
# Boot
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# Hardware
CONFIG_MIPS_MALTA=y
CONFIG_PCI=y
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
# Power management
CONFIG_POWER_RESET_PIIX4_POWEROFF=y
# Disabled subsystems
CONFIG_SMP=n
CONFIG_FPU=n
CONFIG_BLOCK=n
CONFIG_PROC_FS=n
CONFIG_SYSFS=n
CONFIG_NET=n
CONFIG_USB=n
CONFIG_DEBUG_INFO=n
LWRT Router Config
Extends the minimal config with just enough for the LWRT userspace to come up as PID 1: the virtual filesystems, sysctl, IPv4 networking and the QEMU malta NIC. The embedded LWRT initramfs is gzip-compressed.
# Networking core (lean: dhcp client / dns / httpd)
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
# CONFIG_IPV6 is not set
# Network driver (QEMU malta NIC)
CONFIG_NETDEVICES=y
CONFIG_ETHERNET=y
CONFIG_NET_VENDOR_AMD=y
CONFIG_PCNET32=y
# VFS the LWRT init needs
CONFIG_PROC_FS=y
CONFIG_PROC_SYSCTL=y
CONFIG_SYSFS=y
CONFIG_TMPFS=y
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
# IPC / syscalls the Rust std runtime expects
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_SYSCTL=y
# Compress the embedded initramfs to fit the malta byteswap window
CONFIG_RD_GZIP=y
CONFIG_INITRAMFS_COMPRESSION_GZIP=y
The full firewall (nftables / conntrack / NAT) and the bridge/802.1q datapath are deliberately absent from this config. Two QEMU-malta-only ceilings — neither of which exists on real hardware — make them impossible to fit here:
- R_MIPS_26 window. XIP runs from the malta
-bioswindow at phys0x1fc01000→ KSEG00x9fc01000, leaving only ~4 MiB before the0xa0000000segment boundary. MIPSjalcannot jump across it, so the flash-resident text must stay under ~4 MiB. - Byteswap window. QEMU only byte-swaps the first
0x3e0000(~3.9 MiB) of a-biosimage on mipsel, so the entire flash image (shim + kernel ROM + embedded initramfs) must fit under that.
The complete feature set is deferred to the MT7628 defconfig: that
SoC maps its SPI flash XIP window at 0x1c000000
(KSEG0 0x9c000000) with 64 MiB of headroom and has no
byteswap quirk, so the full firewall links and fits there. The malta
config exists to validate that LWRT boots and runs on an XIP
kernel.
Image Assembly
The final flash image is assembled by concatenating the boot shim and the kernel ROM blob.
Image Layout
Assembly Script
#!/bin/bash
set -euo pipefail
WORK=build
KDIR=$WORK/linux-6.12.34
OUT=$WORK/out
mkdir -p "$OUT"
# 1. Extract kernel_entry address from System.map
ENTRY=$(awk '$3 == "kernel_entry" {print "0x"$1}' "$KDIR/System.map")
# 2. Build boot shim (4 KiB at reset vector)
mipsel-linux-gnu-gcc -c -o shim.o shim/shim.S \
-mips32r2 -EL -mno-abicalls -fno-pic -nostdlib
mipsel-linux-gnu-ld -o shim.elf shim.o \
-T shim/shim.ld --defsym kernel_entry="$ENTRY" \
-nostdlib --no-dynamic-linker
mipsel-linux-gnu-objcopy -O binary shim.elf shim.bin
# 3. Extract kernel ROM blob
mipsel-linux-gnu-objcopy -O binary -j .text -j .rodata \
-j .init.text -j .data -j .init.data \
"$KDIR/vmlinux" kernel.bin
# 4. Concatenate: shim + kernel → flash image
cat shim.bin kernel.bin > xip-flash.bin
# 5. Word-swap for QEMU malta (big-endian BIOS on mipsel)
python3 -c "
import sys
data = open('xip-flash.bin','rb').read()
swapped = b''
for i in range(0, len(data), 4):
word = data[i:i+4]
if len(word) == 4:
swapped += bytes([word[3],word[2],word[1],word[0]])
else:
swapped += word
open('$OUT/xip-bios.bin','wb').write(swapped)
"
echo "Flash image: $OUT/xip-bios.bin ($(stat -c%s $OUT/xip-bios.bin) bytes)"
Boot Shim
The boot shim is a 4 KiB MIPS assembly program that initializes hardware and bridges the gap between the reset vector and the Linux kernel entry point.
What the Shim Does
- Skip QEMU board ID: Word at offset 0x10 contains a board identifier — skip over it
- Disable interrupts:
diinstruction prevents any interrupt during init - GT-64120 init: Programs the system controller registers exactly as YAMON does:
- Moves GT registers to
0x1BE00000 - Sets PCI I/O decode at
0x18000000(needed for serial at0x180003F8) - Configures PCI memory windows
- Moves GT registers to
- Fake YAMON protocol: Sets registers
a0=argc, a1=argv, a2=envp, a3=memsize - Jump to kernel: Loads
kernel_entryaddress (injected via linker--defsym) and jumps
ASSERT(SIZEOF(.text) <= 4096, "Shim too large")
Embedded Systems Fundamentals
Embedded systems are specialized computing devices designed for specific tasks, often with strict constraints on power, memory, cost, and real-time performance.
Embedded Linux Architecture
| Layer | Component | Description |
|---|---|---|
| Application | User programs | Domain-specific logic (routing, sensor reading, etc.) |
| Libraries | uClibc, musl | Minimal C library for embedded systems |
| Middleware | BusyBox, procd | Init system, shell, core utilities |
| Kernel | Linux | Process management, drivers, networking |
| Bootloader | U-Boot, Breed | Hardware init, kernel loading |
| Board Support | Device Tree, patches | Hardware-specific configuration |
Resource Constraints
RAM
Typical for budget routers. XIP saves 2.3 MiB — nearly 30% of total RAM.
Flash
SPI-NOR flash stores kernel, rootfs, and configuration.
CPU
No FPU, no SMP, single-core. Optimization is critical.
Power
Wall-powered but thermal constraints in compact enclosures.
Memory Management
Memory management in XIP kernels requires careful attention to the distinction between flash-resident and RAM-resident sections.
MIPS Memory Map
MIPS Virtual Address Space (32-bit)
┌────────────────────────────────────────────────┐ 0xFFFFFFFF
│ KSEG3 (0xBFC00000) — Cached, mapped │
│ └── Reset vector: 0xBFC00000 │
├────────────────────────────────────────────────┤ 0xA0000000
│ KSEG1 (0xA0000000) — Uncached, unmapped │
│ └── I/O registers, flash if > 512 MiB │
├────────────────────────────────────────────────┤ 0x80000000
│ KSEG0 (0x80000000) — Cached, unmapped │
│ └── RAM: 0x80000000 – 0x8FFFFFFF │
│ └── XIP virt: 0x80000000 + phys_addr │
├────────────────────────────────────────────────┤ 0x00000000
│ USEG (0x00000000) — User space (mapped) │
└────────────────────────────────────────────────┘
XIP Memory Sections
| Section | Location | Address | Writable | Notes |
|---|---|---|---|---|
.text | Flash (ROM) | 0x9FC01000 | No | Kernel code — executes in place |
.rodata | Flash (ROM) | after .text | No | Read-only data (strings, constants) |
.init.text | Flash (ROM) | after .rodata | No | Init functions (freed after boot) |
.data | RAM | 0x8xxxxxxx | Yes | Global/static variables |
.bss | RAM | after .data | Yes | Zero-initialized data |
.init.data | RAM | after .data | Yes | Init-only data (freed after boot) |
.data..xip_patchable_text | RAM | after .data | Yes | TLB handlers (uasm-generated) |
Data Copy at Boot
__data_loc (flash LMA) → _sdata (RAM VMA) through __init_endThis mirrors what ARM and RISC-V XIP heads do.
Cross-Compilation
Cross-compilation builds software on one architecture (x86_64) targeting another (MIPS). This project uses clang as the compiler with mipsel-linux-gnu- binutils for linking.
Toolchain Components
| Tool | Purpose | Package |
|---|---|---|
clang | C compiler (cross-compilation via --target) | clang |
mipsel-linux-gnu-ld | Linker for MIPS little-endian | binutils-mipsel-linux-gnu |
mipsel-linux-gnu-objcopy | Binary format conversion | binutils-mipsel-linux-gnu |
mipsel-linux-gnu-gcc | Assembly (boot shim) | gcc-mipsel-linux-gnu |
qemu-system-mips | Emulation for testing | qemu-system-mips |
Why Clang?
- No separate cross-compiler needed: Clang targets any architecture via
--target=mipsel-linux-gnu - Better diagnostics: Clearer error messages for kernel code issues
- Faster compilation: Generally faster than GCC for kernel builds
- Reproducible builds: Deterministic output across machines
Toolchain Setup
Ubuntu/Debian
# Install cross-compilation toolchain
sudo apt install -y \
clang lld llvm \
binutils-mipsel-linux-gnu \
gcc-mipsel-linux-gnu \
qemu-system-mips
# Verify
clang --version
mipsel-linux-gnu-ld --version
qemu-system-mips --version
Build Dependencies
# Kernel build dependencies
sudo apt install -y \
flex bison bc libelf-dev libssl-dev \
libncurses-dev cpio wget xz-utils \
python3 python3-pip
Freestanding Programs
Freestanding programs run without any C library or OS support. This project includes a minimal PID 1 init binary that uses raw MIPS syscalls.
userspace/init.c
// Freestanding PID 1 — no libc, raw MIPS o32 syscalls
#include <stddef.h>
#define SYS_exit 4001
#define SYS_write 4004
#define SYS_pause 4029
#define SYS_sync 4036
#define SYS_reboot 4088
static void syscall1(int n, int a0) {
register int v0 __asm__("v0") = n;
register int a0r __asm__("a0") = a0;
__asm__ volatile("syscall" : "+r"(v0) : "r"(a0r) : "memory");
}
static void write_str(const char *s, int len) {
register int v0 __asm__("v0") = SYS_write;
register int fd __asm__("a0") = 1; // stdout
register const char *buf __asm__("a1") = s;
register int sz __asm__("a2") = len;
__asm__ volatile("syscall" : "+r"(v0) : "r"(fd), "r"(buf), "r"(sz) : "memory");
}
void _start(void) {
write_str("XIP-USERSPACE-OK\n", 17);
write_str("XIP-POWEROFF: requesting power off\n", 35);
syscall1(SYS_sync, 0);
// reboot(LINUX_REBOOT_CMD_POWER_OFF = 0x4321fedc)
register int v0 __asm__("v0") = SYS_reboot;
register int a0 __asm__("a0") = 0xfee1dead;
register int a1 __asm__("a1") = 672274793;
register int a2 __asm__("a2") = 0x4321fedc;
__asm__ volatile("syscall" : "+r"(v0) : "r"(a0), "r"(a1), "r"(a2) : "memory");
while(1); // should never reach here
}
Execute-In-Place (XIP)
XIP allows code to execute directly from non-volatile storage (flash) without first copying it to RAM. This is fundamentally different from traditional boot where the kernel is decompressed into RAM before execution.
Traditional vs XIP Boot
Traditional Boot
1. Bootloader loads compressed kernel to RAM
2. Kernel decompresses itself (~900 KiB → ~2.3 MiB)
3. Decompressed kernel copied to final RAM location
4. Kernel starts executing from RAM
5. RAM used: ~2.3 MiB for kernel alone
XIP Boot
1. Boot shim initializes hardware
2. Kernel executes directly from flash
3. Only writable data copied to RAM (~200 KiB)
4. TLB handlers allocated in RAM
5. RAM used: ~200 KiB for data only
RAM Savings Breakdown
| Component | Traditional | XIP | Savings |
|---|---|---|---|
| Kernel text (decompressed) | ~2304 KiB | 0 KiB (in flash) | 2304 KiB |
| Kernel data | ~200 KiB | ~200 KiB | 0 KiB |
| TLB handlers | 0 (in .text) | ~4 KiB | -4 KiB |
| Trampolines | 0 | ~0.5 KiB | -0.5 KiB |
| Total | ~2508 KiB | ~204 KiB | ~2304 KiB |
Kernel Patches
Five patches against Linux 6.12.34 enable XIP on MIPS. Each addresses a specific challenge.
Kconfig Additions
Adds CONFIG_XIP_KERNEL (bool) and CONFIG_XIP_PHYS_ADDR (hex) to MIPS architecture options. Dependencies: 32BIT && !RELOCATABLE && !MAPPED_KERNEL && !SMP.
XIP Linker Script (261 lines)
The largest patch. Defines ROM region at XIP_VIRT_ADDR and RAM region at LINKER_LOAD_ADDRESS. Introduces XIP_AT() macro to override asm-generic address macros. Moves .data..ro_after_init to RAM.
XIP Data Copy
Adds a copy loop in head.S that copies writable data from flash LMA to RAM VMA before BSS clearing. Mirrors ARM and RISC-V XIP implementations.
Memblock Accounting
Three changes: reserves only [_sdata, _end) under XIP, validates only RAM-resident sections, sets data_resource.start = _sdata.
TLB Handlers + Uasm RAM Buffers
The hardest part. MIPS TLB handlers are uasm-generated at boot into .text buffers — impossible under XIP. Solution: move patchable buffers to RAM section, add ROM trampolines with cross-segment register jumps.
XIP Linker Script
The linker script is the most complex patch (261 lines). It splits the kernel into ROM and RAM regions.
Key Directives
/* XIP virtual address = KSEG0 + physical offset */
XIP_VIRT_ADDR = 0x80000000 + CONFIG_XIP_PHYS_ADDR;
/* ROM region — executes directly from flash */
SECTIONS {
.text XIP_VIRT_ADDR : AT(CONFIG_XIP_PHYS_ADDR) {
_text = .;
/* ... code sections ... */
_etext = .;
}
/* RAM region — writable data */
.data LINKER_LOAD_ADDRESS : {
_sdata = .;
/* ... data sections ... */
_edata = .;
}
.bss : {
_sbss = .;
/* ... BSS sections ... */
_ebss = .;
}
/* Special section for uasm-generated TLB handlers */
.data..xip_patchable_text : {
/* RAM buffers that kernel writes at boot */
}
}
/* Custom macro to override asm-generic AT() */
#define XIP_AT(addr) (addr) - LOAD_OFFSET + XIP_PHYS_OFFSET
XIP_AT() Macro Explained
AT(ADDR(x) - LOAD_OFFSET), which would reset RAM-section load addresses to incorrect flash offsets. The XIP_AT() macro overrides this to produce correct LMA values.
TLB Handlers
The TLB (Translation Lookaside Buffer) handler patch is the most technically challenging part of the XIP implementation.
The Problem
MIPS TLB exception handlers (handle_tlbl, handle_tlbs, handle_tlbm) are generated at boot using uasm — a MIPS assembly DSL. The kernel writes machine code into buffers in .text. Under XIP, .text is ROM — writes are silently dropped.
The Solution
/* ROM trampoline in .text (original symbol location) */
handle_tlbl:
PTR_LA t9, xip_handle_tlbmiss_handler_setup_pgd
jr t9
nop
/* RAM buffer in .data..xip_patchable_text */
xip_handle_tlbmiss_handler_setup_pgd:
/* uasm-generated code lives here — writable RAM */
Cross-Segment Jump Problem
The jal/R_MIPS_26 instruction cannot cross the 256 MiB jump segment between flash (0x9fcxxxxx) and RAM (0x80xxxxxx). Solution: trampolines use lui/addiu/jr sequences for register-indirect jumps.
Memory Layout
Physical Memory Map
Physical Address Space
┌────────────────────────────────────────────┐ 0x20000000 (512 MiB)
│ End of KSEG0/KSEG1 direct-mapped region │
├────────────────────────────────────────────┤
│ ... │
├────────────────────────────────────────────┤ 0x1FC00000
│ SPI-NOR Flash (16 MiB) │
│ ├── 0x1FC00000: Boot Shim (4 KiB) │
│ ├── 0x1FC01000: Kernel .text (XIP) │
│ ├── 0x1FDxxxxx: Kernel .rodata (XIP) │
│ ├── 0x1FExxxxx: Kernel .init.text (XIP) │
│ └── 0x1FFxxxxx: Data LMA (copied to RAM) │
├────────────────────────────────────────────┤ 0x1E000000
│ RAM (8–16 MiB) │
│ ├── 0x80000000: Kernel .data │
│ ├── 0x800xxxxx: Kernel .bss │
│ ├── 0x801xxxxx: TLB handlers (RAM) │
│ ├── 0x802xxxxx: Free memory │
│ └── 0x81FFFFFF: End of 32 MiB window │
└────────────────────────────────────────────┘
Virtual Address Mapping
| Segment | Virtual Range | Physical | Cached | Use |
|---|---|---|---|---|
| KSEG0 | 0x80000000–0x9FFFFFFF | 0x00000000–0x1FFFFFFF | Yes | RAM, XIP kernel |
| KSEG1 | 0xA0000000–0xBFFFFFFF | 0x00000000–0x1FFFFFFF | No | I/O registers |
| KSEG2 | 0xC0000000–0xFFFFFFFF | mapped | Yes | VMalloc, kernel modules |
IoT Overview
The Internet of Things (IoT) connects embedded devices to networks for data collection, monitoring, and control. This project's XIP technology is directly applicable to IoT devices running on flash-constrained hardware.
IoT Device Categories
Sensor Nodes
Temperature, humidity, motion sensors. Ultra-low power, periodic wake-and-report. XIP saves RAM for sensor buffers.
Smart Home
Lighting, locks, thermostats. Local processing + cloud connectivity. XIP enables larger applications on same hardware.
Industrial IoT
PLC controllers, predictive maintenance. Deterministic timing requirements. XIP eliminates decompression latency.
Edge Routers
Gateway devices bridging IoT protocols to internet. This project targets exactly this category.
IoT + XIP Benefits
- Faster boot: No decompression step — kernel starts executing immediately from flash
- More RAM for application: 2.3 MiB savings means more buffer space for sensor data or network packets
- Lower power: Fewer RAM accesses during boot reduce energy consumption
- Simpler OTA: Flash-only updates — no need to handle RAM decompression during firmware upgrades
IoT Protocols
Common protocols used in IoT deployments, many of which can run on this XIP-enabled embedded Linux platform.
| Protocol | Layer | Transport | Use Case | RAM Footprint |
|---|---|---|---|---|
| MQTT | Application | TCP | Pub/sub messaging | ~10 KiB |
| CoAP | Application | UDP | REST for constrained devices | ~5 KiB |
| HTTP/1.1 | Application | TCP | Web APIs, OTA updates | ~20 KiB |
| WebSocket | Application | TCP | Real-time bidirectional | ~15 KiB |
| LwM2M | Application | CoAP | Device management | ~25 KiB |
| DHCP | Network | UDP | IP address assignment | ~8 KiB |
| mDNS | Network | UDP | Local service discovery | ~12 KiB |
| 802.15.4 | Data Link | — | Low-power wireless (Zigbee/Thread) | ~15 KiB |
| LoRaWAN | Network | — | Long-range, low-power | ~30 KiB |
Router Firmware
Building custom router firmware is one of the primary use cases for this XIP kernel project. The LWRT (Lightweight Router Toolkit) configuration demonstrates a full networking stack.
Router Firmware Components
XIP + networking drivers + nftables
IPv4/IPv6, bridge, VLAN, NAT
procd, netifd, dnsmasq, hostapd
LuCI or custom admin interface
OpenWrt Build Process
# Standard OpenWrt build (for reference)
git clone https://git.openwrt.org/openwrt/openwrt.git
cd openwrt
./scripts/feeds update -a
./scripts/feeds install -a
make menuconfig # Select Target → MediaTek Ralink → MT76x8
make -j$(nproc) # Builds full firmware image
# Output: bin/targets/ramips/mt76x8/openwrt-ramips-mt76x8-*.bin
Custom Firmware with XIP
To build router firmware with XIP kernel:
- Build the XIP kernel using this project's Makefile
- Stage an LWRT root filesystem with
build-userspace.sh - Configure
LWRT_ROOTFS=/path/to/staged/rootfs - The build system generates a combined initramfs image
- Flash the resulting
xip-bios.binto the router's SPI-NOR
Network Stack
The LWRT defconfig enables a full IPv4 networking stack suitable for router firmware.
Network Subsystems Enabled
| Subsystem | Config | Purpose |
|---|---|---|
| IPv4 | CONFIG_INET=y | Core Internet Protocol |
| Bridging | CONFIG_BRIDGE=y | LAN port bridging (switch functionality) |
| VLAN | CONFIG_VLAN_8021Q=y | Virtual LAN tagging |
| Multicast | CONFIG_IP_MULTICAST=y | Multicast routing support |
| Advanced routing | CONFIG_IP_ADVANCED_ROUTER=y | Policy routing, multiple tables |
| PCnet NIC | CONFIG_PCNET32=y | QEMU malta network driver |
| Conntrack | CONFIG_NF_CONNTRACK=y | Connection tracking for NAT |
| nftables | CONFIG_NF_TABLES=y | Modern packet filtering |
Firewall & NAT
The nftables-based firewall provides stateful packet inspection and NAT for router deployments.
Netfilter Components
# Connection tracking
CONFIG_NF_CONNTRACK=y
# NAT
CONFIG_NF_NAT=y
CONFIG_NF_NAT_MASQUERADE=y
# nftables framework
CONFIG_NF_TABLES=y
CONFIG_NFT_CT=y # Connection tracking in nftables
CONFIG_NFT_NAT=y # NAT in nftables
CONFIG_NFT_MASQ=y # Masquerading in nftables
Typical Router Firewall Rules
#!/usr/sbin/nft -f
flush ruleset
table inet filter {
chain input {
type filter hook input priority 0; policy drop;
iif lo accept
ct state established,related accept
tcp dport 22 accept # SSH
icmp type echo-request accept # Ping
}
chain forward {
type filter hook forward priority 0; policy drop;
ct state established,related accept
iif "lan" oif "wan" accept # LAN → WAN
}
chain output {
type filter hook output priority 0; policy accept;
}
}
table inet nat {
chain prerouting {
type nat hook prerouting priority -100;
}
chain postrouting {
type nat hook postrouting priority 100;
oif "wan" masquerade # NAT for LAN clients
}
}
TP-Link Archer C54
The primary target hardware for this project — a budget router with extremely constrained memory.
Hardware Specifications
| Component | Specification | Impact |
|---|---|---|
| SoC | MediaTek MT7628KN | MIPS24KEc @ 575 MHz |
| RAM | 8 MiB DDR1 | XIP saves 28.8% of total RAM |
| Flash | 16 MiB SPI-NOR (W25Q128) | Direct CPU addressing via KSEG1 |
| Wi-Fi | 2.4 GHz 802.11n (2×2) | Requires wireless driver in kernel |
| Ethernet | 4× LAN + 1× WAN (100 Mbps) | MT7530 switch芯片 |
| USB | None | No USB subsystem needed |
| Power | 9V DC, ~0.6A | ~5.4W consumption |
Flash Memory Map (16 MiB)
Offset Size Partition
0x000000 128 KiB Bootloader (U-Boot / Breed)
0x020000 64 KiB Config (factory defaults)
0x030000 64 KiB Config (user settings)
0x040000 128 KiB Bootloader (backup)
0x060000 14.75 MiB Firmware (kernel + rootfs)
├── 0x060000 4 KiB Boot Shim (XIP)
├── 0x061000 ~900 KiB Kernel .text (XIP)
├── 0x101000 ~200 KiB Kernel data (copied to RAM)
└── 0x111000 ~14 MiB Root filesystem (SquashFS)
0xFE0000 128 KiB Board config (MAC, region)
0xFF0000 64 KiB Boot countdown
QEMU Testing
QEMU provides a safe, reproducible testing environment that closely mirrors real hardware without risk of bricking devices.
Why QEMU Malta?
- GT-64120 system controller: Same as real MIPS development boards
- Serial console: ttyS0 at
0x180003F8(identical to real hardware) - NOR flash simulation:
-biosflag loads image at reset vector - PIIX4 poweroff: Clean exit for automated testing
- No network required: Fully self-contained testing
QEMU Launch Command
qemu-system-mipsel \
-M malta \
-m 256M \
-nographic \
-bios build/out/xip-bios.bin \
-serial mon:stdio \
-no-reboot \
-device pci-testdev 2>&1 | tee build/boot.log
Expected Serial Output
Linux version 6.12.34 (builder@x86) (clang 18.0) #1 SMP ...
...
Memory: 245760K/262144K available (1024K kernel code, ...)
...
Freeing unused kernel image(s): 32K freed
XIP-USERSPACE-OK
XIP-POWEROFF: requesting power off
Real Hardware
Deploying to real hardware requires additional considerations beyond QEMU testing.
Deployment Checklist
Serial Console Setup
# Connect USB-to-serial adapter to router UART pins
# TX → RX, RX → TX, GND → GND
# Linux
sudo screen /dev/ttyUSB0 115200
# Or with minicom
sudo minicom -D /dev/ttyUSB0 -b 115200
# Or with picocom
sudo picocom -b 115200 /dev/ttyUSB0
Test Infrastructure
Two test scripts verify the XIP implementation: static layout assertions and dynamic boot testing.
Static Verification (verify-layout.py)
Uses readelf and System.map to verify ELF layout without booting:
- A LOAD segment with VMA == LMA == 0x9FC01000, flags
R E(true XIP) - A RAM segment whose LMA lies inside the ROM image (data shipped in flash)
- All uasm buffers at RAM addresses
- All trampolines in ROM
- ROM image fits within flash budget
Boot Smoke Test (smoke-test.py)
Boots the image in QEMU and asserts serial markers appear in order:
#!/usr/bin/env python3
"""QEMU boot smoke test for XIP kernel."""
import subprocess, sys, time
MARKERS = [
"Linux version", # kernel alive
"Memory:", # memblock accounting sane
"Freeing unused kernel",# init memory reclaim worked
"XIP-USERSPACE-OK", # PID 1 ELF executed
"XIP-POWEROFF:", # userspace reached poweroff
]
def test():
proc = subprocess.Popen(
["qemu-system-mipsel", "-M", "malta", "-m", "256M",
"-nographic", "-bios", "build/out/xip-bios.bin",
"-no-reboot", "-serial", "mon:stdio"],
stdout=subprocess.PIPE, stderr=subprocess.STDOUT,
text=True
)
output = []
found = 0
deadline = time.time() + 30
while time.time() < deadline:
line = proc.stdout.readline()
if not line:
break
output.append(line)
if found < len(MARKERS) and MARKERS[found] in line:
found += 1
if "XIP-POWEROFF" in line:
break
proc.wait(timeout=10)
output_text = "".join(output)
if found == len(MARKERS) and proc.returncode == 0:
print("PASS: all markers found, clean poweroff")
sys.exit(0)
else:
print(f"FAIL: found {found}/{len(MARKERS)} markers")
sys.exit(1)
CI/CD Pipeline
GitHub Actions runs the full build + test pipeline on every push and pull request.
Pipeline Stages
Install clang, lld, llvm, binutils-mipsel-linux-gnu, qemu-system-mips, flex, bison on ubuntu-24.04
Cache kernel tarball to avoid re-downloading
make — full pipeline
make verify — static assertions
make test — QEMU boot smoke test
Upload xip-bios.bin and boot.log
ci.yml Configuration
name: Build & Test
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: |
sudo apt update
sudo apt install -y clang lld llvm \
binutils-mipsel-linux-gnu \
qemu-system-mips flex bison bc
- name: Cache kernel tarball
uses: actions/cache@v4
with:
path: build/linux-*.tar.xz
key: kernel-${{ hashFiles('patches/*') }}
- name: Build
run: make
- name: Verify layout
run: make verify
- name: Boot test
run: make test
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: xip-kernel
path: |
build/out/xip-bios.bin
build/boot.log
Verification
Verification happens at multiple levels to ensure correctness.
Verification Levels
| Level | Tool | What It Checks | When |
|---|---|---|---|
| Compile-time | gcc/clang warnings | Syntax, type errors, warnings | During build |
| Link-time | ld assertions | Shim size ≤ 4 KiB, symbol resolution | During link |
| Static layout | verify-layout.py | ELF segments, XIP addresses, ROM/RAM split | After build |
| Boot test | smoke-test.py | Kernel boots, init runs, poweroff succeeds | After build |
| CI | GitHub Actions | Full pipeline on every push | On commit |
File Reference
Complete reference of all project files.
| File | Lines | Purpose |
|---|---|---|
Makefile | 41 | Top-level build orchestration |
README.md | 154 | Project documentation |
configs/xip_qemu_malta_defconfig | 82 | Minimal XIP kernel config |
configs/lwrt_qemu_malta_defconfig | 123 | LWRT router config with networking |
patches/0001-mips-add-xip-kconfig.patch | 53 | Kconfig additions |
patches/0002-mips-xip-linker-script.patch | 293 | XIP linker script (largest patch) |
patches/0003-mips-head-xip-data-copy.patch | 36 | head.S data copy loop |
patches/0004-mips-setup-xip-memblock.patch | 52 | Memblock accounting fixes |
patches/0005-mips-mm-xip-patchable-text.patch | 265 | TLB handlers + RAM buffers |
scripts/build-kernel.sh | 53 | Kernel download/patch/compile |
scripts/build-image.sh | 59 | Flash image assembly |
scripts/build-userspace.sh | 62 | Initramfs generation |
scripts/run-qemu.sh | 8 | Interactive QEMU launch |
shim/shim.S | 86 | Boot shim assembly |
shim/shim.ld | 22 | 4 KiB linker script |
userspace/init.c | 65 | Freestanding PID 1 |
tests/verify-layout.py | 106 | Static ELF layout verification |
tests/smoke-test.py | 109 | QEMU boot smoke test |
.github/workflows/ci.yml | 46 | GitHub Actions CI pipeline |
Glossary
- XIP (Execute-In-Place)
- Executing code directly from non-volatile storage without copying to RAM first.
- SPI-NOR Flash
- Serial Peripheral Interface NOR flash memory. Common in embedded systems for firmware storage.
- UASM
- MIPS assembly DSL (Domain-Specific Language) used by Linux kernel to generate TLB handlers at boot.
- TLB (Translation Lookaside Buffer)
- Hardware cache for virtual-to-physical address translations in MIPS MMU.
- KSEG0 / KSEG1
- MIPS direct-mapped address segments. KSEG0 is cached, KSEG1 is uncached. Both map to physical 0x0–0x1FFFFFFF.
- GT-64120
- Galileo GT-64120 system controller used in MIPS Malta development boards (and QEMU emulation).
- YAMON
- Yet Another Monitor — MIPS bootloader/debugger. The boot shim fakes its protocol.
- Memblock
- Linux kernel's early memory allocator. Tracks available and reserved memory regions.
- Initramfs
- Initial RAM filesystem — a cpio archive loaded into memory at boot, used as the root filesystem.
- nftables
- Modern Linux packet filtering framework, replacing iptables.
- LWRT
- Lightweight Router Toolkit — a minimal router firmware userspace for embedded Linux.
- OpenWrt
- Linux distribution for embedded networking devices. Provides package management and web interface.
- Device Tree (DTB)
- Hardware description data structure passed to the kernel at boot, describing board configuration.
- U-Boot
- Universal Bootloader — the most common bootloader for embedded Linux systems.
- DTB (Device Tree Blob)
- Binary format of the device tree, passed from bootloader to kernel.
Frequently Asked Questions
On devices with 8–16 MiB RAM, a compressed kernel still needs ~2.3 MiB for the decompressed copy. XIP eliminates this entirely, freeing that RAM for applications, network buffers, and sensor data. The tradeoff is slightly slower execution from flash vs. RAM.
XIP works best with SPI-NOR flash because it supports random access and has consistent read latency. SPI-NAND flash requires ECC management and bad-block handling that makes XIP impractical. Parallel NOR flash also supports XIP but is less common in modern designs.
Yes. The LWRT defconfig demonstrates enabling networking, nftables, and drivers while keeping XIP. Use make menuconfig in the kernel source tree to toggle features. Each addition increases kernel size and RAM usage.
1) Create a new defconfig for your board. 2) Modify the boot shim to initialize your board's system controller. 3) Update CONFIG_XIP_PHYS_ADDR to match your flash's physical address. 4) Adjust the linker script if your board has different memory geometry. 5) Test with QEMU first if possible.
Flash reads are slower than RAM (~50 MB/s vs ~800 MB/s for DDR1). However, most kernel code is execution-bound, not bandwidth-bound. Real-world benchmarks show 5–15% slowdown for typical router workloads (NAT, firewall rules, packet forwarding). The RAM savings often outweigh this cost.
Currently no. The XIP implementation requires !SMP because TLB handler generation and cross-segment trampolines assume single-core execution. Adding SMP support would require significant additional work for cache coherency and TLB shootdown across cores.