This summer I had the privilege of contributing to the CVA6 RISC-V core as part of Google Summer of Code 2025 with the FOSSi Foundation. My project focused on implementing the Svnapot extension, which enables mapping large, naturally aligned memory regions with fewer Page Table Entries (PTEs). This blog reflects on my journey, the technical challenges, and the final results.
Project Overview
The goal was to implement support for the RISC-V Svnapot (Extension for Naturally Aligned Power-of-Two Translation Contiguity) extension in CVA6. Svnapot lets the operating system map a 64 KiB (standardized) block with a single PTE, significantly reducing Translation Lookaside Buffer (TLB) pressure and improving address translation performance.
This required modifying three key components: the Page Table Walker (PTW), the Translation Lookaside Buffers (TLBs), and the Memory Management Unit (MMU).
Image credit: Daniel Mangum
Project Timeline
- May – Early June: Studied CVA6's MMU, PTW, and TLB architecture. Set up Verilator and Spike environment.
- Mid June – July: Implemented napot detection in PTW, added new flags and updated the TLB logic.
- Late July – August: Debugged physical address patching logic and validated against
napot.Sand custom edge-case tests. - September: Finalized PR #3094*, addressed reviews, and ensured CI stability.
*PR is still under review, and I will promptly address any feedback.
Technical Details
Supporting Svnapot required three main modifications:
1. Page Table Walker (PTW)
The PTW (cva6_ptw.sv) was enhanced to recognize Svnapot PTEs. It checks if the n bit is set in a leaf PTE and verifies that the lower 4 bits of the PPN match the specific encoding for a 64 KiB page (4'b1000). Any other encoding, or setting the n bit on a megapage/gigapage, results in a page fault.
// In cva6_ptw.sv
if (CVA6Cfg.SvnapotEn && pte.n) begin
// Svnapot: Check if the leaf PTE represents a 64KiB NAPOT page
is_napot_64k = (pte.ppn[3:0] == 4'b1000) && (ptw_lvl_q[0] == 2);
// Fault if N is set on a megapage or gigapage or any other reserved encoding
if (!is_napot_64k) begin
state_d = PROPAGATE_ERROR;
end
end2. TLB Modifications & Address Patching
A new flag, is_napot_64k, was added to the TLB tags. When a TLB lookup hits a NAPOT entry, the lower 4 bits of the physical page number (PPN) are dynamically replaced with bits from the virtual address. This on-the-fly patching ensures correct address translation within the 64 KiB block.
// In cva6_tlb.sv: Patching the PPN on a TLB hit
if (tags_q[i].is_napot_64k && CVA6Cfg.SvnapotEn) begin
patched_pte = content_q[i].pte;
patched_pte.ppn[3:0] = lu_vaddr_i[15:12];
lu_content_o = patched_pte;
end else begin
lu_content_o = content_q[i].pte;
end3. MMU Integration
The top-level MMU (cva6_mmu.sv) defines the core-wide data structures for memory management. I added the Svnapot n bit to the pte_cva6_t struct. Additionally, the tlb_update_cva6_t struct, which carries data between the PTW and the TLBs, was modified to include the is_napot_64k flag, ensuring this information is propagated correctly during a TLB fill.
// In cva6_mmu.sv
localparam type pte_cva6_t = struct packed {
logic n; // Svnapot: N bit for NAPOT extension
// ... other fields
};
localparam type tlb_update_cva6_t = struct packed {
logic is_napot_64k; // Svnapot: Flag indicating a 64KiB NAPOT page
// ... other fields
};Verification Methodology
The Svnapot implementation was verified using the OpenHW CVA6 simulation framework. Functional verification was carried out using the CVA6 simulation framework (verif/sim/cva6.py), which supports multiple simulators (e.g., Verilator and VCS). The RTL was executed in the chosen simulator (veri-testharness, spike). The verification plan consisted of three main thrusts:
- Directed Testing: The
napot.Sassembly test from theriscv-testssuite was used to validate baseline Svnapot functionality. - Regression Testing: The complete RISC-V ISA compliance suite was executed to ensure that Svnapot integration preserved correctness across existing instructions and features.
- Edge-Case Testing: Custom hand-written assembly programs were developed like
napot_dual.Sandnapot_strict.Swhich performed napot check for two different napot mappings and strict napot checks(by not executing the exception handler from thenapot.Sresulting in failure even if page faults were raised or if any functionality is not working correctly) respectively.
Successful verification was confirmed by executing this methodology with both SharedTLB enabled and disabled configurations, ensuring robust Svnapot support in all scenarios.
Challenges Faced
The main challenges were understanding the complex CVA6 codebase and ensuring that the new Svnapot logic integrated seamlessly without introducing regressions. Debugging TLB misses and page faults required careful analysis of the PTW state machine and TLB hit logic. Additionally, ensuring compliance with the RISC-V specification for Svnapot was critical. Other than that debugging the physical address patching logic, especially within the shared TLB, which required careful tracing of address translations was challenging.
Results & Learnings
The final implementation passed napot.S and related tests in riscv-tests. Svnapot is now fully supported in CVA6, with or without SharedTLB, reducing TLB pressure for large memory regions.
Through this project I gained:
- A deep understanding of virtual memory in RISC-V.
- Hands-on RTL modification and debugging in a large industrial-grade codebase.
- Experience in open-source workflows: rebasing, CI pipelines, and PR reviews.
Acknowledgements
I sincerely thank my mentors and reviewers at the OpenHW Foundation for their guidance and feedback and my friends for their encouragement. I also want to acknowledge the broader RISC-V community for maintaining the specifications, test suites, and tools that made this project possible.