diff --git a/0C_virtual_memory/Cargo.lock b/0C_virtual_memory/Cargo.lock index 16ef5e23..8d47f89a 100644 --- a/0C_virtual_memory/Cargo.lock +++ b/0C_virtual_memory/Cargo.lock @@ -1,3 +1,5 @@ +# This file is automatically @generated by Cargo. +# It is not intended for manual editing. [[package]] name = "cortex-a" version = "2.3.1" diff --git a/0C_virtual_memory/README.md b/0C_virtual_memory/README.md index c06ef170..9110f025 100644 --- a/0C_virtual_memory/README.md +++ b/0C_virtual_memory/README.md @@ -1,42 +1,134 @@ # Tutorial 0C - Virtual Memory -**This is a stub** +Virtual memory is an immensely complex, but exciting topic. In this first +lesson, we start slow and switch on the MMU and use static page tables. We will +only be concerned about the first `1 GiB` of address space. That is the amount +of `DRAM` the usual Raspberry Pi 3 has. As we already know, the upper `16 MiB` +of this gigabyte-window are occupied by the Raspberry's peripherals such as the +UART. + +## MMU and paging theory + +At this point, we will not reinvent the wheel again and go into detailed +descriptions of how paging in modern application processors works. The internet +is full of great resources regarding this topic, and we encourage you to read +some of it to get a high-level understanding of the topic. + +To follow the rest of this `AArch64` specific tutorial, we strongly recommend +that you stop right here and first read `Chapter 12` of the [ARM Cortex-A Series +Programmer's Guide for +ARMv8-A](http://infocenter.arm.com/help/topic/com.arm.doc.den0024a/DEN0024A_v8_architecture_PG.pdf) +before you continue. This will set you up with all the `AArch64`-specific +knowledge needed to follow along. + +Back from reading `Chapter 12` already? Good job :+1:! + +## Approach + +The following partitioning will be used for the static page tables: +- The first `2 MiB` will be mapped using a Level 3 table with `4 KiB` granule. + - This aperture includes, among others, the kernel's code, read-only data, and + mutable data. All of which will be `identity mapped` to make our life easy + for now. + - In the past, we already made sure that the linker script aligns the + respective regions to `4 KiB` boundaries. + - This way, we can conveniently flag corresponding regions in distinct page + table entries. E.g. marking the code regions executable, while the mutable + data regions are not. +- All the rest will be mapped using `2 MiB` granule. + +The actual code is divided into two files: `memory.rs` and `memory/mmu.rs`. + +### memory.rs + +This file is used to describe our kernel's memory layout in a high-level +abstraction using our own descriptor format. We can define ranges of arbitrary +length and set respective attributes, for example if the bits and bytes in this +range should be executable or not. + +The descriptors we use here are agnostic of the hardware `MMU`'s actual +descriptors, and we are also agnostic of the paging granule the `MMU` will use. +Having this distinction is less of a technical need and more a convenience +feature for us in order to easily describe the kernels memory layout, and +hopefully it makes the whole concept a bit more graspable for the reader. + +The file contains a global `static KERNEL_VIRTUAL_LAYOUT` array which +stores these descriptors. The policy is to only store regions that are **not** +ordinary, normal chacheable DRAM. However, nothing prevents you from defining +those too if you wish to. Here is an example for the device MMIO region: -TODO: Write rest of tutorial. +```rust +// Device MMIO +Descriptor { + virtual_range: || RangeInclusive::new(map::physical::MMIO_BASE, map::physical::MMIO_END), + translation: Translation::Identity, + attribute_fields: AttributeFields { + mem_attributes: MemAttributes::Device, + acc_perms: AccessPermissions::ReadWrite, + execute_never: true, + }, +}, +``` -Virtual memory is an immensely complex, but exciting topic. In this first -lesson, we start slow and switch on the MMU using handcrafted page tables for -the first `1 GiB` of memory. That is the amount of `DRAM` the usual Raspberry Pi -3 has. As we already know, the upper `16 MiB` of this gigabyte-window are -occupied by the Raspberry's peripherals such as the UART. +Finally, the file contains the following function: + +```rust +fn get_virt_addr_properties(virt_addr: usize) -> Result<(usize, AttributeFields), &'static str> +``` + +It will be used by code in `mmu.rs` to request attributes for a virtual address +and the translation of the address. The function scans `KERNEL_VIRTUAL_LAYOUT` +for a descriptor that contains the queried address, and returns the respective +findings for the first entry that is a hit. If no entry is found, it returns +default attributes for normal chacheable DRAM and the input address, hence +telling the `MMU` code that the requested address should be `identity mapped`. -The page tables we install alternate between `2 MiB` blocks and `4 KiB` blocks. +Due to this default return, it is not needed to define normal cacheable DRAM +regions in `KERNEL_VIRTUAL_LAYOUT`. -The first `2 MiB` of memory are identity mapped, and therefore contain our code -and the stack. We use a single table with a `4 KiB` granule to differentiate -between code, RO-data and RW-data. The linker script was adapted to adhere to -the pagetable sizes. +### mmu.rs -Next, we map the UART into the second `2 MiB` block to show the effects of -virtual memory. +This file contains the `AArch64` specific code. It is a driver, if you like, and +the split in paging granule mentioned before is hardcoded here (`4 KiB` page +descriptors for the first `2 MiB` and `2 MiB` block descriptors for everything +else). -Everyting else is, for reasons of convenience, again identity mapped using `2 -MiB` blocks. +Two static page table arrays are instantiated, `LVL2_TABLE` and `LVL3_TABLE`, +and they are populated using `get_virt_addr_properties()` and a bunch of utility +functions that convert our own descriptors to the actual `64 bit` descriptor +entries needed by the MMU hardware for the page table arrays. -Hopefully, in a later tutorial, we will write or use (e.g. from the `cortex-a` -crate) proper modules for page table handling, that, among others, cover topics -such as using recursive mapping for maintenace. +Afterwards, the [Translation Table Base Register 0 - EL1](https://docs.rs/crate/cortex-a/2.4.0/source/src/regs/ttbr0_el1.rs) is set up with the base address of the `LVL3_TABLE` and +the [Translation Control Register - EL1](https://docs.rs/crate/cortex-a/2.4.0/source/src/regs/tcr_el1.rs) is +configured. -## Adress translation with the 4 KiB LVL3 table +Finally, the MMU is turned on through the [System Control Register - EL1](https://docs.rs/crate/cortex-a/2.4.0/source/src/regs/sctlr_el1.rs). The last step also enables caching for data and instructions. -The following block diagram shows address translation by example of the UART's -Control Register (CR). +## Address translation examples + +For educational purposes, in `memory.rs`, a layout is defined which allows to +access the `UART` via two different virtual addresses: +- Since we identity map the whole `Device MMIO` region, it is accessible by +asserting its physical base address (`0x3F20_1000`) after the `MMU` is turned +on. +- Additionally, it is also mapped into the last `4 KiB` entry of the `LVL3` +table, making it accessible through base address `0x001F_F000`. + +The following two block diagrams visualize the underlying translations for the +two mappings, accessing the UART's Control Register (`CR`, offset `0x30`). + +### Adress translation using a 2 MiB block descriptor + +![2 MiB translation block diagram](../doc/page_tables_2MiB.png) + +### Adress translation using a 4 KiB page descriptor ![4 KiB translation block diagram](../doc/page_tables_4KiB.png) + ## Zero-cost abstraction -The MMU init code is a good example to see the great potential of Rust's +The MMU init code is also a good example to see the great potential of Rust's zero-cost abstractions[[1]](https://blog.rust-lang.org/2015/05/11/traits.html)[[2]](https://ruudvanasseldonk.com/2016/11/30/zero-cost-abstractions) for embedded programming. Take this piece of code for setting up the `MAIR_EL1` register using the @@ -57,7 +149,7 @@ MAIR_EL1.write( ); ``` -This piece of code is super expressive, and it makes us of `traits`, different +This piece of code is super expressive, and it makes use of `traits`, different `types` and `constants` to provide type-safe register manipulation. In the end, this code sets the first four bytes of the register to certain @@ -84,6 +176,5 @@ ferris@box:~$ make raspboot [i] MMU: Up to 40 Bit physical address range supported! [2] MMU online. -Writing through the virtual mapping at 0x00000000001FF000. - +Writing through the virtual mapping at base address 0x00000000001FF000. ``` diff --git a/0C_virtual_memory/kernel8 b/0C_virtual_memory/kernel8 index 44757979..7ab1d87c 100755 Binary files a/0C_virtual_memory/kernel8 and b/0C_virtual_memory/kernel8 differ diff --git a/0C_virtual_memory/kernel8.img b/0C_virtual_memory/kernel8.img index 93ff1e1c..7e2f49fb 100755 Binary files a/0C_virtual_memory/kernel8.img and b/0C_virtual_memory/kernel8.img differ diff --git a/0C_virtual_memory/src/main.rs b/0C_virtual_memory/src/main.rs index 45ad5cb9..8a719330 100644 --- a/0C_virtual_memory/src/main.rs +++ b/0C_virtual_memory/src/main.rs @@ -60,6 +60,8 @@ fn kernel_entry() -> ! { uart.puts(s); uart.puts("\n"); } + // The following write is already using the identity mapped + // translation in the LVL2 table. Ok(()) => uart.puts("[2] MMU online.\n"), } } // After this closure, the UART instance is not valid anymore. @@ -68,7 +70,7 @@ fn kernel_entry() -> ! { // again, though. let uart = uart::Uart::new(memory::map::virt::REMAPPED_UART_BASE); - uart.puts("\nWriting through the virtual mapping at 0x"); + uart.puts("\nWriting through the virtual mapping at base address 0x"); uart.hex(memory::map::virt::REMAPPED_UART_BASE as u64); uart.puts(".\n"); diff --git a/doc/page_tables_2MiB.png b/doc/page_tables_2MiB.png new file mode 100644 index 00000000..930d3b8e Binary files /dev/null and b/doc/page_tables_2MiB.png differ diff --git a/doc/page_tables_2MiB.svg b/doc/page_tables_2MiB.svg new file mode 100644 index 00000000..c98d9b4b --- /dev/null +++ b/doc/page_tables_2MiB.svg @@ -0,0 +1,1303 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + 0 + + ... + Device MMIO + + + + ... + 511 + 504 + Device MMIO + + UART_BASE + CR_Offset =0x3F20_1000 + 0x30 = 0x3F20_1030 =0b111111001_000000001000000110000 + + TTBR_EL1 + + LVL2 tablebase address + LVL3 Table base addr + static mut LVL2_TABLE + } + + 0b1_1111_1001 = 505Select entrty 505 ofLVL2 table + Virtual Address: + + APAccessPermissions + 53 + 47 + 9 + 10 + 8 + 6 + 7 + 4 + 1 + 2 + 0 + + + MAIR_EL1index + + TYPE + VALID + + + + SHShareability + AFAccessFlag + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + PXNPrivilegedexecutenever + 63 + + + + + + Final 48 Bit Physical Address: 0x3F201030 + 0b1_1111_1001 + + + } + + 0 + 20 + 47 + 21 + The LVL2 Block entry pointsto the start address ofa 2 MiB frame. The last21 Bit of the VA are theindex into the frame. + output address + UART + 505 + + + + + + + + + 21 + + + + 0b1_1111_1001 + 0b1000000110000 + + + + 0 + + + diff --git a/doc/page_tables_4KiB.png b/doc/page_tables_4KiB.png index a06e7929..349e9037 100644 Binary files a/doc/page_tables_4KiB.png and b/doc/page_tables_4KiB.png differ diff --git a/doc/page_tables_4KiB.svg b/doc/page_tables_4KiB.svg index 4d61d43e..99c1c911 100644 --- a/doc/page_tables_4KiB.svg +++ b/doc/page_tables_4KiB.svg @@ -16,7 +16,6 @@ id="svg8" inkscape:version="0.92.3 (2405546, 2018-03-11)" sodipodi:docname="page_tables_4KiB.svg" - inkscape:export-filename="/home/arichter/repos/rust-raspi3-tutorial/doc/page_tables.svg.png" inkscape:export-xdpi="95.999992" inkscape:export-ydpi="95.999992"> 47 The LVL3 PT entry pointsto the start address ofa 4KiB frame. The lasta 4 KiB frame. The lastoutput address + 1