You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

447 lines
14 KiB
Markdown

# Tutorial 10 - Privilege Level
## tl;dr
4 years ago
In early boot code, we transition from the `Hypervisor` privilege level (`EL2`
in AArch64) to the `Kernel` (`EL1`) privilege level.
## Table of Contents
- [Introduction](#introduction)
- [Scope of this tutorial](#scope-of-this-tutorial)
- [Checking for EL2 in the entrypoint](#checking-for-el2-in-the-entrypoint)
- [Transition preparation](#transition-preparation)
- [Returning from an exception that never happened](#returning-from-an-exception-that-never-happened)
- [Are we stackless?](#are-we-stackless)
- [Test it](#test-it)
- [Diff to previous](#diff-to-previous)
## Introduction
Application-grade CPUs have so-called `privilege levels`, which have different purposes:
| Typically used for | AArch64 | RISC-V | x86 |
| ------------- | ------------- | ------------- | ------------- |
| Userspace applications | EL0 | U/VU | Ring 3 |
| OS Kernel | EL1 | S/VS | Ring 0 |
| Hypervisor | EL2 | HS | Ring -1 |
| Low-Level Firmware | EL3 | M | |
`EL` in AArch64 stands for `Exception Level`. If you want more information regarding the other
architectures, please have a look at the following links:
- [x86 privilege rings](https://en.wikipedia.org/wiki/Protection_ring).
- [RISC-V privilege modes](https://content.riscv.org/wp-content/uploads/2017/12/Tue0942-riscv-hypervisor-waterman.pdf).
At this point, I strongly recommend that you glimpse over `Chapter 3` of the [Programmers Guide for
ARMv8-A](http://infocenter.arm.com/help/topic/com.arm.doc.den0024a/DEN0024A_v8_architecture_PG.pdf)
before you continue. It gives a concise overview about the topic.
## Scope of this tutorial
If you set up your SD Card exactly like mentioned in [tutorial 06], the Rpi will always start
executing in `EL2`. Since we are writing a traditional `Kernel`, we have to transition into the more
appropriate `EL1`.
[tutorial 06]: https://github.com/rust-embedded/rust-raspi3-OS-tutorials/tree/master/06_drivers_gpio_uart#boot-it-from-sd-card
## Checking for EL2 in the entrypoint
First of all, we need to ensure that we actually execute in `EL2` before we can call respective code
to transition to `EL1`:
```rust
pub unsafe extern "C" fn _start() -> ! {
const CORE_MASK: u64 = 0x3;
// Expect the boot core to start in EL2.
if (bsp::BOOT_CORE_ID == MPIDR_EL1.get() & CORE_MASK)
&& (CurrentEL.get() == CurrentEL::EL::EL2.value)
{
el2_to_el1_transition()
} else {
// If not core0, infinitely wait for events.
wait_forever()
}
}
```
If this is the case, we continue with preparing the `EL2` -> `EL1` transition in
`el2_to_el1_transition()`.
## Transition preparation
Since `EL2` is more privileged than `EL1`, it has control over various processor features and can
allow or disallow `EL1` code to use them. One such example is access to timer and counter registers.
We are already using them since [tutorial 08](../08_timestamps/), so of course we want to keep them.
Therefore we set the respective flags in the [Counter-timer Hypervisor Control register] and
additionally set the virtual offset to zero so that we get the real physical value everytime:
[Counter-timer Hypervisor Control register]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/cnthctl_el2.rs.html
```rust
// Enable timer counter registers for EL1.
CNTHCTL_EL2.write(CNTHCTL_EL2::EL1PCEN::SET + CNTHCTL_EL2::EL1PCTEN::SET);
// No offset for reading the counters.
CNTVOFF_EL2.set(0);
```
Next, we configure the [Hypervisor Configuration Register] such that `EL1` should actually run in
`AArch64` mode, and not in `AArch32`, which would also be possible.
[Hypervisor Configuration Register]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/hcr_el2.rs.html
```rust
// Set EL1 execution state to AArch64.
HCR_EL2.write(HCR_EL2::RW::EL1IsAarch64);
```
## Returning from an exception that never happened
There is actually only one way to transition from a higher EL to a lower EL, which is by way of
executing the [ERET] instruction.
[ERET]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/asm.rs.html#49-62
This instruction will copy the contents of the [Saved Program Status Register - EL2] to `Current
Program Status Register - EL1` and jump to the instruction address that is stored in the [Exception
Link Register - EL2].
This is basically the reverse of what is happening when an exception is taken. You'll learn about it
in an upcoming tutorial.
[Saved Program Status Register - EL2]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/spsr_el2.rs.html
[Exception Link Register - EL2]: https://docs.rs/cortex-a/2.4.0/src/cortex_a/regs/elr_el2.rs.html
```rust
// Set up a simulated exception return.
//
// First, fake a saved program status, where all interrupts were masked and SP_EL1 was used as a
// stack pointer.
SPSR_EL2.write(
SPSR_EL2::D::Masked
+ SPSR_EL2::A::Masked
+ SPSR_EL2::I::Masked
+ SPSR_EL2::F::Masked
+ SPSR_EL2::M::EL1h,
);
// Second, let the link register point to init().
ELR_EL2.set(crate::runtime_init::init as *const () as u64);
```
As you can see, we are populating `ELR_EL2` with the address of the [init()] function that we
earlier used to call directly from the entrypoint.
Finally, we set the stack pointer for `SP_EL1` and call `ERET`:
[init()]: src/runtime_init.rs
```rust
// Set up SP_EL1 (stack pointer), which will be used by EL1 once we "return" to it.
SP_EL1.set(bsp::BOOT_CORE_STACK_START);
// Use `eret` to "return" to EL1. This will result in execution of `reset()` in EL1.
asm::eret()
```
## Are we stackless?
We just wrote a big inline rust function, `el2_to_el1_transition()`, that is executed in a context
where we do not have a stack yet. We should double-check the generated machine code:
```console
make objdump
[...]
Disassembly of section .text:
0000000000080000 _start:
80000: mrs x8, MPIDR_EL1
80004: tst x8, #0x3
80008: b.ne #0x10 <_start+0x18>
8000c: mrs x8, CurrentEL
80010: cmp w8, #0x8
80014: b.eq #0xc <_start+0x20>
80018: wfe
8001c: b #-0x4 <_start+0x18>
80020: mov x8, xzr
80024: mov w9, #0x3
80028: msr CNTHCTL_EL2, x9
8002c: msr CNTVOFF_EL2, x8
80030: adrp x8, #0x0
80034: mov w10, #-0x80000000
80038: mov w11, #0x3c5
8003c: mov w12, #0x80000
80040: msr HCR_EL2, x10
80044: msr SPSR_EL2, x11
80048: add x8, x8, #0xdd0
8004c: msr ELR_EL2, x8
80050: msr SP_EL1, x12
80054: eret
```
Looks good! Thanks zero-overhead abstractions in the
[cortex-a](https://github.com/rust-embedded/cortex-a) crate! :heart_eyes:
4 years ago
## Test it
In `main.rs`, we additionally inspect if the mask bits in `SPSR_EL2` made it to `EL1` as well:
```console
make chainbot
[...]
### Listening on /dev/ttyUSB0
__ __ _ _ _ _
| \/ (_)_ _ (_) | ___ __ _ __| |
| |\/| | | ' \| | |__/ _ \/ _` / _` |
|_| |_|_|_||_|_|____\___/\__,_\__,_|
Raspberry Pi 3
[ML] Requesting binary
### sending kernel kernel8.img [16480 byte]
### finished sending
[ML] Loaded! Executing the payload now
[ 1.459973] Booting on: Raspberry Pi 3
[ 1.462256] Current privilege level: EL1
[ 1.466163] Exception handling state:
[ 1.469810] Debug: Masked
[ 1.473023] SError: Masked
[ 1.476235] IRQ: Masked
[ 1.479447] FIQ: Masked
[ 1.482661] Architectural timer resolution: 52 ns
[ 1.487349] Drivers loaded:
[ 1.490127] 1. GPIO
[ 1.492731] 2. PL011Uart
[ 1.495770] Timer test, spinning for 1 second
[ 2.500114] Echoing input now
```
## Diff to previous
```diff
diff -uNr 09_hw_debug_JTAG/src/arch/aarch64/exception.rs 10_privilege_level/src/arch/aarch64/exception.rs
--- 09_hw_debug_JTAG/src/arch/aarch64/exception.rs
+++ 10_privilege_level/src/arch/aarch64/exception.rs
5 years ago
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: MIT OR Apache-2.0
+//
+// Copyright (c) 2018-2020 Andre Richter <andre.o.richter@gmail.com>
+
+//! Exception handling.
+
+use cortex_a::regs::*;
+
5 years ago
+//--------------------------------------------------------------------------------------------------
+// Arch-public
+//--------------------------------------------------------------------------------------------------
+
+pub trait DaifField {
+ fn daif_field() -> register::Field<u32, DAIF::Register>;
+}
+
+pub struct Debug;
+pub struct SError;
+pub struct IRQ;
+pub struct FIQ;
+
+impl DaifField for Debug {
+ fn daif_field() -> register::Field<u32, DAIF::Register> {
+ DAIF::D
+ }
+}
+
+impl DaifField for SError {
+ fn daif_field() -> register::Field<u32, DAIF::Register> {
+ DAIF::A
+ }
+}
+
+impl DaifField for IRQ {
+ fn daif_field() -> register::Field<u32, DAIF::Register> {
+ DAIF::I
+ }
+}
+
+impl DaifField for FIQ {
+ fn daif_field() -> register::Field<u32, DAIF::Register> {
+ DAIF::F
+ }
+}
+
+pub fn is_masked<T: DaifField>() -> bool {
+ DAIF.is_set(T::daif_field())
+}
diff -uNr 09_hw_debug_JTAG/src/arch/aarch64.rs 10_privilege_level/src/arch/aarch64.rs
--- 09_hw_debug_JTAG/src/arch/aarch64.rs
+++ 10_privilege_level/src/arch/aarch64.rs
@@ -4,6 +4,7 @@
//! AArch64.
+mod exception;
pub mod sync;
mod time;
@@ -21,15 +22,56 @@
pub unsafe extern "C" fn _start() -> ! {
const CORE_MASK: u64 = 0x3;
- if bsp::BOOT_CORE_ID == MPIDR_EL1.get() & CORE_MASK {
- SP.set(bsp::BOOT_CORE_STACK_START);
- crate::runtime_init::runtime_init()
+ // Expect the boot core to start in EL2.
+ if (bsp::BOOT_CORE_ID == MPIDR_EL1.get() & CORE_MASK)
+ && (CurrentEL.get() == CurrentEL::EL::EL2.value)
+ {
+ el2_to_el1_transition()
} else {
// If not core0, infinitely wait for events.
wait_forever()
}
}
+/// Transition from EL2 to EL1.
+///
+/// # Safety
+///
+/// - The HW state of EL1 must be prepared in a sound way.
+/// - Exception return from EL2 must must continue execution in EL1 with ´runtime_init::init()`.
+#[inline(always)]
+unsafe fn el2_to_el1_transition() -> ! {
+ // Enable timer counter registers for EL1.
+ CNTHCTL_EL2.write(CNTHCTL_EL2::EL1PCEN::SET + CNTHCTL_EL2::EL1PCTEN::SET);
+
+ // No offset for reading the counters.
+ CNTVOFF_EL2.set(0);
+
+ // Set EL1 execution state to AArch64.
+ HCR_EL2.write(HCR_EL2::RW::EL1IsAarch64);
+
+ // Set up a simulated exception return.
+ //
+ // First, fake a saved program status, where all interrupts were masked and SP_EL1 was used as a
+ // stack pointer.
+ SPSR_EL2.write(
+ SPSR_EL2::D::Masked
+ + SPSR_EL2::A::Masked
+ + SPSR_EL2::I::Masked
+ + SPSR_EL2::F::Masked
+ + SPSR_EL2::M::EL1h,
+ );
+
+ // Second, let the link register point to init().
+ ELR_EL2.set(crate::runtime_init::runtime_init as *const () as u64);
+
+ // Set up SP_EL1 (stack pointer), which will be used by EL1 once we "return" to it.
+ SP_EL1.set(bsp::BOOT_CORE_STACK_START);
+
+ // Use `eret` to "return" to EL1. This will result in execution of `reset()` in EL1.
+ asm::eret()
+}
+
//--------------------------------------------------------------------------------------------------
// Global instances
//--------------------------------------------------------------------------------------------------
@@ -61,3 +103,39 @@
asm::wfe()
}
}
+
+/// Information about the HW state.
+pub mod state {
+ use crate::arch::PrivilegeLevel;
+ use cortex_a::regs::*;
+
5 years ago
+ /// The processing element's current privilege level.
+ pub fn current_privilege_level() -> (PrivilegeLevel, &'static str) {
+ let el = CurrentEL.read_as_enum(CurrentEL::EL);
+ match el {
+ Some(CurrentEL::EL::Value::EL2) => (PrivilegeLevel::Hypervisor, "EL2"),
+ Some(CurrentEL::EL::Value::EL1) => (PrivilegeLevel::Kernel, "EL1"),
+ Some(CurrentEL::EL::Value::EL0) => (PrivilegeLevel::User, "EL0"),
+ _ => (PrivilegeLevel::Unknown, "Unknown"),
+ }
+ }
+
+ /// Print the AArch64 exceptions status.
+ #[rustfmt::skip]
+ pub fn print_exception_state() {
+ use super::{
+ exception,
+ exception::{Debug, SError, FIQ, IRQ},
+ };
+ use crate::info;
+
+ let to_mask_str = |x| -> _ {
+ if x { "Masked" } else { "Unmasked" }
+ };
+
+ info!(" Debug: {}", to_mask_str(exception::is_masked::<Debug>()));
+ info!(" SError: {}", to_mask_str(exception::is_masked::<SError>()));
+ info!(" IRQ: {}", to_mask_str(exception::is_masked::<IRQ>()));
+ info!(" FIQ: {}", to_mask_str(exception::is_masked::<FIQ>()));
+ }
+}
diff -uNr 09_hw_debug_JTAG/src/arch.rs 10_privilege_level/src/arch.rs
--- 09_hw_debug_JTAG/src/arch.rs
+++ 10_privilege_level/src/arch.rs
@@ -9,3 +9,13 @@
#[cfg(any(feature = "bsp_rpi3", feature = "bsp_rpi4"))]
pub use aarch64::*;
+
+/// Architectural privilege level.
+#[allow(missing_docs)]
+#[derive(PartialEq)]
+pub enum PrivilegeLevel {
+ User,
+ Kernel,
+ Hypervisor,
+ Unknown,
+}
diff -uNr 09_hw_debug_JTAG/src/main.rs 10_privilege_level/src/main.rs
--- 09_hw_debug_JTAG/src/main.rs
+++ 10_privilege_level/src/main.rs
@@ -64,9 +64,16 @@
/// The main function running after the early init.
fn kernel_main() -> ! {
use core::time::Duration;
- use interface::time::Timer;
+ use interface::{console::All, time::Timer};
info!("Booting on: {}", bsp::board_name());
+
+ let (_, privilege_level) = arch::state::current_privilege_level();
+ info!("Current privilege level: {}", privilege_level);
+
+ info!("Exception handling state:");
+ arch::state::print_exception_state();
+
info!(
"Architectural timer resolution: {} ns",
arch::timer().resolution().as_nanos()
@@ -77,11 +84,12 @@
info!(" {}. {}", i + 1, driver.compatible());
}
- // Test a failing timer case.
- arch::timer().spin_for(Duration::from_nanos(1));
+ info!("Timer test, spinning for 1 second");
+ arch::timer().spin_for(Duration::from_secs(1));
+ info!("Echoing input now");
loop {
- info!("Spinning for 1 second");
- arch::timer().spin_for(Duration::from_secs(1));
+ let c = bsp::console().read_char();
+ bsp::console().write_char(c);
}
}
```