[GSOC] Implement hardware nested page table support for vkernels

Mihai Carabas mihai.carabas at gmail.com
Sun Apr 21 14:42:24 PDT 2013


My name is Mihai Carabas and I am a second year student at a master program
from Politehnica University of Bucharest, Romania, Computer Science and
Engineering Department.

I was envolved last year in the GSoC program with the DragonFLY BSD
scheduler ("SMT/HT awareness to DragonFlyBSD scheduler"). The ones who
aren't familiar with what I had accomplished last year, here [1] is a
summary with the work and results. Meanwhile Matthew did a refactoring to
the scheduler and together with another improvements, the results got much

More about me you can find on the last year proposal [2]. In the past half
a year I worked on virtualizing Android on top of the L4 microkernel. The
goal is to have two Androids running on a Galaxy Nexus. My role in this
project was to port (virtualize) a flavour of linux kernel 3.0.8 from
Samsung (tuna/maguro) on top of the microkernel. Most problems came from
the fact that were three layer of addresses (physical addresses used by the
microkernel, microkernel virtual addresses used by the kernel and linux
virtual addresses used by the user space linux programs). The linux kernel
was running in a virtual space (the microkernel address space) and one type
of problem was when allocating "physical" memory for devices (ex. GPU) and
passing the device the address to read/write from there. We had to take
care to pass the actual physical address, not the virtual one, otherwise we
would get and interconnect error which is hard to trace.

After browsing through all this year's projects and due to the fact that I
have been working a lot with memory mapping and address translations, I
would like to work this year on the "Implementing hardware nested page
table support for vkernels".

Before I began I must get a strong understading of how the current virtual
page table is implemented starting from the vmspace implementation [3].
Another point where to see how the vmspace is working/implemented is the
"Page Faults" section from the "Virtual Kernel Peek" [4]. Another point to
look is the Matthew Dillon's article from here [1]. It's worth mentioning
that last year I worked a little with the vkernels: I implemented the CPU
topology for them (basicly you can create any kind of topology you want).

I started documenting on how nested page tables work. It's good to mention
that both AMD and Intel have implemented this virtualization extension but
under different names: NPT (nested page tables) for AMD and EPT(extended
page tables) for Intel. As far as I have read, they differ in some
important details (for example EPT doesn't support accessed/dirty bits - I
have to see how this would influence the implementation).

A brief description of how NPT works can be found in the System Programming
manual from AMD [5] at page 491. Basicly, instead of using only the CR3
register which indicates the place where the page tables are, there are two
registers: gCR3 (guest CR3) which points to the guest page tables (mapping
the virtual guest pages with the physical guest ones) and nCR3 (nested CR3)
which points to the host page table (mapping the guest physical pages with
the physical memory). In the TLB are kept the direct mappings (guest
virtual pages with physical memory pages and the guests can be
differentiated by an Address Space ID - ASID). A more extended description
can be found in this paper published by AMD [7].

My plan is as following:

1) Create a mechanism to detect what virtualization extension supports the
CPU. For this I can use the cpuid instruction. For example for AMD: check
the CPUID Fn8000_0001_ECX[SVM] to see if supports virtualization at all and
if so, check the CPUID Fn8000_000A_EDX[NP] to see if the NPT extension is
available [8].

2) While exposing the info's discovered at 1), document on the flow of
calls regarding the virtual memory allocation/creation when a vkernel
starts (and more when a process in the vkernel is created).

3) Documenting on how the virtual page tables are walked through and
propose a design for a hardware implementation using NPT/EPT.

4) Peek a platform (probabl an Intel core-i3) and write a stub
implementation for activating/using the virtualization extension (this can
be done by looking at the normal implementation with one CR3 register).
Here we have to take care to leave the current implementation if no NPT/EPT
is present.

5) After having a stable vkernel with the NPT/EPT begin testing to see what
is the gain with the virtualization extension enabled. Here we must peak
programs that are allocating/freeing and accesing a lot of pages to
invalidate mappings and create new ones. This way we force of page table

6) Extending the implementation on all platforms (32/64 - AMD/Intel).

I would be glad to know your opinion on the above.

Best regards,

[1] http://lists.dragonflybsd.org/pipermail/kernel/2012-August/015478.html
[2] http://leaf.dragonflybsd.org/mailarchive/kernel/2012-03/msg00066.html
[4] http://www.dragonflybsd.org/docs/developer/VirtualKernelPeek/
[5] http://support.amd.com/us/Embedded_TechDocs/24593.pdf
[6] http://www.freebsd.org/doc/en/articles/vm-design/
[8] http://support.amd.com/us/Embedded_TechDocs/25481.pdf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/kernel/attachments/20130422/3bd18955/attachment.html>

More information about the Kernel mailing list