hvt: x86_64: Allow for up to 4GB guest size#577
Conversation
Allow for up to 4GB guest size on x86_64 by using up to four PDE entries. TODO: Lightly tested only, not sure if this arrangement will conflict with any platform memory holes that "plain" KVM may map into guest memory.
|
I reviewed the construction of the new page table mappings and they look good to me. While looking at the construction of
|
|
Interesting observation. I have used this patch for some unikernels on FreeBSD, it worked nicely. But I don't quite understand the semantics (and FreeBSD mmap(2) man page doesn't guarantee zeroed out memory (also not for MAP_ANONYMOUS). tenders/hvt/hvt_freebsd.c: hvt->mem = mmap(NULL, mem_size, PROT_READ | PROT_WRITE, MAP_SHARED, hvb->vmfd, 0); tenders/hvt/hvt_kvm.c: hvt->mem = mmap(NULL, mem_size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); tenders/hvt/hvt_openbsd.c: p = mmap(NULL, vmr->vmr_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANON, -1, 0); So, why does (a) FreeBSD use the vmfd (/dev/vmm/solo5-)? Why does OpenBSD use MAP_PRIVATE (instead of MAP_SHARED)? Is it worth to unify the flags across the platforms? |
One PDE is 0x1000, but we use four.
|
I think this is ready to review. I added a minor comment why Another question is "why 4 GB?" -- and I think the answer is partly for simplicity, and maybe partly because this is the limit for aarch64 (for reasons I don't think apply to x86). |
According to David Chisnall, the current usage guarantees zeroized memory, see this thread. |
The most helpful reference would be Intel SDM Volume 3A, section 4.5 4-Level Paging and 5-Level Paging and Figure 4-9. Linear-Address Translation to a 2-MByte Page using 4-Level Paging. I think it is helpful to have this reference somewhere. If not in the code then at least in one of the commit messages but a comment would probably preferable.
I do not know what the reasoning was but probably that it is a nice round number and "should be enough for almost anyone" (tm). On Since there were no real code changes, my conclusion from the first review still stands and from my side I think it is looking good. |
|
This PR still has a WIP prefix -- but I consider it to be done, and ready for being merged. Any ideas what is missing? |
|
I marked it as ready for review (non-WIP) a while ago, but I didn't realize it still had WIP in the title. The only thing maybe missing would be some comments or documentation. |
On how to build the page table mappings. Co-authored-by: Adrian-Ken Rueegsegger <[email protected]>
I added the references (without looking them up myself! :/) as you wrote them. |
|
I just checked the references against the latest Intel SDM Volume 3A, from June 2024 (Order Number: 253668-084US). They are still correct. |
|
Thanks! |
The following is a commit done by @mato. I am creating this draft PR to open up for review and hopefully eventually have it merged. I may add comments or changes that I find helpful in understanding the code.
Allow for up to 4GB guest size on x86_64 by using up to four PDE entries.
TODO: Lightly tested only, not sure if this arrangement will conflict with any platform memory holes that "plain" KVM may map into guest memory.