Mmap’d IO is still a shit show because of clearing the CR3 register on page faul...

slashdev · on Feb 10, 2018

What is a good source to read more about that?

amluto · on Feb 10, 2018

What precisely do you mean by "clearing the CR3 register?"

vardump · on Feb 10, 2018

Although slightly technically inaccurate, he clearly means a full TLB flush.

Needs to be done on Intel CPUs older than Haswell, on those CPUs without INVPCID support.

With INVPCID you can partially invalidate TLB.

amluto · on Feb 10, 2018

Having recently rewritten Linux's TLB code, this is quite wrong. For an ordinary page fault, there's no flush at all -- changing a page from not present to present doesn't require a flush on x86. Removing a page from page cache can be done with INVLPG, which had been around for a long, long time.

From 4.14 on, Linux has used PCID to improve context switches, independently of PTI. While writing that code, I did a bunch of benchmarking. INVPCID is not terribly useful, even with PCID. In fact, Linux only uses INVPCID on user pages to assist with a PTI corner case. It's not entirely clear to me what Intel had in mind when INVPCID was added

bsdnoob · on Feb 10, 2018

Why would that be the case? I don't think you'd be changing page directory very often for mmio

vardump · on Feb 10, 2018

I think he means page fault every time a page is not present.

They're slower, because kernel needs to be mapped in and out of virtual address space, just like for syscalls.

If the access pattern is sufficiently local, perhaps this could be mitigated by using large (2MB) pages. A bad idea for a random access pattern, of course.