Read two papers from ASPLOS 22, and found them interesting. They are TMO: Transparent Memory Offloading in Datacenters and IOCost: Block IO Control for Container in Datacenters (both from Meta).
To summarize, they describe the interesting setting of containerized datacenter, including the infrastructure, the heterogenous devices and dynamic application demands. I made (hand-draw) some slide for the reading group meeting, which can be found here.
TMO aims for offloading some portion of RAM access into a backend tier (compressed memory, NVMe SSDs, NVM etc.) because RAM has relatively high cost and power consumption. TMO provides a generic solution based on an in-kernel mechanism and user-space policy decision. To handle the dynamic demands and the device heterogeneity, it adopts an app-centric metric to estimate the slowdown of potential memory offloading to the backend. One interesting part is what’s the information needed by the in-kernel metrics module (mechanism) and how to improve the Linux’s existing page reclaim tracking accordingly. Overall, it is simple and generic (neat!), without making assumption of application workloads.
IOCost solves the problem of block IO scheduling in a similar setting where the SSD devices have quite diverse performance characteristics and the scheduling unit is the workload container. It has an offline module to build device performance model thus could transform block IO request into an estimation of serving time for better supporting the SLO. The combination of offline model and online query is intuitive. It’s interesting to see their cgroup hierarchy and how the memory subsystem and block IO controller should make coherent decisions.