Background
Twitter engineers found and fixed a Linux kernel bug in memory shrinker which caused OOM for us.
Hasnain says:
Interesting technical debugging story.
“Memory reclamation is one of the most complicated parts of the Linux kernel. It is full of heuristic algorithms, complex corner cases, complicated data structures, and convoluted interaction with other subsystems. Memory reclamation is the core part of the Linux kernel and is relied upon by other subsystems. However, the bugs or suboptimal behaviors of memory reclamation may take a long time to get discovered. The fixes may be quite subtle and the validation may take substantial efforts to guarantee no regressions.”
Posted on 2021-05-10T04:28:46+0000