Copy Fail CVE-2026-31431: Linux local privilege escalation risk for containers and CI

Summary: Copy Fail, tracked as CVE-2026-31431, is a Linux kernel vulnerability in the algif_aead path. It is not a remote code execution issue by itself, but it can allow an unprivileged local user to gain root on vulnerable systems. The risk is highest on multi-tenant hosts, CI runners, shared-kernel container environments, and platforms that execute untrusted code.

What was disclosed

The technical site copy.fail describes a kernel logic flaw involving authencesn, AF_ALG, and splice() that can lead to controlled writes into the page cache. NVD records CVE-2026-31431 as a Linux kernel vulnerability resolved by reverting the in-place optimization in algif_aead; the CNA severity from kernel.org is 7.8 HIGH.

According to the researchers, affected kernels span the period from 2017 until the patch, with impact verified across multiple mainstream Linux distributions. The operational question is not only which distribution you run, but whether the host has a vulnerable kernel plus local access or untrusted code execution.

How the primitive works

Copy Fail matters because it does not look like many classic local privilege escalation bugs. It does not rely on a fragile race condition and does not need kernel-specific offsets.

The public chain can be summarized as:

an unprivileged process uses the AF_ALG socket interface to reach kernel crypto primitives;
the authencesn template enters a decrypt path that can treat source and destination pages unsafely;
splice() brings page-cache pages into the flow without writing the file on disk in the traditional way;
the result is a controlled write of a few bytes into the cached copy of a readable file;
if the target is a setuid binary, the kernel may execute the altered cached version and produce privilege escalation.

The defensive point is that the filesystem file may not change in the way teams expect. Controls based only on write events, inotify, or after-the-fact static hashes may miss the temporary page-cache manipulation.

Why it is dangerous in containers

Containers share the host kernel. If an untrusted workload can reach AF_ALG and the kernel is vulnerable, the boundary is no longer the container. It is the host kernel.

This makes Copy Fail closer to a container escape primitive than to a simple "local user" issue in traditional terms. Risk increases when:

the container does not block the required socket family;
seccomp profiles are permissive;
CI runners execute unreviewed code;
the same node hosts workloads with different trust levels;
kernel inventory does not distinguish host nodes from container images.

Why containers and CI are exposed first

On a single-tenant server, Copy Fail still requires local access or chaining with another vulnerability. In shared environments, the risk is different:

a CI runner executing untrusted pull requests can turn user code into root on the runner;
a container with reachable primitives may target a shared host kernel;
a multi-tenant build host can become a pivot point;
a compromised shell account can move from user to root;
sandbox and agent platforms executing customer code should treat this as urgent.

For SMEs and agencies, the practical question is: "Do we execute third-party code on our infrastructure?" If the answer is yes, priority goes up.

Immediate actions

Identify Linux servers with multiple local users, containers, CI runners, build jobs, or sandboxes.
Check whether your distribution vendor has shipped a fixed kernel.
Update the kernel and schedule a controlled reboot where required.
Before patching, evaluate disabling the algif_aead module as a temporary mitigation, following vendor guidance and testing AF_ALG dependencies.
For untrusted workloads, block AF_ALG socket creation with seccomp policies where applicable.
After patching, review users, cron jobs, SSH keys, systemd services, and setuid binaries on hosts exposed to untrusted code.

Do not run public proof-of-concept code on production systems. Even if the page-cache effect is not persistent across reboot, the privilege escalation effect is real.

Where to prioritize

High priority:

Kubernetes and container orchestration nodes running customer or supplier workloads;
self-hosted GitHub Actions runners, GitLab Runner, Jenkins, and build farms;
shared shell servers, bastion hosts, and jump hosts;
SaaS platforms that execute user-submitted code or scripts;
Linux servers where a web RCE could become root.

Medium priority:

single-tenant VPS environments with a small admin group;
application servers without third-party code execution but with exposed web services;
developer workstations with many uncontrolled dependencies.

What to monitor after patching

For higher-risk hosts, review:

SSH and sudo activity before the patch window;
new keys in authorized_keys;
unexpected cron jobs and systemd timers;
changes to privileged users;
containers or CI jobs triggered by untrusted sources;
unexpected changes to setuid binaries and security configuration.

Copy Fail does not replace patch management. It tests it. If you do not have a kernel and CI runner inventory, this is the time to build one.

Sources

How SecBox can help

SecBox can help inventory Linux hosts, isolate panels and runners, set access policies, review container exposure, and create a patch plan with a controlled reboot window.

Request an urgent Linux server check