Original Article Text

Click to Toggle View

NVIDIA shares guidance to defend GDDR6 GPUs against Rowhammer attacks. NVIDIA is warning users to activate System Level Error-Correcting Code  mitigation to protect against Rowhammer attacks on graphical processors with GDDR6 memory. The company is reinforcing the recommendation as new research demonstrates a Rowhammer attack against an NVIDIA A6000 GPU (graphical processing unit). Rowhammer is a hardware fault that can be triggered through software processes and stems from memory cells being too close to each other. The attack was demonstrated on DRAM cells but it can affect GPU memory, too. It works by accessing a memory row with enough read-write operations, which causes the value of adjacent data bits to flip from one to zero and vice-versa, causing the in-memory information to change. The effect could be a denial-of-service condition, data corruption, or even privilege escalation. System Level Error-Correcting Codes (ECC) can preserve the integrity of the data by adding redundant bits and correcting single-bit errors to maintain data reliability and accuracy. In workstation and data center GPUs where VRAM handles large datasets and precise calculations related to AI workloads, ECC must be enabled to prevent crucial errors in their operation. NVIDIA's security notice notes that researchers at the University of Toronto showed "a potential Rowhammer attack against an NVIDIA A6000 GPU with GDDR6 Memory" where System-Level ECC was not enabled. The academic researchers developed GPUHammer, an attack method to flip bits on GPU memories. Although hammering is harder on GDDR6 because of higher latency and faster refresh compared with CPU-based DDR4, the researchers were able to demonstrate that Rowhammer attacks on GPU memory banks is possible. Apart from the RTX A6000, the GPU maker also recommends enabling System-Level ECC for the following products: Data Center GPUs: Workstation GPUs: Embedded / Industrial: The GPU maker notes that newer GPUs like Blackwell RTX 50 Series (GeForce), Blackwell Data Center GB200, B200, B100, and Hopper Data Center H100, H200, H20, and GH200, come with built-in on-die ECC protection, which does not require an intervention from the user. One way to check if System Level ECC is enabled is to use an out-of-band method that utilizes the system's BMC (Baseboard Management Controller) and hardware interface software, like the Redfish API, to check the "ECCModeEnabled" status. Tools like NSM Type 3 and NVIDIA SMBPBI can also be used for configuration, though they require access to the NVIDIA Partner Portal. A second In-Band method also exists, using the nvidia-smi command-line utility from the system's CPU to check and enable ECC where supported. Rowhammer represents a real security concern that could cause data corruption or enable attacks in multi-tenant environments like cloud servers where vulnerable GPUs may be deployed. However, the real risk is context-dependent, and exploiting Rowhammer reliably is complicated, requiring specific conditions, high access rates, and precise control, making it an attack difficult to execute. 8 Common Threats in 2025 While cloud attacks may be growing more sophisticated, attackers still succeed with surprisingly simple techniques. Drawing from Wiz's detections across thousands of organizations, this report reveals 8 key techniques used by cloud-fluent threat actors.

Daily Brief Summary

CYBERCRIME // NVIDIA Advises on Protecting GDDR6 GPUs from Rowhammer Attacks

NVIDIA is urging users to enable System Level Error-Correcting Code (ECC) to mitigate Rowhammer attacks on GPUs with GDDR6 memory.

Recent research demonstrated a successful Rowhammer attack on an NVIDIA A6000 GPU, prompting NVIDIA's recommendation.

Rowhammer is a hardware fault exploited through frequent memory access, causing data corruption or system disruptions by altering adjacent memory bits.

System Level ECC adds redundant bits to data, correcting single-bit errors and ensuring data integrity and reliability in critical applications.

The attacks are technically challenging due to GDDR6’s higher latency and faster refresh rates compared to traditional DDR4, but they remain feasible.

NVIDIA recommends System Level ECC for several GPU models, including latest data center and workstation units, to protect against potential vulnerabilities.

Newer GPUs like the Blackwell and Hopper series feature built-in on-die ECC, which provides automatic protection without user intervention.

Users can verify the ECC status via out-of-band and in-band methods, including system tools and command-line utilities like nvidia-smi.