Article Details
Scrape Timestamp (UTC): 2024-01-17 23:26:05.756
Source: https://www.theregister.com/2024/01/17/leftoverlocals_gpu_flaw/
Original Article Text
Click to Toggle View
Apple, AMD, Qualcomm GPU security hole lets miscreants snoop on AI training and chats. So much for isolation. A design flaw in GPUs made by Apple, Qualcomm, AMD, and likely Imagination can be exploited by miscreants on a shared system to snoop on fellow users. That means creeps can, for instance, observe the large language models and other machine-learning software being accelerated by the processors for other users. That will be a worry for those training or running LLMs on a shared server in the cloud. On a non-shared system, malware running on the box could abuse the weakness to spy on the user's GPU activities. Crucially, the graphics chips are supposed to prevent this kind of monitoring, by fully isolating the memory and other resources used by each user process from one another, but in reality, many chips do not securely implement this functionality, allowing data to be stolen. The vulnerability, tracked as CVE-2023-4969, and dubbed LeftoverLocals, was discovered by Tyler Sorensen, a security research engineer on the Trail of Bits AI and ML assurance team and an assistant professor at University of California, Santa Cruz. Research made public on Tuesday detailed how miscreants can exploit the hole to read arbitrary data in a system's local GPU memory. Plus, they published proof-of-concept code to snoop on an LLM chatbot in conversation with another user on a shared GPU-accelerated server. To exploit the security oversight, the attacker just needs to have sufficient access to the shared GPU to run application code on it. That code, in spite of any isolation protections in place, on vulnerable silicon can mine a region of memory that can be used as a data cache; that cache is shared by other code running on the GPU. Exploitation involves monitoring the cache for values written by other programs, and exfiltrating information of interest, thus allowing data to be stolen from other users. "This data leaking can have severe security consequences, especially given the rise of ML systems, where local memory is used to store model inputs, outputs, and weights," according to Sorensen and Heidy Khlaaf, Trail of Bits' engineering director for AI and ML assurance. While the flaw potentially affects all GPU applications on affected chips, it is especially concerning for those processing machine-learning applications because of the amount of data these models process using GPUs, and therefore the amount of potentially sensitive information that could be swiped by exploiting this issue. "LeftoverLocals can leak ~5.5 MB per GPU invocation on an AMD Radeon RX 7900 XT which, when running a 7B model on llama.cpp, adds up to ~181 MB for each LLM query," Sorensen and Khlaaf explained. "This is enough information to reconstruct the LLM response with high precision." The bug hunters have been working with the affected GPU vendors and the CERT Coordination Center to address and disclose the flaws since September 2023. AMD, in a security bulletin issued Tuesday, said it plans to begin rolling out mitigations in March though upcoming driver updates. The chip house also confirmed that a lot of its products are vulnerable to the memory leak, including multiple versions of its Athlon and Ryzen desktop and mobile processors, Radeon graphics cards, and Radeon and Instinct data center GPUs. When asked about LeftoverLocals, an AMD spokesperson directed The Register to the bulletin for its mitigation plans, and sent the following statement: Based on our current understanding from researchers, if a user is running on the same local machine as malicious software then the final contents of the GPU program scratch pad memory that is used for temporary storage of data during operation could be viewable by a bad actor. It is important to note there is no exposure to any other part of the system and no user data is compromised. Since the exploit requires installing malicious software, AMD recommends users follow security best practices including protecting access to their system and not downloading unfamiliar software. Google pointed out to Trail of Bits that some Imagination GPUs are impacted, and that the processor designer released a fix for its holes last month. Additionally, a Google spokesperson gave The Register this statement: Google is aware of this vulnerability impacting AMD, Apple, and Qualcomm GPUs. Google has released fixes for ChromeOS devices with impacted AMD and Qualcomm GPUs as part of the 120 and 114 releases in the Stable and LTS channels, respectively. ChromeOS device users who are not pinned to a release, do not need to take any action. Customers who are pinning to an unsupported version should update to receive the fix. Nvidia and Arm are not said to be affected. That's the good news here: loads of AI accelerators in the cloud come from Nvidia, so if you're training or running on those, you'll be OK. Apple, meanwhile, told The Register its M3 and A17 series processors have fixes for the vulnerability, and declined to comment on the boffins' assessment that "the issue still appears to be present on the Apple MacBook Air (M2)." "Furthermore, the recently released Apple iPhone 15 does not appear to be impacted as previous versions have been," the Trail of Bits team added. A spokesperson for Apple also told us that the iGiant appreciated the researchers' work as it advances the mega-corp's understanding of these types of threats. Qualcomm has issued a firmware patch, though according to the researchers it only fixes the issue for some devices. The chip goliath did not respond to The Register's inquiries.
Daily Brief Summary
A security vulnerability in GPUs from Apple, Qualcomm, AMD, and possibly Imagination allows unauthorized access to data on shared systems.
The flaw, named CVE-2023-4969 or LeftoverLocals, permits attackers to spy on machine-learning models, including language processing, by exploiting memory isolation failures.
Attackers on shared servers can observe and potentially steal sensitive data used by machine-learning applications, with around 5.5 MB leakable per GPU invocation.
The exploit requires access to run code on the shared GPU and is a concern for cloud-based AI systems due to the volume of sensitive data processed.
The Trail of Bits research team has disclosed the vulnerability to vendors and CERT Coordination Center since September 2023, and mitigations are being rolled out.
AMD is releasing driver updates with mitigations starting March, while Google has patched ChromeOS devices affected by the flaw, and Apple has fixes for certain processors.
Unlike Apple, Qualcomm, AMD, and Imagination, Nvidia and Arm GPUs are not affected by this particular security issue.