Security

Lessons from the Recent PyTorch Supply Chain Attack

January 23, 2024
3 min
A recent attack by ethical hackers on PyTorch, a popular Machine Learning library, is a stark reminder of the importance of securing software supply chains in software development, particularly in widely used open-source ecosystems like Python and JavaScript.

Introduction

With the advent of GenAI, there has never been a more exciting time to work with Open Source Software. However, software supply chain attacks have emerged as a formidable threat in the Artificial Intelligence and Machine Learning space, capable of compromising entire systems through seemingly innocuous channels. 

A recent attack by ethical hackers on PyTorch, a popular Machine Learning library, is a stark reminder of the importance of securing software supply chains in software development, particularly in widely used open-source ecosystems like Python and JavaScript.

PyTorch Supply Chain Attack

Beginning in August 2023, PyTorch was subjected to a new class of CI/CD

attack, in which ethical hackers infiltrated the library GitHub repository by injecting malicious code via improperly secured self-hosted GitHub Actions runners. By injecting malicious code through this component, the attackers had the ability to upload malicious PyTorch releases to GitHub, upload releases to AWS, potentially add code to the main repository branch, backdoor PyTorch dependencies, and more.

With that level of access, the attackers were able to cover their tracks by deleting the logs showing their presence in PyTorch systems. And even though it was not demonstrated in their write-up, they could have included a backdoor in PyTorch releases to gain access to the user's systems of this popular library.

For more in-depth analysis, we encourage readers to review the detailed account at John Stawinski's blog. It’s a fascinating read! 

Potential Impact

The potential impact of such attacks is significant, especially given PyTorch's extensive use in AI and machine learning applications. 

As stated in the original article, however, “the issues surrounding these attack paths are not unique to PyTorch. They’re not unique to ML repositories or even to GitHub. We’ve repeatedly demonstrated supply chain weaknesses by exploiting CI/CD vulnerabilities.” For example, previous research showed an attack affecting the largest CI/CD service on the market itself: GitHub Actions.

These incidents serve as another wake-up call regarding the security of dependencies in open-source projects. It highlights the need for more rigorous security protocols in dependency management and serves as a case study in the vulnerability of software supply chains and the need for proactive security measures.

Understanding Supply Chain Attacks - Beyond the Basics

Supply chain attacks in software occur when a trusted component, like a third-party library or update mechanism, is compromised to deliver malicious code. In open-source ecosystems, such as Python's, the communal trust and shared responsibility model compound the risk. Attackers exploit this trust, embedding malware in commonly used packages or libraries. 

As discussed in our article, Introduction to Software Supply Chains for Python Developers, a deeper understanding of this threat landscape is crucial for developers.

The technical fallout from a supply chain attack can be devastating. Beyond the immediate data breach or system compromise, these attacks can silently alter codebases, introduce backdoors, and hijack APIs. The interconnected nature of modern software means that a single compromised component can cascade, affecting multiple systems and applications.

The Role of Proactive Security Measures - Technical Strategies

Proactive security in software development involves continuous vigilance - monitoring dependencies, auditing code, and integrating automated security testing into the development lifecycle. Our previous blog posts provide practical examples of these strategies at work.

How Safety CLI 3 Addresses Supply Chain Security

Safety CLI 3 is a Python dependency vulnerability scanner engineered to integrate seamlessly into existing Python development workflow and enable the secure use of Python packages. Our team of cybersecurity analysts leverages AI and manual analysis to identify novel vulnerabilities and confirm the efficacy of fixes applied to known vulnerabilities. By leveraging the industry’s most comprehensive vulnerability data, Safety CLI offers unparalleled protection against known vulnerabilities and malicious packages. 

Conclusion

The PyTorch supply chain attack serves as a critical lesson in the importance of securing software supply chains. In an age where dependencies are deeply intertwined, a proactive stance on security is not just recommended but essential. Safety CLI offers a powerful solution to identify vulnerabilities at every stage of the software development lifecycle, enabling organizations to safeguard their software supply chains.

We invite you to explore more about Safety CLI and how it can enhance your organization's cybersecurity posture on our website and documentation.

Additional Resources

Safety CLI Website

Safety CLI Documentation

Original Source Article

John Stawinski's blog

Reduce vulnerability noise by 90%.
Get a demo today to learn more.