Blog

Understand the path traversal bug in Python’s tarfile module

Recently, a team of security researchers announced their finding of a fifteen year old bug in Python’s tar file extraction functionality. The vulnerability was first disclosed in 2007 and tracked as CVE-2007-4559. A note was added to the official Python documentation, but the bug itself was left unpatched.

This vulnerability could impact thousands of software projects yet many people are unfamiliar with the situation or how to handle it. That’s why, here at Secure Code Warrior, we’re giving you the opportunity to simulate exploiting this vulnerability yourself to see the impact first-hand and get some hands-on experience in the mechanics of this persistent bug, so you can better protect your application!

Try the simulated Mission now.

The vulnerability: path traversal during tar file extraction

Path or directory traversal happens when unsanitized user input is used to construct a file path, allowing an attacker to gain access to and overwrite files, and even execute arbitrary code.

The vulnerability exists in Python’s tarfile module. A tar (tape archive) file is a single file, called an archive. It packages together multiple files along with their metadata, and is usually recognized by having the .tar.gz or .tgz extension. Each member in the archive can be represented by a TarInfo object, which contains metadata, such as the file name, modification time, ownership, and more.

The risk arrises from the archives ability to be extracted again.

When being extracted, every member needs a path to be written to. This location is created by joining the base path with the file name:

Once this path is created, it’s passed on to the tarfile.extract or tarfile.extractall functions to perform the extraction:

The issue here is the lack of sanitization of the filename. An attacker could rename files to include path traversal characters, such as dot dot slash (../), which would cause the file to traverse out of the directory it was meant to be in and overwrite arbitrary files. This could eventually lead to remote code execution, which is ripe for exploitation.

The vulnerability appears throughout other scenarios, if you know how to identify it. In addition to Python’s handling of tar files, the vulnerability exists in the extraction of zip files. You may be familiar with this under another name, such as the zip slip vulnerability, which has manifested itself in languages other than Python!

LINK TO MISSION

How can you mitigate risk?

Despite the vulnerability being known for years, the Python maintainers consider the extraction functionality to be doing what it’s supposed to do. In this case, some may say “it’s a feature, not a bug.” Unfortunately, developers can’t always avoid extracting tar or zip files from an unknown source. It’s up to them to sanitize the untrusted input to prevent path traversal vulnerabilities as part of secure development practices.

Want to learn more about how to write secure code and mitigate risk with Python?

Try out our Python challenge for free.

If you’re interested in getting more free coding guidelines, check out Secure Code Coach to help you stay on top of secure coding practices.

‍

View Resource

Fill out the form below to download the report

First Name

Last Name

Company Email

Company

Company Size

Job Role

Country

State

Contact Permission

We would like your permission to send you information on our products and/or related secure coding topics. We’ll always treat your personal details with the utmost care and will never sell them to other companies for marketing purposes.

I would like to hear more from Secure Code Warrior

Submit

To submit the form, please enable 'Analytics' cookies. Feel free to disable them again once you're done.

Recently, a team of security researchers announced their finding of a fifteen year old bug in Python’s tar file extraction functionality. The vulnerability was first disclosed in 2007 and tracked as CVE-2007-4559. A note was added to the official Python documentation, but the bug itself was left unpatched.

This vulnerability could impact thousands of software projects yet many people are unfamiliar with the situation or how to handle it. That’s why, here at Secure Code Warrior, we’re giving you the opportunity to simulate exploiting this vulnerability yourself to see the impact first-hand and get some hands-on experience in the mechanics of this persistent bug, so you can better protect your application!

Try the simulated Mission now.

The vulnerability: path traversal during tar file extraction

Path or directory traversal happens when unsanitized user input is used to construct a file path, allowing an attacker to gain access to and overwrite files, and even execute arbitrary code.

The vulnerability exists in Python’s tarfile module. A tar (tape archive) file is a single file, called an archive. It packages together multiple files along with their metadata, and is usually recognized by having the .tar.gz or .tgz extension. Each member in the archive can be represented by a TarInfo object, which contains metadata, such as the file name, modification time, ownership, and more.

The risk arrises from the archives ability to be extracted again.

When being extracted, every member needs a path to be written to. This location is created by joining the base path with the file name:

Once this path is created, it’s passed on to the tarfile.extract or tarfile.extractall functions to perform the extraction:

The issue here is the lack of sanitization of the filename. An attacker could rename files to include path traversal characters, such as dot dot slash (../), which would cause the file to traverse out of the directory it was meant to be in and overwrite arbitrary files. This could eventually lead to remote code execution, which is ripe for exploitation.

The vulnerability appears throughout other scenarios, if you know how to identify it. In addition to Python’s handling of tar files, the vulnerability exists in the extraction of zip files. You may be familiar with this under another name, such as the zip slip vulnerability, which has manifested itself in languages other than Python!

LINK TO MISSION

How can you mitigate risk?

Despite the vulnerability being known for years, the Python maintainers consider the extraction functionality to be doing what it’s supposed to do. In this case, some may say “it’s a feature, not a bug.” Unfortunately, developers can’t always avoid extracting tar or zip files from an unknown source. It’s up to them to sanitize the untrusted input to prevent path traversal vulnerabilities as part of secure development practices.

Want to learn more about how to write secure code and mitigate risk with Python?

Try out our Python challenge for free.

If you’re interested in getting more free coding guidelines, check out Secure Code Coach to help you stay on top of secure coding practices.

‍

View webinar

Get Started

Click on the link below and download the PDF of this resource.

Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.

View report Book a demo

View Resource

Interested in more?

Author

Laura Verheyde is a software developer at Secure Code Warrior focused on researching vulnerabilities and creating content for Missions and Coding labs.

Recently, a team of security researchers announced their finding of a fifteen year old bug in Python’s tar file extraction functionality. The vulnerability was first disclosed in 2007 and tracked as CVE-2007-4559. A note was added to the official Python documentation, but the bug itself was left unpatched.

This vulnerability could impact thousands of software projects yet many people are unfamiliar with the situation or how to handle it. That’s why, here at Secure Code Warrior, we’re giving you the opportunity to simulate exploiting this vulnerability yourself to see the impact first-hand and get some hands-on experience in the mechanics of this persistent bug, so you can better protect your application!

Try the simulated Mission now.

The vulnerability: path traversal during tar file extraction

Path or directory traversal happens when unsanitized user input is used to construct a file path, allowing an attacker to gain access to and overwrite files, and even execute arbitrary code.

The vulnerability exists in Python’s tarfile module. A tar (tape archive) file is a single file, called an archive. It packages together multiple files along with their metadata, and is usually recognized by having the .tar.gz or .tgz extension. Each member in the archive can be represented by a TarInfo object, which contains metadata, such as the file name, modification time, ownership, and more.

The risk arrises from the archives ability to be extracted again.

When being extracted, every member needs a path to be written to. This location is created by joining the base path with the file name:

Once this path is created, it’s passed on to the tarfile.extract or tarfile.extractall functions to perform the extraction:

The issue here is the lack of sanitization of the filename. An attacker could rename files to include path traversal characters, such as dot dot slash (../), which would cause the file to traverse out of the directory it was meant to be in and overwrite arbitrary files. This could eventually lead to remote code execution, which is ripe for exploitation.

The vulnerability appears throughout other scenarios, if you know how to identify it. In addition to Python’s handling of tar files, the vulnerability exists in the extraction of zip files. You may be familiar with this under another name, such as the zip slip vulnerability, which has manifested itself in languages other than Python!

LINK TO MISSION

How can you mitigate risk?

Despite the vulnerability being known for years, the Python maintainers consider the extraction functionality to be doing what it’s supposed to do. In this case, some may say “it’s a feature, not a bug.” Unfortunately, developers can’t always avoid extracting tar or zip files from an unknown source. It’s up to them to sanitize the untrusted input to prevent path traversal vulnerabilities as part of secure development practices.

Want to learn more about how to write secure code and mitigate risk with Python?

Try out our Python challenge for free.

If you’re interested in getting more free coding guidelines, check out Secure Code Coach to help you stay on top of secure coding practices.

‍

Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.

Book a demo Download

Resource hub

Resources to get you started

Cyber Resilience Act (CRA) Aligned Learning Pathways

SCW supports Cyber Resilience Act (CRA) readiness with CRA-aligned Quests and conceptual learning collections that help development teams build the Secure by Design, SDLC, and secure coding skills aligned with the CRA’s secure development principles.

Learn More

Jan 20, 2026

One Pager

Case Studies

Kamer van Koophandel Sets the Standard for Developer-Driven Security at Scale

Kamer van Koophandel shares how it embedded secure coding into everyday development through role-based certifications, Trust Score benchmarking, and a culture of shared security ownership.

Jan 6, 2026

eBooks

OWASP Top 10 2025 eBook

Want to dominate the OWASP Top 10? Download the No-BS Guide to Defending Your Applications Against the OWASP Top 10:2025

Dec 23, 2025

Webinar

Threat Modeling with AI: Turning Every Developer into a Threat Modeler

Walk away better equipped to help developers combine threat modeling ideas and techniques with the AI tools they're already using to strengthen security, improve collaboration, and build more resilient software from the start.

Oct 28, 2025

Resource hub

Resources to get you started

Enabler 1: Defined & Measurable Success Criteria

Enabler 1 kicks off our 10-part Enablers of Success series by showing how to link secure coding to business outcomes like risk reduction and velocity for long-term program maturity.

Learn More

Feb 19, 2026

A light grey callout box featuring a blue magnifying glass icon with a yellow exclamation point inside. The text encourages considering key stakeholders and executive sponsors when determining program success criteria to drive departmental adoption.

Blog

SCW Turns 11: A Realtime Lesson in Adaptability and Continuous Improvement

2025 was a big year for AI, for cybersecurity, and for SCW. I’m approaching 2026 with quiet confidence, and the optimism that only hard work paying off can bring.

Jan 27, 2026

Blog

Introducing the 10 Enablers of Success

Secure Code Warrior’s 10 Enablers guide organizations in building lasting secure coding programs by focusing on people, process, and program maturity stages.

Jan 22, 2026

Blog

New Risk Category on the OWASP Top Ten: Expecting the Unexpected

OWASP Top 10 2025 adds Mishandling of Exceptional Conditions at #10. Mitigate risks via "fail closed" logic, global error handlers, and strict input validation.

Dec 1, 2025

Understand the path traversal bug in Python’s tarfile module

The vulnerability: path traversal during tar file extraction

How can you mitigate risk?

The vulnerability: path traversal during tar file extraction

How can you mitigate risk?

Fill out the form below to download the report

The vulnerability: path traversal during tar file extraction

How can you mitigate risk?

The vulnerability: path traversal during tar file extraction

How can you mitigate risk?

Table of contents

Resources to get you started

Cyber Resilience Act (CRA) Aligned Learning Pathways

Kamer van Koophandel Sets the Standard for Developer-Driven Security at Scale

OWASP Top 10 2025 eBook

Threat Modeling with AI: Turning Every Developer into a Threat Modeler

Resources to get you started

Enabler 1: Defined & Measurable Success Criteria

SCW Turns 11: A Realtime Lesson in Adaptability and Continuous Improvement

Introducing the 10 Enablers of Success

New Risk Category on the OWASP Top Ten: Expecting the Unexpected

Developer-driven coding.

Securely.

Contact us today and make software security an intrinsic part of your development process.

Connect

Product

Learn

Measure

Integrate

Solutions

By industry

For different teams

By use case

Resources

Company

Help & Support