Coders Conquer Security OWASP Top 10 API Series - Excessive Data Exposure

Published Sep 23, 2020
by Matias Madou, Ph.D.
cASE sTUDY

Coders Conquer Security OWASP Top 10 API Series - Excessive Data Exposure

Published Sep 23, 2020
by Matias Madou, Ph.D.
View Resource
View Resource

The excessive data exposure vulnerability is distinct from other API problems on the OWASP list, in that it involves a very specific kind of data. The actual mechanics behind the vulnerability are similar to others, but excessive data exposure, in this case, is defined as involving legally protected or highly sensitive data. This can include any personally identifiable information, which is often referred to as PII. Or it could involve payment card industry information, or PCI. Finally, excessive data exposure can include any information that is subject to privacy laws, such as the General Data Protection Regulation (GDPR) in Europe or the Health Insurance Portability and Accountability Act (HIPAA) in the United States.

As you might imagine, this is cause for deep concern, and it's imperative that savvy developers learn how to squash these bugs wherever possible. If you're already prepared to take on a data exposure dragon, head to our gamified challenge:

What was your score? Read on and learn more:

What are some examples of excessive data exposure?

One of the primary reasons that excessive data exposure happens is because developers and coders don't have enough insight into the kind of data that their applications will be using. Because of this, developers tend to utilize generic processes where all object properties are exposed to end-users.

Developers also sometimes assume that frontend components will perform data filtering before displaying any information to users. For most generic data, this is rarely a problem. But exposing legally protected or sensitive data to users as part of a session ID, for example, can lead to big problems from both a security and a legal standpoint.

As an example of how easily sensitive data can be accidentally shared, the OWASP report envisions a scenario where a security guard is given access to specific IOT-based cameras in a facility. Perhaps those cameras are watching over sealed and secure areas, while other cameras that view people are supposed to be restricted to guards or supervisors with higher permissions.

To give the guard access to authorized cameras, developers can use an API call like the following one.

/api/sites/111/cameras

In response, the app would send details about the cameras that the guard is able to see in the following format:

{ "id":"xxx","live_access_token":"xxxxbbbbb","building_id":"yyy"}

On the surface, this would appear to work just fine. The guard, who is using the graphical user interface on the app, would only see the camera feeds that they are authorized to view. The problem is that because of the generic code used, the actual API response would contain a full list of all cameras throughout the facility. Anyone sniffing the network who captures that data, or compromises the guard's account, would be able to discover the locations and nomenclature for every camera on the network. They could then access that data without restriction.

Eliminating Excessive Data Exposure

The biggest key to preventing excessive data exposure is an understanding of the data and the protections surrounding it. Creating generic APIs and leaving it up to the client to sort data before displaying it to users is a dangerous choice that leads to many preventable security breaches.

In addition to understanding the relevant data protections, it's also important to stop the process of sending everything to a user with generic APIs. For example, code such as to_json() and to_string() must be avoided. Instead, the code should specifically pick the properties that need to return to authorized users and exclusively send that information.

As a way to ensure that no protected data is being accidentally overshared, organizations should consider implementing a schema-based response validation mechanism as an extra layer of security. It should define and enforce data being returned by all API methods including rules for error reporting.

Finally, all data classified as containing PII or PCI, or information that is protected by regulations such as GDPR or HIPAA should be protected using strong encryption. That way, even if the location of that data slips out as part of an excessive data exposure vulnerability, there is a good secondary line of defense in place that should protect the data even if it lands in the hands of a malicious user or threat actor.

Check out the Secure Code Warrior blog pages for more insight about this vulnerability and how to protect your organization and customers from the ravages of other security flaws. You can also try a demo of the Secure Code Warrior training platform to keep all your cybersecurity skills honed and up-to-date.

View Resource
View Resource

Author

Matias Madou, Ph.D.

Matias is a researcher and developer with more than 15 years of hands-on software security experience. He has developed solutions for companies such as Fortify Software and his own company Sensei Security. Over his career, Matias has led multiple application security research projects which have led to commercial products and boasts over 10 patents under his belt. When he is away from his desk, Matias has served as an instructor for advanced application security training courses and regularly speaks at global conferences including RSA Conference, Black Hat, DefCon, BSIMM, OWASP AppSec and BruCon.

Matias holds a Ph.D. in Computer Engineering from Ghent University, where he studied application security through program obfuscation to hide the inner workings of an application.

Want more?

Dive into onto our latest secure coding insights on the blog.

Our extensive resource library aims to empower the human approach to secure coding upskilling.

View Blog
Want more?

Get the latest research on developer-driven security

Our extensive resource library is full of helpful resources from whitepapers to webinars to get you started with developer-driven secure coding. Explore it now.

Resource Hub

Coders Conquer Security OWASP Top 10 API Series - Excessive Data Exposure

Published Sep 23, 2020
By Matias Madou, Ph.D.

The excessive data exposure vulnerability is distinct from other API problems on the OWASP list, in that it involves a very specific kind of data. The actual mechanics behind the vulnerability are similar to others, but excessive data exposure, in this case, is defined as involving legally protected or highly sensitive data. This can include any personally identifiable information, which is often referred to as PII. Or it could involve payment card industry information, or PCI. Finally, excessive data exposure can include any information that is subject to privacy laws, such as the General Data Protection Regulation (GDPR) in Europe or the Health Insurance Portability and Accountability Act (HIPAA) in the United States.

As you might imagine, this is cause for deep concern, and it's imperative that savvy developers learn how to squash these bugs wherever possible. If you're already prepared to take on a data exposure dragon, head to our gamified challenge:

What was your score? Read on and learn more:

What are some examples of excessive data exposure?

One of the primary reasons that excessive data exposure happens is because developers and coders don't have enough insight into the kind of data that their applications will be using. Because of this, developers tend to utilize generic processes where all object properties are exposed to end-users.

Developers also sometimes assume that frontend components will perform data filtering before displaying any information to users. For most generic data, this is rarely a problem. But exposing legally protected or sensitive data to users as part of a session ID, for example, can lead to big problems from both a security and a legal standpoint.

As an example of how easily sensitive data can be accidentally shared, the OWASP report envisions a scenario where a security guard is given access to specific IOT-based cameras in a facility. Perhaps those cameras are watching over sealed and secure areas, while other cameras that view people are supposed to be restricted to guards or supervisors with higher permissions.

To give the guard access to authorized cameras, developers can use an API call like the following one.

/api/sites/111/cameras

In response, the app would send details about the cameras that the guard is able to see in the following format:

{ "id":"xxx","live_access_token":"xxxxbbbbb","building_id":"yyy"}

On the surface, this would appear to work just fine. The guard, who is using the graphical user interface on the app, would only see the camera feeds that they are authorized to view. The problem is that because of the generic code used, the actual API response would contain a full list of all cameras throughout the facility. Anyone sniffing the network who captures that data, or compromises the guard's account, would be able to discover the locations and nomenclature for every camera on the network. They could then access that data without restriction.

Eliminating Excessive Data Exposure

The biggest key to preventing excessive data exposure is an understanding of the data and the protections surrounding it. Creating generic APIs and leaving it up to the client to sort data before displaying it to users is a dangerous choice that leads to many preventable security breaches.

In addition to understanding the relevant data protections, it's also important to stop the process of sending everything to a user with generic APIs. For example, code such as to_json() and to_string() must be avoided. Instead, the code should specifically pick the properties that need to return to authorized users and exclusively send that information.

As a way to ensure that no protected data is being accidentally overshared, organizations should consider implementing a schema-based response validation mechanism as an extra layer of security. It should define and enforce data being returned by all API methods including rules for error reporting.

Finally, all data classified as containing PII or PCI, or information that is protected by regulations such as GDPR or HIPAA should be protected using strong encryption. That way, even if the location of that data slips out as part of an excessive data exposure vulnerability, there is a good secondary line of defense in place that should protect the data even if it lands in the hands of a malicious user or threat actor.

Check out the Secure Code Warrior blog pages for more insight about this vulnerability and how to protect your organization and customers from the ravages of other security flaws. You can also try a demo of the Secure Code Warrior training platform to keep all your cybersecurity skills honed and up-to-date.

We would like your permission to send you information on our products and/or related secure coding topics. We’ll always treat your personal details with the utmost care and will never sell them to other companies for marketing purposes.

Submit
To submit the form, please enable 'Analytics' cookies. Feel free to disable them again once you're done.