Blog

Secure Coding Technique: Processing XML data, part 1

Extensible Markup Language (XML) is a markup language used for encoding documents in a format that is both easy to handle for machines and human-readable. However, this commonly used format includes multiple security flaws. In this first XML related blog post, I will explain the basics of handling XML documents securely by using a schema.

OWASP divides the different vulnerabilities related to XML and XML schemas in two categories.

Malformed XML documents

Malformed XML documents are documents that do not follow the W3C XML specifications. Some examples that result in a malformed document are the removing of an ending tag, changing the order of different elements or the use of forbidden characters. All of these errors should result in a fatal error and the document should not undergo any additional processing.

In order to avoid vulnerabilities caused by malformed documents, you should use a well-tested XML parser that follows W3C specifications and does not take significantly longer to process malformed documents.

Invalid XML documents

Invalid XML documents are well formed but contain unexpected values. Here an attacker may take advantage of applications that do not properly define an XML schema to identify whether documents are valid. Below you can find a simple example of a document that, if not validated correctly, might have unintended consequences.

A web store which stores its transactions in XML data:

<purchase></purchase>
<id>123</id>
<price>200</price>

And the user only has control over the <id> value. It is then possible, without the right counter measures, for an attacker to input something like this:</id>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

If the parser that processes this document only reads the first instance of the <id> and <price> tags this will lead to unwanted results. </price></id>

XML database illustrated with a warning sign and user credentials including an attacker admin account.

It is also possible that the schema is not restrictive enough or that other input validation is insufficient, so that negative numbers, special decimals (like NaN or Infinity) or exceedingly big values can be entered where they are not expected, leading to similar unintended behavior.

Avoiding vulnerabilities related to invalid XML documents should be done by defining a precise and restrictive XML Schema to avoid problems of improper data validation.

Next blog post we will go into some more advanced attacks on XML documents such as Jumbo Payloads and the feared OWASP Top Ten number four, XXE.

In the meantime you can hone or challenge your skills on XML input validation on our portal.

Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing.

https://www.owasp.org/index.php/XML_Security_Cheat_Sheet

View Resource

Fill out the form below to download the report

First Name

Last Name

Company Email

Company

Company Size

Job Role

Country

State

Contact Permission

We would like your permission to send you information on our products and/or related secure coding topics. We’ll always treat your personal details with the utmost care and will never sell them to other companies for marketing purposes.

I would like to hear more from Secure Code Warrior

Submit

To submit the form, please enable 'Analytics' cookies. Feel free to disable them again once you're done.

Extensible Markup Language (XML) is a markup language used for encoding documents in a format that is both easy to handle for machines and human-readable. However, this commonly used format includes multiple security flaws. In this first XML related blog post, I will explain the basics of handling XML documents securely by using a schema.

OWASP divides the different vulnerabilities related to XML and XML schemas in two categories.

Malformed XML documents

Malformed XML documents are documents that do not follow the W3C XML specifications. Some examples that result in a malformed document are the removing of an ending tag, changing the order of different elements or the use of forbidden characters. All of these errors should result in a fatal error and the document should not undergo any additional processing.

In order to avoid vulnerabilities caused by malformed documents, you should use a well-tested XML parser that follows W3C specifications and does not take significantly longer to process malformed documents.

Invalid XML documents

Invalid XML documents are well formed but contain unexpected values. Here an attacker may take advantage of applications that do not properly define an XML schema to identify whether documents are valid. Below you can find a simple example of a document that, if not validated correctly, might have unintended consequences.

A web store which stores its transactions in XML data:

<purchase></purchase>
<id>123</id>
<price>200</price>

And the user only has control over the <id> value. It is then possible, without the right counter measures, for an attacker to input something like this:</id>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

If the parser that processes this document only reads the first instance of the <id> and <price> tags this will lead to unwanted results. </price></id>

It is also possible that the schema is not restrictive enough or that other input validation is insufficient, so that negative numbers, special decimals (like NaN or Infinity) or exceedingly big values can be entered where they are not expected, leading to similar unintended behavior.

Avoiding vulnerabilities related to invalid XML documents should be done by defining a precise and restrictive XML Schema to avoid problems of improper data validation.

Next blog post we will go into some more advanced attacks on XML documents such as Jumbo Payloads and the feared OWASP Top Ten number four, XXE.

In the meantime you can hone or challenge your skills on XML input validation on our portal.

Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing.

https://www.owasp.org/index.php/XML_Security_Cheat_Sheet

View webinar

Get Started

Click on the link below and download the PDF of this resource.

Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.

View report Book a demo

View Resource

Interested in more?

Author

Application Security Researcher - R&D Engineer - PhD Candidate

Extensible Markup Language (XML) is a markup language used for encoding documents in a format that is both easy to handle for machines and human-readable. However, this commonly used format includes multiple security flaws. In this first XML related blog post, I will explain the basics of handling XML documents securely by using a schema.

OWASP divides the different vulnerabilities related to XML and XML schemas in two categories.

Malformed XML documents

Malformed XML documents are documents that do not follow the W3C XML specifications. Some examples that result in a malformed document are the removing of an ending tag, changing the order of different elements or the use of forbidden characters. All of these errors should result in a fatal error and the document should not undergo any additional processing.

In order to avoid vulnerabilities caused by malformed documents, you should use a well-tested XML parser that follows W3C specifications and does not take significantly longer to process malformed documents.

Invalid XML documents

Invalid XML documents are well formed but contain unexpected values. Here an attacker may take advantage of applications that do not properly define an XML schema to identify whether documents are valid. Below you can find a simple example of a document that, if not validated correctly, might have unintended consequences.

A web store which stores its transactions in XML data:

<purchase></purchase>
<id>123</id>
<price>200</price>

And the user only has control over the <id> value. It is then possible, without the right counter measures, for an attacker to input something like this:</id>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

If the parser that processes this document only reads the first instance of the <id> and <price> tags this will lead to unwanted results. </price></id>

It is also possible that the schema is not restrictive enough or that other input validation is insufficient, so that negative numbers, special decimals (like NaN or Infinity) or exceedingly big values can be entered where they are not expected, leading to similar unintended behavior.

Avoiding vulnerabilities related to invalid XML documents should be done by defining a precise and restrictive XML Schema to avoid problems of improper data validation.

Next blog post we will go into some more advanced attacks on XML documents such as Jumbo Payloads and the feared OWASP Top Ten number four, XXE.

In the meantime you can hone or challenge your skills on XML input validation on our portal.

Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing.

https://www.owasp.org/index.php/XML_Security_Cheat_Sheet

Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.

Book a demo Download

Resource hub

Resources to get you started

SCW named in new Agentic Coding Security category

Gartner named SCW twice in the 2026 Hype Cycle for Secure Software Engineering. Here's why it matters for AI-driven development.

Learn More

Jun 9, 2026

SCW named twice in new Gartner Hype Cycle

One Pager

Guides

SCW Learning Content for KnowBe4

Secure Code Warrior content available through KnowBe4 helps technical teams build secure coding and AI governance awareness through structured learning covering OWASP Top 10 risks, AI-assisted development, and modern secure coding practices.

May 15, 2026

Stack of white documents with blue text lines on a gradient blue to pink background.

One Pager

Secure AI-driven development with KnowBe4 + Secure Code Warrior

Secure Code Warrior joins KnowBe4 to bring hands-on secure coding training into security awareness programs — covering OWASP, AI development, and 10 languages.

May 13, 2026

Stack of white documents with blue text lines on a blue to pink gradient background.

One Pager

Secure Code Warrior Learning: Secure AI-Driven Development

Secure code for the AI era: Learn how Secure Code Warrior builds developer capability to reduce vulnerabilities and secure AI-generated code at scale.

Apr 27, 2026

Resource hub

Resources to get you started

Enabler 5: Certification Programs

Move beyond one-and-done training. Enabler 5 builds multi-level certification programs that give developers meaningful progression and validated skills.

Learn More

Jun 18, 2026

Enablers of Success Series banner: Certification Programs, with a certificate and verification badge icon — Secure Code Warrior.

Blog

The NSA just issued its first MCP security guidance. Here's what it means for developer capability.

NSA published its first MCP security guidance. SCW's curriculum already covers 18 of 23 issues raised — here's how it maps.

Jun 11, 2026

Blog

Secure Code Warrior named twice in the Gartner Hype Cycle for secure software engineering

Gartner names SCW twice. As AI agents take over more development, SCW gives you the capability and governance to adopt AI-driven development securely.

Jun 9, 2026

Blog

Announcing Adaptive Learning: The Antidote to AI Software Security Risk and Skill Gaps

Adaptive Learning bridges SCW Trust Agent with our entire learning platform, ensuring training stays perfectly aligned with real-time developer activity.

Jun 1, 2026

Secure Coding Technique: Processing XML data, part 1

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

Fill out the form below to download the report

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

Table of contents

Resources to get you started

SCW named in new Agentic Coding Security category

SCW Learning Content for KnowBe4

Secure AI-driven development with KnowBe4 + Secure Code Warrior

Secure Code Warrior Learning: Secure AI-Driven Development

Resources to get you started

Enabler 5: Certification Programs

The NSA just issued its first MCP security guidance. Here's what it means for developer capability.

Secure Code Warrior named twice in the Gartner Hype Cycle for secure software engineering

Announcing Adaptive Learning: The Antidote to AI Software Security Risk and Skill Gaps

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>