Blog

Secure Coding Technique: Processing XML data, part 1

Extensible Markup Language (XML) is a markup language used for encoding documents in a format that is both easy to handle for machines and human-readable. However, this commonly used format includes multiple security flaws. In this first XML related blog post, I will explain the basics of handling XML documents securely by using a schema.

OWASP divides the different vulnerabilities related to XML and XML schemas in two categories.

Malformed XML documents

Malformed XML documents are documents that do not follow the W3C XML specifications. Some examples that result in a malformed document are the removing of an ending tag, changing the order of different elements or the use of forbidden characters. All of these errors should result in a fatal error and the document should not undergo any additional processing.

In order to avoid vulnerabilities caused by malformed documents, you should use a well-tested XML parser that follows W3C specifications and does not take significantly longer to process malformed documents.

Invalid XML documents

Invalid XML documents are well formed but contain unexpected values. Here an attacker may take advantage of applications that do not properly define an XML schema to identify whether documents are valid. Below you can find a simple example of a document that, if not validated correctly, might have unintended consequences.

A web store which stores its transactions in XML data:

<purchase></purchase>
<id>123</id>
<price>200</price>

And the user only has control over the <id> value. It is then possible, without the right counter measures, for an attacker to input something like this:</id>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

If the parser that processes this document only reads the first instance of the <id> and <price> tags this will lead to unwanted results. </price></id>

It is also possible that the schema is not restrictive enough or that other input validation is insufficient, so that negative numbers, special decimals (like NaN or Infinity) or exceedingly big values can be entered where they are not expected, leading to similar unintended behavior.

Avoiding vulnerabilities related to invalid XML documents should be done by defining a precise and restrictive XML Schema to avoid problems of improper data validation.

Next blog post we will go into some more advanced attacks on XML documents such as Jumbo Payloads and the feared OWASP Top Ten number four, XXE.

In the meantime you can hone or challenge your skills on XML input validation on our portal.

Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing.

https://www.owasp.org/index.php/XML_Security_Cheat_Sheet

View Resource

Fill out the form below to download the report

First Name

Last Name

Company Email

Company

Company Size

Job Role

Country

State

Contact Permission

We would like your permission to send you information on our products and/or related secure coding topics. We’ll always treat your personal details with the utmost care and will never sell them to other companies for marketing purposes.

I would like to hear more from Secure Code Warrior

Submit

To submit the form, please enable 'Analytics' cookies. Feel free to disable them again once you're done.

Extensible Markup Language (XML) is a markup language used for encoding documents in a format that is both easy to handle for machines and human-readable. However, this commonly used format includes multiple security flaws. In this first XML related blog post, I will explain the basics of handling XML documents securely by using a schema.

OWASP divides the different vulnerabilities related to XML and XML schemas in two categories.

Malformed XML documents

Malformed XML documents are documents that do not follow the W3C XML specifications. Some examples that result in a malformed document are the removing of an ending tag, changing the order of different elements or the use of forbidden characters. All of these errors should result in a fatal error and the document should not undergo any additional processing.

In order to avoid vulnerabilities caused by malformed documents, you should use a well-tested XML parser that follows W3C specifications and does not take significantly longer to process malformed documents.

Invalid XML documents

Invalid XML documents are well formed but contain unexpected values. Here an attacker may take advantage of applications that do not properly define an XML schema to identify whether documents are valid. Below you can find a simple example of a document that, if not validated correctly, might have unintended consequences.

A web store which stores its transactions in XML data:

<purchase></purchase>
<id>123</id>
<price>200</price>

And the user only has control over the <id> value. It is then possible, without the right counter measures, for an attacker to input something like this:</id>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

If the parser that processes this document only reads the first instance of the <id> and <price> tags this will lead to unwanted results. </price></id>

It is also possible that the schema is not restrictive enough or that other input validation is insufficient, so that negative numbers, special decimals (like NaN or Infinity) or exceedingly big values can be entered where they are not expected, leading to similar unintended behavior.

Avoiding vulnerabilities related to invalid XML documents should be done by defining a precise and restrictive XML Schema to avoid problems of improper data validation.

Next blog post we will go into some more advanced attacks on XML documents such as Jumbo Payloads and the feared OWASP Top Ten number four, XXE.

In the meantime you can hone or challenge your skills on XML input validation on our portal.

Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing.

https://www.owasp.org/index.php/XML_Security_Cheat_Sheet

Get Started

Click on the link below and download the PDF of this resource.

Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.

View report Book a demo

View Resource

Interested in more?

Author

Application Security Researcher - R&D Engineer - PhD Candidate

Extensible Markup Language (XML) is a markup language used for encoding documents in a format that is both easy to handle for machines and human-readable. However, this commonly used format includes multiple security flaws. In this first XML related blog post, I will explain the basics of handling XML documents securely by using a schema.

OWASP divides the different vulnerabilities related to XML and XML schemas in two categories.

Malformed XML documents

Malformed XML documents are documents that do not follow the W3C XML specifications. Some examples that result in a malformed document are the removing of an ending tag, changing the order of different elements or the use of forbidden characters. All of these errors should result in a fatal error and the document should not undergo any additional processing.

In order to avoid vulnerabilities caused by malformed documents, you should use a well-tested XML parser that follows W3C specifications and does not take significantly longer to process malformed documents.

Invalid XML documents

Invalid XML documents are well formed but contain unexpected values. Here an attacker may take advantage of applications that do not properly define an XML schema to identify whether documents are valid. Below you can find a simple example of a document that, if not validated correctly, might have unintended consequences.

A web store which stores its transactions in XML data:

<purchase></purchase>
<id>123</id>
<price>200</price>

And the user only has control over the <id> value. It is then possible, without the right counter measures, for an attacker to input something like this:</id>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

If the parser that processes this document only reads the first instance of the <id> and <price> tags this will lead to unwanted results. </price></id>

It is also possible that the schema is not restrictive enough or that other input validation is insufficient, so that negative numbers, special decimals (like NaN or Infinity) or exceedingly big values can be entered where they are not expected, leading to similar unintended behavior.

Avoiding vulnerabilities related to invalid XML documents should be done by defining a precise and restrictive XML Schema to avoid problems of improper data validation.

Next blog post we will go into some more advanced attacks on XML documents such as Jumbo Payloads and the feared OWASP Top Ten number four, XXE.

In the meantime you can hone or challenge your skills on XML input validation on our portal.

Specifications for XML and XML schemas include multiple security flaws. At the same time, these specifications provide the tools required to protect XML applications. Even though we use XML schemas to define the security of XML documents, they can be used to perform a variety of attacks: file retrieval, server side request forgery, port scanning, or brute forcing.

https://www.owasp.org/index.php/XML_Security_Cheat_Sheet

Secure Code Warrior is here for your organization to help you secure code across the entire software development lifecycle and create a culture in which cybersecurity is top of mind. Whether you’re an AppSec Manager, Developer, CISO, or anyone involved in security, we can help your organization reduce risks associated with insecure code.

Book a demo Download

Resource hub

Resources to get you started

Secure by Design: Defining Best Practices, Enabling Developers and Benchmarking Preventative Security Outcomes

In this research paper, Secure Code Warrior co-founders, Pieter Danhieux and Dr. Matias Madou, Ph.D., along with expert contributors, Chris Inglis, Former US National Cyber Director (now Strategic Advisor to Paladin Capital Group), and Devin Lynch, Senior Director, Paladin Global Institute, will reveal key findings from over twenty in-depth interviews with enterprise security leaders including CISOs, a VP of Application Security, and software security professionals.

Learn More

Apr 28, 2025

Whitepapers

Webinar

Benchmarking Security Skills: Streamlining Secure-by-Design in the Enterprise

Finding meaningful data on the success of Secure-by-Design initiatives is notoriously difficult. CISOs are often challenged when attempting to prove the return on investment (ROI) and business value of security program activities at both the people and company levels. Not to mention, it’s particularly difficult for enterprises to gain insights into how their organizations are benchmarked against current industry standards. The President’s National Cybersecurity Strategy challenged stakeholders to “embrace security and resilience by design.” The key to making Secure-by-Design initiatives work is not only giving developers the skills to ensure secure code, but also assuring the regulators that those skills are in place. In this presentation, we share a myriad of qualitative and quantitative data, derived from multiple primary sources, including internal data points collected from over 250,000 developers, data-driven customer insights, and public studies. Leveraging this aggregation of data points, we aim to communicate a vision of the current state of Secure-by-Design initiatives across multiple verticals. The report details why this space is currently underutilized, the significant impact a successful upskilling program can have on cybersecurity risk mitigation, and the potential to eliminate categories of vulnerabilities from a codebase.

Apr 17, 2025

One Pager

Professional Services - Accelerate with expertise

Secure Code Warrior’s Program Strategy Services (PSS) team helps you build, enhance, and optimize your secure coding program. Whether you're starting fresh or refining your approach, our experts provide tailored guidance.

Apr 2, 2025

Brochure

Secure code training topics & content

Our industry-leading content is always evolving to fit the ever changing software development landscape with your role in mind. Topics covering everything from AI to XQuery Injection, offered for a variety of roles from Architects and Engineers to Product Managers and QA. Get a sneak peak of what our content catalog has to offer by topic and role.

Mar 24, 2025

Resource hub

Resources to get you started

Revealed: How the Cyber Industry Defines Secure by Design

In our latest white paper, our Co-Founders, Pieter Danhieux and Dr. Matias Madou, Ph.D., sat down with over twenty enterprise security leaders, including CISOs, AppSec leaders and security professionals, to figure out the key pieces of this puzzle and uncover the reality behind the Secure by Design movement. It’s a shared ambition across the security teams, but no shared playbook.

Learn More

Apr 28, 2025

Blog

Is Vibe Coding Going to Turn Your Codebase Into a Frat Party?

Vibe coding is like a college frat party, and AI is the centerpiece of all the festivities, the keg. It’s a lot of fun to let loose, get creative, and see where your imagination can take you, but after a few keg stands, drinking (or, using AI) in moderation is undoubtedly the safer long-term solution.

Apr 4, 2025

Blog

Prompt Injection and the Security Risks of Agentic Coding Tools

How a coding agent was tricked into writing SQL injection-prone code, installing shell tools, and maybe even stalking its user

Mar 31, 2025

Blog

Introducing Quests: The Newest Addition to Your Secure Code Journey

Quests empower developers to master secure coding through interactive learning, reducing risk and optimizing development

Mar 18, 2025

Secure Coding Technique: Processing XML data, part 1

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

Fill out the form below to download the report

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

<purchase></purchase> <id>123</id> <price>200</price>

<purchase></purchase> <id>123</id> <price>0</price> <id></id> <price>200</price>

Table of contents

Resources to get you started

Secure by Design: Defining Best Practices, Enabling Developers and Benchmarking Preventative Security Outcomes

Benchmarking Security Skills: Streamlining Secure-by-Design in the Enterprise

Professional Services - Accelerate with expertise

Secure code training topics & content

Resources to get you started

Revealed: How the Cyber Industry Defines Secure by Design

Is Vibe Coding Going to Turn Your Codebase Into a Frat Party?

Prompt Injection and the Security Risks of Agentic Coding Tools

Introducing Quests: The Newest Addition to Your Secure Code Journey

Developer-driven coding.

Securely.

Contact us today and make software security an intrinsic part of your development process.

Connect

Product

Learn

Measure

Integrate

Solutions

By industry

For different teams

By use case

Resources

Company

Help & Support

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>200</price>

<purchase></purchase>
<id>123</id>
<price>0</price>
<id></id>
<price>200</price>