Environmentally-friendly XML

Photo of red plastic bag in green grass

There's a reason you pay for plastic shopping bags. It is to protect the environment. Durable shopping bags can be re-used, and don't pollute our oceans and landfills.

Re-use is a good thing - and not just for the environment. We know that code re-use is important. And that also applies to data. If we have data that is used in many places, we only want to store it in one place and have one source.

That's the same principle behind XML external entities (XEE). Unfortunately, there's a potential security loop hole.

How XML external entities work

Many developers have only learned the basics of XML — enough to write well-formed XML, and no more. So you might not know what an XML entity is (yet).

An XML entity is a way to include the contents of another (XML) file in the current file. It consists of two steps:

  1. Declare the entity — give it an identifier, and specify its URI.
    Let's declare an entity called xee inside our XML data file:
    <!DOCTYPE root_element [
    <!ENTITY xee SYSTEM "some_URI" >
  2. Reference it by using its identifier where you want the contents.
    Now we can reference xee to get its contents:

Data reuse! Wonderful!

Many XML processors will automatically fetch the external resource and include the contents where specified.

What's the problem?

There's a reason that XML external entities are listed as one of the OWASP top 10 vulnerabilities.

The security problem arises when an attacker uses an XEE to obtain the contents of a file he should not have access to.

As an example, let's imagine a basic web service that uses XML to get information on a vehicle, based on its registration number.

  • The client sends a request in XML format asking for the information. One of the XML elements is the registration number.
  • The server sends back the requested information, including the registration number that was part of the request.

Now an attacker makes one small change. He declares an XEE that points to a file on the server - perhaps an operating system file or a data file. Then he references the XEE in the field for the registration number. The XML processor on the server dutifully "pastes" the copies of the file into that field. And then sends it back - probably along with some message that the registration number can't be found.

Now the attacker has that information! Not so wonderful!

Disclaimer: This is just a simple example. I am not saying that this is what happens when you get your car licence renewed at your local post office.

What's the solution?

Systems that parse XML input will often receive data from an untrusted source. Think of a web service, or an XMLHttpRequest using Ajax.

Fortunately this is quite simple to solve. Make sure your XML processor is correctly configured to disable external entity resolution.

And, of course, you should always sanitise data input and validate it as far as possible.

You can read more about this in the OWASP XML Enternal Entity article.

Leave a Comment

Your email address will not be published. Required fields are marked *

Thank You

We're Excited!

Thank you for completing the form. We're excited that you have chosen to contact us about training. We will process the information as soon as we can, and we will do our best to contact you within 1 working day. (Please note that our offices are closed over weekends and public holidays.)

Don't Worry

Our privacy policy ensures your data is safe: Incus Data does not sell or otherwise distribute email addresses. We will not divulge your personal information to anyone unless specifically authorised by you.

If you need any further information, please contact us on tel: (27) 12-666-2020 or email info@incusdata.com

How can we help you?

Let us contact you about your training requirements. Just fill in a few details, and we’ll get right back to you.