Server Side Request Forgery (SSRF)

This is a blog post summarising a few notes I’ve gathered around the internet, with the purpose of cementing them in my mind rather than adding anything new or attempting to broadcast them to a wider crowd. If you find it useful, that’s great, but it’s nothing original and its been pulled from several sources noted at the end of the post.

The post is organised in several categories: general attacks against URL parsers and implementations, and specific attacks against parsers in specific languages, with the intention of highlighting differences in interpretation of URL strings.

Discrepancies between parsers and HTTP libraries


This is interesting. These three URL parsing libs in python all interpret this individual URL differently:

import urllib
import urllib2
import requests

url = " &@ @"

urllib.urlopen(url) #
urllib2.urlopen(url) #
requests.get(url) #

urlparse(url) # ParseResult(scheme='http', netloc=' &@', path='', params='', query='', fragment=' @')


In versions earlier than a few years ago, there was an inconsistency between parse_url and readfile which could lead to URL parser bypass with URLs with several colons: or a URL with extraneous characters In both examples parse_url and readfile interpreted the URL differently. In the second case, readfile interprets the host as, whereas parse_url interprets the host as

This may affect other programming languages.


Curl is in widespread usage, and there are curl bindings in every language under the sun. Discrepancies between language URL parsers and Curl could lead to SSRF. Consider the following URL, as interpreted by PHP:

# php
<?php print_r(parse_url("http://foo@"));
[scheme] => http
[host] =>
[user] => foo@
[path] => /

And as interpreted by cURL:

curl 'http://foo@'
curl: (7) Failed to connect to port 80: Connection refused

As you can see it is attempting to retrieve This example uses PHP, but these discrepancies have been identified in other languages as well and more are bound to exist. Example vulnerabilities have been found in WordPress, VBulletin and MyBB utilising this technique.


Path traversal bypasses are possible with special unicode character U+FF2E. This happens because node’s internal unicode parser interprets this multibyte character as two separate bytes, and then proceeds to discard a part, leaving \x2E, a dot.

Similar results can be observed by injecting U+FF0D and U+FFOA, which results in a newline, allowing newline injection.

General Bugs

In linux, hostname resolution is generally done with gethostbyname. As per RFC1035 it supports escaping of values with \DDD notation. This may allow for additional parser confusion.

# echo or\\
# nslookup or\\

Non-authoritative answer:
Address: ...

Interestingly enough, gethostbyname will remove all backslashes that are not followed by a digit.

echo or\\an\\g\\
# nslookup or\\an\\g\\

Non-authoritative answer:
Address: ...

gethostbyname will also pass input to getaddrinfo at times, which means that it will ignore invalid input as long as it is preceded by a valid address. Examples below:

>>> import socket
>>> socket.gethostbyname(" foo")
>>> socket.gethostbyname("\r\nfoo")

The ability to add invalid trailing content can lead to an attacker that can perform HTTP header smuggling attacks in the event they can inject encoded new lines, such as\r\\r\nAuthorization: blah

Additionally, an attacker can smuggle other protocols (such as SMTP) thanks to TLS’ SNI.


Internationalizing Domain Names in Applications (IDNA) is a standard or a set of standards that allow for characters not in the ascii set to be used in domain names. There are two diferring standards, IDNA2003 and IDNA2008, which are difficult to transition between for client implementations, which lead the unciode consortium to release UTS46.

Different HTTP libraries and URL parsing libraries implement different versions of this standard, as well as implementing the standard in different ways. This can be useful to avoid blacklists of disallowed hosts. An example of this is an inconsistency in PHP’s gethostbynamel function and curl’s resolver: PHP’s gethostbynamel fails when provided with a domain with a special character, which can lead to bypasses. cURL will then retrieve the URL and resolve it successfully.

Values synonymous to localhost

Beside the obvious examples, the following URLs will all attempt to retrieve localhost.

wget http://0/
wget http://[::1]/
wget http://SERVER_IP/
wget http://127.1/


Several mechanisms for bypassing SSRF protections through DNS shenanigans exist. I will cover these on a high level below:

Host that resolves to a malicious IP

DNS records may point to an internal IP address (such as or This frequently works because developers check whether the ip matches an address range but accept arbitrary DNS names regardless of what they resolve to.

Time of check, time of use vulnerabilities (TOCTOU)

A TOCTOU vulnerability can occur if the target application implements host whitelisting or host blacklisting. Imagine the following pseudo-code:

download_url = request.get('target')
target_host = urlparse.urlparse(download_url).netloc
target_ip = socket.gethostbyname(target_host) #resolve

if target_ip in blacklist:
raise Exception('fail.')

A TOCTOU vulnerability allows for a bypass of the blacklist check if DNS resolution occurs twice: once for the check, and twice for the retrieval. An attacker-controlled DNS server could resolve the first time to a good address and the second time to a malicious IP.

Malicious redirect

A SSRF protection bypass may occur if an attacker creates a malicious site that redirects to an internal IP because the check is performed on the initial IP address and not the address the HTTP client gets redirected to. This frequently works for most HTTP clients as they tend to follow HTTP redirections by default. Here’s an example: -> FAIL, not allowed -> 301 REDIRECT TO -> 200 OK.

Final notes

The ability to inject \r\n in combination with either spaces or \t characters allows you to inject new headers into the request, which may allow other attacks. Imagine a request to `’ results in the following request:

GET /aa HTTP/1.1


A request that looks like true%0Ax-aa: could result in the following:

GET /aa HTTP/1.1
x-aa: HTTP/1.1

This would allow you to attack a lot of plaintext protocols.


Click to access us-17-Tsai-A-New-Era-Of-SSRF-Exploiting-URL-Parser-In-Trending-Programming-Languages.pdf

Agrega un comentario! los comentarios se aprueban automaticamente

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s