What is the HTTP Host Header?#
HTTP host headers are mandatory request headers that specify the domain name the client is trying to access. For example, when you visit https://vulnerable.com/blog
, the browser makes a request like this:
GET /blog HTTP/1.1
Host: vulnerable.com
In some cases when a request is forwarded from one host to another, the host might change before it reaches the intended back-end destination.
The purpose of these headers is to help identify which back-end component the client is trying to communicate with. If these headers aren’t included or are malformed, the request will likely not reach the correct endpoint.
This is also important because there are only so many IP addresses to be used by the likes of cloud service providers and this approach of utilizing more hostnames for the same IP addresses. Multiple applications being accessible on the same IP address are usually the result of:
- Virtual hosting
- Routing traffic through intermediary hosts
In each of those two cases, the host header is relied on to specify the intended recipient for the request - like sending a letter to someone in an apartment building. The host header is like including an apartment number on a letter in the sense that when a browser sends the request, the target URL resolves to the IP address of a certain server - allowing that server to use the host header to forward the request to the right place.
What is an HTTP Host Header Attack?#
Host header attacks often take advantage of websites that handle the host header in an unsafe way. If the server is trusting of that host header and doesn’t validate it properly, we might be able to use this to inject some harmful payloads that change how the server behaves. (I imagine this to be sending a letter to an apartment and specifying an apartment number that does not exist or something similar.)
If we are injecting any value in the host header, we would call this host header injection attacks. Many off-the-shelf web applications have their domain name configured when they are initially set up. When the site needs to know the current domain (like for generating a link to a support page), the application might default to using the host header to get this value:
<a href="https://_SERVER['HOST']/support">Contact support</a>
The header might be used for a variety of differing actions between different systems depending on the website’s infrastructure.
The host header is user controllable, which means that if there is not proper validation in place we can sometimes take advantage of this to exploit some other vulnerabilities like:
- Web cache poisoning
- Business logic flaws
- Routing-based SSRF
- Server-side injection attacks
These vulnerabilities typically arise because developers or administrators don’t realize that the host header is user-controllable. In some cases where sanitization is being done you can possibly overwrite the host header by using other headers.
Identification and Exploitation#
To find these vulnerabilities, you’ll need a proxy to view requests from the target site. We need to determine if we can modify the host header and still reach the target application with the request.
Supplying Arbitrary Host Headers#
First, we want to see what happens if we supply some arbitrary unrecognized domain name in the host header. Burp suite is one of a handful of proxies that lets you do this without just deriving the IP address from the host header for its own purposes, so we are able to send the request to the correct target even though the host header has changed. (Imagine this like writing a certain address on a letter but then hand-delivering it to another place.)
If we are still able to access the target site when we supply a strange host header, then we need to figure out why that might be happening. Some servers are configured to use a default fallback option if the host header isn’t recognized. If the target website is this default fallback domain then we are in luck and can probe further.
Keep in mind that this header is often pretty important to a web application and changing it will often lead to errors that stop us from even reaching the target application.
Checking For Flawed Validation#
If your request with a strange header is blocked by some security feature, there might be a few ways around it. If you can figure out how the site is parsing the host header, you might be able to bypass validation.
Imagine a site where the parser omits a port number from the host header even though you can still include it. This means that the following payload might get past one of those parsers:
GET /example HTTP/1.1
Host: vulnerable.com:evil-payload
Some sites might allow arbitrary subdomains, in which case you could try registering one of those domain names, or if the whitelisting is set up poorly and only checks for a certain domain name you might be able to add arbitrary characters before the host domain like this:
GET /example HTTP/1.1
Host: evil-payloadvulnerable.com
Sending Ambiguous Requests#
If the code that validates the host header does something we can determine to be vulnerable, we might be able to make host headers that look different depending on which server views it.
For example, if we inject more than one host header, different systems might handle this request differently because which host header should take precedence, right? Imagine the front-end prefers the first header but the back-end prefers the second header - you might be able to get your payload past some security measures.
You could also try supplying an absolute URL and a host header to create a similar kind of ambiguity for the server to process:
GET https://vulnerable.com/ HTTP/1.1
Host: evil-payload
You could also try to do some strange things with line wrapping to see if the different host headers are interpreted differently:
GET /example HTTP/1.1
Host: evil-payload
Host: vulnerable.com
You can probably see where this is going, just try to case a wide net and hopefully you’ll see something that catches your eye.
Injecting Host Override Headers#
If we can’t use an ambiguous request to get the server to misbehave, we could try to overwrite the value while still leaving it intact. This would mean that we inject our payload using more than one header that serves a similar purpose to the host header.
You could try using the X-Forwarded-Host
header to pass a payload through:
GET /example HTTP/1.1
Host: vulnerable.com
X-Forwarded-Host: evil-payload
Of course there are quite a few headers that fulfill this purpose and there are quite a few websites that unintentionally support this behavior.
Password Reset Poisoning#
Password reset poisoning is (in my opinion) a kind of niche attack but it basically involves getting a legitimate password reset link to point to a domain that we control.
Password resets typically follow these steps:
- User enters their credentials and submits a reset request
- The website checks that the user exists and makes a high-entropy token to associate with the user on the back-end
- The website sends an email to the user that contains a link to use when resetting their password
- When the user visits this URL, the website checks if the token is valid and uses it to see which account is being reset - then the new password is entered and the token is destroyed
Of course, the security of this reset model depends on the user being the only one with access to their email address.
But let’s look a bit deeper, how is that URL being constructed? If it is being generated using some controllable input, then couldn’t we get the site to make a reset link that points to the wrong domain?
This would essentially just give the victim user a fake URL that, when accessed, will let us know the proper reset token so we could then construct the real URL and reset their password.
Web Cache Poisoning via Host Headers#
If we see some kind of behavior that seems vulnerable but not very exploitable - like if the host header is reflected sure we can try XSS but we wouldn’t be able to get another user’s browser to use our fake host headers.
If the target is using a web cache though, we might be in luck. Say we find that with host header injection we can get some reflected XSS which by itself isn’t worth much but if we can get this response cached and served to a victim user we could do a lot more dangerous things.
Web cache poisoning hinges on some of our input being reflected while still preserving the cache key that will still get mapped to other users’ requests.
Once we can get that input reflected, we just need to get it cached and wait for a victim user to view it.
For example, we find a site that reflects our input when we use two host headers:
If we can store some malicious JS on our domain and change this second host header to point there, then once we get this response cached any user who is served that cached response will be executing out malicious JS.
Accessing Restricted Functionality#
It is pretty common for websites to restrict functionality to internal users only - but those access control features might assume that the host header can’t be modified.
If you have a list of virtual hosts or IP addresses or see them leaked somewhere you could possibly use those to trick the authentication mechanisms in place.
Routing-Based SSRF#
Classic SSRF vulnerabilities are based on XXE or exploitable logic flaws that send HTTP requests to a URL derived from some controllable input. Routing-based SSRF however relies on exploiting the intermediary components that are common in lots of cloud-based environments.
Lots of these intermediary services forward front-end requests to some other back-end service so they might be configured to forward requests based on some poorly-validates host header that we can manipulate.
If we can get one of these intermediary hosts or services to query our host, we could probably pull off some SSRF by triggering behavior intended for internal systems only.
Connection State Attacks#
For performance reasons, lots of sites reuse connections when the same client makes request-response cycles. Poorly implemented servers sometimes assume that the host headers are identical for all HTTP/1.1 requests sent over the same connection - which is true if the requests are sent by a browser but not if we use a proxy to modify the requests.
We might find some servers that only validate based on the first request from a certain connection. We could bypass this by sending an innocent initial request then following it up with an evil one.
SSRF via Malformed Request Line#
In some cases customized proxies will fail to validate the actual request line properly which could cause issues. For example, imagine a proxy prepends every input with http://backend-host
and uses that to route the requests. This will work fine if the path starts with a /
character but it might break if we use other characters.
If we use the @
character, it would make the upstream URL into a login attempt:
http://backend-host@private-host/example
HTB - Forgot#
We start with a port scan and see that HTTP is being used and when we visit the page we are greeted with a login page. If we read the page source we will see what looks like a username in an inline comment.
If we try to use the forgot password functionality on the web application, it lets us determine which users exist because the response will tell you if a reset link was sent or if the username was invalid.
We try the developer’s username and it seems to send the reset link. If we change the Host
header in the request, we will get that reset link send to our server. We can then use this reset link to change the developer’s password - a nice and quick example of a host header attack.
Prevention#
The most straightforward approach is to avoid using the host header all together in server-side code. Most of the time you can use relative URLs instead of ones set by the host header anyways. Other things you can do are:
- Protecting Absolute URLs: Require the current domain to be specified in a config file and use that value instead of a host header.
- Validating Host Headers: If you need to use a host header then at least make sure that it is being validates with a whitelist and redirects or rejects all other requests.
- Whitelist Permitted Domains: Preventing routing-based attacks can be done most of the time by configuring a load-balancer with a whitelist.
- Avoid Internal-Only Virtual Hosts: It is usually best to avoid hosting internal-only content on the same servers as public-facing content - as host header manipulation could allow them to access those other hosts.