This Time Self-Hosted
dark mode light mode Search

Free Idea: a filtering HTTP proxy for securing web applications

This post is part of a series of free ideas that I’m posting on my blog in the hope that someone with more time can implement. It’s effectively a very sketched proposal that comes with no design attached, but if you have time you would like to spend learning something new, but no idea what to do, it may be a good fit for you.

Going back to a previous topic I wrote about, and the fact that I’m trying to set up a secure WordPress instance, I would like to throw out another idea I won’t have time to implement myself any time soon.

When running complex web applications, such as WordPress, defense-in-depth is a good security practice. This means that in addition to locking down what the code itself can do on to the state of the local machine, it also makes sense to limit what it can do to the external state and the Internet at large. Indeed, even if you cannot drop a shell on a remote server, there is value (negative for the world, positive for the attacker) to at least being able to use it form DDoS (e.g. through an amplification attack).

With that in mind, if your app does not require network at all, or the network dependency can be sacrificed (like I did for Typo), just blocking the user from making outgoing connection with iptables would be enough. The --uid-owner option makes it very easy to figure out who’s trying to open new connections, and thus stop a single user transmitting unwanted traffic. Unfortunately, this does not always work because sometimes the application really needs network support. In the case of WordPress, there is a definite need to contact the WordPress servers, both to install plugins and to check if it should self-update.

You could try to limit access to what the user can access by hosts. But that’s not easy to implement right either. Take WordPress as an example still: if you wanted to limit access to the WordPress infrastructure, you would effectively have to allow it accessing *.wordpress.org, and this can’t really be done in iptables, at far as I know, since those connections go to IP literal addresses. You could rely on FcRDNS to verify the connections, but that can be slow, and if you happen to have access to poison the DNS cache of the server, you’re effectively in control of this kind of ACL. I ignored the option of just using “standard” reverse DNS resolution, because in that case you don’t even need to poison DNS, you can just decide what your IP will reverse-resolve to.

So what you need to do is actually filter at the connection-request level, which is what proxies are designed for. I’ll be assuming we want to have a non-terminating proxy (because terminating proxies are hard), but even in that case you can now know what (forward) address you want to connect to, and in that case *.wordpress.org becomes a valid ACL to use. And this is something you can actually do relatively easily with Squid, for instance. Indeed, this is the whole point of tools such as ufdbguard (which I used to maintain for Gentoo), and the ICP protocol. But Squid is particularly designed as a caching proxy, it’s not lightweight at all, and it can easily become a liability to have it in your server stack.

Up to now, what I have used to reduce the surface of attacks of my webapps is set them behind a tinyproxy, which does not really allow for per-connection ACLs. This only provides isolation against random non-proxied connections, but it’s a starting point. And here is where I want to provide a free idea for anyone who has the time and would like to provide better security tools for srver-side defense-in-depth.

A server-side proxy for this kind of security usage would have to be able to provide ACLs, with both positive and negative lists. You may want to provide all access to *.wordpress.org, but at the same time block all non-TLS-encrypted traffic, to avoid the possibility of downgrade (given that WordPress has a silent downgrade for requests to api.wordpress.org, that I talked about before).

Even better, such a proxy should have the ability to distinguish the ACLs based on which user (i.e. which webapp) is making the request. The obvious way would be to provide separate usernames to authenticate to the proxy — which again Squid can do, but it’s designed for clients for which the validation of username and password is actually important. Indeed, for this target usage, I would ignore the password altogether, and just use the user at face value, since the connection should always only be local. I would be even happier if instead of pseudo-authenticating to the proxy, the proxy could figure out which (local) user the connection came from, by inspecting the TCP socket connection, kind of like querying the ident protocol used to work for IRC.

So to summarise, what I would like to have is an HTTP(S) proxy that focuses on securing server-side web applications. Does not have to support TLS transport (because it should only accept local connections), nor it should be a terminating proxy. It should support ACLs that allow/deny access to a subset of hosts, possibly per-user, without needing a user database of any sort, and even better if it can tell by itself which user the connection came from. I’m more than happy if someone tells me this already exists, or if not, someone starts writing this… thank you!

Comments 7
  1. We also need something like this where I work, something small and lightweight that can be easily plugged into an application stack that lets us implement outbound deny by default. More often you can’t get away with IP firewalls because of cloud services and CDNs, you need an application firewall in those cases. Ideally the solution would integrate tightly with the application, allowing policy to be specified in the application source repo instead of some monolithic firewall sitting somewhere on the network.

  2. I like the idea but something bothers me. All the local webapps need to know how to talk to a proxy server then, right? I know that WordPress does but a transparent proxy would be better in many situations since it doesn’t involve any support from the application or any prior configuration but it won’t work for proxying HTTPS requests because you have no way to read the Host header from the outgoing connection.TL;DR: so either the webapp has proxy support or you just can’t proxy outgoing HTTPS requests which is a major issue IMHO.

  3. Transparent non-terminating proxies are not feasible, and terminating proxies are too risky to implement, so yes you need to support proxies in your HTTP client. Which is not hard at all, and for instance works out of the box on Ruby and (afaict) on Python, by setting the `http_proxy` environment variable. Of course this is more complicated for PHP, because it’s PHP.

  4. Dnsmasq will add the results of certain queries to an ipset. You can then use this in iptables.

  5. Well, for example the following stanza will track all used IPs lookup for whatsapp and put them in an ipset called chat_whatsapp ipset=/e.whatsapp.net/e1.whatsapp….Now I can apply various rules (block/allow) to the ipset in order to control accessIn your case you would do something like ipset=/wordpress.net/ ips_wordpressNow when your server resolves some wordpress domain it’s added to that ipset and you can then do anything you like with iptables rulesThis is quite a reasonable technique for allowing/blocking apps with highly dynamic server ranges. It will break if the app gets passed IP addresses inside the protocol layer, ie not via DNS, but that seems rare enough to be a minor hiccup.

  6. Ah that does make a lot more sense now! I didn’t realize you could do that per-domain, and that’s why I was confused :)Yes that sounds like it would be an interesting way to implement ACLs without requiring a full proxy implementation, I definitely need to give it a try!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.