Hacker News new | past | comments | ask | show | jobs | submit login
High Performance TCP Proxy Server (partow.net)
75 points by ArashPartow on April 22, 2017 | hide | past | favorite | 59 comments



Very nice, but this is more of a demo code for boost::asio than something that is production-worthy.

For example, it doesn't relay FINs between connections, doesn't disable Nagle algorithm on the upstream socket, doesn't wait for pending writes to complete before tearing down the connection, doesn't handle congestion at all (potentially leading to unbound memory use), etc.


Could you point to an open source TCP proxy that you'd consider production worthy with an approachable code base?


HAProxy is the gold standard.

nginx and apache only do HTTP(S). Don't support TCP.


> nginx and apache only do HTTP(S). Don't support TCP.

That's not entirely accurate. For nginx, for example, there's the ngx_stream_core_module[1]. It's not particularly well-documented though, so I don't know much about it.

[1]: https://nginx.org/en/docs/stream/ngx_stream_core_module.html


It's totally experimental. It's only been made a few months ago. It's not even compiled into the official distribution, you gotta compile nginx yourself with special flags.

For the sake of god, please don't have your entire site run on experimental software like that. ^^


haproxy is quite well documented


stud is pretty approachable and production worthy; I assume hitch [1] remains so, if you dropped the TLS termination, it would just be a TCP proxy as well.

[1] https://hitch-tls.org/



I assumed (without looking) that because of all of the features, the nginx codebase wouldn't be that east to read. I'll give it a try!


Why not just?

Linux: iptables -t nat -A PREROUTING -p tcp -s 192.168.20.200/0 -d 192.168.0.100/0 --dport 8080 -j REDIRECT --to-ports 20000

Windows: netsh interface portproxy add v4tov4 listenaddress=192.168.20.200 listenport=8080 connectaddress=192.168.0.100 connectport=20000 protocol=tcp


Or use the layer-4 load balancer built into the Linux kernel - IPVS (http://kb.linuxvirtualserver.org/wiki/IPVS).

Use Keepalived (http://www.keepalived.org/) for HA and health checking or use Gorb (https://github.com/kobolog/gorb) and you can dynamically change services / backends using a REST API.


Because people don't learn the OS.


    -s 192.168.20.200/0 -d 192.168.0.100/0
That sounds wrong.

You probably meant /24 or /32. Or you meant to not write anything to not filter it at all.


Yeah, sorry for that, was copy-pasting too quickly :-)


Socat is excellent too. Not as high performance as iptables but the command is very simple to use. More of a direct replacement for the program above.


Those require root/admin access.

Still, I agree.


Because it's not as interesting :)


As someone who has also written a TCP proxy (along with many others...), after thoroughly reading the page I'm still unsure of how exactly this is "high performance". It is also curious that, despite the fact that it uses a separate library for networking, the source is already quite a bit longer than some other proxies which don't. I found the explanation overly complex.

Around half the code in this implementation could probably be removed by the realisation that, after a connection is established, both ends are completely symmetric: all it needs to do is try to read from A and write to B, then try to read from B and write to A. If A closes, close B. If B closes, close A.


It is symmetrical, but there are nuances.

If A closes or errors out, the proxy should first push out any pending data and only then close B.

The same goes for when A sends a FIN - it should flush any data queued at application level before calling shutdown() on B's socket.

If A becomes unwritable, it should stop reading from B.


> If A closes, close B. If B closes, close A.

It's shutdown() writes on EOFs, not close, with refcounting to also do close() when EOFs were detected on both directions (I also have written a TCP proxy). But yeah, TCP proxies are trivial, would be more interesting to see something like a tunneling proxy that sends data over multiple connections to maximize performance.


Hey, I am looking for a way of packing multiple TCP streams over a single TCP connection to a backend server. I can unpack them in application logic if necessary. Do you know what that is called? Kinda like SCTP


The quick and dirty approach would be an ssh tunnel or vpn. There's also GRE tunnels which would be faster, but not encrypted. Any of these would work without having to change the application.

Probably, though, it's best to start with why you would want to do that. You could be, for example, trying to solve something where a pub/sub model would work better. Or just two separate apps, on different ports. What's driving the idea of multiplexing?


Oh that's interesting, can we speak via email:- tom dot larkworthy at Google's popular email provider?


Multiplexing?


As a sidenote if you don't have a requirement to compile with C++03 I would recommend using the standalone asio library free from boost[1]. The only things you need to modify are the includes and namespaces.

[1] http://think-async.com/Asio/Download


Why would you recommend the upstream Asio library instead of the one in Boost?


Because it has zero dependencies -- There's no reason to pull in boost::shared_ptr when you can use std::shared_ptr


A similar application that I'm fond of is Pen, which also does simple load balancing and has udp support:

http://siag.nu/pen/


Ironically, the site appears down (I am redirected to google)


I thought it was some kind of a joke. I guess you could say that google.com is a high performance tcp proxy server.


Yep,I second that. I'm getting redirected to google.


I got redirected when using chrome but not on firefox.


How does this compare to using HAProxy in TCP mode?


HAProxy is 90k loc while this is 300 loc so they really are different beasts. I would say HAPRoxy is a great general purpose proxy that has most of the features you want while this proxy server is a MVP proxy you can grow off of if you want to do something that HAProxy can't provide.


Netty is a more mature foundation for this sort of thing and likely much faster


Again, this is a really cool 300 loc snippet. No need to pull in the jvm if you're going to do something simple.

As for performance, a reproducible benchmark is the minimum requirement to even start the conversation.


And yet by "pulling in the JVM" (which is a rounding error in 2017), it's remarkably easy to generate a proxy that actually works as people expect a TCP proxy to work. This doesn't even so much as wait for connections to drain before terminating. This "300-line snippet" (which is reliant on Asio, which is not 300 lines by any stretch) is, as near as I can tell, not viable in real-world conditions and the spirited defenses of it that put it as competitive with real-world, battle-tested solutions are profoundly weird.

(Similarly, because I don't really care about JVM versus not, HAProxy--which I'd probably use for something like this because I have better things to do with my programming time--is 90Kloc because it has stuff to do and does it right. Simple is only better if simple can actually get the job done.)


Not sure if the 300 loc is really a fair comparison, as it is using ASIO (https://think-async.com/) under the covers, which is far more than 300 loc.


Can you expand on your reasons for making that claim - perhaps with some benchmarks?


Check out the techempower framework benchmarks. Netty can do over a million http responses per second on a reasonable machine.

On Linux it uses an epoll native driver and is asynchronous. The framework makes it possible to write proxies in a few lines.

If you want to beat netty by a significant margin you'll probably need to use kernel bypass


C++ 17 Networking TS is based on ASIO. It will be very interesting if netty performs better than ASIO based code for the same task.


They both use epoll underneath. Netty will likely perform better because it has a robust thread pool implementation.


Would suggest to look at how thread pooling works in ASIO before making conclusions.


ASIO does all that. And it compiles to machine code. What you're basically trying to argue is a well-written C++ application will be slower than a well-written java application. That's not going to happen -- at best they will be the same performance.


Netty certainly has a more impressive list of projects using it in real-world high-volume scenarios. That's probably what wins in this case...enough high volume end users such that you've gotten enough edge cases to tweak the software and iron out bugs.

If I had to pick something in the C/C++ space to implement a custom proxy, I would probably stick to something where I could find a similar list of established high volume real world users. Facebook's Proxygen, or some customized HAProxy maybe.


I was trying to say in a nice way that it's unlikely you've created something as fast as netty because a lot of people spent a lot of time optimizing it.

Netty is also a lot easier to extend and more portable


Netty was first released in 2016?

HAproxy was first released around 2001.

Probably irrelevant to the question of speed, but there is a comment in another thread about hype driven development on HN first page right now where a commenter states they prefer Netty to the alternatives apparently because the alternatives are older or more cumbersome to use, although I may have misread.

Edit: This was a hasty, dumb comment. Please accept my apologies. Netty is not new and I should have known better. For whatever irrational reason, I have a bias against Java and deliberately avoid it. I do know it helps professional programmers get things done easier and faster. I'm an HAproxy user and have probably developed an HAproxy bias.


Netty is hardly the example of a hype driven framework. I'm not sure when it was first released but I've found references from version 2.0 in 2004. It may be older than HAProxy.

Netty is a far more robust, faster, and easier to use framework for TCP proxies than the one the author cooked up and I'm getting downvoted like crazy for saying it.

It's also used internally by Google, Twitter, and netflix. It's embedded in the GRPC library, Cassandra's database driver, Play framework, and Vert.x among many others. Check their related projects page https://netty.io/wiki/related-projects.html

Netty is a phenomenal project, and had the author known about it, I doubt he would have spent the time writing his own TCP proxy.


> Netty was first released in 2016?

Netty 2 (the current version that underlies WebSphere and Vertx) was first released in 2004. This stuff is pretty well-bulletproofed. And a lot of folks who know how to write high-performance Java are naturally going to prefer Netty to the C++ alternatives (I am ambivalent; I can do either and I'd probably just use HAProxy to begin with because life is short) because you get competitive performance while ruling out entire classes of errors.


The great thing about battletested, JIT VMs like CLR/JVM/HiPE is that the user code can be compiled once and the providers can keep optimizing them in future versions and also with uptime. As long as memory usage, GC behavior and performance vs. optimized C is close, it's usually a win.


having had the brain blown out trying to read the documentation of haproxy I can't believe that there's something haproxy can't provide


are there benchmarks also available ? may you please share them for objective comparison with other implementations. thank you!


If it meets the guidelines, this might make a good 'Show HN'. Show HN guidelines: https://news.ycombinator.com/showhn.html


I felt kinda bad tearing it apart and only did so because it wasn't posted as a ShowHN.

It's cool to show people an example of networking using Boost but the article and post have no mention of this being alpha quality software.


Maybe the author was proud of shipping.


Darn...opening 2 sockets, read from one and write to the other, that qualifies as "show me mama...no hands" :) ?


It blows my mind that this is on the front page. It's not useful for anything, doesn't fully work, and can be replicated with a few lines on the console of BSD/Linux/windows.


Reminds of the time a friend wanted my opinion on a quote he received that it would take a week to open a socket in Unix using C code


Sounds about right. Takes even longer to close it. That's why you don't manage the socket yourself and just use boost ;)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: