Very nice, but this is more of a demo code for boost::asio than something that is production-worthy.
For example, it doesn't relay FINs between connections, doesn't disable Nagle algorithm on the upstream socket, doesn't wait for pending writes to complete before tearing down the connection, doesn't handle congestion at all (potentially leading to unbound memory use), etc.
> nginx and apache only do HTTP(S). Don't support TCP.
That's not entirely accurate. For nginx, for example, there's the ngx_stream_core_module[1]. It's not particularly well-documented though, so I don't know much about it.
It's totally experimental. It's only been made a few months ago. It's not even compiled into the official distribution, you gotta compile nginx yourself with special flags.
For the sake of god, please don't have your entire site run on experimental software like that. ^^
stud is pretty approachable and production worthy; I assume hitch [1] remains so, if you dropped the TLS termination, it would just be a TCP proxy as well.
As someone who has also written a TCP proxy (along with many others...), after thoroughly reading the page I'm still unsure of how exactly this is "high performance". It is also curious that, despite the fact that it uses a separate library for networking, the source is already quite a bit longer than some other proxies which don't. I found the explanation overly complex.
Around half the code in this implementation could probably be removed by the realisation that, after a connection is established, both ends are completely symmetric: all it needs to do is try to read from A and write to B, then try to read from B and write to A. If A closes, close B. If B closes, close A.
It's shutdown() writes on EOFs, not close, with refcounting to also do close() when EOFs were detected on both directions (I also have written a TCP proxy). But yeah, TCP proxies are trivial, would be more interesting to see something like a tunneling proxy that sends data over multiple connections to maximize performance.
Hey, I am looking for a way of packing multiple TCP streams over a single TCP connection to a backend server. I can unpack them in application logic if necessary. Do you know what that is called? Kinda like SCTP
The quick and dirty approach would be an ssh tunnel or vpn. There's also GRE tunnels which would be faster, but not encrypted. Any of these would work without having to change the application.
Probably, though, it's best to start with why you would want to do that. You could be, for example, trying to solve something where a pub/sub model would work better. Or just two separate apps, on different ports. What's driving the idea of multiplexing?
As a sidenote if you don't have a requirement to compile with C++03 I would recommend using the standalone asio library free from boost[1]. The only things you need to modify are the includes and namespaces.
HAProxy is 90k loc while this is 300 loc so they really are different beasts. I would say HAPRoxy is a great general purpose proxy that has most of the features you want while this proxy server is a MVP proxy you can grow off of if you want to do something that HAProxy can't provide.
And yet by "pulling in the JVM" (which is a rounding error in 2017), it's remarkably easy to generate a proxy that actually works as people expect a TCP proxy to work. This doesn't even so much as wait for connections to drain before terminating. This "300-line snippet" (which is reliant on Asio, which is not 300 lines by any stretch) is, as near as I can tell, not viable in real-world conditions and the spirited defenses of it that put it as competitive with real-world, battle-tested solutions are profoundly weird.
(Similarly, because I don't really care about JVM versus not, HAProxy--which I'd probably use for something like this because I have better things to do with my programming time--is 90Kloc because it has stuff to do and does it right. Simple is only better if simple can actually get the job done.)
ASIO does all that. And it compiles to machine code. What you're basically trying to argue is a well-written C++ application will be slower than a well-written java application. That's not going to happen -- at best they will be the same performance.
Netty certainly has a more impressive list of projects using it in real-world high-volume scenarios. That's probably what wins in this case...enough high volume end users such that you've gotten enough edge cases to tweak the software and iron out bugs.
If I had to pick something in the C/C++ space to implement a custom proxy, I would probably stick to something where I could find a similar list of established high volume real world users. Facebook's Proxygen, or some customized HAProxy maybe.
I was trying to say in a nice way that it's unlikely you've created something as fast as netty because a lot of people spent a lot of time optimizing it.
Netty is also a lot easier to extend and more portable
Probably irrelevant to the question of speed, but there is a comment in another thread about hype driven development on HN first page right now where a commenter states they prefer Netty to the alternatives apparently because the alternatives are older or more cumbersome to use, although I may have misread.
Edit: This was a hasty, dumb comment. Please accept my apologies. Netty is not new and I should have known better. For whatever irrational reason, I have a bias against Java and deliberately avoid it. I do know it helps professional programmers get things done easier and faster. I'm an HAproxy user and have probably developed an HAproxy bias.
Netty is hardly the example of a hype driven framework. I'm not sure when it was first released but I've found references from version 2.0 in 2004. It may be older than HAProxy.
Netty is a far more robust, faster, and easier to use framework for TCP proxies than the one the author cooked up and I'm getting downvoted like crazy for saying it.
It's also used internally by Google, Twitter, and netflix. It's embedded in the GRPC library, Cassandra's database driver, Play framework, and Vert.x among many others. Check their related projects page https://netty.io/wiki/related-projects.html
Netty is a phenomenal project, and had the author known about it, I doubt he would have spent the time writing his own TCP proxy.
Netty 2 (the current version that underlies WebSphere and Vertx) was first released in 2004. This stuff is pretty well-bulletproofed. And a lot of folks who know how to write high-performance Java are naturally going to prefer Netty to the C++ alternatives (I am ambivalent; I can do either and I'd probably just use HAProxy to begin with because life is short) because you get competitive performance while ruling out entire classes of errors.
The great thing about battletested, JIT VMs like CLR/JVM/HiPE is that the user code can be compiled once and the providers can keep optimizing them in future versions and also with uptime. As long as memory usage, GC behavior and performance vs. optimized C is close, it's usually a win.
It blows my mind that this is on the front page. It's not useful for anything, doesn't fully work, and can be replicated with a few lines on the console of BSD/Linux/windows.
For example, it doesn't relay FINs between connections, doesn't disable Nagle algorithm on the upstream socket, doesn't wait for pending writes to complete before tearing down the connection, doesn't handle congestion at all (potentially leading to unbound memory use), etc.