Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Even with things like Python, CGI is pretty fast these days. If your CGI script takes a generous 400 milliseconds of CPU to start up and your server has 64 cores, you can serve 160 requests per second, which is 14 million hits per day per server. That's a high-traffic site.

That is, if your web service struggles to handle single-digit millions of requests per day, not counting static "assets", CGI process startup is not the bottleneck.

A few years ago I would have said, "and of course it's boring technology that's been supported in the Python standard library forever," but apparently the remaining Python maintainers are the ones who think that code stability and backwards compatibility with boring technology are actively harmful things, so they've been removing modules from the standard library if they are too boring and stable. I swear I am not making this up. The cgi module is removed in 3.13.

I'm still in the habit of using Python for prototyping, since I've been using it daily for most of the past 25 years, but now I regret that. I'm kind of torn between JS and Lua.






Here's the justification for removing cgi - https://peps.python.org/pep-0594/#cgi

Amusingly that links to https://peps.python.org/pep-0206/ from 14th July 2000 (25 years ago!) which, even back then, described the cgi package as "designed poorly and are now near-impossible to fix".

Looks like the https://github.com/jackrosenthal/legacy-cgi package provides a drop-in replacement for the standard library module.


That fails pretty hard at providing a rationale. Basically it says that CGI is an inefficient interface because it involves creating a new process! Even if that were true, "You shouldn't want to do such an inefficient thing" is very, very rarely a reasonable answer to a technical question like "How do I write a CGI script in Python?" or "How do I parse a CSV file in Python?"

There are certainly some suboptimal design choices in the cgi module's calling interface, things you did a much better job of in Django, but what made them "near-impossible to fix" was that at the time everyone reading and writing PEPs considered backwards compatibility to be not a bad thing, or even a mildly good thing, but an essential thing that was worth putting up with pain for. Fixing a badly designed interface is easy if you know what it should look like and aren't constrained by backwards compatibility.


Not to mention that if efficiency is a goal, probably Python isn't the language as well, so it is a very strange argument from Python developers.

It would have been a less strange argument 25 years ago, before the manycore era, when using Python involved less of a performance sacrifice. And there are still cases where Python is acceptably performant. However, the argument is from only 6 years ago, which makes it ridiculous.

Python is still a 10x or more performance sacrifice for anything that's actually CPU throughput limited. Or, alternatively, your VM hosting cost will be 10x larger on Python, than something top of the line, if your workload is CPU throughput limited. Whether you're actually CPU limited, and whether VM hosting costs is your largest cost, is a totally different question :-)

Typically, computation you do in Python code costs you about 40× as much CPU as if you did it in C. But with Numpy I usually see only about a 4× single-core slowdown after a little optimization work. Many database-backed web services are bottlenecked on the database or text template instantiation, neither of which are really related to Python's CPU efficiency.

As a side note, the most popular databases are getting only a tiny fraction of the available performance on current hardware. I wrote a couple of comments with more details about this a week ago: https://news.ycombinator.com/item?id=44408654

In the manycore world, Python's GIL makes some approaches to scaling across cores unavailable, though that is changing. But I don't think those are usually relevant to web server throughput, just (potentially) latency.


I work with Python quite a bit. Basically you either world where something like a web request takes 40ms to process instead of 20 and that just doesn't matter or you are in the situation where the request takes 2000ms instead of 500ms which is not as acceptable (but might be depending on the UI on top of this web request). At that point your first stop is something like numpy or another C or Rust module that will do the brunt of the CPU-intensive work. Past that, yeah you gotta look at different runtimes. But Python is so fast for prototyping that it might not even be worth it.

I haven't tried it yet but I do wonder about the feasibility of writing code in Python and then having an LLM transcode it to something like C, especially since I know C well enough to do what I want in that directly so I could check the resultant code by hand.


That is in agreement with my experience.

I've had much better luck with LLMs translating code from one language to another than with writing it from scratch.


We are not yet there, however I do1 see a future where the target language might as well be machine code, and slowly we will leave the current languages behind.

Moving stuff out of the standard library seems like a reason. However, I think this all is a weird mix of arguments. IMHO new process spawning is a feature and not a bug in the use cases where CGI is used. Most of the stuff is low traffic config interfaces or remote invocable scripts. There was this trend to move stuff to fcgi. We had tons of cases of memory leaks in long running but really seldomly used stuff like mailing list servers. To me cgi is the poor man's alternative to serverless. However, I also do not really completely understand why a standard library has to support it. I have bash scripts running using the Apache CGI mod.

I would have phrased it, "serverless is a marketing term for CGI scripts."

I have bash CGI scripts too, though Shellshock and bash's general bug-proneness make me doubt that this was wise.

There are some advantages of having the CGI protocol implemented in a library. There are common input-handling bugs the library can avoid, it means that simple CGI programs can be really simple, and it lets you switch away from CGI when desired.

That said, XSS was a huge problem with really simple CGI programs, and an HTML output library that avoids that by default is more important than the input parsing—another thing absent from Python's standard library but done right by Django.


I used bash cgi scripts all the time. Haven’t used a python cgi module but the main benefit of perl’s cgi module (also removed) is the query parameter parsing.

I think CGI.pm is still in CPAN? It was never in the Perl standard library, was it?

As mentioned elsewhere in the thread, the query parameter parsing is still in the Python standard library, just invoked differently.


Yes, it's still in CPAN, but previously it was in the standard library.

I didn't realize! Well, shame on P5P.

The main rationale is earlier in the PEP: https://peps.python.org/pep-0594/#rationale

Right. Isn't that insane? If I hadn't found it by means of having modules removed that my code depended on, I would have thought it was satire.

That policy, and the heinous character assassination the PSF carried out against Tim Peters, mean I can no longer recommend in good conscience that anyone adopt Python.


I really understand your frustration. Everyone developing in Python for a long time has felt it a bit too often when breaking changes (even between minor version updates) once again ruins the day.

But I also understand that the world is not perfect. We all need to prioritize all the time. As they write in the rationale: "The team has limited resources, reduced maintenance cost frees development time for other improvements". And the cgi module is apparently even unmaintained.

I guess a "batteries included" philosophy sooner or later is caught up by reality.

What do you mean by "character assassination" carried out against Tim Peters? Not anything in the linked article I presume?


He was banned for 3 months for opposing a change to the PSF bylaws that would allow the board to remove members with a simple majority vote.

https://www.theregister.com/2024/08/09/core_python_developer...

https://tim-one.github.io/psf/ban

https://chrismcdonough.substack.com/p/the-shameful-defenestr...


Alright. Another case when "code of conducts" trumps manners or actually being a grownup. It really is a shame. Happened to a friend of mine on a rather big technical mailing list just for arguing for something that some people disagreed to. It would be nice to get back to a system based on manners and respect. That system worked for years.

Maintenance costs... that only exists because other parts of Python do not keep a stable and backwards compatible API? Same problem as everywhere else, but particularly silly when there are different parts of the same organization that is ruining it for each other internally. Not that I think it is ever defensible. A small cost-saving in one place that is causing more extra work in many other places.

On top of that, backward incompatibility creates a cost for everyone using Python. I would prefer a slower rate of change and fewer breaking changes.

It does make me wonder whether Python is still the best choice for what I use it for, and whether I should be moving to something else.


They have limited resources because the inner circle chased away most active people in order to secure their own corporate positions (which hilariously failed since companies caught on and fired some of them anyway).

So the remaining people periodically launch some deprecation PEPs or other bureaucratic things in order to give the appearance of active development.


No, it was an unrelated scandal. I don't have my bookmarks handy at the moment, so hopefully you can find a link.

As for prioritizing, I think the right choice is to deprioritize Python.


Python is for everyone, not just the PSF Cabal. Like the Democratic party, there is a huge need for new leadership. We have all seen what a little brigading can do.

> Everyone developing in Python for a long time has felt it a bit too often when breaking changes (even between minor version updates) once again ruins the day

No, not everyone. I've been using Python as my primary language since 2000 (that's 1.5.2 days). It has been the least troublesome language that I work with, and I work with (or have worked with) a bunch (shell, perl, python, ruby, lua, tcl, c, objective-c, swift, java, javascript, groovy, go and probably others I'm forgetting).

Even all the complaints about the Python packaging ecosystem over the years... I just don't get it. Like, have you ever tried working with CPAN or Maven or Gradle or even, FFS, Ruby Gems/bundler? The Python virtual environment concept is easy to understand and pip mostly does its job just fine, and these days, uv makes all that even faster and easier.

Anywho, just dropping a contrarian comment here because maybe I'm part of the generally silent majority that is just able to use Python day in and day out to get their job done.


I have not had problem with Python packaging myself, so I agree with that bit.

I have not yet had major problems with breaking changes, but they do happen more often than the used to and it makes me nervous.


> There are only two kinds of languages: the ones people complain about and the ones nobody uses. --Bjarne Stroustrup

I've used CPAN, Maven, gem, and bundler, so I'm also always a little puzzled when people complain about Python's packaging system. However, I've also used npm, so I can kind of understand it.

Python was great in 02000, but some of the things that made it great then are gone now. Stability was one of those; simplicity another; reasonable performance a third; but the biggest issue is really social rather than technical. And I feel like we have alternatives now that we didn't have then.


I wouldn't recommend Python for new projects regardless. New scripts maybe, sure, but not new projects. Python has a lot of problems and it's just not really worth it because the experience it provides is not unique.

That's where I'm ending up, but I don't even want to use it for new scripts! What are its most important problems, as you see it?

Performance is a big one. GIL is still a thorn in Python's side, although somewhat ironically CGI side-steps that. Then there's the environment configuration, which is just one big footgun. And then there's error handling. I find python scripts with ticking time bombs all the time.

The maintenance burden of Python projects is just so much higher than it has any right to be. The language is neat, but not that neat. I think too often we think of performance as a sort of "tradeoff" for having a bad, unergonomic language, but that's not necessarily true. Plenty of languages have poor performance and are also a pain in the ass. We no longer live in a world where our options are C++ or scripting languages. We have mature environments with fantastic tooling. We have fast compilers with amazing error messages. We have great runtimes with more than adequate performance.


That makes sense to me. Which alternatives are you favoring, especially for the kind of prototyping stuff that is Python's strong point?

I do think there are some inherent tradeoffs in the space.


Go is a great choice these days pretty much exclusively due to the tooling. Turns out having a good compiler that's really, really fast is a big deal.

My main issue with prototyping as a concept is that it doesn't exist in most workplaces. Prototypes quickly devolve into applications. Discarding code is risky. Your best bet IMO is choosing a language that's ergonomic in the long run, because odds are you're in for the long run.

dotnet is another great choice because of the tooling and batteries included, although you do have to deal with a fairly slow compiler. Java is okay too, but Java is very restrictive and high-friction, which might not lend itself to prototyping.

In the world of scripting languages, ironically PHP is a decent choice. It has better progressive typing than Python and it's reasonably safe these days. We've sort of come full circle on PHP. The downside is that PHP programmers tend to throw everything in an array, especially when going fast. That hurts readability and the IDE a lot.

And then, of course, typescript and node. I don't like typescript. There's something about scripting languages with build steps that pisses me off. But, it's got a wide developer pool and it's not the worst language ever. Although there's a bit too much teeth-pulling IMO with typescript.


Thank you! There seems to be a TypeScript REPL at https://typestrong.org/ts-node/, so it's at least feasible...

Neat, thank you, didn't know about the TS REPL. I heard there's some talk to standardize TS in browsers, which would be nice.

If you rely on ‘cgi’ in your Python application, you are probably fine using 3.12 until mid-2028 when it stops being maintained (and probably beyond).

You guys are all really getting worked up over very little.


I don't even use Python, and even I've read Tim Peters' works and think highly of him! To have him so unceremoniously booted for upsetting a committee is absolutely insane.

This is a bit like Apple firing Steve Jobs for wearing sneakers to work because it violates some dress code.


Idk about the internal affairs, I just really don't like Python for web backend kind of things. It's taking them way too long to sort out parallelism and packaging, while NodeJS got both right from the start and gracefully upgraded (no 2->3 mess).

Also I used Python way before JS, and I still like JS's syntax better. Especially not using whitespace for scope, which makes even less sense in a scripting language since it's hard to type that into a REPL.


Node.js actually had no parallelism at the start, other than the ability to manually spawn new processes. Worker threads were only added in 2018 with v10.5.0, and only stabilized in 2019 with v12.

What Node.js had from the start was concurrency via asynchronous IO. And before Node.js was around to be JavaScript's async IO framework, there was a robust async IO framework for Python called Twisted. Node.js was influenced by Twisted[0], and this is particularly evident in the design of its Promise abstraction (which you had to use directly when it was first released, because JavaScript didn't add the async/await keywords until years later).

[0] https://nodejs.org/en/about


I was referring to the async io from the start, not worker threads. Other langs had their own frameworks for this, including Twisted for Python, but it really makes a difference having that stuff built-in and default.

Async IO is concurrency, not parallelism. And Node.JS is simply a framework for JavaScript, like Twisted is a framework for Python. If you compare a framework to a language, then of course the framework has more stuff built in; but that's hardly a fair comparison.

NodeJS is a separate runtime. It's not really the same language either since import syntax differs from web JS and the standard libs are different. Or in practical terms, you can't copy a lot of browser JS code and expect it to work in NodeJS as-is.

But that's beside the point. Performant web backends are way easier to deal with in NodeJS than in Python. I'm not comparing to Twisted because, even though it looks good, every Python backend I've ever seen was either plain Python or Django, which was also a mess compared to Express.


The amount of breakage in the Node land when doing major package upgrades far exceeds anything seen in Python. And it happens more often, too, because the stdlib is so thin you need way more packages to do anything interesting.

Not saying that Python is great, but Node is even worse.


Yeah, generally I feel like the indentation sensitivity was the right idea (the alternative evidently being worse compiler error messages, bugs like the `goto fail` vulnerability, and greater verbosity) but it causes real difficulties with the REPL, as well as with shell one-liners.

Jupyter fixes the REPL problem, and it's a major advance in REPLs in a number of other ways, but it has real problems of its own.


I agree async is a mess, but for web backends what is wrong with multi-process?

I do not think JS got it right. Node did, by doing async, but the reason for that was that JS did not do threads! It was making a virtue of a deficiency.

I love whitespace for scope.


JS didn't do threads for a reason, though. It's not that the people working on JS had never heard of threads. Java, which JS was named after, was pervasively multithreaded from the beginning. The Microsoft IE folks lived and breathed threads. Opera even had multithreaded JS in 02000 before they took it out.

JS didn't do threads because threads are an error-prone way to write concurrent software. Crockford was a huge influence on its development in the early 02000s, and he had been at Electric Communities; he was part of the capabilities cabal, centered around Mark S. Miller, who literally wrote his dissertation on why you shouldn't use threads and how to structure async code comprehensibly. Promises came from his work, by way of Twisted. Unfortunately, a lot of that work didn't get into JS until well after Node had defined a lot of its APIs.

But this wasn't "making a virtue of a deficiency". JS was intentionally exploring a different approach to structuring concurrent software. You might argue that that approach didn't pay off, but it wasn't some kind of an accident.


I would say that it would be better to offer both options, depending on what you are doing. Not so much for JS's original role in the browser, but threads can be the right approach for a lot of backend tasks.

TCL, which promoted the same approach for the same reasons long before Node, eventually added threading.


That's an excellent point about Tcl; I'm sure it was a strong influence on Brendan's design for JS, though maybe not as strong as Perl, Java, and Scheme.

We clearly need some way to take advantage of manycore, but I'm still not convinced that threading with implicitly shared mutable state is the right default. It isn't even what the hardware implements! Every core has its own cache! It's a better fit to the hardware than a single giant single-threaded event loop is, and I think that accounts for its curent dominance, but there are a lot of other possibilities out there, like transactional memory, explicit asynchronous message-passing interfaces, or (similarly) lots of tiny single-threaded event loops like Erlang (or, maybe, like Web Workers).


Web backends tend to have lots of concurrent connections doing IO-bound things. Processes or OS threads have too much overhead for this; they're more for when you want CPU parallelism with some communication between threads. Thread pools are still only a compromise. So JS with event loops made a lot of sense.

Greenthreading like in Golang is even better cause you get the advantages of OS threads, but that requires a runtime, which is why Rust didn't do that. And I guess it's hard to implement, cause Java didn't have it until very recently.


> Web backends tend to have lots of concurrent connections doing IO-bound things. Processes or OS threads have too much overhead for this

Depends what you are doing, how you are doing it, and how careful you need to be with resources.

The article is about the fact that it is often OK even when done in a particularly inefficient way.


Well yeah it still works with enough hardware, but what I said is why it's usually inefficient.

Java had green threads since the beginning. That's where the term "green threads" came from and why there's a yield() method.

An early Java version had greenthreads, but they were soon removed in like the year 2000. That's pretty much the reason all our Java code at work uses some kind of cooperative multitasking with that painful promises syntax (foo().then(...))

It seems Python has firmly reached its "Wikipedia notability" era, with busybodies that code little but discuss much dominating and ruining all progress. They make up stuff that reads insane to anyone doing actual work, like the "maintenance burden" of the cgi module:

https://github.com/python/cpython/commits/3.12/Lib/cgi.py

Turns out most maintenance this thing received is the various attempts of removing it.


The Python maintainers are removing the module _named_ cgi, but they're not removing the support for implementing CGI scripts, which is CGIHTTPRequestHandler in the http.server module.

All that was in the cgi module was a few functions for parsing HTML form data.


It would be very difficult indeed to make it impossible to implement CGI scripts in Python; you'd have to remove its ability to either read environment variables or perform stdio, crippling it for many other purposes, so I didn't think they had done that. Even if they removed the whole http package, you could just copy its contents into your codebase. It's not about making Python less powerful.

As a side note, though, CGIHTTPRequestHandler is for launching CGI programs (perhaps written in Rust) from a Python web server, not for writing CGI programs in Python, which is what the cgi module is for. And CGIHTTPRequestHandler is slated for removal in Python 3.15.

The problem is gratuitous changes that break existing code, so you have to debug your code base and fix the new problems introduced by each new Python release. It's usually fairly straightforward and quick, but it means you can't ship the code to someone who has Python installed but doesn't know it (they're dependent on you for continued fixes), and you can't count on being able to run code you wrote yourself on an earlier Python version without a half-hour interruption to fix it. Which may break it on the older Python version.


My mistake.

The support for writing CGI programs in Python is in wsgiref.handlers.CGIHandler .


Yes, thanks.

[flagged]


I can't wait to live in the world where openly admitting your mistakes is considered evidence of disingenuousness.


with the rise of docker, I assume the current gen of progress just assumes you're going to containerise your solutions. How many projects are you actively upgrading python for?

On the other hand, you can't carry old stuff to infinity.

You can't carry anything to infinity, by definition. Please argue constructively.

Then remove the hyperbole and be merry.

You can't carry everything to very long term horizons, especially for categories of "everything" whose user base is 2 people and 1 squirrel.

People who want otherwise should volunteer to maintain what they want kept.


The social problem is that maintaining backward compatibility with boring technology is considered harmful by the current Python community. There was an active campaign to extract pledges from popular Python library authors to break compatibility with Python 2 by a certain date. This means that if you are volunteering to maintain what you want kept, you had better not tell anyone about it.

Do you have any URIs or back up links? Encouragement to actively break compatibility sounds like "inciting murder" vs. a more tame "don't expend effort (go to medical school)" to save folks.

{ Oh, and before anyone jumps on me, this is only an analogy as it relates to freshman moral philosophy courses, not an attempt by me to over-dramatize - that is more the fault of said courses trying to engage 18 year olds. :-) I'm mostly interested in the active-passive details of the pledge campaign. }



Thanks!

Sure!

Why not? Was it broken? If it was, is it easily fixable?

Well, for one, security concerns, especially for an internet oriented component.

Secondly, you have to find a reliable maintainer or several.

A lot of people want stuff to be maintained indefinitely for them by unspecified "others".


You don't have to find a maintainer.

Not updating the system is usually a solution to such problems.

At best there is a nginx or an API in front that acts a reverse proxy to clean-up/normalize the incoming requests and prevent directly exposing the service.

Example: banks, airlines, hospitals, air traffic controllers, electricity companies, etc

All critical services that nobody wants to touch, as it works +/-


Guess what, all those places can just use Python 3.12 for as long as it's maintained and if they REALLY can't update, they can:

a) make the system air gapped

b) pay a Python consulting company to back port security fixes

c) hire a Python core dev to do the system, directly

OOOOR, they can just update to Python 3.13 and migrate to the equivalent Python package that's not part of the core. For sure they already use other Python packages already.

We're making a mountain out of a molehill, also on behalf of places that have plenty of money to spend if push comes to shove.


I think it may be easier to backport CGI to a new version of Python rather than backport security fixes

I agree.

It takes time, and this means that instead of working on something else, their time is locked on this.

The CGI standard hasn't changed… what changes did the module need?

I don't get it. Having a complaint about Python removing CGI from the stdlib is well and fine. But then you say you'd rather consider JS, which doesn't even have a std lib? Lua doesn't have a CGI module in stdlib either.

I think it's fine to not have functionality in the standard library if it can be implemented by including some code in my project. It's not fine to have a standard library that stuff disappears from over time.

Ruby has been removing stuff from stdlib for some time now. But "moving" is the correct word, because it is simply moved to a stand-alone gem, and with packaging situation in Ruby being so good, it feels completely seamless.

Whenever code is removed from the Java standard library it is announced ages ahead of time and then typically it becomes available in a separate artefact so you can still use it if you depended on it.

I wrote an online securities trading system in Java, with a little Jython. Java is reasonably good at stability, but I find it unappealing for other reasons, especially for prototyping. Kotlin might be okay.

Jython no longer works with basically any current Python libraries because it never made the leap to Python 3, and the Python community stigmatizes maintaining Python 2 compatibility in your libraries. This basically killed Jython, and from my point of view, Jython was one of the best things about Java.


Note that Jython was replaced by GraalPy, which does target Python 3

I hadn't heard about GraalPy! Thanks for the note.

It is fine though. CGI for python is one pip install away, as it is for the other languages you listed.

Most rational people are ok with code being removed that 99.99% of users have absolutely no use for, especially if it is unmaintained, a burden, or potentially contains security issues. If you are serious about cgi you’ll probably be looking at 3rd party modules anyway.


I don't think it's reasonable to expect that most python installs use pip

Why not?

I'm not sure what @LtWorf means, exactly, but one reason I can think of is that on Linux (Gentoo & Debian at least), the system package managers are not putting pip in place by default with python itself, the way they used to. The rationale, I believe, is to steer users towards using the system package manager or only doing ~/.local style things.

EDIT: So, you get threads like this https://stackoverflow.com/questions/65651040/what-is-the-rec... and so on


Presumably there’s a distro maintained python3-cgi package to install then which handles the need


Because it's much easier to use apt and install the module system-wide, and then I don't need to mess around with venv, requirements.txt, and there's someone that will fix any CVEs or malicious backdoors that get found out, automatically. Unlike what happens with venv.

It's much easier until you run into a package that doesn't have a .deb for it, like, well, CGI?

That is very much a self-inflicted wound, though. If you insist on not using the standard packaging solution for the language, you have to own the complications of that.


If you do insist on using pip, you will often find that it works very poorly or not at all if you are using an old version of either Python or some Python module. This is another aspect of the social backward compatibility problems in the current Python community.

https://packages.debian.org/sid/python3-legacy-cgi

Python has no standard packaging. They even deprecated and removed distutils (another terrible idea that caused a lot of busywork). The only way that python supports packages is via 3rd party external solutions.


Wheels are the standard Python packaging.

Source?

PEP 427

Describing a format and saying it's the standard are not one and the same.

Yeah, the distutils clusterfuck is another excellent reason someone might "insist on not using the standard packaging solution for the language": it's specifically and only the packages that did use the standard packaging solution for the language, distutils, that got broken when distutils was removed from the standard library.

I mean, I understand the desire to remove distutils. It sucked. It was the least Pythonic package in the whole Python standard library. But removing it was even worse, because it means you can't use old versions of most Python libraries with recent versions of Python.


You must be living in a different reality.

Yes mine. At work I don't use pip. We have thousands of servers all using python and none using pip or modules obtained via pip before imaging.

Personally… I don't use pip. Why? apt is there.


It’s easier for you to rely on the volunteers that maintain python packages for Debian. That’s fine, but if you need something niche like cgi, you might have to package it yourself.

Because pypi is managed by paid people?

Also I'm part of the python team in debian so I can package what's missing or update out of date things if I need.


I appreciate your work! The Python team in Debian has saved me an enormous amount of effort over the years.

I doubt I did anything you use ;)

The argument isn't about who has the standard lib, what I think Kragen is saying is that that the Python leadership has no qualms about removing functionality that people rely on and making up lame reasons to do so.

This feels like the spacebar heating argument. Sure, there’s probably a fraction of a percent that uses Python for CGI like that, but it’s not worth making all other support harder to have it.

> Lua doesn't have a CGI module in stdlib either.

Lua barely has any stdlib to speak of, most notably in terms of OS interfaces. I'm not even talking about chmod or sockets; there's no setenv or readdir.

You have to install C modules for any of that, which kinda kills it for having a simple language for CGI or scripting.

Don't get me wrong, I love Lua, but you won't get far without scaffolding.


Right, you need something more specific than Lua to actually write most complete programs in. The LuaJIT REPL does provide a C FFI by default, for example, so you don't need to install C modules.

But my concern is mostly not about needing to bring my own batteries; it's about instability of interfaces resulting from evaporating batteries.


Honestly, I'm not worried about the batteries. Thanks to FFI, you can just talk to libc, and vendor any native Lua code you need. (That's my approach for LÖVE.)

LuaJIT, release-wise, has been stuck in a very weird spot for a long time, before officially announcing it's now a "rolling release" - which was making a lot of package maintainers anxious about shipping newer versions.

It also seems like it's going to be forever stuck on the 5.1 revision of the language, while continuing to pick a few cherries from 5.2 and 5.3. It's nice to have a "boring" language, but most distros (certainly Alpine, Debian, NixOS) just ship each release branch between 5.1 and 5.4 anyway. No "whatever was master 3 years ago" lottery.


Node.js provides the defacto standard lib for JS backend and its got a good feature set.

That said these days I'd rather use Go.


Golang seems pretty comfortable from the stuff I've done in it, but it's not as oriented toward prototyping. It's more oriented toward writing code that's long-term maintainable even if that makes it more verbose, which is bad for throwaway code. And it's not clear how you'd use Golang to do the kind of open-ended exploration you can do in a Jupyter notebook, for example. How would you load new code into a Golang program that's already running?

Admittedly Python is not great at this either (reload has interacted buggily with isinstance since the beginning), but it does attempt it.


There are live reload utilities for Go but yeah it comes down to recompiling and restarting - which is quick but any state is gone.

I agree its not a rapid prototyping kind of language. AI assistance can help though.


Can you load new code into a Node.js program that's running?

`eval()` to run new code and `delete require.cache` (to reload a module)

Hmmm... Can you eval entire files?

That's essentially what browsers do.

This is a question about the details of the Lua interpretation, for which it is not relevant what browsers do, because they do not use Lua.

Yes.

Yeah high performance web used to be an art. Now it's find what you are doing that's stupidly wasteful that you did to ship fast, and stop doing that thing.

Your app could add almost no latency beyond storage if you try.


I rather stick with PHP or JS, due to having a JIT in the box for such cases.

Since I learnt Python starting in version 1.6, it has mostly been for OS scripting stuff.

Too many hard learnt lessons with using Tcl in Apache and IIS modules, continuously rewriting modules in C, back in 1999 - 2003.


I don't think the JIT will help that much as each request will need to be JITed again. Unless Node and PHP are caching JIT output

Yes, both do.

On the file system? CGI is a whole new process per request

edit: Looks like yes for Node JS. I can't tell for PHP as I keep getting results for optcache which is different and in memory.


Not sure what the current situation is, but PHP used to ship with APC, a shared memory cache, that enables this exactly. Even without filesystem overhead!

I still miss PHP's simple deployment, execution and parallelization model, in these over-engineered asyncy JavaScripty days.


My point wasn't using them with CGI, rather that they have JIT in the box, nodejs with V8, PHP has zend and its own JIT since Facebook efforts with HipHop and later Hack (HHVM), both with server support as well, no need for CGI approach.

Python got a JIT recently (an experimental / off by default feature in 3.13).

Finally, are still far away from the competition.

Also lets see the impact of Microsoft's Python team layoffs on it, given that CPython developers only started caring about performance due to Facebook and Microsoft, so far the JITs in Python have been largely ignored by the community.


Consider Perl. It's not quite as batteries-included as Python, but it is preinstalled almost everywhere and certainly more stable than JS and Lua. (And Python.)

You may be interested in an article I wrote in ;login: 23 years ago: https://www.usenix.org/publications/login/june-2002-volume-2...

At the time Perl was the thing I used in the way I use Python now. I spent a couple of years after that working on a mod_perl codebase using an in-house ORM. I still occasionally reach for Perl for shell one-liners. So, it's not that I haven't considered it.

Lua is in a sense absolutely stable unless your C compiler changes under it, because projects just bundle whatever version of Lua they use. That's because new versions of Lua don't attempt backwards compatibility at all. But there isn't the kind of public shaming problem that the Python community has where people criticize you for using an old version.

JS is mostly very good at backwards compatibility, retaining compatibility with even very bad ideas like dynamically-typed `with` statements. I don't know if that will continue; browser vendors also seem to think that backwards compatibility with boring technology like FTP is harmful.


Ha, fun bit of history! Many of the listed problems with Perl can be configured away these days. I don't have time for a full list, but as two early examples:

- `perl -de 0` provides a REPL. With a readline wrapper, it gives you history and command editing. (I use comint-mode forn this, but there are other alternatives.)

- syscalls can automatically raise exceptions if you `use autodie`.

Why is this not the default? Because Perl maintainers value backward compatible. Improvements will always sit behind a line of config, preventing your scripts from breaking if you accidentally rely on functionality that later turns out to be a mistake.

https://entropicthoughts.com/you-want-technology-with-warts

https://entropicthoughts.com/why-perl


That's a great read, thanks! And I didn't know about autodie, though I do use perl -de1 from time to time.

Perl feels clumsy and bug-prone to me these days. I do miss things like autovivification from time to time, but it's definitely bug-prone, and there are a lot of DWIM features in Perl that usually do the wrong thing, and then I waste time debugging a bug that would have been automatically detected in Python. If the default Python traceback doesn't make the problem obvious, I use cgitb.enable(format='text') to get a verbose stack dump, which does. cgitb is being removed from the Python standard library, though, because the maintainers don't know it can do that.

Three years ago, a friend told me that a Perl CGI script I wrote last millennium was broken: http://canonical.org/~kragen/sw/rfc-index.cgi. I hadn't looked at the code in, I think, 20 years. I forget what the problem was, but in half an hour I fixed it and updated its parser to be able to use the updated format IETF uses for its source file. I was surprised that it was so easy, because I was worse at writing maintainable code then.

Maybe we could do a better job of designing a prototyping language today than Larry did in 01994, though? We have an additional 31 years of experience with Perl, Python, JS, Lua, Java, C#, R, Excel, Haskell, OCaml, TensorFlow, Tcl, Groovy, and HTML to draw lessons from.


We can definitly do better than Perl. The easy proof is that modern Perl projects are supposed to start with a bunch of config to make Perl more sane, and many of them also include the same third-party libraries that e.g. improve exception handling and tweak the datetime functionality in the standard library.

One benefit Perl had that I think not many of the other languages do was being designed by a linguist. That makes it different -- hard to understand at first glance -- but also unusually suitable for prototyping.


What do you think a better design would look like?

It would be nice if there was one single incantation that you could use to basically get "Perl, but with sensible defaults for the modern age", rather than having to use individual packages to deal with specific idiosyncrasies one by one.

Basically something like "use strict" in JS.


This has been floated before and many people expect such a pragma to arrive at some point in the future.

For now, the relevant committees think some more experimentation and deprecation needs to happen before locking in the set of features to be considered modern.


I know this is probably not what you meant, but it's amusing to think about someone wondering if Perl has an equivalent to JS's 'use strict'!

Yep, I'm well aware that this is where it came from, so I guess I really should be asking for "use very strict" or something like that.

Come to think of it, that's a nice extensibility scheme, too - whenever you want to update it, just add another "very". ~


I wonder how many other language syntax things would be better in unary.

That article is a pretty good overview at the time.

Only one benchmark on one system, but over in day before yesterday's HN thread on this (https://news.ycombinator.com/item?id=44464272), I report a rather significant slowdown in Perl start up overhead: https://news.ycombinator.com/item?id=44467268 . Of course, at least for me, Python3 is worse than Python2 by an even larger factor and Python2 worse than Perl today by an even larger factor.

FWIW, in Nim, you can get a CGI that probably runs faster than the Go of this article with simply:

    import std/cgi                  # By default both ..
    for (key, val) in decodeData(): #.. $QUERY_STRING & POST
      if key == "something":
         do_something(val)
I don't know of a `cgitb` equivalent even in the Nimbleverse. Some of the many web frameworks in Nim like jester seem to have that kind of thing built into them, though I realize a framework is not the same as CGI (and that's one of the charms of CGI).

Hey, thanks!

What's the landscape like, for when you need to scale your project up? As in: your project needs more structure, third-party integrations, atomic deployments, etc - not necessarily more performance.

Python has Werkzeug, Flask, or at the heavier end Django. With Werkzeug, you can translate your CGI business logic one small step at a time - it's pretty close to speaking raw HTTP, but has optional components like a router or debugger.


The scary thing about CGI for me is more the shell insanity than the forking. Although I think after the shell shock RCE most servers probably switched to directly execing the CGI process.

I think they were already doing that. I'd be very surprised if Apache's mod_cgi in 01997 had been spinning up a whole bash process to spawn off the CGI script as a subprocess! Certainly I haven't seen the passels of useless bash processes in ps that you would get with that approach.

Indeed. There is no reason why CGI would need shells or scripting languages though, you can just write them in any programming language. It's not that hard; I wrote this pastebin clone in C: https://github.com/gsliepen/cbin/

It's not an issue with the actual CGI program. It's hard to make exec alone work the way people expect without doing something like exec('sh', '-c',...) so a lot of servers were doing that.

> If your CGI script takes a generous 400 milliseconds of CPU to start up

then that endpoint will have at least 400ms response times, not great


I have cgi scripts which take dozens of seconds to run. Load time isn’t a problem for many uses.

Zen 6c is about to get 256 Core Per Socket, 512 vCPU / Thread, or 1024 vCPU in a Dual Socket System. That is 2560 Request Per Second ( or PageView ), and this doesn't even include caching.

If I remember correctly that is about half of what StackExchange served on daily average over 8 servers. I am sure using Go or Crystal would have scale this at least 10x if not 20x.

The problem I see is that memory cost isn't dropping which means somewhere along the graph the memory cost per process together will outweight whatever advantage this has.

Still, sounds like a fun thing to do. At least for those of us who lived through CGI-Bin and Perl era.


People use crystal?

Kagi is built with Crystal.

Yeah but 400ms is unacceptable this days

That's intended as an unreasonably high upper bound. On my cellphone, in Termux, python3 -m cgi takes 430–480ms. On my laptop it takes 90–150ms. On your server it probably takes less.

I agree that tens of milliseconds of latency is significant to the user experience, but it's not always the single most important consideration. My ping time to news.ycombinator.com is 162–164ms because I'm in Argentina, and I do unfortunately regularly have the experience of web page loads taking 10 seconds or more because of client-side JS.


If the whole site takes 5 seconds to fully hydrate and load its 20megs of JS I'll gladly take a server side rendered page that has finished loading in a second.

I would rather take a server side rendered page that finishes loading in 100ms than 400ms.

On site like [1] Total Real Returns running on Crystal, response time could be sub 10ms but you are fundamentally limited by latency between you and server. Which could be 150ms if you visit to US server from Valeriepieris Circle, where 50% of population lives.

[1] https://totalrealreturns.com


Average website loading time nowadays is measured in seconds rather than milliseconds...

Now using the solution of OP, the JS loaded website now takes 5.4sec instead of 5sec, a 10% slowdown that the users have to pay, and that will increase server costs.

Surprised this is the top comment 12 hours in, should be intuitive to HN that 160 req/s on 64 cores is...not great!...and that's before the errors it takes to get that # (ex. all we're doing is starting the per-request executable)

Obviously you can do orders of magnitude better than 160 req/s! But if you only have tens of thousands of users or less, like all but the top thousand web sites in the world, there's at best very marginal benefit in doing so. It's spending effort to solve a problem that no longer exists.

I agree that the number I gave is erroneous, but not in the direction you imply. 400ms is extremely conservative, and it's not just starting an executable. I took my 100-millisecond estimate for the time to start up Python and `import cgi`, which loads 126 modules in the version I have here on my cellphone, and multiplied it by a safety factor of 4. Even on my cellphone it doesn't take that long, although `python3 -m cgi` does, because it loads all those modules and then runs a debug dump script. The time I measured for `python3 -m cgi` on my laptop is 90–150ms.

If all you are doing is starting a per-request executable, that will typically take more like 1 millisecond than 100 milliseconds.

Perhaps you meant to suggest that the actual logic triggered by the request would increase this 400ms number significantly.

Consider first the case where it would, for example because you do 200ms of computation in response to each request. In this case the best you could do by switching from CGI to a more efficient mechanism like FastCGI is a 3× increase in req/s. If that allows you to increase from 10 million requests per day on one server to 30 million, that could be worthwhile if you actually have tens of millions of requests per day. But it's kind of a marginal improvement; it saves you from needing two or three servers, probably.

Now consider the case—far more common, I think—where your computation per request is small compared to our hypothetical 400 ms Python CGI overhead. Maybe you do 20ms of computation: 80 million machine instructions or so, enough to tie an IBM PC up for a few minutes. In such a case, the total one-core request time has gone from 400ms to 420ms, so 400ms is a very good estimate, and your criticism is invalid.


> But if you only have tens of thousands of users or less

Even on HN#1, you get like two requests per second for maybe 18 hours. That's three orders of magnitude below 200M/day. Shifting the argument to say that no hobby project needs this is easy. PHP being in common use already proves that starting up and tearing down all context is fine for >99% of websites. But the context we're in is an article saying one can do 200M requests per day with CGI-bin, not whether 99% of sites need that. CGI-bin is simply wasteful if the process takes 400ms before it starts doing useful work, not to mention a noticeable amount of lag for users. (Thankfully TFA specifies they're not doing that but are speaking of compiled languages)

> your computation per request is [commonly] small compared to our hypothetical 400 ms Python CGI overhead. Maybe [that makes it go] from 400ms to 420ms

At "only" 420ms total request time, I'd still want to take that prototype code out of this situation. Might be as easy as wrapping the code so that the former process entrypoint (__main__ in python) gets, instead, called by some webserver within the same process (Flask as the first python example that comes to mind). Going from 420ms request handling time to slightly over 20ms seems rather worth an hour of development time to me if you have this sort of traffic volume


It's simply wasteful, just like spawning off a new process for every shell command. But we can afford to waste the thing it's wasting—unless it produces a noticeable amount of lag for users, as you're saying it would

The problem child for classic CGI is Windows, where process creation is 100x slower than any other POSIX implementation.

You can measure this easily, get a copy of Windows busybox and write a shell script that forks off a process a few thousand times. The performance difference is stark.


That's interesting. I hadn't thought about that. Still, fork() (plus _exit() and wait()) takes 0.7ms in Termux on my phone for small processes, as measured by http://canonical.org/~kragen/sw/dev3/forkovh.c.

Are you really saying that it takes 70ms on Microsoft Windows? I don't have an installed copy here to test.

Even if it does, that would still be about 15% of the time required for `python3 -m cgi`, so it seems unlikely to be an overriding concern for CGI programs written in Python, at least on manycore servers serving less than tens of millions of hits per day. Or does it also fail to scale across cores?


Let's consider ten thousand forks.

  $ cat winforktest.sh
  #!/home/busybox-1.35 sh

  i=1; while [ $i -lt 10000 ]; do (echo $i); i=$((i+1)); done
On Linux, this takes:

  $  cat /etc/redhat-release
  Red Hat Enterprise Linux release 8.10 (Ootpa)

  $ time ./winforktest.sh > /dev/null
  real    0m0.792s
  user    0m0.490s
  sys     0m0.361s
On Windows,

  c:\users\x>ver

  Microsoft Windows [Version 10.0.22631.5472]

  c:\users\x>busybox | busybox head -2
  BusyBox v1.37.0-FRP-5236-g7dff7f376 (2023-12-06 10:31:32 GMT)
  (mingw64-gcc 13.2.1-5.fc39; mingw64-crt 11.0.0-2.fc39; glob)

  c:\users\x>busybox sh

  ~ $ time ./winforktest > /dev/null
  real    3m 44.44s
  user    0m 0.32s
  sys     0m 4.90s
Windows is quite a bit slower, at least with Busybox.

I understand that WSLv1 presents a more efficient fork(), which is not really available to shims like Cygwin or mingw.


Thank you very much!

So, that does 19999 fork()s in 224 seconds, which works out to about 11 milliseconds per fork(). (On my Linux box, it's actually doing clone(), and also does a bunch of wait4(), rt_sigaction(), exit_group(), etc., but let's assume the fork() is the bottleneck.)

This is pretty slow, but it's still about 6× faster than the 70 milliseconds I had inferred from your "100× slower than any other POSIX".

Also note that your Red Hat machine is evidently forking in 39μs, which is several times faster than I've ever seen Linux fork.


Let me know how your tests go.

I did!

That’s more an argument against using Windows for a web server than against using CGI.

If you're necessarily going to go with Lua (not recommended, IMHO) you should at least try luau.

Thanks! What do you like and dislike about Luau and Lua in general?

> If your CGI script takes a generous 400 milliseconds of CPU to start up and your server has 64 cores, you can serve 160 requests per second

...and then you're wasting a 64-core server at 100% CPU load on just starting up and tearing down script instances, not yet doing any useful work. This is doing 160 startups per second, not requests per second

Would be a pretty big program for it to require 400ms on a fast CPU though, but the python interpreter is big and if you have one slow import it's probably already not far off



> I'm kind of torn between JS and Lua.

why lua?


Six months ago here, in response to a Lua-advocacy post that I didn't like very much, I wrote a comment where I gave an overview of what I like about Lua: https://news.ycombinator.com/item?id=42519070

Shortly after that, we had a lengthy thread about "how Lua compares to Python": https://news.ycombinator.com/item?id=42655158

Two years ago I commented https://news.ycombinator.com/item?id=38862372 listing Lua's worst flaws from my point of view.


> LuaJIT is motherfucking alien technology from the future

haha, keep writing, you're good


Thanks!

<PHP> I would like to have a word.

The PHP interpreter (+mods) don't take anywhere near 400ms to start up, even on very old CPUs, though? Not sure what you mean

It was a response to the person who complained about python's cgi being removed from stdlib and them wanting to go js or lua (both of which don't even have any supporting code for cgi)

The problem isn't that you can't write a CGI script in Python, or that it's hard to do so. The problem is that the Python standard library is being maintained under a policy of breaking new things intentionally every release, so nothing you wrote five or ten years ago works today. The `cgi` module is one thing that got broken, but there are a lot of them.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: