Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Waiting for SQL:202y: Group by All (eisentraut.org)
52 points by ingve 19 hours ago | hide | past | favorite | 32 comments




Let me reference fields as I create them:

  select xxxxx as a
       , a * 2 as b

This will be great! One of the things ClickHouse has had since 2016.

SQL needs to have `select` as the _last_ part, not the first. LINQ has had this for 2 decades by now: "from table_a as a, table_b as b where ... select a.blah, b.duh".

The Pipe Query Syntax in GoogleSQL implements this elegantly as well:

https://docs.cloud.google.com/bigquery/docs/reference/standa...


This is not relevant to GP's point. This is a separate topic, which... I don't really care, but I know a lot of people want to be able to write SQL as you suggest, and it's not hard to implement, so, sure.

Though, I think it might have to be table sources, then `SELECT`, then `WHERE`, then ... because you might want to refer to output columns in the `WHERE` clause.


WHERE clauses are pushed down into the query planner before the SELECT list is processed, that’s why HAVING exists.

The logical order, in full, is:

FROM

WHERE/JOIN (you can join using WHERE clauses and do FROM a,b still)

SELECT

HAVING


That's the order in which the processing happens, but this doesn't need to be reflected in the language. The language has this ordering so it sounds like a natural language which SQL was invented for.

Ideally, it needs to be "from", then arbitrary number of something like `let` statements that can introduce new variables, maybe interspersed with where-s, and then finally "select".

"select" can also be replaced with annotations, something like: `from table_1 t1 let t1.column_1 as @output_1 where ...` and then just collect all the @-annotated variables.

I need to write a lot of SQL, and it's so clumsy. Every time I need a CTE, I have to look into the documentation for the exact syntax.


> Ideally, it needs to be "from", then arbitrary number of something like `let` statements

Isn't that what a CTE is?


Also in the Kusto Query Language (KQL) as used by Azure Log Analytics.

What about reusing a CTE? Let me import a CTE definition so that it can be used throughout my app, not just in the current context.

So, why not a SORT BY ALL or a GROUPSORT BY ALL, too? Not always what you want (e.g., when you're ranking on a summarized column), but it often alphabetic order on the GROUP BY columns is just what the doctor ordered! :-)

The working group also discussed ORDER BY ALL, but for some reason most participants really did not like it.

Snowflake has that, once you start using it, it's painful to go back.

Also just let me reference the damn alias in a group by, FUCK

The problem with this and similar requests is that it would change the identifier scoping in incompatible ways and therefore potentially break a lot of existing SQL code.

At least in PostgreSQL, both by alias and ordinal are possible:

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:15 PST 2025]
  > create table test (a_key integer primary key, a_group integer, a_val numeric);
  CREATE TABLE
  Time: 3.102 ms

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:25 PST 2025]
  > insert into test (a_key, a_group, a_val) values (1, 1, 5.5), (2, 1, 2.6), (3, 2, 1.1), (4, 2, 6.5);
  INSERT 0 4
  Time: 2.302 ms

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:58 PST 2025]
  > select a_group AS my_group, sum(a_val) from test group by my_group;
   my_group | sum
  ----------+-----
          2 | 7.6
          1 | 8.1
  (2 rows)
  
  Time: 4.124 ms
  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:03:15 PST 2025]
  > select a_group AS my_group, sum(a_val) from test group by 1;
   my_group | sum
  ----------+-----
          2 | 7.6
          1 | 8.1
  (2 rows)
  
  Time: 0.360 ms

I think it should be not only in GROUP BY, but in every context, e.g., inside expressions in SELECT, WHERE, etc.

PostgreSQL and DuckDB support this, which makes MSSQL feel like a dinosaur in context.

Some do. It would also be nice to reference by ordinal number similar to order by. Very handy for quick and dirty queries. I can see the issue though that people start to lean on it too much.

this seems to ignore the fact that you can group by a column that isn't in the select statement.

it's not something that i've found a particular use for, but it IS a thing you can do.


BigQuery has that and I've been loving using it since they introduced it


What's wrong with GROUP BY 1,2,3?

Not directly related, but I saw this project recently of a data language by google which is quite cool https://www.malloydata.dev/

What? No! I want GROUP BY * and more importantly GROUP BY mytable.*

would be nice

SELECT * EXCEPT(col_name) next please.

This was also discussed at the last SQL WG meeting but was postponed for further refinement. But it’s likely to be added soon.

That might be nice for manual experimentation, but for application use, this seems brittle compared to specifying the columns you really want to have and process.


Yes it needs to be in the standard though.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: