Waiting for SQL:202y: Group by All

elygre · 2025-11-16T20:08:21 1763323701

Let me reference fields as I create them:

  select xxxxx as a
       , a * 2 as b

zX41ZdbW · 2025-11-16T21:49:58 1763329798

This will be great! One of the things ClickHouse has had since 2016.

cyberax · 2025-11-16T23:29:49 1763335789

SQL needs to have `select` as the _last_ part, not the first. LINQ has had this for 2 decades by now: "from table_a as a, table_b as b where ... select a.blah, b.duh".

agnosticmantis · 2025-11-17T04:30:11 1763353811

The Pipe Query Syntax in GoogleSQL implements this elegantly as well:

https://docs.cloud.google.com/bigquery/docs/reference/standa...

cryptonector · 2025-11-17T00:19:54 1763338794

This is not relevant to GP's point. This is a separate topic, which... I don't really care, but I know a lot of people want to be able to write SQL as you suggest, and it's not hard to implement, so, sure.

Though, I think it might have to be table sources, then `SELECT`, then `WHERE`, then ... because you might want to refer to output columns in the `WHERE` clause.

snuxoll · 2025-11-17T02:21:01 1763346061

WHERE clauses are pushed down into the query planner before the SELECT list is processed, that’s why HAVING exists.

The logical order, in full, is:

FROM

WHERE/JOIN (you can join using WHERE clauses and do FROM a,b still)

SELECT

HAVING

1718627440 · 2025-11-17T10:48:55 1763376535

That's the order in which the processing happens, but this doesn't need to be reflected in the language. The language has this ordering so it sounds like a natural language which SQL was invented for.

cyberax · 2025-11-17T01:33:42 1763343222

Ideally, it needs to be "from", then arbitrary number of something like `let` statements that can introduce new variables, maybe interspersed with where-s, and then finally "select".

"select" can also be replaced with annotations, something like: `from table_1 t1 let t1.column_1 as @output_1 where ...` and then just collect all the @-annotated variables.

I need to write a lot of SQL, and it's so clumsy. Every time I need a CTE, I have to look into the documentation for the exact syntax.

1718627440 · 2025-11-17T10:49:42 1763376582

> Ideally, it needs to be "from", then arbitrary number of something like `let` statements

Isn't that what a CTE is?

jiggawatts · 2025-11-17T09:56:12 1763373372

Also in the Kusto Query Language (KQL) as used by Azure Log Analytics.

sixtram · 2025-11-17T05:45:26 1763358326

What about reusing a CTE? Let me import a CTE definition so that it can be used throughout my app, not just in the current context.

theodpHN · 2025-11-17T00:14:46 1763338486

So, why not a SORT BY ALL or a GROUPSORT BY ALL, too? Not always what you want (e.g., when you're ranking on a summarized column), but it often alphabetic order on the GROUP BY columns is just what the doctor ordered! :-)

petereisentraut · 2025-11-17T11:29:10 1763378950

The working group also discussed ORDER BY ALL, but for some reason most participants really did not like it.

cm2187 · 2025-11-16T21:42:44 1763329364

Snowflake has that, once you start using it, it's painful to go back.

Exuma · 2025-11-16T19:27:06 1763321226

Also just let me reference the damn alias in a group by, FUCK

petereisentraut · 2025-11-17T11:32:19 1763379139

The problem with this and similar requests is that it would change the identifier scoping in incompatible ways and therefore potentially break a lot of existing SQL code.

sbuttgereit · 2025-11-16T20:05:54 1763323554

At least in PostgreSQL, both by alias and ordinal are possible:

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:15 PST 2025]
  > create table test (a_key integer primary key, a_group integer, a_val numeric);
  CREATE TABLE
  Time: 3.102 ms

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:25 PST 2025]
  > insert into test (a_key, a_group, a_val) values (1, 1, 5.5), (2, 1, 2.6), (3, 2, 1.1), (4, 2, 6.5);
  INSERT 0 4
  Time: 2.302 ms

  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:02:58 PST 2025]
  > select a_group AS my_group, sum(a_val) from test group by my_group;
   my_group | sum
  ----------+-----
          2 | 7.6
          1 | 8.1
  (2 rows)
  
  Time: 4.124 ms
  localhost(from SCB-MUSE-BOXX).postgres.scb.5432 [Sun Nov 16 12:03:15 PST 2025]
  > select a_group AS my_group, sum(a_val) from test group by 1;
   my_group | sum
  ----------+-----
          2 | 7.6
          1 | 8.1
  (2 rows)
  
  Time: 0.360 ms

zX41ZdbW · 2025-11-16T21:50:55 1763329855

I think it should be not only in GROUP BY, but in every context, e.g., inside expressions in SELECT, WHERE, etc.

kermatt · 2025-11-16T21:31:35 1763328695

PostgreSQL and DuckDB support this, which makes MSSQL feel like a dinosaur in context.

mberning · 2025-11-16T19:30:57 1763321457

Some do. It would also be nice to reference by ordinal number similar to order by. Very handy for quick and dirty queries. I can see the issue though that people start to lean on it too much.

parpfish · 2025-11-17T01:54:44 1763344484

this seems to ignore the fact that you can group by a column that isn't in the select statement.

it's not something that i've found a particular use for, but it IS a thing you can do.

chewxy · 2025-11-16T22:57:41 1763333861

BigQuery has that and I've been loving using it since they introduced it

elchief · 2025-11-16T22:13:47 1763331227

duckdb has it

https://duckdb.org/docs/stable/sql/query_syntax/groupby

Inviz · 2025-11-16T23:57:44 1763337464

What's wrong with GROUP BY 1,2,3?

oulipo2 · 2025-11-16T22:42:37 1763332957

Not directly related, but I saw this project recently of a data language by google which is quite cool https://www.malloydata.dev/

wvbdmp · 2025-11-17T01:16:17 1763342177

What? No! I want GROUP BY * and more importantly GROUP BY mytable.*

dorianmariecom · 2025-11-16T20:21:45 1763324505

would be nice

SigmundA · 2025-11-16T23:34:40 1763336080

SELECT * EXCEPT(col_name) next please.

petereisentraut · 2025-11-17T11:26:58 1763378818

This was also discussed at the last SQL WG meeting but was postponed for further refinement. But it’s likely to be added soon.

1718627440 · 2025-11-17T10:46:56 1763376416

That might be nice for manual experimentation, but for application use, this seems brittle compared to specifying the columns you really want to have and process.

azurezyq · 2025-11-17T01:51:51 1763344311

BigQuery has it! https://docs.cloud.google.com/bigquery/docs/reference/standa...

SigmundA · 2025-11-17T03:10:33 1763349033

Yes it needs to be in the standard though.