Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's really only one thing I dislike about pgsql. It seems overly difficult to return two disjointed result sets from a stored procedure/function. This is fairly common and easy in MSSQL. It also is easy in MySQL. Something like this:

SELECT int_col_1, int_col_2 FROM table1

SELECT varchar_1, varchar_2, int_col_2 FROM table2

In pgsql if the columns are the same type and number then we can use a union, otherwise the suggested alternative seems to be to just make 2 queries.



What is a good example of when you'd want to do that? I've been doing database work on the 4 majors for about 15 years and this is the first I've ever heard of somebody wanting to return two totally different result sets from a single procedure.

I have no issue with putting logic in your database. In many cases it's the ideal solution...but your example is something I've never even heard mentioned in 15 years much less advocated for or used as criticism for not being available.


One straightforward example would be feeding some kind of dashboard. Imagine a time-consuming query using a complex series of CTEs to build up or filter or reduce a very large set of rows, then running a bunch of "final" selects to transform and carve up that set in different ways to feed a bunch of tables and graphs. Obviously this could be done in various ways, but making one request to a procedure and receiving one response with multiple result sets can be a very pragmatic (and performant) approach.


One place I worked used a disconnected data set model. The idea was a single stored procedure returned multiple result sets all relating to a single business entity; specifically, an insurance quote, along with all the drivers, vehicles, accidents, convictions, etc. The entity was locked by inserting a row in a table.

Thereafter, the application didn't communicate with the database until it was time to save the record. The disconnected data set was small enough to store in shared session store, or even in a cookie (encrypted, naturally).

This architecture had a number of interesting knock-on effects. The entire state of a client's conversation with the server was tiny and entirely encapsulated by this disconnected data set, so you could record and play back each request to recreate a bug.

It didn't have to use multiple result sets from the stored procedure, but it did save a bunch of round trips.


>... or even in a cookie (encrypted, naturally).

I won't comment on anything else because I don't know the case, but you should never put user data in cookies. Even with encryption you are exposing it unnecessarily, not to mention you have to double check data integrity... Why? Cookies are not meant for this.


One query, multiple result sets--so you can get a customer and the customer's orders in one shot without any join awkwardness. It's part of the jdbc spec and somewhat common in mssql land.


I know for a fact we have reports which do this. I didn't write them though, our old data guy did.


> What is a good example of when you'd want to do that?

When you've got a long round trip.


Or when you can do common subexpression elimination, where you can compute something such as a temporary table, then use that to produce result R, then use it to produce result S.


So this would be used more in BI/data mining type workflows?

That would explain why I haven't had to deal with it.


This is my number one gripe about PGSQL. It's never had "real stored procedures", a feature which has been on the TODO list for a long, long time and nobody is even working on it. See Server-Side Languages/Implement stored procedures items and associated notes/threads here - http://wiki.postgresql.org/wiki/Todo

Beyond that, my other gripes about PGSQL are all surrounding the fact that it's simply not as polished or as widely used as SQL Server. I can't just drop PG in at some business site and expect the IT staff to be able to handle maintenance.

The Postgres GUI tools are extremely lackluster compared to the standard Microsoft SQL Server tools. Even the third party apps like Navicat are simply not as robust or as trustworthy as SQL Server Management Studio and the SQL Server Profiler. And there is literally nothing out that that compares with SQL Server Data Tools.


What's the difference between "real" stored procedures and postgres functions?


I would imagine PG doesn't place much emphasis on GUI tools, being primarily hosted on Linux/BSD... Does your criticism apply to CLI tools too?


> In pgsql if the columns are the same type and number then we can use a union, otherwise the suggested alternative seems to be to just make 2 queries.

If the two commands have nothing to do with each other then yes that's usually the best option. That does open you up to understanding transaction semantics if you want to ensure both see exactly the same snapshot.

Another option is to use JSON as the great normalizer. Any row can be converted to JSON via row_to_json(...) or manually via the more low level functions. That allows you to "stack" multiple, distinct, JSON responses atop each other. Probably a bad idea 99% of the time to use something like that but it can come in handy that one time you want to jam things through in one query.


You didn't state a single reason why you think that grabbing multiple resultsets in a single trip to the database is a bad idea though...

The only real argument I've ever seen against it has to do with "Separation of Concerns". Some people don't want any logic in their database. However, there's never any reason why.

(I bet you've UPDATEd multiple tables in one query though.)

Reasons are often given, sure...but they're almost never based on that persons experience because PGSQL users don't have this feature. So, they'll never know just how awesome it is to have real stored procedures like TSQL has.

Anyway, it's not about "jamming" things into one query. It's about composability and PGSQL is lacking it.


> You didn't state a single reason why you think that grabbing multiple resultsets in a single trip to the database is a bad idea though...

It's a bad idea for the same reason that having a function that returns back multiple values is a bad idea. Yes it's useful sometimes but on the whole it ends up being confusing. From a result set processing perspective it's also a pain in the ass as your app code is now tied to handling multiple results in a particular order, i.e., more/tighter coupling.

> (I bet you've UPDATEd multiple tables in one query though.)

You'd win that bet. There are a lot of ways to do this and it's a legit operation because you want N things to change at once (where N isn't in the same table).

> Reasons are often given, sure...but they're almost never based on that persons experience because PGSQL users don't have this feature. So, they'll never know just how awesome it is to have real stored procedures like TSQL has.

> Anyway, it's not about "jamming" things into one query. It's about composability and PGSQL is lacking it.

What's your goto "killer example" of what this could be used for?

I've used SQL Server quite a bit[1] and while it's not a terrible database, on the whole I don't like it. Rattling off at random: lack of MVCC (sure they have it now but it's not the default and still has warts), explicit locking that drives you nuts, defaulting to case insensitive string comparison, lack of built in functions. The inane licensing options don't help either.

I'd take Postgres over SQL Server any day of the week and twice on Sundays (i.e. side projects).

[1]: Officer I swear! It was already like that when I got there!


> It's a bad idea for the same reason that having a function that returns back multiple values is a bad idea.

OK, what's the reason then? I didn't hear you give one.

Every PGSQL query already return multiple values because you get a command status along with the result set.

Every Golang call returns 2 values. They even make it very easy for you to return even more values. So what is your reasoning?


He did give you one, you just aren't listening.

Having a function return the minimum and maximum of an array is returning one value. The value is an array of size 2. Having a function return an array of the minimum, maximum and the number of states in a given country is returning two values.

You are asking PG to return multiple unrelated data sets. The number of use cases that this is useful and good is pretty much for displaying reports. Do two queries. The problem is already solved.


"You are asking PG to return multiple unrelated data sets

Who talked about 'unrelated'?

Just as one can return min, max, average and standard deviation of column C in table T in one call, one can, for example, return that, the top ten records with highest C value and the records with the top ten most common values for C.

Yes, you can do two queries, but doing them in one go can be faster, some times much faster.

The risk, of course, is that one gets "one stored procedure per screen", but as long as one is aware of that, I don't see anything inherently wrong with that.


> Do two queries. The problem is already solved.

I can think of a handful of scenarios where the second result set needs to be based on something from the first, and combining them can make some sense in that case, but it's a stretch. For "long running process" or "slow query time" arguments... eh... that may be a problem, but it'd be very much an edge case for most users of the DB engine in question.

Dunno if Oracle supports this or not, but pretty sure MySQL doesn't and PG doesn't. MSSQL is the only DB engine I know that supports multiple result sets from one stored procedure. If you want to tie yourself to features of just one DB, that's great - most people do in one way or another, just don't expect everyone else to support that particular feature or syntax.


> OK, what's the reason then? I didn't hear you give one.

Read the rest of that paragraph. You only quoted the first sentence.

> Every PGSQL query already return multiple values because you get a command status along with the result set.

And most generic database driver interfaces (ex: JDBC) return that out of band. Errors get turned into exceptions and update counts get returned back as integers. That's far from perfect (particularly with JDBC) but it's usually done to fit the programming paradigms of the driver's native language.

> Every Golang call returns 2 values. They even make it very easy for you to return even more values. So what is your reasoning?

That's because Golang doesn't have exceptions. Every function that could fail needs to indicate if there's an error. It's like errno in C but per-function.


He gave an opinion. That is not a reason.


> It's a bad idea for the same reason that having a function that returns back multiple values is a bad idea.

Every single language that lacks multiple return values has some kind of hack to compensate, like varying parameters, that breaks encapsulation and makes code inconsistent.


You could also wrap your output in a struct, or tuple if so inclined


On a tuple, yes. That's how most language implement returning multiple values, but you need good accessors. On a struct you'll need to declare it first.


You can write a plpgsql function that returns a setof refcursor (pointer to cursors). You can then iterate these cursors to retrieve the results from an arbitrary number of tables.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: