I used postgres for the first time last week, and was wondering, "WHERE IS UPSERT?" Ended up going with a "UPDATE table SET field='C', field2='Z' WHERE id=3;
INSERT INTO table (id, field, field2)
SELECT 3, 'C', 'Z'
WHERE NOT EXISTS (SELECT 1 FROM table WHERE id=3);" (as seen on http://stackoverflow.com/questions/1109061/insert-on-duplica... - although, not the chosen answer). Anyone come up with better or their own solutions?
I built the pgscript version of the some upsert example and use it as a stored function. I'm no pgsql pro, so sorry if this is total shit. Works for what I needed. This is controlling a list of access rules to different services for doling out access on the controller level globally, or down to an individual action inside that controller. If I rewrote this on newer PG, I'd store the permissions text as a JSON datatype.
CREATE OR REPLACE FUNCTION merge_privileges(key integer, data_controller text, data_permissions text)
RETURNS void AS
$BODY$
BEGIN
LOOP
-- first try to update the key
UPDATE privileges SET controller = data_controller, permissions = data_permissions WHERE user_id = key AND controller = data_controller;
IF found THEN
RETURN;
END IF;
-- not there, so try to insert the key
-- if someone else inserts the same key concurrently,
-- we could get a unique-key failure
BEGIN
INSERT INTO privileges(user_id, controller, permissions) VALUES (key, data_controller, data_permissions);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing, and loop to try the UPDATE again
END;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
I prefer to use CTE in postgres to do upserts, this is a non complete example, but you get the general idea:
WITH upsert AS (
UPDATE some_table
SET attrib = 'foo'
WHERE id = 123
RETURNING *
) INSERT INTO some_table ( id , attrib )
SELECT 123 as id, 'foo' as attrib
WHERE 123 NOT IN (SELECT id FROM upsert );
Nope, that's exactly what I use when I really need it, generalized to simple table/PK schemes.[1]
At first I wanted to try and do it within PL/pgSQL (e.g., from your SO link), but I couldn't figure out how to generalize it like I could with my the UPDATE ... INSERT WHERE. (I haven't really studied PL/pgSQL too hard yet.)
I wanted to update a record if it existed, otherwise use insert. So if a name is in the database, update the times I've seen it. If the name does not exist, insert. I was keeping track of the number of times a name is seen.
Without knowing much about your actual application (i.e. treat the following with utmost skepticism!), my instinct would be to separate the fact of observing a name from the fact of the observation count. I would keep a log table and a summary table like this:
CREATE TABLE namelog (
name TEXT
);
CREATE TABLE namesummary (
name TEXT,
seen int
);
Then each time you see a name, log it:
INSERT INTO namelog VALUES (%s)
Then on a regular basis, run a housekeeping/summary process like this:
CREATE TEMPORARY TABLE prevsummary AS
SELECT * FROM namesummary;
TRUNCATE namesummary;
INSERT INTO namesummary
SELECT coalesce(a.name, b.name) as name,
coalesce(a.seen,0) + coalesce(b.seen, 0) as seen
FROM
prevsummary a
FULL OUTER JOIN
(SELECT name, count(*) as seen
FROM namelog
GROUP BY name) b
ON a.name=b.name
;
DELETE FROM namelog;
This of course is grossly space-inefficient, but making the logging process INSERT-only should enable a great degree of concurrency - and you can tune the summary process to run as often as necessary to keep memory/disk usage within your available resources.
If concurrency isn't an issue, then your initial solution is probably fine.
Very nice approach, I didn't think of trying to log names with only INSERT. I see how it handles concurrency. My solution is likely to fail if my database gets bombarded...My app was supposed to be a simple one I made for a few friends and also for me to learn a few things in node / postgres.
For more detail, here is my initial table:
CREATE TABLE names(name_id serial, name text, count integer)
I was planning to access the name count a good numbers of times, so I went with a simple SELECT from names. I don't necessarily need "on demand" counts so having the summary process run 5 or 10 mins would be a better!