Returning data in PostgreSQL

In or­der to get the best per­for­mance from our data­base en­gine, most of the times, we have to take ad­van­tage of its ex­clu­sive fea­tures. This has the dis­ad­van­tage of get­ting far­ther away from stan­dard, AN­SI code, and in the case of RDBM en­gi­nes, this al­so mean­s, most of the ORMs will not work.

Be­sides that, it could some­times be worth ex­plor­ing what we can gain by adapt­ing to the en­gine.

This is some­thing learnt from the feed­back of my talk giv­en at Eu­roPy­thon 2016. Af­ter the talk, an at­tendee told me that he liked it, and asked me about the code por­trayed in one of the slides, the one about dec­o­ra­tors with the ex­am­ple of mov­ing some records from one ta­ble to an­oth­er. He asked me if I ran code like that one in pro­duc­tion, to which I an­swered no: the code is en­tire­ly fic­tion­al, and it was done for the mere pur­pos­es of the ex­am­ples, so I just need­ed an ex­cuse for a task that could be done in two SQL state­ments, so the ex­am­ple with the dec­o­ra­tor can be shown.

The code of the ex­am­ple fol­lows:

INSERT INTO archive_orders
SELECT * from orders
WHERE order_date < '2016-01-01';

DELETE from orders WHERE order_date < '2016-01-01';

Then I learnt that there is an­oth­er way of writ­ing that sort of state­ments, with the “re­turn­ing” sen­tence of Post­greSQL.

It would be re-writ­ten like:

WITH deleted as (
    DELETE FROM orders WHERE order_date < '2016-01-01'
    RETURNING *
)
INSERT INTO archive_orders select * from deleted;

Or… the oth­er way around:

WITH to_delete as(
    INSERT INTO archive_orders
    SELECT id, description, order_date FROM orders WHERE order_date < '2016-01-01'
    RETURNING id
)
DELETE FROM orders where id in (select id from to_delete);

The in­ter­est­ing point here, is that it en­tails a sin­gle com­mand, in­stead of two. So this can be done with a sin­gle call to the database, sav­ing an ex­tra round trip.

The point here is that the delete state­ment of Post­greSQL (as well as the rest of the state­ments, IN­SERT, UP­DATE, for in­stance), can be spec­i­fied to re­turn the da­ta they af­fec­t, and this can be used in an in­ter­me­di­ate ta­ble to pipe it to an­oth­er state­men­t.

By de­fault, if you run the delete state­men­t, it should re­turn the num­ber of af­fect­ed rows, like:

delete from orders where id > 3;
DELETE 4

But, we can make it to re­turn the rows them­selves:

delete from orders where id > 3 returning \*;
id|   description    |     order_date
----+------------------+---------------------
4 | to be archived   | 2014-12-09 00:00:00
5 | First sale order | 2016-07-17 00:00:00
6 | First sale order | 2016-07-20 00:00:00
7 | First sale order | 2016-07-24 00:00:00
(4 rows)

DELETE 4

Or, spe­cif­ic col­umns if we se­lect them:

delete from orders where id > 3 returning id, description;
id |   description
----+------------------
4 | to be archived
5 | First sale order
6 | First sale order
7 | First sale order
(4 rows)

DELETE 4

So, we can use the “re­turn­ing” fea­ture of Post­greSQL, to do in a sin­gle com­mand what we usu­al­ly would do in two or more, and in some cas­es, it might be some­thing worth ex­plor­ing. It was great learn­ing things like this one, and get­ting tips as a re­sult from the feed­back of the talk (it does not change the mean­ing, and the ex­am­ple could re­main the same for the afore­men­tioned rea­son­s; it is just an ex­am­ple :-).