Returning data in PostgreSQL

In or­der to get the best per­for­man­ce from our da­ta­ba­se en­gi­ne, most of the ­ti­me­s, we ha­ve to take ad­van­ta­ge of its ex­clu­si­ve fea­tu­res. This has the ­di­sad­van­ta­ge of ge­tting far­ther away from stan­dar­d, AN­SI co­de, and in the ca­se of RDBM en­gi­nes, this al­so mean­s, most of the OR­Ms wi­ll not wo­rk.

Be­si­des tha­t, it could so­me­ti­mes be wor­th ex­plo­ring what we can gain by a­dap­ting to the en­gi­ne.

This is so­me­thing learnt from the fee­dba­ck of my ta­lk gi­ven at Eu­ro­P­y­thon 2016. After the ta­lk, an atten­dee told me ­that he liked it, and asked me about the co­de por­tra­yed in one of the sli­des, ­the one about de­co­ra­tors wi­th the exam­ple of mo­ving so­me re­cor­ds from one ta­ble ­to ano­the­r. He asked me if I ran co­de like that one in pro­duc­tio­n, to whi­ch I an­swe­red no: the co­de is en­ti­re­ly fic­tio­na­l, and it was do­ne for the me­re ­pur­po­ses of the exam­ple­s, so I just nee­ded an ex­cu­se for a ta­sk that could be­ ­do­ne in two SQL sta­te­men­ts, so the exam­ple wi­th the de­co­ra­tor can be sho­wn.

The co­de of the exam­ple fo­llo­ws:

INSERT INTO archive_orders
SELECT * from orders
WHERE order_date < '2016-01-01';

DELETE from orders WHERE order_date < '2016-01-01';

Then I learnt that the­re is ano­ther way of wri­ting that sort of sta­te­men­ts, wi­th the “re­tur­nin­g” sen­ten­ce of Pos­tgreS­Q­L.

It would be re-w­ri­tten like:

WITH deleted as (
    DELETE FROM orders WHERE order_date < '2016-01-01'
    RETURNING *
)
INSERT INTO archive_orders select * from deleted;

Or… the other way aroun­d:

WITH to_delete as(
    INSERT INTO archive_orders
    SELECT id, description, order_date FROM orders WHERE order_date < '2016-01-01'
    RETURNING id
)
DELETE FROM orders where id in (select id from to_delete);

The in­te­res­ting point he­re, is that it en­tails a sin­gle co­m­man­d, ins­tead of ­two. So this can be do­ne wi­th a sin­gle ca­ll to the da­ta­ba­se, saving an ex­tra ­round tri­p.

The point he­re is that the de­le­te sta­te­ment of Pos­tgreS­QL (as we­ll as the ­rest of the sta­te­men­ts, IN­SER­T, UP­DA­TE, for ins­tan­ce), can be spe­ci­fied to­ ­re­turn the da­ta they affec­t, and this can be us­ed in an in­ter­me­dia­te ta­ble to­ ­pi­pe it to ano­ther sta­te­men­t.

By de­faul­t, if you run the de­le­te sta­te­men­t, it should re­turn the num­ber of a­ffec­ted ro­ws, like:

delete from orders where id > 3;
DELETE 4

Bu­t, we can make it to re­turn the ro­ws the­msel­ve­s:

delete from orders where id > 3 returning \*;
id|   description    |     order_date
----+------------------+---------------------
4 | to be archived   | 2014-12-09 00:00:00
5 | First sale order | 2016-07-17 00:00:00
6 | First sale order | 2016-07-20 00:00:00
7 | First sale order | 2016-07-24 00:00:00
(4 rows)

DELETE 4

Or, spe­ci­fic co­lumns if we se­lect the­m:

delete from orders where id > 3 returning id, description;
id |   description
----+------------------
4 | to be archived
5 | First sale order
6 | First sale order
7 | First sale order
(4 rows)

DELETE 4

So, we can use the “re­tur­nin­g” fea­tu­re of Pos­tgreS­Q­L, to do in a sin­gle co­m­man­d what we usua­lly would do in two or mo­re, and in so­me ca­ses, it mi­ght be­ ­so­me­thing wor­th ex­plo­rin­g. It was great lear­ning things like this one, an­d ­ge­tting tips as a re­sult from the fee­dba­ck of the ta­lk (it does not chan­ge the ­mea­nin­g, and the exam­ple could re­main the sa­me for the afo­re­men­tio­ned rea­son­s; it is just an exam­ple :-).