Running PostgreSQL in memory with docker

Introduction

In­ter­act­ing with a database, can be a reg­u­lar task of a de­vel­op­er, and for that we would like to en­sure that we are de­vel­op­ing and test­ing in sit­u­a­tions close ­to a re­al im­ple­men­ta­tion; there­fore, us­ing the same data­base as in­ pro­duc­tion can help de­tect­ing is­sues ear­ly.

How­ev­er, set­ting up an en­tire data­base serv­er for de­vel­op­men­t, can be cum­ber­some. Hope­ful­ly nowa­days, mod­ern op­er­at­ing sys­tems like Lin­ux have ­great tools and fea­tures that we could take ad­van­tage of.

In par­tic­u­lar, I would like to set­up a sim­ple data­base lo­cal­ly us­ing dock­er, and stor­ing the da­ta in mem­o­ry.

The idea is sim­ple: run a dock­er con­tain­er with the im­age for Post­gresSQL, us­ing a tmpfs [1] [2] as stor­age for the data­base (a ramfs could al­so be used).

Procedure

First, we get the im­age of Post­gresSQL ac­cord­ing to the plat­for­m, for ex­am­ple:

docker pull fedora/postgresql

Then, I could cre­ate a tmpfs, for the da­ta and mount it

sudo mkdir /mnt/dbtempdisk
sudo mount -t tmpfs -o size=50m tmpfs /mnt/dbtempdisk

Now we could run the data­base con­tain­er us­ing this di­rec­to­ry:

1
2
3
4
5
6
7
docker run --name mempostgres \
    -v "/mnt/dbtempdisk:/var/lib/pgsql/data:Z" \
    -e POSTGRES_USER=<username-for-the-db> \
    -e POSTGRES_PASSWORD=<password-for-the-user> \
    -e POSTGRES_DB=<name-of-the-db> \
    -p 5432:5432 \
    fedora/postgresql

The first line in­di­cates the name for the con­tain­er we are run­ning (if is not spec­i­fied, dock­er will put a de­fault one); the sec­ond line is the im­por­tan­t one, since it is what makes the map­ping of di­rec­to­ries, mean­ing that will map the di­rec­to­ry for the tmpfs on the host, mount­ed as /var/lib/pgsql/­da­ta in­side the con­tain­er (the tar­get). The lat­er di­rec­to­ry is the one Post­greSQL us­es by de­fault for ini­tial­iz­ing and stor­ing the da­ta of the ­database. The Z at the end of the map­ping is an in­ter­nal de­tail for flag­ging that di­rec­to­ry in case SELin­ux is en­abled, so it will not fail due ­to a per­mis­sions er­rors (be­cause con­tain­ers run as an­oth­er user, and we are ­mount­ing some­thing that might be out of that scope) [3].

The rest of the three lines, are en­vi­ron­ment vari­ables that dock­er will use for the ini­tial­iza­tion of the data­base (they are op­tion­al, and de­faults will be used, in case they are not pro­vid­ed). Then fol­lows the port map­ping, which in­ this case in­di­cates to map the port 5432 in­side the con­tain­er to the same one on the host. And fi­nal­ly, the name of the dock­er im­age we will run.

Once this is run­ning, it would look like we have an ac­tu­al in­stance of Post­greSQL up and run­ning on our ma­chine (ac­tu­al­ly we do, but it is in­sid­e a con­tain­er :-), so we can con­nect with any client (even a Python ap­pli­ca­tion, etc.).

For ex­am­ple, if we want to use the psql client with the con­tain­er, the ­com­mand would be:

1
2
3
4
docker run -it --rm \
--link mempostgres:postgres \
fedora/postgresql \
psql -h mempostgres -U <username-in-db> <db-name>

Applications

If we have Post­greSQL in­stalled, we could sim­ply start a new in­stance as our us­er with the com­mand (post­gres ...) and pass the -D pa­ram­e­ter with­ the de­sired path where the data­base is go­ing to store the da­ta (which will be the tmpfs/ramdisk). This would be an­oth­er way of achiev­ing the same.

Re­gard­less the im­ple­men­ta­tion­s, here are some po­ten­tial ap­pli­ca­tion­s:

  1. Lo­cal de­vel­op­ment with­out re­quir­ing disk stor­age, and run­ning faster at the same time.
  2. Unit test­ing: unit tests should be fast, grant­ed. Some­times, it makes per­fect sense to run the tests against an ac­tu­al data­base (prac­ti­cal­i­ty beats pu­ri­ty), even if this makes them “in­te­gra­tion/­func­tion­al” test­s. In­ this re­gard, hav­ing a light­weight data­base con­tain­er run­ning lo­cal­ly could achieve the goal with­out com­pro­mis­ing per­for­mance.
  3. Iso­la­tion: (this on­ly ap­plies for the con­tain­er ap­proach), run­ning Post­greSQL in a dock­er con­tain­er, en­cap­su­lates the li­braries, tool­s, ­pack­ages, etc. in dock­er, so the rest of the sys­tem does not have to ­main­tain much oth­er pack­ages in­stalled. Think of if as a sort of “vir­tu­al en­vi­ron­men­t” for pack­ages.

All in al­l, I think it’s an in­ter­est­ing ap­proach, worth con­sid­er­ing, at least­ ­to have al­ter­na­tives when work­ing in projects that re­quire in­tense in­ter­ac­tion with the data­base.

[1] : https://www.kernel.org/doc/Documentation/filesystems/tmpfs.txt
[2] : https://en.wikipedia.org/wiki/Tmpfs
[3] : http://www.projectatomic.io/blog/2015/06/using-volumes-with-docker-can-cause-problems-with-selinux/