Running PostgreSQL in memory with docker

Introduction

In­te­rac­ting wi­th a da­ta­ba­se, can be a re­gu­lar ta­sk of a de­ve­lo­pe­r, and for tha­t we would like to en­su­re that we are de­ve­lo­ping and tes­ting in si­tua­tions clo­se ­to a real im­ple­men­ta­tio­n; the­re­fo­re, using the sa­me da­ta­ba­se as in ­pro­duc­tion can help de­tec­ting is­sues ear­l­y.

However, setting up an entire database server for development, can be cumbersome. Hopefully nowadays, modern operating systems like Linux have great tools and features that we could take advantage of.

In par­ti­cu­la­r, I would like to se­tup a sim­ple da­ta­ba­se lo­ca­lly usin­g do­cker, and sto­ring the da­ta in me­mo­r­y.

The idea is simple: run a docker container with the image for PostgresSQL, using a tmpfs 1 2 as storage for the database (a ramfs could also be used).

Procedure

First, we get the image of PostgresSQL according to the platform, for example:

docker pull fedora/postgresql

Then, I could create a tmpfs, for the data and mount it

sudo mkdir /mnt/dbtempdisk
sudo mount -t tmpfs -o size=50m tmpfs /mnt/dbtempdisk

Now we could run the da­ta­ba­se con­tai­ner using this di­rec­to­r­y:

1
2
3
4
5
6
7
docker run --name mempostgres \
    -v "/mnt/dbtempdisk:/var/lib/pgsql/data:Z" \
    -e POSTGRES_USER=<username-for-the-db> \
    -e POSTGRES_PASSWORD=<password-for-the-user> \
    -e POSTGRES_DB=<name-of-the-db> \
    -p 5432:5432 \
    fedora/postgresql

The first line indicates the name for the container we are running (if is not specified, docker will put a default one); the second line is the important one, since it is what makes the mapping of directories, meaning that will map the directory for the tmpfs on the host, mounted as /var/lib/pgsql/data inside the container (the target). The later directory is the one PostgreSQL uses by default for initializing and storing the data of the database. The Z at the end of the mapping is an internal detail for flagging that directory in case SELinux is enabled, so it will not fail due to a permissions errors (because containers run as another user, and we are mounting something that might be out of that scope) 3.

The rest of the three lines, are environment variables that docker will use for the initialization of the database (they are optional, and defaults will be used, in case they are not provided). Then follows the port mapping, which in this case indicates to map the port 5432 inside the container to the same one on the host. And finally, the name of the docker image we will run.

Once this is running, it would look like we have an actual instance of PostgreSQL up and running on our machine (actually we do, but it is inside a container :-), so we can connect with any client (even a Python application, etc.).

For example, if we want to use the psql client with the container, the command would be:

1
2
3
4
docker run -it --rm \
--link mempostgres:postgres \
fedora/postgresql \
psql -h mempostgres -U <username-in-db> <db-name>

Applications

If we have PostgreSQL installed, we could simply start a new instance as our user with the command (postgres ...) and pass the -D parameter with the desired path where the database is going to store the data (which will be the tmpfs/ramdisk). This would be another way of achieving the same.

Re­gard­le­ss the im­ple­men­ta­tion­s, he­re are so­me po­ten­tial appli­ca­tion­s:

  1. Lo­­­cal de­­ve­­lo­­­p­­ment wi­­thout re­­qui­­ring disk sto­­­ra­­ge, and run­­ning fa­s­­ter at the s­a­­me ti­­me.

  2. Unit tes­­ti­n­­g: unit tes­­ts should be fa­s­­t, gran­te­­d. So­­­me­­ti­­me­s, it makes ­­pe­r­­fect sen­­se to run the tes­­ts against an ac­­tual da­­ta­­ba­­se (pra­c­­ti­­ca­­li­­ty ­­bea­­ts pu­­ri­­ty), even if this makes them “i­n­­te­­gra­­tio­­n/­­fun­c­­tio­­­na­­l” tes­­ts. In ­­this re­­ga­r­­d, ha­­ving a li­­gh­­twe­i­ght da­­ta­­ba­­se co­n­­tai­­ner run­­ning lo­­­ca­­lly cou­l­­d a­­chie­­ve the goal wi­­thout co­m­­pro­­­mi­­sing pe­r­­fo­r­­man­­ce.

  3. Iso­la­tio­n: (this on­ly applies for the con­tai­ner appro­ach), run­nin­g Pos­tgreS­QL in a do­cker con­tai­ne­r, en­cap­su­la­tes the li­bra­rie­s, tool­s, ­pa­cka­ges, etc. in do­cker, so the rest of the sys­tem does not ha­ve to­ ­main­tain mu­ch other pa­cka­ges ins­ta­lle­d. Thi­nk of if as a sort of “vir­tua­l en­vi­ron­men­t” for pa­cka­ges.

All in all, I thi­nk it’s an in­te­res­ting appro­ach, wor­th con­si­de­rin­g, at leas­t ­to ha­ve al­ter­na­ti­ves when wo­rking in pro­jec­ts that re­qui­re in­ten­se in­te­rac­tio­n wi­th the da­ta­ba­se.

1

: http­s://www.ker­ne­l.or­g/­do­c/­Do­cu­men­ta­tio­n/­fi­le­s­ys­te­ms/t­m­pfs.­txt

2

: http­s://en.wiki­pe­dia.or­g/wiki/T­m­pfs

3

: http://www.­pro­jec­ta­to­mi­c.io­/­blo­g/2015/06/u­sin­g-­vo­lu­me­s-wi­th-­do­cke­r-­can-­cau­se-­pro­ble­ms-wi­th-se­li­nu­x/