Running PostgreSQL in memory with docker

Introduction

In­ter­act­ing with a database, can be a reg­u­lar task of a de­vel­op­er, and for that we would like to en­sure that we are de­vel­op­ing and test­ing in sit­u­a­tions close to a re­al im­ple­men­ta­tion; there­fore, us­ing the same data­base as in pro­duc­tion can help de­tect­ing is­sues ear­ly.

However, setting up an entire database server for development, can be cumbersome. Hopefully nowadays, modern operating systems like Linux have great tools and features that we could take advantage of.

In par­tic­u­lar, I would like to set­up a sim­ple data­base lo­cal­ly us­ing dock­er, and stor­ing the da­ta in mem­o­ry.

The idea is simple: run a docker container with the image for PostgresSQL, using a tmpfs 1 2 as storage for the database (a ramfs could also be used).

Procedure

First, we get the image of PostgresSQL according to the platform, for example:

docker pull fedora/postgresql

Then, I could create a tmpfs, for the data and mount it

sudo mkdir /mnt/dbtempdisk
sudo mount -t tmpfs -o size=50m tmpfs /mnt/dbtempdisk

Now we could run the data­base con­tain­er us­ing this di­rec­to­ry:

1
2
3
4
5
6
7
docker run --name mempostgres \
    -v "/mnt/dbtempdisk:/var/lib/pgsql/data:Z" \
    -e POSTGRES_USER=<username-for-the-db> \
    -e POSTGRES_PASSWORD=<password-for-the-user> \
    -e POSTGRES_DB=<name-of-the-db> \
    -p 5432:5432 \
    fedora/postgresql

The first line indicates the name for the container we are running (if is not specified, docker will put a default one); the second line is the important one, since it is what makes the mapping of directories, meaning that will map the directory for the tmpfs on the host, mounted as /var/lib/pgsql/data inside the container (the target). The later directory is the one PostgreSQL uses by default for initializing and storing the data of the database. The Z at the end of the mapping is an internal detail for flagging that directory in case SELinux is enabled, so it will not fail due to a permissions errors (because containers run as another user, and we are mounting something that might be out of that scope) 3.

The rest of the three lines, are environment variables that docker will use for the initialization of the database (they are optional, and defaults will be used, in case they are not provided). Then follows the port mapping, which in this case indicates to map the port 5432 inside the container to the same one on the host. And finally, the name of the docker image we will run.

Once this is running, it would look like we have an actual instance of PostgreSQL up and running on our machine (actually we do, but it is inside a container :-), so we can connect with any client (even a Python application, etc.).

For example, if we want to use the psql client with the container, the command would be:

1
2
3
4
docker run -it --rm \
--link mempostgres:postgres \
fedora/postgresql \
psql -h mempostgres -U <username-in-db> <db-name>

Applications

If we have PostgreSQL installed, we could simply start a new instance as our user with the command (postgres ...) and pass the -D parameter with the desired path where the database is going to store the data (which will be the tmpfs/ramdisk). This would be another way of achieving the same.

Re­gard­less the im­ple­men­ta­tion­s, here are some po­ten­tial ap­pli­ca­tion­s:

  1. Lo­­cal de­vel­op­­ment with­­out re­quir­ing disk stor­age, and run­n­ing faster at the same time.

  2. Unit test­ing: unit tests should be fast, grant­ed. Some­­times, it makes per­­fect sense to run the tests against an ac­­tu­al data­base (prac­ti­­cal­i­­ty beats pu­ri­­ty), even if this makes them “in­te­­gra­­tion/­­func­­tion­al” test­s. In this re­­gard, hav­ing a light­weight data­base con­­tain­er run­n­ing lo­­cal­­ly could achieve the goal with­­out com­pro­mis­ing per­­for­­mance.

  3. Iso­la­tion: (this on­ly ap­plies for the con­tain­er ap­proach), run­ning Post­greSQL in a dock­er con­tain­er, en­cap­su­lates the li­braries, tool­s, pack­ages, etc. in dock­er, so the rest of the sys­tem does not have to main­tain much oth­er pack­ages in­stalled. Think of if as a sort of “vir­tu­al en­vi­ron­men­t” for pack­ages.

All in al­l, I think it’s an in­ter­est­ing ap­proach, worth con­sid­er­ing, at least to have al­ter­na­tives when work­ing in projects that re­quire in­tense in­ter­ac­tion with the data­base.

1

: http­s://www.k­er­nel.org/­doc/­Doc­u­men­ta­tion/­filesys­tem­s/tmpf­s.txt

2

: http­s://en.wikipedi­a.org/wik­i/Tmpfs

3

: http://www.pro­jec­tatom­ic.io/blog/2015/06/us­ing-vol­umes-with­-­dock­er-­can-­cause-prob­lem­s-with­-selin­ux/