On returning consistent data types

This post is in­spired on an is­sue I once found, while I was us­ing a well-­known li­brary in Python for ­pars­ing YAML files. The prob­lem was that when it was load­ing the con­tent of the file, the re­sult was not co­her­en­t, be­cause some­times it re­turned the con­tent as a python dict, but if the file was emp­ty, the re­turn val­ue was None.

Do you no­tice some­thing odd here?

What if I want to use the re­sult? I can­not do it safe­ly, for ex­am­ple:

1
2
3
content = yaml.load(...)  # with the correct parameters and file name
for tag, values in content.items():
    pass  # process as required...

If content is None, it will raise an AttributeError saying that None has no attribute called “items” (which is true).

There­fore, the de­vel­op­er should catch the ex­cep­tion or avoid the cor­ner case, ­by do­ing some­thing like the fol­low­ing:

content = yaml.load() or {}

That could be a case of “coding defensively”, making sure that the program will not fail under most conditions (it would also require to add an assert or to raise an exception perhaps, but that is a different topic). I actually agree with defensive programming, but I think it is better if the library itself has a more correct behaviour, respecting the interface (that is: if you are going to return a dictionary, and there is not content, then the logical assumption is to expect an empty dictionary). This must be the default behaviour, not something to be set by parameters.

This could be thought as an in­stance of a more gen­er­al prob­lem that oc­curs when some func­tion is in­tend­ed to re­turn “X or Y”. In my opin­ion, if X and Y do not share the same in­ter­face, there is a po­ten­tial bug (in the Ob­jec­t-Ori­ent­ed par­a­digm we would say that there is no poly­mor­phis­m, or maybe that the “con­trac­t” is not be­ing re­spect­ed).

This is an ex­am­ple that I want­ed to high­light, be­cause it might help you to write clean­er code.