On returning consistent data types

This post is in­spired on an is­sue I once found, while I was us­ing a well-­known li­brary in Python for ­pars­ing YAML files. The prob­lem was that when it was load­ing the con­tent of the file, the re­sult was not co­her­en­t, be­cause some­times it re­turned the con­tent as a python dict, but if the file was emp­ty, the re­turn val­ue was None.

Do you no­tice some­thing odd here?

What if I want to use the re­sult? I can­not do it safe­ly, for ex­am­ple:

content = yaml.load(...)  # with the correct parameters and file name
for tag, values in content.items():
    pass  # process as required...

If con­tent is None, it will raise an At­tribu­teEr­ror say­ing that None has no at­tribute called “item­s” (which is true).

There­fore, the de­vel­op­er should catch the ex­cep­tion or avoid the cor­ner case, ­by do­ing some­thing like the fol­low­ing:

content = yaml.load() or {}

That could be a case of “cod­ing de­fen­sive­ly”, mak­ing sure that the pro­gram will not fail un­der­ ­most con­di­tions (it would al­so re­quire to add an as­sert or to raise an ex­cep­tion per­hap­s, but that is a dif­fer­ent top­ic). I ac­tu­al­ly agree with de­fen­sive pro­gram­ming, but I think it is bet­ter if the li­brary it­self has a more cor­rect be­haviour, re­spect­ing the in­ter­face (that is: if you are go­ing to re­turn a dic­tio­nary, and there is not con­tent, then the log­i­cal as­sump­tion is to ex­pect an emp­ty dic­tio­nary). This must be the de­fault be­haviour, not some­thing to be set by pa­ram­e­ter­s.

This could be thought as an in­stance of a more gen­er­al prob­lem that oc­curs when some func­tion is in­tend­ed to re­turn “X or Y”. In my opin­ion, if X and Y do not share the same in­ter­face, there is a po­ten­tial bug (in the Ob­jec­t-Ori­ent­ed par­a­digm we would say that there is no poly­mor­phis­m, or maybe that the “con­trac­t” is not be­ing re­spect­ed).

This is an ex­am­ple that I want­ed to high­light, be­cause it might help you to write clean­er code.