Published on

Professional Software - Explicit Domain Definitions

Authors

Domains are the subject of a piece of software. Any software that does anything productive is operating on an entity, within the context of a domain. Car dealership software operates on customers, cars, salespeople, loans, etc. Software that parses files operate on files, lines, and serialization formats. Domains must exist for software to be productive. Domains will either be implicitly or explicitly expressed in software. Professional software explicitly expresses domains which enables constant time extension, maintanence and verification of the software.

Implicit Domain Definitions

Implicit domains are difficult to identifiy and are often a side effect or implementation detail instead of a first class entity. Implicit domains don't stand on their own. You can't point to an implicit domain and say "THAT is an Account". Implicit domains are often duplicated and exist across multiple places in the software.

The following code shows an example of an implicitly defined domain:

def enroll_user(first_name: str, last_name: str):
    id = first_name + last_name
    insert_user_into_db(id, first_name, last_name)

def remove_user(first_name: str, last_name: str):
    id = first_name + last_name
    delete_user_from_db(id, first_name, last_name)

def notify_user(first_name: str, last_name: str):
    id = first_name + last_name
    send_user_notification(id)

The definition of user (the collection of properties that make up a user in this project) leaks into each of the functions calling parameters. Each function duplicates the definition of user by requring the attributes required to construct a user. The concept of a user exists as an implementation detail of each one of the functions. In this software the definition of a user exists in multiple locations:

Implicit nested domains

Extensibility

At some point the domain will change and the definition of a User will change. This will require updating each individual function where User is implicitly defined. Consider what happens when a User needs a surname added:

def enroll_user(first_name: str, last_name: str, surname: Optional[str] = None):
    id = first_name + last_name
    insert_user_into_db(id, first_name, last_name, surname)

def remove_user(first_name: str, last_name: str, surname: Optional[str] = None):
    id = first_name + last_name
    delete_user_from_db(id, first_name, last_name, surname)

def notify_user(first_name: str, last_name: str, surname: Optional[str] = None):
    id = first_name + last_name
    send_user_notification(id, first_name, last_name, surname)

This code adds the optional surname to every method that implicitly defines a user.

Every impleicit location eneds to be updated and re-verified. What often happens in the case of implicit domain definitions is a business requirement comes into engineering, such as:

"Start capturing user surname"

The business asks for an estimate on how long this will take. This forces engineering to do a full code audit of everywhere that a user is implicitly defined. In large projects this may be 10's (or hundreds) of locations. The amount of effort required to update an implicit definitions is proportional to the number of implicit definitions. This means extending is linear time, O(n), with respect to the number of implicit definitions. Each implicit location that is updated needs to be re-verified. In our example this means that the following functionality needs to be re-verified:

  • User enrollment
  • User removal
  • User notification

This is some serious business functionality that needs to be completely reverified, just for adding one simple optional field!

Maintainability

Maintainability of implicit domain definitions is similiarly impacted. Every time a bug fix comes in around the core user domain, it requires updating all implicit definitions. This means that maintainibility is also linear time, O(n), with respect to the number of implicit definitions. Implicit domain definitions makes bug fixes a large effort. This makes it so bug fixes consume a large portion of a teams capacity. Every piece of software has bugs. Implicit domain definitions create the environment that can create a positive feedback loop for bug load.

Pretend that the 3 functions above were launched into production. Each function builds the User id by concatenting a users first name and last name. The bug is that ids are not actualy unique since many users may have the same first name and last name. A fix is made to use a UUID to generate each ID. The code is deployed. After which another bug is detected because the identifiers do not fetch old users, since the old users use the first_name + last_name identifier. The code is updated to first look for the first_name + last_name and then the id! After the code is deployed it's determined that the concatenation breaks on None strings. Each change to the domain increases the surface area of introducing bugs. This increases the opportunity for code to break. This can easily create cycles where teams update code, code breaks, teams need to update code again, more stuff breaks, until most of the teams capacity is focused on bug fixes.

Verifiability

Impliciltly defined domains couples the domain entities with the actions operating on those entities. In the case of our example, enroll_user constructs users and it performs an action on a user.

def test_enroll_user_user_created(self):
    enroll_user(first_name='test', last_name='test')
    # assert the user was created as a side effect of this function
    # by patching the enroll_user user dependencies

The cost of implicit domain definitions compounds with each duplicate definition. Implicit definitions make it almost impossible to predict the surface area of any given change, which means modification has a linear cost and time instead of a fixed cost and predictable time. Implicit domain definitions are fundamentally not predictable.

Explicit Domain Definitions

Expicit domains model the domain entities as first class stand-alone concepts. Each domain entity has a single explicit definition. An explicit domain definition allows an engineer to point to it and say "THAT's a user, there it is". Explicit domain definitions exist in a single place independent of any functions operating on that user.

class User:
    def __init__(self, first_name, last_name):
        self.first_name = first_name
        self.last_name = last_name
        self.id = None

def create_user(first_name: str, last_name: str) -> User:
    user = User(
        first_name = first_name,
        last_name = last_name
    )
    user.id = first_name + last_name
    return user

It's important to note that Explicit Domain Definitions are not object oriented programming. They don't need type hierarchies, polymorphism, inheritance, or any other relationship expressed in OOP. Explicit domain definitions will be grouped as a collection of associated data attributes, and expressed in your languages way of grouping these attributes, whether that is a class, struct, object, or whatever other primitive.

An explicit definition profoundly impacts the relationship with functions that depend on users:


user = create_user(...)

def enroll_user(user: User):
    # ...
    pass

def remove_user(user: User):
    # ...
    pass

def notify_user(user: User):
    # ...
    pass

Explicit domain definitions pull the definition up and out of the dependend functions create an explicit relationship:

Explisit domain reference

Extensibility

An explicit domain definition makes it trivial to extend the definition. Adding a new attribute is simple and only takes place in a single location:

class User:
    def __init__(self, first_name, last_name, surname):
        self.first_name = first_name
        self.last_name = last_name
        self.surname = surname
        self.id = None

def create_user(first_name: str, last_name: str, surname: Optional[str] = None) -> User:
    user = User(
        first_name = first_name,
        last_name = last_name,
        surname = surname
    )
    user.id = first_name + last_name
    return user

This is also backwards compatible. The domain definition update takes place in a single controlled location which makes it a constant time, O(1), operation. Individual dependent functions can begin to take advantage of the new field in a controlled and isolated manner. notify_user can be updated in a focused and isolated way to start to leverage the new field, at a later date.

Maintainability

Explicit definitions makes it trivial to pinpoint the source of domain errors, since a single definition exists. The user id bug only needs to be fixed in a single location making it a constant, O(1), time operation. A single domain definition also makes it much less likely that bug fixes trigger other bug fixes. This is because the surface area of a change is reduced.

Verifiability

Explicit domain definitions enable "dependency injection". Since the domain entities live on their own, they can be provided to functions that depend on them. This makes testing operations a first class invocation and not a side effect:

def test_create_user(self):
    self.assertEqual(
        'first last',
        create_user(
            first_name='first',
            last_name='last'
        )
    )

def test_enroll_user(self):
    user = create_user('first', 'last')
    ok = enroll_user(user)
    self.assertTrue(ok)

Conclusion

Professional software favors explicit definitions because it enables constant time predictable extendsion and maintenence. This allows companies to control costs in terms of the amount of time required to change software.