Links and Refactoring the Models

by Richard Taylor : 2020-02-28

Implementing the Item object first, rather than the Link, was an obvious choice because the Item is more autonomous. Each Item can mean something by itself, it does not necessarily depend on other objects.

That doesn't mean Items are unconstrained. As we saw in the API implementation, the id, created_at and created_by fields are set internally and cannot be modified. For me this is the difference between an API and a data bucket ... one applies constraints, the other stores anything you ask it to until it fills up.

Lifecycles

The Items are the fundamental objects in this Electric Cat system. They represent real world entities or actions and we never delete them; because even if the thing that they represent in the real world is destroyed or completed, we still want a lasting record of that thing.

The lifecycle of an Item is thus:

Item created
Item modified
Item modified
...

In contrast, a Link represents a relationship between two Items at the current time. It exists to group Items and navigate between them. It may be the case that the relationship is permanent (for example: Task-X "fixes" Bug-Y) or it may be that the relationship is transient (for example: Task-X "blocks" Task-Z).

As such a Link may be deleted, whereas an Item may not.

Link created
Link modified
Link modified
...
Link deleted

You can see at this version of the project that I have implemented Links.

Refactoring

The first phase of implementing Links was entirely test-driven. I wrote the api_tests_link.py tests first and then added code until they passed. Much of that new code was cut and paste from the Items implementation.

Then, with working code, I examined the implementation for repetition and duplication. A simple example is in the translator.py module where to_item and to_link have a common structure that can be factored out into a fill_object method.

Another kind of refactoring that arises when you add more code is the result of noticing that some responsibilities are lodged in the wrong class.

My first attempt at refactoring the API class after adding create_item had a new "private" method _no_immutables to check that incoming data was not trying to sneakily set immutable fields.

def _no_immutables(self, dictionary):
  if 'id' in dictionary:
    raise NotSavedError("id cannot be set externally")

  if 'created_at' in dictionary:
    raise NotSavedError("created_at cannot be set externally")

  if 'created_by' in dictionary:
    raise NotSavedError("created_by cannot be set externally")

This works... but is it the job of the API class to determine which data fields are immutable? With only two types of data object that happen to have the same immutable fields it is perhaps not crystal clear, but imagine adding more objects in the future with different fields and then clearly it should be the responsibility of the data models.

The same thing goes for deciding if a Link has valid endpoints. The API could extract the source and target and check for them in storage. Or it could ask the Link if its endpoints are valid (passing in the storage interface).

We can refactor this:

self._no_immutables(dictionary)

link = Link(user, dictionary)

self._source_and_target_valid(link)

Into this:

try:
    link = Link(user, dictionary)
    link.check_endpoints(storage)

except ValueError as error:
    raise NotSavedError(error)

The trick with refactoring is to ask "Does the design of this code still look right?" after each change. If the answer is no, then not refactoring it means any further changes will be building on that flawed design... which is accumulating a technical debt for the future.

Done?

Now we have an API that can manage Items and Links. Is that enough to start using it? Eager as I am to move on to creating Maps, there is one tiny thing missing. Unless you record the id of each Item and Link you create then how can you find it again?

Obviously we need some search functionality.