Data Models - part 1

by Richard Taylor : 2019-11-12

I am calling this "part 1" because I don't expect to get it right first time. Partly because I am only thinking about the external view of the data at this stage: by which I mean the data as it is handled by the API.

Thinking Shapes
I'm thinking of a shape

Which is quite deliberate. I want to get the main concepts sorted out in the API first, so that I can completely decouple the internal representation from whatever UI frameworks I try out later.

Get Started

It's often hard to get started when you are staring at a blank page. So I usually blast down something to get me going and then rewrite it as I think through the implications.

Lets start with the most basic data. We have Items and Links that we want to be trackable. So that says to me that we need a Trackable base class to hold the tracking; and other classes that extend it with their own specific data and functionality.

class Trackable
class Item(Trackable)
class Link(Trackable)
class User(Trackable)

Then again, maybe composition is better than inheritance. Which of these statements makes more sense: an Item is a Trackable thing; or an Item has a Trackable thing?

To me, both seem very similar. And the good news is that for now at least the choice is invisible to users of the classes so I'll stick with inheritance and change later if needs be.

I wouldn't normally make lots of commits while playing at this stage, but to illustrate the process here is a very early version of the model.

Notice that I am deliberately populating all fields with built-in types rather than library classes:

self.id = str(uuid.uuid4())
self.created_by = created_by.id
self.created_at = datetime.datetime.utcnow().isoformat()

One reason for this is that I want to be able to convert these models to JSON really easily, without defining custom converters. The second is that I don't want Trackable objects to contain nested Trackable objects; instead I want these objects to reference other Trackable objects.

For example. Rather than storing the User who created an Item inside each Item like this:

{
  'id': '1b8d44d0-0fba-11ea-9cb9-b707013a9afa',
  'created_by': {
    'id': '271d411a-0fba-11ea-8c2b-23aad820de64',
    'name': 'The Electric Cat',
    'email': 'electric.cat@example.org',
    'shoe-size': 'N/A'
  },
  'type': 'STORY',
  'properties': {
    'DESCRIPTION': 'Once upon a time there were...'
  }
}

I simply want to record the id of the User like this:

{
  'id': '1b8d44d0-0fba-11ea-9cb9-b707013a9afa',
  'created_by': '271d411a-0fba-11ea-8c2b-23aad820de64',
  'type': 'STORY',
  'properties': {
    'DESCRIPTION': 'Once upon a time there were...'
  }
}

Because that is more compact, avoids returning data that might not be required and avoids returning multiple copies of the same data.

Design Choices

So you can see, even with these early stages of model design, I am making design choices. I'm thinking about how these classes might be used. And something that always helps with that is writing unit tests.

Take a look at the tests in the linked commit. I am creating objects in different ways and making some assertions about their properties. What should the ID of an empty Item be?

These tests help me think about how I will use the objects I am writing. Do I write all the tests first? TDD? No, not all the time. Here I actually wrote the test_empty tests before the code, but then wrote some of the code before adding the other tests.

Now I am thinking about how these classes look in action. I am wondering if 'type' should be a top-level field of Item and Link after all... or maybe it should just be part of the properties?

# A
item = Item(self.user, 'EPIC', {'DESCRIPTION': 'big'})

# B
item = Item(self.user, {'TYPE': 'EPIC', 'DESCRIPTION': 'big'})

What's the rationale for keeping it separate? It looked fine in the module code, but in use it looks odd. Because it is not fixed, the type of an Item could be changed from EPIC to STORY, just as the DESCRIPTION can be changed. So it looks like it should be a property.

With that change made, maybe next we can look at an API to read and write these objects.