Thursday, March 28, 2019

Translating objects simply


How do you translate one object into a different, related object? Simply!

interface FooBarTranslator {
    Foo translate(Bar bar);
}

Some people, when confronted with this problem, think "I know, I'll use a framework." Now they have two problems, as the saying goes.

Many people code it themselves using abstractions and design patterns. Their code has organizing principles, but it isn't clear what those principles achieve. This code starts out as interesting to read, and gets progressively worse as each new developer touches it. It's even harder to maintain when we're integrating with other systems. It's a pain to test properly with those other systems, so we're reluctant to change the code unless we're forced to.

So here's a simple pattern to use the next time you find yourself translating objects.
  1. Use a single function with a single line for each field on the object to which you are translating.
  2. Whenever you need something new, do the simplest thing that could possibly work.
  3. There is no step 3!
Before any more explanation, let's do an example or two...

E.g. if Foo has fields a and b and c, your translator might look like:

    Foo translate(Bar bar) {
        Foo foo = new Foo();
        foo.setA(bar.getQux());
        foo.setB(bar.getBaz());
        foo.setC(bar.getQux());
        return foo;
    }

Here's an example that shows simple ways to address multiple aspects of the problem.

    Foo translate(Bar bar) {
        Foo foo = fooFactory.create();
        foo.setA(123);
        foo.setB(bar.getB()+1);
        foo.setC(aFunction(bar.getC()));
        D d = anExpensiveFunction(bar.getD());
        foo.setD(d.getSomePart());
        foo.setE(d.getSomeOtherPart());
        foo.setNestedF(fBarTranslator.translate(bar));
        if (bar.getFloat() != null) foo.setPrimitiveFloat(bar.getFloat());
        return foo;
    } 

Organizing the code in terms of the target object eliminates a whole category of bugs and confusion, in which the responsibility for setting a single field is spread out over multiple places. The code in those different places grows overlapping and conflicting behavior.

Organizing the code in terms of the target object achieves some "functional programming" goodness. Stakeholders often ask questions like "why does this field have this value? where does it come from?" and our code makes it easy to answer that kind of question. Each of the lines in the translation function is like a definition of the target object field. Though a single field on the source object might contribute to the value of multiple fields on the target object, a single field on the target object is only ever determined by a single complete function.

For fields with nontrivial logic, that logic should reside in a function in a different class, a la Single Responsibility Pattern. (I favor static methods for pure logic and field injection for logic that needs to consult some state, but there are good reasons to do otherwise.)

Any additional architecture only ever harms your code! Resist the temptation to add architecture. It will not pay for itself.

An important rule that you can't see in the examples above: never retrieve information from the target while you are translating to it. Another thing that might not be obvious: if your translation is a nice normal stateless translation, then your translator class should not have any member variables. Use local variables and not member variables for your expensive calculations.

If you have more than a handful of fields, you might want to order your setter calls alphabetically. That makes it harder for developers to accidentally set the same field twice in the same function. It makes it easy to find a field on a printout or to consult multiple fields without having to search via the app. It also makes git merge conflicts smarter; you want a conflict when two different developers merge different code for setting the same field, but you don't want a conflict when two different developers merge code for setting two different new fields. Just adding support for new fields to the end of the function yields a conflict in both cases.

Adding support for new fields to the end of the function is not a guideline which is easy to defend, or which is even easy for faithful developers to see in the code. If that's your de facto organizing principle, then you'll end up with no organizing principle. In contrast, if you alphabetize and have clear standards around dependency injection etc as mentioned above, you'll have one canonical simplest possible program for any translation. One right way to do it.

This raises the biggest disadvantage. If you don't have firsthand, repeated, experience with the pain of the alternatives at scale, you might not appreciate this pattern. 😉