Sunday, December 03, 2017

DDD Observers

Observers automatically run code when some condition occurs, without the causing code having to care.

Even in an imperative, object oriented environment, we'd like to support "birds sing when the sun comes up" without having to change the code for the sun coming up.

Theoretically, we could do this with simple polling. We could just explicitly check the condition every interval. E.g. is the sun up yet? Five seconds later: is the sun up now? Five second after that: what about now? This is not elegant, and could be costly, and completely falls down when we need to poll many objects.

We can implement observers cleanly by wrapping our entities!

Instead of triggering on changes to the internal state of an entity, using  "properties" as Kotlin does for example, we can wrap our entities in decorators that intercept calls from client code. Instead of introducing a new event abstraction, we can allow triggered code to compare the visible state of the entity before and after the triggering method.

Even a seemingly innocuous "event observer" class ruins the entity’s nice obliviousness. We step onto the slippery slope of adding notification features because some client might be interested. We shouldn't let our observing clients couple to our commanding clients, nor should we introduce a completely new abstraction in between. Observing clients are really interested in the change of an existing observable field, though not all clients are interested in the same changes.

The decorator solution does require calling code to know about the observer, usually at creation time. For all of the practical use cases for which I've needed observers, it's been sufficient to set them up at creation time. The calling code could be a factory, or even better a repository that wraps the entity at persist time.

This use of observers creates a nice parallel to "Collection Oriented Repositories" as discussed by Vernon in Implementing Domain Driven Design. That is, we explicitly wire our entities once, and then we don't need to subsequently worry about persistence and publishing every time the entity changes. We might even be able to leverage observers to support a Collection Oriented API on top of a Persistence Oriented Repository, and still avoid the morass of the full object relational mapping.

One last important point: be careful not to overuse the observer pattern. By definition, it hides the implementation. This can make it hard to figure out what the system is doing. In particular, never let your observing code make changes to your observed objects. Even if you're not worried about infinite loops, this is wrong because the observed code can't actually be oblivious.

Without further ado, here's the java implementation of this properly decoupled observer. It uses annotation processing to generate code to support observers of this interface:
public interface Observer<T, V> {
 default V beforeChange(T t) {
  return null;
 };
 void afterChange(T t, V v);
}
Which may be wired simply by:
 YourEntity entity = new YourEntityImpl();
 entity = new YourEntityObservable(entity, yourObserver);


You could implement this observable decorator manually. It’s kind of boring, so you could instead generate it using something like the linked project above.

Monday, May 29, 2017

DDD Transactions

The last post ended on a bit of cliffhanger. How do we leverage the compiler to validate before making changes to the domain? We’d like to make it easy to follow this pattern:
  1. Build all new objects, setting fields directly
  2. Validate, and if everything is valid then…
  3. Call the domain methods
The simplest thing is to hide our setters behind an interface. We’ll actually have two interfaces: one "uninitialized interface" for newly created objects, with just the setters; and one "domain interface" with just the proper domain methods. The uninitialized interface prevents taking any action until we’ve validated, and the domain interface encapsulates state. A factory returns an instance of the first interface (setters), and the validating method on the object itself returns the second interface (proper domain methods).

public class ItineraryImpl implements Itinerary, UninitializedItinerary {
  ...
}

public interface Itinerary {
  List getLegs();
  boolean isExpected(HandlingEvent event);
}

public interface UninitializedItinerary {
  void setLegs(List legs);
  Itinerary validate(ValidationResults r);
}

public class Cargo {
  public void assignToRoute(Itinerary itinerary) {
  ...


The next step is to support composing validation for multiple objects. We can do this with a simple local transaction class, used like this for example:

UninitializedItinerary itinerary = itineraryFactory.create();
itinerary.setLegs(...);
txn.with(itinerary).add(i->cargo.assignToRoute(i));
if (txn.isValid()) {
  txn.commit();
} else {
  reject(txn.getValidationResults());
}

With the validation approach described in the previous post, support for these transactions is simple and easy.
Within the domain, we use only domain interfaces. We use the transaction class to convert from uninitialized interfaces to domain interfaces. Especially with Java 8 lambda expressions, it's easy to defer actions until after validation. For example, the "cargo.assignToRoute(i)" call above does not run until and unless all validation for the transaction has succeeded.

Using his approach, it's hard to accidentally use an object before it's been initialized. For example, an unadorned:
  cargo.assignToRoute(itinerary);
doesn't compile. Nor does an attempt to modify the private state of an already initialized object:
  cargo.itinerary().setLegs(null);

"Enrichment" or "defaulting" has exactly the same challenge as validation. In fact, any calculation that is both validated and then applied to the domain has the same challenge. We want neither the entity nor clients of the domain to care about defaulting logic. The solution is the same: wire up defaulting services in the object factory and let the transaction wiring ensure that defaulting, as well as validation, is applied at the right time.

These transactions are like local database transactions. Instead of making changes immediately and performing a "rollback" if those changes are not actually valid, DDD transactions validate first and only then proceed to make changes.

This is how to do transactions for Event Sourcing or Prevalent Systems. Enrich and validate whole objects, and use the type system to ensure that comes first.

Friday, May 19, 2017

DDD Validation

How should we implement validation for Domain Driven Design?

Our first thought is to put validation in the entity. After all, entities are in charge of maintaining their invariants. They are the heart of our business logic, and validation can also be pretty fundamental to our business logic.

One disadvantage of validating in the entity is that the entity can grow too large. Another disadvantage is that validations often require access to services, so putting those in the entity is not Clean Architecture. The third disadvantage is that entity methods do not naturally provide a powerful enough API. Throwing exceptions does not reasonably handle results for more than a single validation. It's too easy for calling code to neglect to properly call pairs of methods for validation or to check method return values, and these calling conventions detract from the proper focus of an entity.

E.g.

class Creature {
 public void eat(Snack v) {...} //invariant maintained using types
 private void setBloodSugarLevel(int i) {...} //invariant maintained privately
 public void eatIfValid1(Object o) throws RuntimeException() {...} //no
 public void eatIfValid2(Object o) throws EatingException() {...} //no
 public ValidationResults eatIfValid3(Object o) {...} //no
 public ValidationResults validateEat(Object o) {...} //no
}

The rest of this post describes a different approach to validation, which solves these problems.

We code each validation in a class by itself, thereby satisfying the Single Reponsibility Principle. All the validations of an object should implement a common interface. Two interfaces are better than one here; use one interface for indicating that the object is invalid, and another for providing information regarding how or why it's invalid (SRP again). Not only does this help share code for generating validation results, it also causes your code to be cleaner for the cases in which the result is specific to the validation.

E.g.

@Singleton
class ComfySnackValidation implements Predicate, Function {
 @Inject
 WeatherService weather;

 public boolean test(Snack snack) {
  int temperature = weather.getCurrentTemperatureInFarenheit();
  if (temperature < 68 || 78 < temperature)
   return true;
 }

 public ValidationResult apply(Snack snack) {
  return new ValidationResult(getClass().getSimpleName());
 }
}

There are two important aspects to this approach:
1) we validate whole objects and not individual method calls, and
2) we allow creating invalid objects.

Validating anything other than whole objects requires one of inelegant APIs mentioned above. Validating only whole objects enables us to leverage the type checker, as we'll see in the next post. The objects that we validate may be entities or value objects. They may be "command objects", that exist solely to serve as arguments to a single method. Often, the object needs a reference to another object which is already valid and persisted. This is fine, so long as nothing in the persistent object graph yet refers back to the new object, the object which is not yet known to be valid.

Creating invalid objects is especially compelling in Java, which doesn't yet support named parameters, and for which entity builders can be challenging. Even in languages which do support named parameters, we often want to use the actual object before we know it's valid, consulting it in defaulting and validation logic. We may even want to publish invalid objects, and it’s better to not have two different code paths for publishing the same fields.

We can achieve “correctness by construction”; there should be no reasonable way to call the domain incorrectly. We can achieve this without the entities having to know about each validation. The essence of the design is that a factory injects a collection of validating services into the object to be validated.

e.g.

@Singleton
public class SnackFactory {
  private Validator validator = new Validator();

  @Inject setComfyValidation(ComfySnackValidation v) {
    validator.add(v);
  }

  ...other validations to inject...

  public Snack create() {
    return new Snack(validator);
  }
}

With a small generic ValidatorImpl, the boilerplate that we need in the validated object is small:

e.g.

class SnackImpl implements Snack {
 private Validator validator;

 public Snack(Validator validator) {
  this.validator = validator;
 }

 public Snack validate(ValidationResults results) {
  return validator.validate(this, results);
 }
}

Here is an example of a generic validator to support.

Next post will discuss how the type checking works.

Thursday, May 18, 2017

DDD Entities

In Domain Driven Design, it's all about the entities.

Entities are the things in your software users' mental model that change.

In Clean Architecture, your entities are independent of all of the rest of your software.

All the rest of your software is defined mostly in relation to entities. Repositories are collections of entities. And nothing else in DDD software changes over time.

Entities are the genuine objects of Object Oriented Programming.

Your software should only change entities by calling their methods, and never by directly modifying their internal state.

E.g. animal.eat(grass) and not animal.setBloodSugarLevel(100)

Thursday, May 11, 2017

High level code, great performance

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/destination-passing-style-1.pdf

Exciting!

Sunday, March 19, 2017

Slicing the cake and different products

The concept of "slicing the cake" is one of the most important lessons to come out of the agile movement. People consistently neglect to apply it across products...

One solution is to have a single developer make all the changes across products. This makes sense if all the products use the same infrastructure, or all have high standards of documentation and devops. E.g. it doesn't work well if different products have different build processes that all require manual intervention. When we reduce barriers to entry this way, "ownership" of individual products might then be meaningful only for code reviews and maintaining long term health of each code base.

The only other solution is to have developers on each of the products all working together to integrate early. If your sprints are two weeks long, each developer gets one week to build an initial implementation, and must be available to integrate immediately on the first day of the second week. Everyone should expect developers to refactor their initial implementations afterwards.

Sources of technical debt

All coding smells get worse over time until they are corrected.

For example, an awkward abstraction might not seem so bad when it is introduced, but as more features are added, it gets increasingly brittle and creates more bugs.

In practical software development these are the most common sources of technical debt:

YAGNI (You aren't gonna need it.)

People build things that don't end up being needed at all, or more insidiously people build things that have an overall cost which exceeds their benefit.

Adapting

Existing code is organized into two parts A and B, and a new use case comes along that needs mostly B. That is, the interface to use B is awkward. Rather than improve that interface, people have their code C start using B via an extra piece of code, a B-C adapter.

Adapters are necessary when the adapted code is owned by a different team, but in that case you'd hope that the interface is well designed in the first place, or at least the integration is part of the application's job. When all the code is owned by a single team, adapters are just debt.

Special casing

A change is desired for a single specific case. The change could apply more generally, but it isn't urgent. People add new code that runs just for the specific case, because they are afraid.

This source of technical debt is particularly tempting. And sometimes it's hard to distinguish from properly avoiding YAGNI. Just as with all refactoring, good automated tests are essential.

Wednesday, February 08, 2017

Schmoperties

Schmoperties is the best two java configuration APIs.

Monday, January 16, 2017

In praise of kafka

Messaging is great because it reduces coupling.

Kakfa does it even better.

A message consumer can come up and work just fine after being down all week.

Cool unhyped languages

Pony - the best of low level and high level
Zig - more pragmatic than c
Lamdu - types are friendly and easy
Crema - program without turing completeness

Monday, October 10, 2016

Easier git squash

The traditional way to squash multiple commits into a single commit is to use interactive rebase.

That way involves a lot of merging work, one merge for each commit.

You can avoid all of that work by instead using the git checkout branch -- file usage.

E.g. you'd like to merge feature/foo onto master using a single commit:

git fetch origin master
git checkout -b feature/foo-squashed origin/master
rm -rf *
git checkout feature/foo -- .
git commit -a
git push origin feature/foo-squashed:feature/foo

Monday, July 04, 2016

Cool java tools

Power

Dagger
Mapstruct
Autovalue

Safety

Error-prone
Pure4j

Writing your own

Javapoet
Autoservice

Release workflow

This revision control workflow is designed keep you sane.

There are just two types of long lived branches:
  • master - This branch lives forever, and all changes are eventually merged to it.
  • release branches - These are branched from master for each major release candidate.

Master

New features are implemented against master, by merging feature branches into it. Feature branches should be as short lived as possible, because keeping them up-to-date is expensive. If your feature can't be implemented within a single sprint, consider branching by abstraction.

Release Branches

Release branches are also known as "stabilization branches". Bugs that should be fixed for a release are first merged into the corresponding release branch, and then merged forward to all later releases and then to master.

Change as little code in release branches as possible, because merging to code that has later changed is expensive.

Each published build of a release branch should be tagged so that it can be easily identified and ordered.

Example

In June, we branch master to release/1, and start testing it.
In July, we branch master to release/2, and start testing it.
In August, we find a bug in release/1.
We branch release/1 to bugfix/555, and fix the bug on the bugfix/555 branch.
We merge bugfix/555 to release/1, to release/2, and to master.

Feature development in July and later does not create risk for release/1, and even bugfixes for release/2 do not create risk for release/1.

Tracking Bugfixes

It is straightforward to automatically merge most changes from older releases to newer, and to report regarding changes that have not yet been merged. We want to avoid regressing a bugfix in a later release just because we forgot to merge.

Using the merge history to keep track of what fixes have been applied to releases sometimes requires doing a "trivial merge". Even when there is no change to be made to the later branch, we still have to merge, to inform the revision control software (e.g. git) that the bug is fixed in the later branch.

Cherry picking

If we merge a bugfix into a release branch and only realize afterwards that it should've been fixed in an earlier release, we must cherry pick, and trivial merge as described in the previous section.

Hotfix Branches

It is good to decouple bug fixing from the choice of exactly which build is deployed into production, especially when there are multiple production environments that have different risk profiles. Therefore, the production tag could be different from the tip of the release branch. When this happens and we want to make an urgent fix for that production environment, we don't want to jump to the tip of the release branch. We haven't yet tested all the changes in the release branch, so we hotfix branch from the tag that we have tested.

Example

In September, we deploy the tag 1.2 of release/1 to the London production environment.
In October, we make five bugfixes to release/1 in preparation for its 1.7 New York release.
In November, we discover a critical global bug in release/1.
We branch tag 1.2 to hotfix/1.2-LN, and fix the bug on the hotfix/1.2-LN branch.
We deploy tag 1.2-LN1 of hotfix/1.2-LN to the London production environment.
When we have leisure, we merge hotfix/1.2-LN, to release/1 etc.

Advanced Topics

Gitflow and ProdFlow

The "gitflow" workflow from nvie has two problems: it confusingly changes the meaning of the "master" branch, and it doesn't clearly justify the cost of an additional long lived branch. Both problems could be resolved by tweaking the workflow to support multiple production environments. This is valuable if you actually need to support multiple production environments :). Let's call this improved workflow "prodflow".

Prodflow has release branches and master-as-trunk. Instead of one branch for release history, it uses a branch for each production environment, prefixed with "prod/". So when we deploy release/1 to production in London, we merge it to prod/LN.

An advantage of this approach is that it simplifies hotfixes. Neither gitflow nor prodflow really need hotfix branches; you can just merge bugfixes first into the prod branch and then into release branches and master. We don't need to keep track of the version in production.

Instead of: "
We branch tag 1.2 to hotfix/1.2-LN.
We branch hotfix/1.2-LN to bugfix/777...
" in the example above, that'd be just "We branch prod/LN to bugfix/777...".

Deserializing Merges

When there is a conflict, merging bugfixes from one release branch to the next can be a pain. Developers might not do it quickly, and this blocks merging subsequent bugfixes. We accumulate a backlog of unmerged changes, and it becomes increasingly onerous to work it down, especially with proper code review.

The solution is to "deserialize" merges. When bugfixes are made to different parts of the code, one needn't be blocked by the other. We can convey this to git using "rebase --onto", so that the parent commit of the bugfix branch is a commit that is already present on all the long lived branches. If we didn't rebase, ordinarily the parent commit of the bugfix branch would be the tip of its release branch, and the author of that commit might be procrastinating her merge.

Example

We merge bugfix/888 to release/1 but not to master.
Now we'd like to merge bugfix/999 to release/1 and master without being blocked by bugfix/888.
Before merging to release/1, we run: git rebase --onto origin/master bugfix/999
And it Just Works!

Conclusion

So that's the git release voodoo I learned over the last couple of years. I hope you find it useful!

Wednesday, November 18, 2015

Posting cljs to get around an unkind firewall.

Sunday, December 28, 2014

Extensible Software and Dependency Injection

tl;dr For Dependency Injection of a collection of implementations of an interface, just inject the concrete classes of those implementations.

Supporting change is a big challenge in programming. We want to make the most likely kinds of change quick and easy. How do we do that with Dependency Injection?

So if you have code that does five things that all have the same pattern, you'd like to be able to easily add a sixth. Common examples are programs pulling from multiple sources of data, and programs performing multiple validations. The technical name for this kind of thing is the Open Closed Principle.

If your five things are complicated, their code might organically spread out over your system...
setup();
doWork();
cleanup();

void setup() {
    setupForDataSourceA();
    setupForDataSourceB();
    ...
}

void doWork() {
    workForDataSourceA();
    workForDataSourceB();
    ...
}

...
The idiomatic way to handle this in java is with interfaces:
interface DataSource {
    void setup();
    void doWork();
    void cleanup();
}
So each data source implements the interface, and many parts of the general code loop over a set of instances of the interface.

for (DataSource datasource : datasources) {
    datasource.setup();
}
for (DataSource datasource : datasources) {
    datasource.doWork();
}
...
If you're using Guice for your dependency injection, when you hear the word "set" you might be tempted to use multibindings.

Or you might think, "wouldn't it be cool to add a new datasource without having to change any existing code". You could write a whiz bang code generation framework that automatically wires up implementations, either purely because they implement the interface or when they are additionally annotated.

Keep it simple sweety! Remember the original goal: making software easy to change. When the software is too clever, it becomes harder to change over time, especially as multiple programmers work on it.

There's a simple way to do it in Guice:
@Singleton
class DataSources {
    Set datasources = new HashSet();

    public Set getDataSources() {
        return Colletions.unmodifiableSet(set);
    }

    @Inject void setDataSourceA(DataSourceA datasource) {
        datasources.add(datasource);
    }

    @Inject void setDataSourceB(DataSourceB datasource) {
        datasources.add(datasource);
    }

    ...
}
This works without any additional binding configuration, using Guice's JustInTimeBindings support for eligible constructors.

This wiring is vanilla java, and surprisingly not Guice's embedded domain specific language. These implementations aren't typical injected dependencies, because the design of the application requires that you run lots of them.

Don't be scared by the idea of a concrete class, because the framework code still only ever relies on the interface. The DataSources class above has the same testing burden as Guice modules. If you don't unit test your Guice modules, don't unit these classes either.

In addition to being more simple than multibindings, this pattern is much more explicit because the full configuration is in one place. Multibindings allow multiple Guice modules to add to the one binding.

It's one of the few cases where setter injection is best, though it works fine with field and constructor injection too. Setter injection enables you to add new instances by editing a single location in the composing class. It also allows you to simply subclass in order to compose an additional group of instances.

Thursday, December 26, 2013

Test Driven Development (TDD) recommends first writing a failing test.

Continuous Integration recommends committing early and often.

How do you do both and not break the build?

How do you distinguish between test failures associated with regressions and test failures associated with unfinished development?

Using Junit @Rules!

Instead of just disabling a test using the Junit standard @Ignore annotation, you can annotate a new test using @NotImplemented. The result of these tests will be inverted; the automated build's junit test will succeed if and only if the actual test logic fails.

That way, when the application functionality is still incomplete the actual failing test will not break the build. When the functionality is ready, the inverted test will start breaking the build, so that you don't forget to enable it.

And the @NotImplemented annotation provides a machine-checked way to track known issues.

Here's the implementation:

import org.junit.rules.TestRule;
import org.junit.runner.Description;
import org.junit.runners.model.Statement;
import java.lang.annotation.*;

/*
 * To use in a testcase, include a line:
 *
        @Rule
        public NotImplemented.MustFail notImplementedRule = new NotImplemented.MustFail();
 *
 * and annotate individual test methods:
 *
        @NotImplemented @Test
        public void someTest() {
                ...
        }
 *
 */

@Retention(RetentionPolicy.RUNTIME)
@Target(ElementType.METHOD)
public @interface NotImplemented {
        String value() default ""; //to enable grouping tests; doesn't affect tests
        boolean invertTest() default true;

        public static class MustFail implements TestRule {
                @Override
                public Statement apply(final Statement base, Description description) {
                        NotImplemented annotation = description.getAnnotation(NotImplemented.class);
                        if (annotation == null || !annotation.invertTest()) {
                                return base;
                        } else {
                                return new Statement() {
                                        @Override
                                        public void evaluate() throws Throwable {
                                                try {
                                                        base.evaluate();
                                                } catch (AssertionError e) {
                                                        return;
                                                }
                                                throw new AssertionError();
                                        }
                                };
                        }
                }
        }
}

Sunday, July 18, 2010

Limited video for kids

Few parents can completely resist employing the video-as-babysitter.

Parents that use linux can easily have their computer play just one movie with the following script:

if [ "$1" ]
then
totem --fullscreen "$@" &
fi
xtrlock

It starts playing the video(s) you want, but then prevents the kids from playing other videos or engaging in other unsupervised fiddling with the computer.

xtrlock needs to be installed, easily for example from the standard Ubuntu Software Center in Ubuntu 10.04.

You can create a Launcher in the panel for this script and a %F paramter, and drag videos to it from the File Browser.

Sunday, July 05, 2009

Google Strategy

Google has three main strategic product categories.

Google has login-based services, like Gmail. These are most like other companies' internet services. They also have the most lock-in effect, though Google expends some effort to reduce it (for example allowing downloading and forwarding away email messages). These services enrich Google search data with user identity data.

Google has internet growth services, like Chrome, Android, and News. In general, companies would like to reduce "substitutes" and increase "complements". Since Google has dominant marketshare of internet search, anything that increases internet use is a complement. Google can run ads on all of these services, and to varying degrees their users are more likely to be Google search users. Developing these services is also a little like Military Keynesianism. It gives Google engineers fun projects to work on, helping Google to get and keep top engineers, who help it to maintain its internet search dominance.

Google has the internet. Since Google is the king of search, it will vigorously defend the internet against closed competing networks like the new social networks Facebook, LinkedIn, and Twitter. At least three distinct Google projects, though they also occupy the second category, attempt to open up the social software world: Social Graph API, Open Social, and Wave. This makes Google "the good guys," because they'll always try to bring the fight to the open web, where they have the competitive advantage. Now, if Google didn't have a challenger, it might just release something into the first category (Orkut for example), but when they do have a competitor, everyone will end up being a lot better off. Wave promises to be a wonderful open technology, though it will put a lot of social software companies out of business.

Thursday, July 26, 2007

Temporal and Bitemporal in One Sentence

Temporal database support allow efficiently answering questions like "What was the state of this entry on Monday?", and bitemporal database support, questions like "What was the state of his entry on Monday, if I had asked on Tuesday?"