Wednesday, September 23, 2020

IPython replacing shell

IPython is a good UNIX shell replacement, for those who want an interactive shell with a better programming language.

You can use unadorned shell commands, file globbing, and pipes, as well as regular python.

Regular shell: ls *.py | grep -v foo

Regular python: for x in range(3): print(x)

There is special syntax for running shell commands to be easily used by python code...

Shell in python: my_python_var1 = !ls #only for assigning variables

Python in shell: ls $my_python_var2

For anybody pining for python2: print "foo" #works!

Also works for any outermost python function: dir enumerate(list())

But unfortunately doesn't currently work with other code: for x in range(3): print x

For some reason, the maintainers are no longer publishing a profile for this stuff, so you can use the following to get up and running:

pip install ipython

mkdir -p ~/.ipython/profile_default
cat >> ~/.ipython/profile_default/ipython_config.py <<EOF
from IPython.terminal.prompts import Prompts, Token
import os

class MyPrompt(Prompts):
    def cwd(self):
        cwd = os.getcwd()
        if cwd.startswith(os.environ['HOME']):
            cwd = cwd.replace(os.environ['HOME'], '~')
            cwd_list = cwd.split('/')
            for i,v in enumerate(cwd_list):
                if i not in (1,len(cwd_list)-1): #not last and first after ~
                    cwd_list[i] = cwd_list[i][0] #abbreviate
            cwd = '/'.join(cwd_list)
        return cwd

    def in_prompt_tokens(self, cli=None):
        return [
                (Token.Prompt, 'In ['),
                (Token.PromptNum, str(self.shell.execution_count)),
                (Token.Prompt, '] '),
                (Token, self.cwd()),
                (Token.Prompt, ': ')]

c.TerminalInteractiveShell.prompts_class = MyPrompt
c.TerminalInteractiveShell.editing_mode = 'vi'
c.InteractiveShell.autocall = 2 #insert parens around functions as much as possible
c.InteractiveShellApp.exec_lines = ['%rehashx']
EOF

Wednesday, May 27, 2020

Select and Lookup

Select

Computers let you select something you see and do useful things with it. This interaction should be improved and standardized.

Whether it's with a mouse or a finger, when you select text, a little menu should pop up, and its first choices should be: Lookup, Copy, Share.

It's fine if applications want to customize and support additional choices, but those three should always be the first choices. Currently each application has those mixed up differently with other choices, sometimes not visible without an additional action, and sometimes even missing them entirely.

Just like on mobile devices, the menu should pop up immediately on traditional computers without needing a separate action like a right click. That the menu should not interfere with changing the selection.

Lookup

"Lookup" should be customizable, in terms of whether the first screen is a local search, a query to a popular search engine, a dictionary lookup, etc. Other forms of lookup should be linked from that first screen.

For example, selecting the word "customizable" above could show the content of the Wikipedia page for "customizable", with links on the first page there to the dictionary entry, to the Google search, and to the Memex page.

Ideally the first screen would display the page returned by Google's "I'm Feeling Lucky" feature. As the linked page mentions, making this feature more prominent might have revenue implications for Google. As long as we're talking about Google, we should also mention that Google already tries to implement something like what I'm describing. So many Google searches have a dictionary definition and a wikipedia link in their first pages. That would be redundant if operating systems were already presenting that information even more prominently.

Another important possibility for the main lookup page, when the selected text is a hyperlink, would be the actual linked page. For example, the two pages linked above are a lot more useful for understanding how I'm using the terms of the linked text than any of the other sources of information discussed here (at least the time of writing), so should be the primary destination for readers looking them up.

Monday, February 24, 2020

Automated tests and new bugs

Can unit tests find unexpected bugs?

Russ Cox offhandedly mentions that, no, unit tests only "make sure bugs you fix are not reintroduced over time", in his brilliant recent piece on versioning in go.

On the other hand, Hillel Wayne has a nice example using property based testing and contracts. See if you can spot the bug yourself before following that link:
def mode(l):
  max = None
  count = {}
  for x in l:
    if x not in count:
      count[x] = 0
    count[x] += 1
    if not max or count[x] > count[max]:
      max = x
  return max

Hillel analyzes several different approaches to testing. The punch line is that you catch the bug with the following steps:
  • annotate the mode function with its specification:
    
    @ensure("result must occur at least as frequently as any member", 
      lambda a, r: all((a.l.count(r) >= a.l.count(x) for x in a.l)))
    
    
  • also prepend the following:
    
    from hypothesis import given
    from hypothesis.strategies import lists, integers, text
    from dpcontracts import require, ensure
    
    @given(lists(text()))
    def test_mode(l):
        mode(l)
    
    
  • install the prerequisites with: pip3 install hypothesis dpcontracts pytest
  • run pytest on the file that you created with the mode function and its test code
Though this particular specification looks like an alternate implementation, it isn't intended to be. Using one implementation to test another is a kind of "test oracle", but that doesn't feel like an elegant way to find bugs. The test implementation could have its own bugs, and if maintained by the same programmers, it could even have the same bugs as the regular implementation.

In contrast, a specification can be easier to read than any practical implementation. At least with current technology, there are limits on how clear pragmatic code can be. Specification-like code can be too slow. Perhaps the specification could be an oversimplified implementation, and still be useful for testing.

Testing based purely on the specification is not enough however. We need some automated equivalent of "clear box testing".

Thursday, January 16, 2020

Zig

It'd be great for Zig to replace C/C++.

Zig's main advantage over C is robustness. That includes features like:
It aims to keep all of the advantages of C, including:
Its main architectural trick is good support for compile-time code execution. This is even how it implements generic types.

It also includes (or will include) tooling that's as good any platform's:

Tuesday, January 07, 2020

Beyond Literate Programming

Programs must be written for people to read, and only incidentally for machines to execute.
― Harold Abelson

More important to read a program than to run it? Absurd! How can something even be a called a program if it can't be executed by a machine? So soften the hyperbole: programs should first be written to be easy to read, and only afterwards optimized. It's the opposite of premature optimization.

The quote's hyperbole is good, because technical debt is caused by customers benefiting only directly from machines executing programs, as opposed to anything associated with people reading programs' source code. All maintenance programming starts with people reading. So this could be a programming principle that's more important than extensibility, discoverability, clean code, clean architecture, SOLID, DRY, YAGNI, KISS, least astonishment, defensive programming, offensive programming, cohesion&coherence, consistent style, information hiding, and good separation of concerns. Let's call it the readability principle: programs should be written in such a way as to communicate their function as clearly as possible. "Function" refers to function which is meaningful to the user. "Communicate" refers to informing somebody who is not already familiar with the program, and not even an expert in all of the technologies employed. This implies that programmers should avoid both Rube Goldberg machines and fancy language features.

Knuth's literate programming is the original push to have a single program provide both function and documentation. Its most popular descendant is the Javadoc family, which is now the standard way to document APIs. Jupyter Notebooks and Devcards publish executable documentation. Concordion brings together test code and documentation in a lovely way. All of these involve additional program documentation, which is not itself executable. Of course this should not distract from making properly self-documenting code. All additional documentation must have clear and agreed benefit, to justify the cost of manually keeping it up-to-date with the program itself.

The next big thing in program readability is humble: put design documents and technical specifications in the same repository as your program! And favor the markdown format. The best new source control repositories (Github, Bitbucket, and Gitlab) all render it automatically to the web. Having program and doc together makes it easier to keep them in the sync. Versions and links are easier, as well as running find-and-replace across program and doc. Perhaps most importantly, if you keep program and documentation together, you signal that programmers should maintain both.

Simplicity

In programming conversation, people often incorrectly use the word "simple" as a synonym for "good". This is probably because they have on some level internalized the lesson that "complexity kills", so they conflate "simplicity" and "justified complexity".

Simplicity refers to having fewer layers, fewer features, fewer moving parts, and fewer distinctions.

Thursday, March 28, 2019

Translating objects simply


How do you translate one object into a different, related object? Simply!

interface FooBarTranslator {
    Foo translate(Bar bar);
}

Some people, when confronted with this problem, think "I know, I'll use a framework." Now they have two problems, as the saying goes.

Many people code it themselves using abstractions and design patterns. Their code has organizing principles, but it isn't clear what those principles achieve. This code starts out as interesting to read, and gets progressively worse as each new developer touches it. It's even harder to maintain when we're integrating with other systems. It's a pain to test properly with those other systems, so we're reluctant to change the code unless we're forced to.

So here's a simple pattern to use the next time you find yourself translating objects.
  1. Use a single function with a single line for each field on the object to which you are translating.
  2. Whenever you need something new, do the simplest thing that could possibly work.
  3. There is no step 3!
Before any more explanation, let's do an example or two...

E.g. if Foo has fields a and b and c, your translator might look like:

    Foo translate(Bar bar) {
        Foo foo = new Foo();
        foo.setA(bar.getQux());
        foo.setB(bar.getBaz());
        foo.setC(bar.getQux());
        return foo;
    }

Here's an example that shows simple ways to address multiple aspects of the problem.

    Foo translate(Bar bar) {
        Foo foo = fooFactory.create();
        foo.setA(123);
        foo.setB(bar.getB()+1);
        foo.setC(aFunction(bar.getC()));
        D d = anExpensiveFunction(bar.getD());
        foo.setD(d.getSomePart());
        foo.setE(d.getSomeOtherPart());
        foo.setNestedF(fBarTranslator.translate(bar));
        if (bar.getFloat() != null) foo.setPrimitiveFloat(bar.getFloat());
        return foo;
    } 

Organizing the code in terms of the target object eliminates a whole category of bugs and confusion, in which the responsibility for setting a single field is spread out over multiple places. The code in those different places grows overlapping and conflicting behavior.

Organizing the code in terms of the target object achieves some "functional programming" goodness. Stakeholders often ask questions like "why does this field have this value? where does it come from?" and our code makes it easy to answer that kind of question. Each of the lines in the translation function is like a definition of the target object field. Though a single field on the source object might contribute to the value of multiple fields on the target object, a single field on the target object is only ever determined by a single complete function.

For fields with nontrivial logic, that logic should reside in a function in a different class, a la Single Responsibility Pattern. (I favor static methods for pure logic and field injection for logic that needs to consult some state, but there are good reasons to do otherwise.)

Any additional architecture only ever harms your code! Resist the temptation to add architecture. It will not pay for itself.

An important rule that you can't see in the examples above: never retrieve information from the target while you are translating to it. Another thing that might not be obvious: if your translation is a nice normal stateless translation, then your translator class should not have any member variables. Use local variables and not member variables for your expensive calculations.

If you have more than a handful of fields, you might want to order your setter calls alphabetically. That makes it harder for developers to accidentally set the same field twice in the same function. It makes it easy to find a field on a printout or to consult multiple fields without having to search via the app. It also makes git merge conflicts smarter; you want a conflict when two different developers merge different code for setting the same field, but you don't want a conflict when two different developers merge code for setting two different new fields. Just adding support for new fields to the end of the function yields a conflict in both cases.

Adding support for new fields to the end of the function is not a guideline which is easy to defend, or which is even easy for faithful developers to see in the code. If that's your de facto organizing principle, then you'll end up with no organizing principle. In contrast, if you alphabetize and have clear standards around dependency injection etc as mentioned above, you'll have one canonical simplest possible program for any translation. One right way to do it.

This raises the biggest disadvantage. If you don't have firsthand, repeated, experience with the pain of the alternatives at scale, you might not appreciate this pattern. 😉

Friday, December 28, 2018

The Web is a detail?

Should programs be mostly independent of how they communicate with each other?

Yes, if you can do it right.

If you do it wrong, you'll end up with a big hunk of useless complexity. You have been warned. The rest of this post describes one way of "doing it right" in Java. TL;DR use Autovalue and Mapstruct in a separate module.

One of the nicest things about Java is that it allows checking your interfaces at compile time, before running any of your code. So you can have one module that depends on the interfaces of your transport technology, and another module that doesn't. You can even have Maven prevent people from accidentally spreading that dependency.
This enables building and testing the domain code entirely independently of transport technology choice. Ideally, you could add support for another transport later by adding another module and not making any changes to your domain code. If your transport technology involves code generation, you might keep the generated Data Transfer Objects (DTOs) in the same module as your adapter code.

If you need multiple transports from day one, then this architecture is obviously the right way to go. It's more controversial if you're starting with a single transport, and it's also easier to get wrong. There's a strong argument to be made that you aren't going to need it. Besides being disciplined about dependencies, the most important thing is to keep the adapter layer small. We can do that with some open source code generation tools...

Let's say that you have a bunch of DTOs to work with messages that are sent and received by your message encoding technology, e.g. Google Protocol Buffers (Protobuf). Instead of working with those technology specific objects directly, your domain code can work with corresponding lightweight value objects.

Initially, you'll want to have one abstract class corresponding to each Protobuf message. With a little bit of annotation, AutoValue will nicely generate value objects for you, as well as fluent builders to compensate for Java's lack of named parameters. Here's an example, assuming you have a FooRequest message with fields bar and baz:

@AutoValue
abstract class FooRequest {
 abstract int getBar();
 abstract String getBaz();

 Builder builder() { return new AutoValue_FooRequest.Builder(); }

 @AutoValue.Builder

 static abstract class Builder {
  abstract Builder setBar(int i);
  abstract Builder setBaz(String s);
  abstract FooRequest build();
 }
}

That goes in your domain module, and your domain code can proudly rely on it. Imagine a similar FooResponse. For example, you might have a FooUseCase with a method that accepts a FooRequest. E.g.
@Singleton
class FooUseCase {
 FooResponse handle(FooRequest r) {
  someFunction(r.getFoo(), r.getBar());
  return FooResponse.builder()
   .setBar(...)
   .setBaz(...)
   .build();
 }
 ...
}

To map between the DTO and the domain object, use the following Mapstruct annotated abstract class in the adapter module:

@Singleton
abstract class FooUseCaseAdapter {
 FooRequestMapper requestMapper = new FooUseCaseAdapter$FooRequestMapperImpl(); //generated by mapstruct
 FooResponseMapper responseMapper = new FooUseCaseAdapter$FooResponseMapperImpl(); //generated by mapstruct
 @Inject FooUseCase foo;
 @Inject Sender sender;

 void handle(FooIncomingMessage m) {
  sender.send(responseMapper.map(foo.handle(requestMapper.map(m)));
 }

 @Mapper(unmappedTargetPolicy=ERROR, unmappedSourcePolicy=IGNORE)

 static abstract class FooRequestMapper {
  FooRequest map(FooIncomingMessage m);
 }

 @Mapper(unmappedTargetPolicy=IGNORE, unmappedSourcePolicy=ERROR)

 static abstract class FooResponseMapper {
  FooOutgoingMessage map(FooResponse r);
 }
}

Using Mapstruct means that you don't have to write adapter code for each field. If you have a field of the same name in the DTO class and in the Autovalue class, the generated code will map that field automatically. The "unmapped" policy of ERROR means that you will be stopped at compile time from accidentally renaming a field in the DTO but not in the domain, or vice versa. It also prevents you from forgetting to add a field to the DTO when you add a field to the domain. You'll want to use the "target" and "source" as above, lest you get lots of false positives as people evolve the DTOs independently of your application.

So our YAGNI overhead is two trivial classes for each DTO. One of them needs two lines for each DTO field used by the application. Considering that the application needs changes anyway to use a new field, those two extra changes do not seem onerous. Besides not having to write and maintain the generated code, there's the advantage that its design doesn't drift. The Autovalue classes may accumulate logic, via interfaces and default values, for example. That's perfectly fine.

This whole exercise assumes that most of your automated tests will be on the Autovalue classes. The unmapped error policy reduces the risk of bugs, even if you don't test the DTO mapping at all, but the paranoid among us might want to have a test or two to exercise each field. Mapstruct is smart enough to map between different types, so that you can use an long integer on your DTO and an Instant in your domain, for example. If you use that feature, it makes sense to test it.

What about generated enum types? Perhaps compromise is best here: generate or pull the enum classes into a separate module, and as long as these don't do strange things in static initialization, just rely on that module in the domain. The opportunity for coupling related to enums should be small. The alternative is to duplicate all the enums manually. Though Mapstruct will automatically map and warn regarding those too, that overhead may not pay for itself, especially for enums that have values that the application just needs to pass on without special handling.

Notice in the example above that a single adapter handles publishing as well as incoming message processing. We could have the domain code return a tuple-like object which contains references to Autovalue objects for all of the messages which could be published in reaction to the incoming message. This is very easy to test, and doesn't require any mocking at all. Alternatively, we could instead have the domain call some kind of publisher interface, with two implementations: a mock implementation for testing, as well as an adapter of the real transport. The real implementation would be bound to the interfaces using a dependency injection configuration in the adapter. This approach enables the adapters to be quite uniform, and to hardly ever change. If the number of possible messages to publish for each case is small, the first approach is simpler.

(The name of this post is a quote from Bob Martin. How the database might also be a detail is a topic for another post. Event sourcing makes it less relevant.)

Tuesday, February 20, 2018

State

Computer programming is much more difficult than it should be. It's much less elegant than it should be. Programs often resemble logical Rube Goldberg machines, or corporate income tax forms.

This complexity is caused by our inability to properly manage "state", or information that changes over time. Procedural programming, object oriented programming, domain driven design, immutable data structures, and monads are all attempts to better manage state, and none of them are good enough.

The more recent functional reactive technologies are a step in the right direction. React is good, and MobX, Javelin, and Matrix are better. Serious programming can be as easy as spreadsheets. We can have full referential transparency, and we can refactor without fear.


Sunday, December 03, 2017

DDD Observers

Observers automatically run code when some condition occurs, without the causing code having to care.

Even in an imperative, object oriented environment, we'd like to support "birds sing when the sun comes up" without having to change the code for the sun coming up.

Theoretically, we could do this with simple polling. We could just explicitly check the condition every interval. E.g. is the sun up yet? Five seconds later: is the sun up now? Five second after that: what about now? This is not elegant, and could be costly, and completely falls down when we need to poll many objects.

We can implement observers cleanly by wrapping our entities!

Instead of triggering on changes to the internal state of an entity, using  "properties" as Kotlin does for example, we can wrap our entities in decorators that intercept calls from client code. Instead of introducing a new event abstraction, we can allow triggered code to compare the visible state of the entity before and after the triggering method.

Even a seemingly innocuous "event observer" class ruins the entity’s nice obliviousness. We step onto the slippery slope of adding notification features because some client might be interested. We shouldn't let our observing clients couple to our commanding clients, nor should we introduce a completely new abstraction in between. Observing clients are really interested in the change of an existing observable field, though not all clients are interested in the same changes.

The decorator solution does require calling code to know about the observer, usually at creation time. For all of the practical use cases for which I've needed observers, it's been sufficient to set them up at creation time. The calling code could be a factory, or even better a repository that wraps the entity at persist time.

This use of observers creates a nice parallel to "Collection Oriented Repositories" as discussed by Vernon in Implementing Domain Driven Design. That is, we explicitly wire our entities once, and then we don't need to subsequently worry about persistence and publishing every time the entity changes. We might even be able to leverage observers to support a Collection Oriented API on top of a Persistence Oriented Repository, and still avoid the morass of the full object relational mapping.

One last important point: be careful not to overuse the observer pattern. By definition, it hides the implementation. This can make it hard to figure out what the system is doing. In particular, never let your observing code make changes to your observed objects. Even if you're not worried about infinite loops, this is wrong because the observed code can't actually be oblivious.

Without further ado, here's the java implementation of this properly decoupled observer. It uses annotation processing to generate code to support observers of this interface:
public interface Observer<T, V> {
 default V beforeChange(T t) {
  return null;
 };
 void afterChange(T t, V v);
}
Which may be wired simply by:
 YourEntity entity = new YourEntityImpl();
 entity = new YourEntityObservable(entity, yourObserver);


You could implement this observable decorator manually. It’s kind of boring, so you could instead generate it using something like the linked project above.

Monday, May 29, 2017

DDD Transactions

The last post ended on a bit of cliffhanger. How do we leverage the compiler to validate before making changes to the domain? We’d like to make it easy to follow this pattern:
  1. Build all new objects, setting fields directly
  2. Validate, and if everything is valid then…
  3. Call the domain methods
The simplest thing is to hide our setters behind an interface. We’ll actually have two interfaces: one "uninitialized interface" for newly created objects, with just the setters; and one "domain interface" with just the proper domain methods. The uninitialized interface prevents taking any action until we’ve validated, and the domain interface encapsulates state. A factory returns an instance of the first interface (setters), and the validating method on the object itself returns the second interface (proper domain methods).

public class ItineraryImpl implements Itinerary, UninitializedItinerary {
  ...
}

public interface Itinerary {
  List getLegs();
  boolean isExpected(HandlingEvent event);
}

public interface UninitializedItinerary {
  void setLegs(List legs);
  Itinerary validate(ValidationResults r);
}

public class Cargo {
  public void assignToRoute(Itinerary itinerary) {
  ...


The next step is to support composing validation for multiple objects. We can do this with a simple local transaction class, used like this for example:

UninitializedItinerary itinerary = itineraryFactory.create();
itinerary.setLegs(...);
txn.with(itinerary).add(i->cargo.assignToRoute(i));
if (txn.isValid()) {
  txn.commit();
} else {
  reject(txn.getValidationResults());
}

With the validation approach described in the previous post, support for these transactions is simple and easy.
Within the domain, we use only domain interfaces. We use the transaction class to convert from uninitialized interfaces to domain interfaces. Especially with Java 8 lambda expressions, it's easy to defer actions until after validation. For example, the "cargo.assignToRoute(i)" call above does not run until and unless all validation for the transaction has succeeded.

Using his approach, it's hard to accidentally use an object before it's been initialized. For example, an unadorned:
  cargo.assignToRoute(itinerary);
doesn't compile. Nor does an attempt to modify the private state of an already initialized object:
  cargo.itinerary().setLegs(null);

"Enrichment" or "defaulting" has exactly the same challenge as validation. In fact, any calculation that is both validated and then applied to the domain has the same challenge. We want neither the entity nor clients of the domain to care about defaulting logic. The solution is the same: wire up defaulting services in the object factory and let the transaction wiring ensure that defaulting, as well as validation, is applied at the right time.

These transactions are like local database transactions. Instead of making changes immediately and performing a "rollback" if those changes are not actually valid, DDD transactions validate first and only then proceed to make changes.

This is how to do transactions for Event Sourcing or Prevalent Systems. Enrich and validate whole objects, and use the type system to ensure that comes first.

Friday, May 19, 2017

DDD Validation

How should we implement validation for Domain Driven Design?

Our first thought is to put validation in the entity. After all, entities are in charge of maintaining their invariants. They are the heart of our business logic, and validation can also be pretty fundamental to our business logic.

One disadvantage of validating in the entity is that the entity can grow too large. Another disadvantage is that validations often require access to services, so putting those in the entity is not Clean Architecture. The third disadvantage is that entity methods do not naturally provide a powerful enough API. Throwing exceptions does not reasonably handle results for more than a single validation. It's too easy for calling code to neglect to properly call pairs of methods for validation or to check method return values, and these calling conventions detract from the proper focus of an entity.

E.g.

class Creature {
 public void eat(Snack v) {...} //invariant maintained using types
 private void setBloodSugarLevel(int i) {...} //invariant maintained privately
 public void eatIfValid1(Object o) throws RuntimeException() {...} //no
 public void eatIfValid2(Object o) throws EatingException() {...} //no
 public ValidationResults eatIfValid3(Object o) {...} //no
 public ValidationResults validateEat(Object o) {...} //no
}

The rest of this post describes a different approach to validation, which solves these problems.

We code each validation in a class by itself, thereby satisfying the Single Reponsibility Principle. All the validations of an object should implement a common interface. Two interfaces are better than one here; use one interface for indicating that the object is invalid, and another for providing information regarding how or why it's invalid (SRP again). Not only does this help share code for generating validation results, it also causes your code to be cleaner for the cases in which the result is specific to the validation.

E.g.

@Singleton
class ComfySnackValidation implements Predicate, Function {
 @Inject
 WeatherService weather;

 public boolean test(Snack snack) {
  int temperature = weather.getCurrentTemperatureInFarenheit();
  if (temperature < 68 || 78 < temperature)
   return true;
 }

 public ValidationResult apply(Snack snack) {
  return new ValidationResult(getClass().getSimpleName());
 }
}

There are two important aspects to this approach:
1) we validate whole objects and not individual method calls, and
2) we allow creating invalid objects.

Validating anything other than whole objects requires one of inelegant APIs mentioned above. Validating only whole objects enables us to leverage the type checker, as we'll see in the next post. The objects that we validate may be entities or value objects. They may be "command objects", that exist solely to serve as arguments to a single method. Often, the object needs a reference to another object which is already valid and persisted. This is fine, so long as nothing in the persistent object graph yet refers back to the new object, the object which is not yet known to be valid.

Creating invalid objects is especially compelling in Java, which doesn't yet support named parameters, and for which entity builders can be challenging. Even in languages which do support named parameters, we often want to use the actual object before we know it's valid, consulting it in defaulting and validation logic. We may even want to publish invalid objects, and it’s better to not have two different code paths for publishing the same fields.

We can achieve “correctness by construction”; there should be no reasonable way to call the domain incorrectly. We can achieve this without the entities having to know about each validation. The essence of the design is that a factory injects a collection of validating services into the object to be validated.

e.g.

@Singleton
public class SnackFactory {
  private Validator validator = new Validator();

  @Inject setComfyValidation(ComfySnackValidation v) {
    validator.add(v);
  }

  ...other validations to inject...

  public Snack create() {
    return new Snack(validator);
  }
}

With a small generic ValidatorImpl, the boilerplate that we need in the validated object is small:

e.g.

class SnackImpl implements Snack {
 private Validator validator;

 public Snack(Validator validator) {
  this.validator = validator;
 }

 public Snack validate(ValidationResults results) {
  return validator.validate(this, results);
 }
}

Here is an example of a generic validator to support.

Next post will discuss how the type checking works.

Thursday, May 18, 2017

DDD Entities

In Domain Driven Design, it's all about the entities.

Entities are the things in your software users' mental model that change.

In Clean Architecture, your entities are independent of all of the rest of your software.

All the rest of your software is defined mostly in relation to entities. Repositories are collections of entities. And nothing else in DDD software changes over time.

Entities are the genuine objects of Object Oriented Programming.

Your software should only change entities by calling their methods, and never by directly modifying their internal state.

E.g. animal.eat(grass) and not animal.setBloodSugarLevel(100)

Thursday, May 11, 2017

High level code, great performance

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/destination-passing-style-1.pdf

Exciting!

Sunday, March 19, 2017

Slicing the cake and different products

The concept of "slicing the cake" is one of the most important lessons to come out of the agile movement. People consistently neglect to apply it across products...

One solution is to have a single developer make all the changes across products. This makes sense if all the products use the same infrastructure, or all have high standards of documentation and devops. E.g. it doesn't work well if different products have different build processes that all require manual intervention. When we reduce barriers to entry this way, "ownership" of individual products might then be meaningful only for code reviews and maintaining long term health of each code base.

The only other solution is to have developers on each of the products all working together to integrate early. If your sprints are two weeks long, each developer gets one week to build an initial implementation, and must be available to integrate immediately on the first day of the second week. Everyone should expect developers to refactor their initial implementations afterwards.

Sources of technical debt

All coding smells get worse over time until they are corrected.

For example, an awkward abstraction might not seem so bad when it is introduced, but as more features are added, it gets increasingly brittle and creates more bugs.

In practical software development these are the most common sources of technical debt:

YAGNI (You aren't gonna need it.)

People build things that don't end up being needed at all, or more insidiously people build things that have an overall cost which exceeds their benefit.

Adapting

Existing code is organized into two parts A and B, and a new use case comes along that needs mostly B. That is, the interface to use B is awkward. Rather than improve that interface, people have their code C start using B via an extra piece of code, a B-C adapter.

Adapters are necessary when the adapted code is owned by a different team, but in that case you'd hope that the interface is well designed in the first place, or at least the integration is part of the application's job. When all the code is owned by a single team, adapters are just debt.

Special casing

A change is desired for a single specific case. The change could apply more generally, but it isn't urgent. People add new code that runs just for the specific case, because they are afraid.

This source of technical debt is particularly tempting. And sometimes it's hard to distinguish from properly avoiding YAGNI. Just as with all refactoring, good automated tests are essential.

Wednesday, February 08, 2017

Schmoperties

Schmoperties is the best two java configuration APIs.

Monday, January 16, 2017

In praise of kafka

Messaging is great because it reduces coupling.

Kakfa does it even better.

A message consumer can come up and work just fine after being down all week.

Cool unhyped languages

Pony - the best of low level and high level
Zig - more pragmatic than c
Lamdu - types are friendly and easy
Crema - program without turing completeness