Monday, February 24, 2020

Automated tests and new bugs

Can unit tests find unexpected bugs?

Russ Cox offhandedly mentions that, no, unit tests only "make sure bugs you fix are not reintroduced over time", in his brilliant recent piece on versioning in go.

On the other hand, Hillel Wayne has a nice example using property based testing and contracts. See if you can spot the bug yourself before following that link:
def mode(l):
  max = None
  count = {}
  for x in l:
    if x not in count:
      count[x] = 0
    count[x] += 1
    if not max or count[x] > count[max]:
      max = x
  return max

Hillel analyzes several different approaches to testing. The punch line is that you catch the bug with the following steps:
  • annotate the mode function with its specification:
    
    @ensure("result must occur at least as frequently as any member", 
      lambda a, r: all((a.l.count(r) >= a.l.count(x) for x in a.l)))
    
    
  • also prepend the following:
    
    from hypothesis import given
    from hypothesis.strategies import lists, integers, text
    from dpcontracts import require, ensure
    
    @given(lists(text()))
    def test_mode(l):
        mode(l)
    
    
  • install the prerequisites with: pip3 install hypothesis dpcontracts pytest
  • run pytest on the file that you created with the mode function and its test code
Though this particular specification looks like an alternate implementation, it isn't intended to be. Using one implementation to test another is a kind of "test oracle", but that doesn't feel like an elegant way to find bugs. The test implementation could have its own bugs, and if maintained by the same programmers, it could even have the same bugs as the regular implementation.

In contrast, a specification can be easier to read than any practical implementation. At least with current technology, there are limits on how clear pragmatic code can be. Specification-like code can be too slow. Perhaps the specification could be an oversimplified implementation, and still be useful for testing.

Testing based purely on the specification is not enough however. We need some automated equivalent of "clear box testing".