Friday, May 11, 2012

An F# DSL for MbUnit

In F# we typically organize tests much like in C# or VB.NET: writing functions marked with a [<Test>] attribute or similar. Actually there's a slight advantage in F#: you don't need to write a class marked as test fixture, you can directly write the tests as let-bound functions. Still, it's fundamentally the same model. (If you're into BDD there's also TickSpec as an alternative model).

Since it's the same model, you get the same issues I described in my last post, and then some: for example as Kurt explains, attributes in F# sometimes aren't treated exactly as in C#.

Also in my last post, I wrote about how MbUnit supports first-class tests as an alternative to attribute-defined tests. In F# we can take advantage of this and custom operators to build a very concise DSL to define tests.

First let's see a small test suite with setup/teardown, written with the classic attributes:

[<TestFixture>]
type ``MemoryStream tests``() =
    let mutable ms : MemoryStream = null

    [<SetUp>]
    member x.Setup() =
        ms <- new MemoryStream()

    [<TearDown>]
    member x.Teardown() =
        ms.Dispose()

    [<Test>]
    member x.``Can read``() =
        Assert.IsTrue ms.CanRead

    [<Test>]
    member x.``Can write``() =
        Assert.IsTrue ms.CanWrite

Looks simple enough, right? And yet, the mutable field is a smell, or at least an indicator that this isn't functional. Let's try to get rid of that mutable.

As a first step we'll rewrite this as first-class tests, that is, using [<StaticTestFactory>] as shown in my last post:

[<StaticTestFactory>]
let testFactory() =
    let suite = TestSuite("MemoryStream tests")
    let ms : MemoryStream ref = ref null
    suite.SetUp <- fun () -> ms := new MemoryStream()
    suite.TearDown <- fun () -> (!ms).Dispose()
    let tests = [
        TestCase("Can read", 
            fun () -> Assert.IsTrue (!ms).CanRead)
        TestCase("Can write", 
            fun () -> Assert.IsTrue (!ms).CanWrite)
    ]
    Seq.iter suite.Children.Add tests
    [suite]

Oh great, that's even uglier than what we started with! And we have replaced the mutable field with a ref cell, not much of an improvement. 
But bear with me, we have first-class tests now so there's a lot a room for improvement.

In order to keep refactoring this, we need to realize that the problem is that our test cases should be functions MemoryStream -> unit instead of unit -> unit. That way, they wouldn't have to depend on an external MemoryStream instance; instead the instance would be pushed somehow to the test. Let's write that:

let tests = [
    "Can read", (fun (ms: MemoryStream) -> Assert.IsTrue ms.CanRead)
    "Can write", (fun ms -> Assert.IsTrue ms.CanWrite)
]

Now we have this list of strings and MemoryStream -> unit functions. What we need now is to turn these functions into unit -> unit so we can ultimately build TestCases.

In other words, we need a function (MemoryStream -> unit) -> (unit -> unit). This function should create the MemoryStream, pass it to our test function, then dispose the MemoryStream. Hey, what do you know, turns out that's just what SetUp and TearDown do!

Still with me? It's much easier to see this in code:

let withMemoryStream f () = 
    use ms = new MemoryStream()
    f ms

Now we apply this to our list, building the TestCases and then the TestSuite:

[<StaticTestFactory>]
let testFactory() =
    let suite = TestSuite("MemoryStream tests")
    tests 
    |> Seq.map (fun (n,t) -> TestCase(n, Gallio.Common.Action(withMemoryStream t)))
    |> Seq.iter suite.Children.Add
    [suite]

We've eliminated all mutable references, and also replaced SetUp/TearDown with a simple higher-order function.

But we can do still better, in terms of readability. We can define a few custom operators to hide the TestSuite and TestCase constructors:

let inline (=>>) name tests =
    let suite = TestSuite(name)
    Seq.iter suite.Children.Add tests
    suite :> Test

let inline (=>) name (test: unit -> unit) =
    TestCase(name, Gallio.Common.Action test) :> Test

[<StaticTestFactory>]
let testFactory() =
    [
        "MemoryStream tests" =>> [
            "Can read" => 
                withMemoryStream(fun ms -> Assert.IsTrue ms.CanRead)
            "Can write" => 
                withMemoryStream(fun ms -> Assert.IsTrue ms.CanWrite)
        ]
    ]

And with a couple more operators we get rid of the duplicate call to withMemoryStream:

let inline (+>) f =
     Seq.map (fun (name, partialTest) ->
                    name => f partialTest)

let inline (==>) (name: string) test = name,test

[<StaticTestFactory>]
let testFactory() =
    [
        "MemoryStream tests" =>>
            withMemoryStream +> [
                "Can read" ==> 
                    fun ms -> Assert.IsTrue ms.CanRead
                "Can write" ==>
                    fun ms -> Assert.IsTrue ms.CanWrite
            ]
    ]

Conclusions

Confused about all those kinds of arrows? The good thing about first-class tests is that you can build them any way you want, no need to use these operators if you don't like them. That's also precisely one of its downsides: as there is no fixed idiom, it can get harder to read compared to attribute-based test definitions, where there is a single, well-defined way to do things.

In my last post I showed how first-class tests practically eliminate the concept of parameterized tests. In this post I showed how they eliminate the concept of setup/teardown, replacing them with a higher-order function, a more generic concept.

More generally, I'd say that whatever domain you're modeling (in this case, tests), there is much to gain if the core concepts are representable as first-class values. It should also be noted that different languages have very different notions of what language objects are first-class values. Some are more flexible than others, but that doesn't imply any superiority by itself. However it does mean that if you're not aware of this you'll probably misuse your language and end up with ever more complex workarounds to manipulate your domain objects as values. Nice APIs, conventions, configuration, etc, are all secondary and can be built much more easily on top of composable, first-class building blocks.

But I digress. In the next post I'll show a simple testing library built around tests as first-class values and more pros/cons about this approach.

Tuesday, May 8, 2012

First-class tests in MbUnit

Originally, xUnit style testing frameworks used inheritance to define tests. SUnit, the original xUnit framework, builds test cases by inheriting the TestCase class. NUnit 1.0 and JUnit derived from SUnit and also used inheritance. Fast-forward to today, unit testing frameworks in .NET and Java typically organize tests using attributes/annotations instead.

For a few years now, MbUnit has been able to define tests programmatically as an alternative, though it seems this feature isn't used much. Let's compare attributes vs programmatic tests with a simple example in C#:

Attributes

[TestFixture]
public class TestFixture {
    [Test]
    public void Test() {
        Assert.AreEqual(4, 2 + 2);
    }

    [Test]
    public void AnotherTest() {
        Assert.AreEqual(8, 4 + 4);
    }
}

Programmatic

public class TestFixture {
    [StaticTestFactory]
    public static IEnumerable<Test> Tests() {
        yield return new TestCase("Test", () => {
            Assert.AreEqual(4, 2 + 2);
        });

        yield return new TestCase("Another test", () => {
            Assert.AreEqual(8, 4 + 4);
        });
    }
}

At first blush, declaring tests programmatically is more verbose and complex. However, the real difference is that these tests are first-class values. It becomes more clear why this matters with an example of parameterized tests:

Attributes

[TestFixture]
public class TestFixture {
    [Test]
    [Factory("Parameters")]
    public void Parse(string input, DateTime expectedOutput) {
        var r = DateTime.ParseExact(input, "yyyy-MM-dd'T'HH:mm:ss.FFF'Z'", CultureInfo.InvariantCulture);
        Assert.AreEqual(expectedOutput, r);
    }

    IEnumerable<object[]> Parameters() {
        yield return new object[] { "1-01-01T00:00:00Z", new DateTime(1, 1, 1) };
        yield return new object[] { "2004-11-02T04:05:20Z", new DateTime(2004, 11, 2, 4, 5, 20) };
    }
}

Programmatic

public class TestFixture {
    [StaticTestFactory]
    public static IEnumerable<Test> Tests() {
        var parameters = new[] {
            new { input = "1-01-01T00:00:00Z", expectedOutput = new DateTime(1, 1, 1) },
            new { input = "2004-11-02T04:05:20Z", expectedOutput = new DateTime(2004, 11, 2, 4, 5, 20) },
        };

        return parameters.Select(p => new TestCase("Parse " + p.input, () => {
            var r = DateTime.ParseExact(p.input, "yyyy-MM-dd'T'HH:mm:ss.FFF'Z'", CultureInfo.InvariantCulture);
            Assert.AreEqual(p.expectedOutput, r);
        }));
    }
}

Programmatically, we just wrote the parameters and tests in a direct style. With attributes, not only we lost the types but also it's more complicated: you have to know  (or look up in the documentation) that you need a [Factory] attribute, that its string parameter indicates the method name that contains the test parameters, and the format for the parameters (e.g. can they be represented as a property? As a field? Can it be private? Static? Can it be a non-generic IEnumerable? An ArrayList[]?). Fortunately, MbUnit is quite flexible about it. Yet it doesn't handle an ArrayList[].

Something similar happens with JUnit and TestNG. Actually JUnit did have something close to first-class tests with its inheritance API.

With programmatic tests, you simply return a list of tests, there's no magic about it. It doesn't matter how they're built, they can be parameterized or not, all you have to know is [StaticTestFactory] public static IEnumerable<Test> Tests() . If they're parameterized, it doesn't matter what kind of parameters they are. Actually, the very concept of "parameterized tests" simply disappears.

With attributes, you may have tried to use [Row] first, only to have the compiler remind you that attribute parameter types are very limited and you can't have a DateTime. Or a function. Or even a decimal. The testing framework gets in the way. Attributes are just not the right tool to model this.

With programmatic tests, you are in control, not the testing framework. It becomes more of a library rather than a framework. Things are conceptually simpler.

What about SetUp and TearDown? Don't worry, MbUnit supports them directly as properties of TestSuite. However, as we'll see in the next post, they're not really necessary. We'll also see a few other pros/cons first-class tests have.

I'll leave you with this quote from the twitter-fake Alain de Botton:

Wednesday, May 2, 2012

Moroco: a minimal mocking library for C# / VB.NET

The more I learn about functional programming, the more I come to question many widely used and accepted practices in mainstream programming.

This time it's the turn of mocks and mocking libraries.

First, since there are so many different definitions for stubs, mocks, fakes, etc, here's my own definition of a mock: an entity (in an object-oriented language, usually an object) used to test an interaction between the entity under test and an external entity (again, in OO languages, these entities are objects).

So mocks are used for interaction-based testing which means testing for side-effects. The original paper on mock objects says this explicitly: "Test code should communicate its intent as simply and clearly as possible. This can be difficult if a test has to set up domain state or the domain code causes side effects."

Side effects are code smells, or more precisely, they should be few and isolated. Even the creators of mock object say that side effects make testing difficult! We should minimize the need for mocking. Code without side-effect doesn't need mocks. You still may need stubs or fakes, but those should be trivial to build.

Quoting Daniel Cazzulino (author of Moq): "The sole presence of a 'Verify' method on the mock is a smell to me, one that will slowly get you into testing the interactions as opposed to testing the observable state changes caused by a particular behavior.". Make those states immutable (i.e. a state change create a new state) and you're half-way to side-effect-free code.

Remember that discussion a few years ago where people said that Typemock was too powerful? The argument was that Typemock "doesn't force you to write testable code". What this really means is "it doesn't make you isolate side-effects". I'm thinking that all current mocking libraries are actually too powerful: instead of making you think how to write pure (side-effect free) code, it encourages you to just use a mock to replace your impure code with pure code in tests.

There's also the matter of library complexity. .NET mocking libraries are big and complex: Moq: 17000 LoC, NSubstitute: 12000 LoC, Rhino Mocks: 87000 LoC, FakeItEasy: 17000 LoC. Not counting the embedded runtime proxy library (usually Castle DynamicProxy).

Many have issues running mocks in parallel (1, 2, 3, 4). It's 2012 and most developers have at least a 4-core workstation, using a mock library that limits my ability to run tests in parallel is getting ridiculous.

In summary, I think mock libraries support an undesirable practice, are not worth their code and have to go, except for very specific scenarios. But I still have a lot of existing side-effecting code I have to test, I can't just wish it away. Refactoring to pure code is not trivial. So I decided that

To which I got an encouraging reply:

By the way I'm not the first or the only one that thinks that mocking libraries aren't worth it. Uncle Bob also prefers manual mocking (even in Java, which is much more verbose than any .NET language), though he mostly stresses the argument of simplicity.

In F# it's easy to do manual mocking thanks to object expressions (1, 2), so you don't have to actually create a new class for each mock. In C#/VB.NET we're not so lucky but we can get 80% there with a little boilerplate, making a "semi-manual", reusable mock class with settable Funcs to define behavior. Example:

interface ISomething {
    void DoSomething(int a);
}

class MockSomething: ISomething {
    public Action<int> doSomething;

    public void DoSomething(int a) {
        doSomething(a);
    }
}

class Test {
    void test() {
        var r = new List<int>();
        var s = new MockSomething {
            doSomething = r.Add
        };
        // etc
    }
}

This isn't anything new, people have been doing this for years. Downsides: doesn't play well with overloaded and generic methods, but still works.

Also, as Uncle Bob explains, manual mocks are prone to break whenever you change the mocked interface, but if you hit this often it could be revealing you that perhaps you should have used an abstract base class instead.

I added some code to track the call count of a Func and named the resulting library Moroco. So I wanted to get away from mocking libraries and ended up writing one, talk about hypocrisy! The difference between Moroco and other mocking libraries is that it's really minimal: less than 400 lines of pretty trivial code, fitting in a single file, with no dependencies. And I'm still against mocks: I get to count mocks to measure mock smell. And run tests in parallel.

I'm already using it in SolrNet, where I simply dropped Moroco's source code in the test project and replaced Rhino.Mocks in all tests. Here's the 'before vs after' of one of the tests:

Rhino.Mocks

[Test]
public void Extract() {
    var mocks = new MockRepository();
    var connection = mocks.StrictMock<ISolrConnection>();
    var extractResponseParser = mocks.StrictMock<ISolrExtractResponseParser>();
    var docSerializer = new SolrDocumentSerializer<TestDocumentWithoutUniqueKey>(new AttributesMappingManager(), new DefaultFieldSerializer());
    var parameters = new ExtractParameters(null, "1", "test.doc");
    With.Mocks(mocks)
        .Expecting(() => {
            Expect.On(connection)
                .Call(connection.PostStream("/update/extract", null, parameters.Content, new List<KeyValuePair<string, string>> {
                    new KeyValuePair<string, string>("literal.id", parameters.Id),
                    new KeyValuePair<string, string>("resource.name", parameters.ResourceName),
                }))
                .Repeat.Once()
                .Return(EmbeddedResource.GetEmbeddedString(GetType(), "Resources.responseWithExtractContent.xml"));
            Expect.On(extractResponseParser)
                .Call(extractResponseParser.Parse(null))
                .IgnoreArguments()
                .Return(new ExtractResponse(null));
        })
        .Verify(() => {
            var ops = new SolrBasicServer<TestDocumentWithoutUniqueKey>(connection, null, docSerializer, null, null, null, null, extractResponseParser);
            ops.Extract(parameters);
        });
}


Moroco

[Test]
public void Extract() {
    var parameters = new ExtractParameters(null, "1", "test.doc");
    var connection = new MSolrConnection();
    connection.postStream += (url, contentType, content, param) => {
        Assert.AreEqual("/update/extract", url);
        Assert.AreEqual(parameters.Content, content);
        var expectedParams = new[] {
            KV.Create("literal.id", parameters.Id),
            KV.Create("resource.name", parameters.ResourceName),
        };
        Assert.AreElementsEqualIgnoringOrder(expectedParams, param);
        return EmbeddedResource.GetEmbeddedString(GetType(), "Resources.responseWithExtractContent.xml");
    };
    var docSerializer = new SolrDocumentSerializer<TestDocumentWithoutUniqueKey>(new AttributesMappingManager(), new DefaultFieldSerializer());
    var extractResponseParser = new MSolrExtractResponseParser {
        parse = _ => new ExtractResponse(null)
    };
    var ops = new SolrBasicServer<TestDocumentWithoutUniqueKey>(connection, null, docSerializer, null, null, null, null, extractResponseParser);
    ops.Extract(parameters);
    Assert.AreEqual(1, connection.postStream.Calls);
}

Yes, I do realize this uses an old Rhino.Mocks API, but it's necessary to get thread-safety.

Conclusion

No, I don't expect anyone to use Moroco instead of Moq, Rhino.Mocks, etc. Yes, I know this means more code (though it's not as much as you might think), and I agree that .NET needs another mock library like I need a hole in my head. But I think we should think twice before using a mock and see if we can find a side-effect-free alternative for some piece of code.
Even when you do use mocks, don't just blindly reach for a mocking library. Consider the trade-offs.