Bug squash: October 2011

Saturday, October 29, 2011

Poor man's option type in C#

I've blogged before about the virtues of the Option type. However, it's currently not part of the standard .NET library, so either you have to use F#, code it yourself, or use a library.

But there is something built-in and almost equivalent: lists! You can use any list-like container (List<T>, T[]) as an option type, and all operations are already built-in.

Option has None (no value) and Some (a value). This corresponds exactly to an empty list and a singleton list respectively. Let's see how we map operations on Options (using FSharpx) to operations on arrays using standard .NET 3.5:

	Option	Array / IEnumerable
Constructor: Some	FSharpOption<int>.Some(5) or: 5.Some()	new[] { 5 }
Constructor: None	FSharpOption<int>.None	new int[0]
Check if the option has a value	FSharpOption<int> o = ... bool hasValue = o.HasValue();	IEnumerable<int> o = ... bool hasValue = o.Any();
Get the value associated with the option	FSharpOption<int> o = ... int value = o.Value;	IEnumerable<int> o = ... int value = o.First();
Pattern matching	FSharpOption<int> o = ... int b = o.Match(x => x + 2, () => 99);	IEnumerable<int> o = ... int b = o.Aggregate(99, (_, x) => x + 2); or lazier: int b = singleVersion.Any() ? singleVersion.First() + 2 : 99;

And thanks to LINQ, you also get mapping and monadic syntax for free. Remember that code I refactored to monads a while ago? Here's the last part of it side-by-side with a translation using Array/IEnumerable:

var maxVersion = L.F((string[] parts) => {
    var p = parts.Length == 2 ? parts[1] : parts[0];
    if (string.IsNullOrWhiteSpace(p))
        return FSharpOption<FSharpOption<Version>>.Some(FSharpOption<Version>.None);
    return ParseVersion(p).Select(v => v.ToOption());
});

var singleVersion =
    from v in ParseVersion(value)
    select (IVersionSpec) new VersionSpec {IsMinInclusive = true, MinVersion = v};

var versionRange = L.F(() => from x in checkLength(value)
                             from isMin in minInclusive(value)
                             from isMax in maxInclusive(value)
                             let val = value.Substring(1, value.Length - 2)
                             let parts = val.Split(',')
                             from y in checkParts(parts)
                             from min in minVersion(parts)
                             from max in maxVersion(parts)
                             select (IVersionSpec) new VersionSpec {
                                 IsMinInclusive = isMin,
                                 MinVersion = min.HasValue() ? min.Value : null,
                                 IsMaxInclusive = isMax,
                                 MaxVersion = max.HasValue() ? max.Value : null,
                             });

return singleVersion.OrElse(versionRange)();

var maxVersion = L.F((string[] parts) => {
    var p = parts.Length == 2 ? parts[1] : parts[0];
    if (string.IsNullOrWhiteSpace(p))
        return new[] {new Version[0]};
    return ParseVersion(p).Select(v => new[] {v});
});

var singleVersion =
    from v in ParseVersion(value)
    select (IVersionSpec) new VersionSpec {IsMinInclusive = true, MinVersion = v};

var versionRange = L.F(() => from x in checkLength(value)
                             from isMin in minInclusive(value)
                             from isMax in maxInclusive(value)
                             let val = value.Substring(1, value.Length - 2)
                             let parts = val.Split(',')
                             from y in checkParts(parts)
                             from min in minVersion(parts)
                             from max in maxVersion(parts)
                             select (IVersionSpec) new VersionSpec {
                                 IsMinInclusive = isMin,
                                 MinVersion = min.Any() ? min.First() : null,
                                 IsMaxInclusive = isMax,
                                 MaxVersion = max.Any() ? max.First() : null,
                             });

return singleVersion.Any() ? singleVersion : versionRange();

If you want to compare the whole code: here's the original (using option) and here's the one using arrays.

You've even probably used this already, perhaps without realizing this relation. However, while an option type can either have exactly one or zero value, an array can have any number of values. And if you see a method returning an IEnumerable<T>, you wouldn't think you're supposed to treat it as an option.

So IEnumerable<T> as a monad (the List monad, that is) is sort of an extension of the option type (i.e. the Maybe monad): instead of just supporting one successful computation, it supports many. I think using the List monad as an Option is acceptable locally, and only if you can't use a proper option type or for some reason don't want to take the dependency on a library. It's a useful hack, but still a hack. They're different things really.

Tuesday, October 18, 2011

10 reasons to use the F# runtime in your C# app

Most people have at least noticed that F# shipped with Visual Studio 2010. I mean, you click File –> New Project and there's the F# project templates, you can't miss them.

What most people probably didn't realize is that even if you don't use the F# language or aren't even interested in it, you can still profit from using the F# runtime in C# / VB.NET projects. The F# runtime is just a regular DLL named FSharp.Core.dll you can reference just like any other assembly. It's available for .NET 2.0 and 4.0 (separate DLLs). This availability for .NET 2.0 is particularly valuable for projects that for some reason can't be upgraded to newer versions of .NET.

Granted, the library is designed to be used from F#, so sometimes it looks weird in C#, but we'll see how to work around some of the oddities with FSharpx.

Let's see some of the things FSharp.Core gives you, in no particular order:

Tuples

So you want to use tuples but you can't upgrade to .NET 4 because of company policies or some obscure dependency that breaks. No problem, FSharp.Core.dll implements them, so you can use tuples in .NET 2.0 with exactly the same API and namespace as .NET 4 tuples. If you then upgrade you don't have to change anything.

Tuples are simple but not trivial to implement, for example some forget to implement equality / hashing so you'd end up with "WTF moments" at some point. It's worth using a library that implements them properly.

As usual, keep in mind that tuples are essentially anonymous. Item1, Item2, etc don't convey any information about what they're holding, only their types.

Persistent collections

Persistent lists are one of the most frequently used data structures in functional programming. They're so prevalent that F# has special syntax for them. For example, to define an empty list in F# :

let empty = []

F# infers the list element type. In C# things are more verbose:

var empty = FSharpList<int>.Empty;

To add an element to a list you actually create a new list that has the new element as head and the other list as tail. Again, F# has special syntax:

let a = 1::empty

While in C#:

var a = new FSharpList<int>(1, empty);

or:

var a = FSharpList<int>.Cons(1, empty);

FSharpx helps with a little sugar here:

var a = empty.Cons(1);

You can also create an immutable list from any IEnumerable<T>:

var b = SeqModule.ToList(new[] { 1, 2, 3 });

Again, FSharpx adds some sugar:

var b = new[] { 1, 2, 3 }.ToFSharpList();

or:

var b = FSharpList.Create(1, 2, 3);

How do you use a FSharpList? You can access a particular element just as with a regular mutable list:

Console.WriteLine(b[2]); // prints "3"

Be aware that random access in an immutable linked list is O(n).

FSharpList implement IEnumerable<T>, so you can traverse it with foreach and use all LINQ functions (Aggregate, Select, Where, etc).

Functional languages often use pattern matching and recursion to process a list. The F# wikibook has a great chapter explaining it. FSharpx implements basic pattern matching on lists for C#, so you can write this to reverse a list:

[Test]
void Reverse() {
    var a = Enumerable.Range(0, 1000).ToFSharpList();
    var r = Loop(FSharpList<int>.Empty, a);
    Console.WriteLine(r);
}

static FSharpList<T> Loop<T>(FSharpList<T> acc, FSharpList<T> l) {
    return l.Match(() => acc,
                   (head, tail) => Loop(acc.Cons(head), tail));
}

But be careful! F# compiles the equivalent code using tail call optimization, while C# doesn't have that feature, so the above code blows with a StackOverflowException when given a sufficiently big list (unless you've compiled with optimizations and running in a 64-bit CLR !)

When recursively processing lists, it's best to use Aggregate() instead if possible (usually called fold in functional languages), which encapsulates recursion without blowing the stack. It's also simpler:

var a = Enumerable.Range(0, 1000000).ToFSharpList();
var r = a.Aggregate(FSharpList<int>.Empty, (acc, i) => acc.Cons(i));

Of course, this is just demo code. If you really want to reverse a list just call ListModule.Reverse(a);

FSharp.Core also implements a persistent set and dictionary.

Imperative programmers might wonder why they should use an immutable collection when the BCL already has several perfectly good mutable collections.

One of the most cited reasons for using persistent collections (and functional programming in general) is multithreading. Indeed you can freely and safely pass persistent collections around threads, which makes multithreaded development easier. However, the same can be said about passing collections around regular functions: you can be sure that no function can ever modify a list, therefore you have one less thing to keep track of in your head and you statically eliminate a whole class of bugs. Immutability makes all kinds of programming simpler, multithreaded or not. Of course, for immutable collections to really work as immutable, the underlying element type must be also immutable.

Reactive extensions also includes an ImmutableList class, although it's internal.

The Option type

I have blogged before about using the F# Option type in C# projects here and here. Options are pervasively used in F#, for example several functions on collections use options. The problem is, these functions take the equivalent of a Func but in F#, which is an FSharpFunc, which makes it very inconvenient to use them from C#.

FSharpx wraps these F# functions so they can be used with System.Func and System.Action. For example:

var a = FSharpList.Create(1, 2, 3);
a.TryFind(x => x > 4) // returns FSharpOption<int>
    .Match(v => Assert.Fail("shouldn't have found value {0}", v),
           () => { /* nothing found */ });

The Unit type

Many functional languages like F# have a type called "Unit", which is just like "void" in C-like languages, except it's actually usable as a proper type.

By "usable" I mean you can actually define something like a Func<Unit> (you can't have a Func<void>, it's not even syntactically correct even though there is a type System.Void). A Func<Unit> is just like an Action, except it's obviously a Func so it can be used for example in a LINQ expression (i.e. a monad).

FSharpx includes a ToFunc() extension method on Action, Action<T>, Action<T1,T2>, etc. to respectively convert them to Func<Unit>, Func<T,Unit>, Func<T1,T2,Unit> and so on.

You can also use it for types like FSharpOption<Unit> as I blogged about before.

Reactive Extensions also includes a Unit type.

Discriminated unions

I have blogged before about using F# discriminated unions in C# here and here, in the context of validation. They're very useful to express things like "either this or that" without having to introduce a whole class hierarchy implementing equality / hash / comparison.

Just as with other things, using them in C# is more verbose than in F#.

Let's see an example:

var a = FSharpChoice<int, string>.NewChoice1Of2(1);
if (a.IsChoice1Of2) {
    var x = ((FSharpChoice<int, string>.Choice1Of2)a).Item;
    Console.WriteLine(x + 2);
} else if (a.IsChoice2Of2) {
    var x = ((FSharpChoice<int, string>.Choice2Of2)a).Item;
    Console.WriteLine(x + ";");
}

Now that looks really ugly. And what's with the downcasting?!

FSharpx makes this more usable by implementing pattern matching (basically a visitor) so you can write instead:

var a = FSharpChoice.New1Of2<int, string>(1);
a.Match(x => Console.WriteLine(x + 2),
        x => Console.WriteLine(x + ";"));

FSharpx also implements LINQ operators around 2-choice and integrates with Option. Here's an example:

object a = 40;
const string b = "60";
var r = from i in FSharpOption.ParseInt(b).ToFSharpChoice("Invalid value b")
        from j in FSharpChoice.Cast<int>(a).SelectSecond(_ => "Invalid value a")
        select i + j;
r.Match(i => Assert.AreEqual(100, i),
        Assert.Fail);

Just as with tuples, discriminated unions are essentially anonymous. Tuples are the generic, anonymous product types. Discriminated unions are the generic, anonymous sum types.

Reactive extensions uses an internal Either<TLeft, TRight> type.

Async

Once again, you're stuck with .NET 3.5 drooling over the Task Parallel Library in .NET 4.

Reactive extensions used to include a backport of System.Threading.dll, but it was unsupported and it's not included in recent releases any more.

F# has asynchronous workflows, which is similar yet somewhat different from C# 5 await/async (see differences in this series of posts by Tomas Petricek)

FSharpx has LINQ bindings for this so you can write:

static FSharpAsync<string> Get(string u) {
    var web = new WebClient();
    return web.AsyncDownloadString(new Uri(u));
}
var qq = // qq is of type FSharpAsync<string>
    from google in Get("http://www.google.com")
    from bing in Get("http://www.bing.com")
    select google + bing;

string result = qq.Run();

Or you can run multiple requests in parallel:

var urls = FSharpList.Create(
      "http://www.google.com"
    , "http://www.bing.com"
    , "http://www.yahoo.com"
    , "http://www.microsoft.com"
    );
var result = FSharpAsync.Parallel(urls.Select(Get)).Select(s => string.Join("", s)).Run();

It may not be as powerful as F# async workflows, but still useful.

BigInteger

Another one for .NET 2.0 / 3.5 users. FSharp.Core includes System.Numerics.BigInteger for arbitrary-precision arithmetic. It doesn't have all of .NET 4 BigInteger's methods, but it implements the basic operations. Want to calculate 23^25 + 4? No problem:

var a = new BigInteger(23);
var b = BigInteger.Pow(a, 25);
b += new BigInteger(4);
Console.WriteLine(b);

Result: 11045767571919545466173812409689947

Lazy

The Lazy<T> type is yet another feature that .NET 4 copied from F#, or so it seems. Are you still writing singletons the "old" way? With Lazy you can just do this in .NET 3.5 (using FSharpx-added sugar):

class MySingleton {
    private MySingleton() {}

    private static readonly Lazy<MySingleton> instance = 
        FSharpLazy.Create(() => new MySingleton());

    public static MySingleton Instance {
        get { return instance.Value; }
    }
}

Although to be honest, I don't think I've ever used this.

Enumerable cache

Sometimes you have a forward-only iterator wrapped in an IEnumerable, like database results or some data parsed lazily from a web request, and you want to traverse it more than once, but you also want to keep it lazy, so ToList() doesn't cut it. With FSharp.Core you can cache it on demand using Seq.cache, named SeqModule.Cache in C# / VB.NET.

System.Interactive also has a function like this, it's called MemoizeAll, although I like the F# name better as it seems to be more an application of caching than memoization.

Enumerable zip

Another nifty operator that is only available in .NET 4+. The one in FSharp.Core is slightly different: Enumerable.Zip includes a mapper, its signature is:

IEnumerable<TResult> Zip<TFirst, TSecond, TResult>(
    this IEnumerable<TFirst> first,
    IEnumerable<TSecond> second,
    Func<TFirst, TSecond, TResult> resultSelector)

while the one in F# (also in the static SeqModule class) zips directly to a tuple:

IEnumerable<Tuple<T1, T2>> Zip<T1, T2>(IEnumerable<T1> first, IEnumerable<T2> second)

Conclusion

If you're working with Visual Studio 2010, the F# runtime is a great library you can take advantage of, even in .NET 2.0 projects. And you already have it, so use it!

If you run .NET 3.5 or better, FSharpx makes it more C# friendly. It also makes it easier to interop with F# projects if you ever need it, since they use the same underlying types.

Even in .NET 4, persistent collections, discriminated unions and Option alone are easily worth the dependency.

Also worth mentioning is the F# PowerPack, a separate library implementing additional collections like HashMultiMap and LazyList and math-specific facilities such as rational and complex numbers, matrix, vectors.

And it's all open source, Apache-licensed.

PS: did you know the VB.NET runtime has a CSV parser?

Friday, October 7, 2011

Introducing FSharpx

A couple of months ago I started writing FSharp.Core.CS, a project to bridge F# core constructs to C#, such as the Option type. I think I never mentioned it explicitly but I did blog about it, for example back when I wrote about validating with applicative functors with LINQ.

I realized I was going to implement several monads in F# and there was an excellent project already doing that: Ryan Riley's FSharp.Monad. It only missed C# compatibility, exactly what I was doing in FSharp.Core.CS among other things, and FSharp.Monad already was doing some things other than monads, so it seemed like the perfect moment to merge both projects, and so we created FSharpx.

FSharpx aims to create a wider and richer foundation for programming in .NET, building on top of the F# core library and the PowerPack. It targets mainly F# but strives to be usable from all .NET languages wherever possible.

It's similar in spirit to Scalaz, even though Scala can do things like typeclasses which F#/C# cannot.

Here's a brief summary of what FSharpx.Core currently implements:

Monads: Maybe (also available in LINQ), State, Reader, Writer, Either (also in LINQ), Continuation, Distribution, Undo
Monoids: int sum, int product, option wrapper
ZipList
Nullable operators
Iteratee
Validation as applicative functor (also in LINQ), parametrizable by monoid
Collections extensions
Option extensions
Helpers to make functional programming in C# easier
LINQ bindings to use F# async (similar to this)
Helpers to make C# - F# interop easier
General basic functions from the Haskell prelude, such as flip, curry, tuple constructors...

Tomas Petricek's async extensions were also merged, under the name FSharpx.AsyncExtensions, which implements:

Async sequences
Reusable agents
Observable extensions

Finally, Steffen Forkmann has started a branch for F# 3.0 type providers, which includes AppSettings, File system, Regex, CSV. These are implemented using a DSL on top of the base classes provided by Microsoft.

It's still early days for FSharpx, we're frequently breaking things, and there's almost no documentation. Still, I'm already using it in some of my projects (Figment, FsFormlets, CsFormlets) and in real-world production C#/VB.NET code.

I'm very excited about FSharpx as it is truly community-driven with many people involved. Contributors so far include Ryan Riley, Steffen Forkmann, Tomas Petricek, Daniel Mohl, Gustavo Guerra, Hodza Nassredin and yours truly.

So come check it out! If you have any questions, drop by our mailing list.

Bug squash