Tuesday, January 11, 2011

Monads in C#-3. Creating our first Monad

This is the third part of my Monad series. See the introduction for a list of all the posts.

You can find the code for this post on GitHub here. This post is an almost direct rip-off of Wes Dyer’s excellent post on Monads from 2008.

In the last post I showed how we can do lots of useful things with ‘amplified’ types, like IEnumerable<T>. If we can find a way of composing functions of amplified types we’ll be able to simply write some very powerful programs. We talk a lot about ‘function composition’ when discussing Monads. It can sound very dry and academic, so let’s be clear, function composition is simply programming. When I write some lines of code like this:

var customer = customerData.GetById(customerId);
var order = customer.CreateOrder();

I am doing function composition. In this case composing the ‘GetById’ function with the ‘CreateOrder’ function. Implicitly or explicitly, all programming is function composition.

As programmers we know how to compose functions that consume and return unamplified types, it’s second nature.

Here are two simple functions. They both have the same function type, they both take a single int value argument and return an int. I’m going to use function delegates and lambda expressions rather than writing out static methods, but it’s exactly the same principle:

Func<int, int> add2 = x => x + 2;
Func<int, int> mult2 = x => x*2;

It’s simple to compose these functions:

Func<int, int> add2Mult2 = x => mult2(add2(x));

We can use our new function:

var r2 = add2Mult2(5);
Console.Out.WriteLine("r2 = {0}", r2);

Which prints:
 
r2 = 14

We do this kind of thing all the time. But what if we want to compose functions that return amplified types? First we’ll need an amplified type to compose. For this example, we choose the simplest one possible, it simply wraps a value without adding anything, that will come later. Let’s call it Identity:

public class Identity<T>
{
public T Value { get; private set; }

public Identity(T value)
{
Value = value;
}
}

Now let’s create two functions that return amplified types:

Func<int, Identity<int>> add2 = x => new Identity<int>(x + 2);
Func<int, Identity<int>> mult2 = x => new Identity<int>(x * 2);

If we try to compose these in the same way we composed our simple Func<int, int>s above it won’t compile:

// we can't compose them directly, the types don't match. This won't compile:
Func<int, Identity<int>> add2Mult2 = x => mult2(add2(x));

Because add2 returns an Identity<int> and mult2 expects an int, the types don’t match and we get a compilation error. We could of course, simply access the ‘Value’ parameter like this:

Func<int, Identity<int>> add2Mult2 = x => mult2(add2(x).Value);

This works fine. The problem is that we need to write x.Value wherever we’re composing any function of Identity<T>. Don’t forget Identity<T> is the simplest possible amplified type we could think of, in fact it’s completely useless. As soon as we want to use anything useful like IEnumerable<T> or Task<T> or IMightThrowAnException<T>, we’re back to obfuscating our code with huge amounts of boiler plate. Getting rid of that boiler plate is the reason we are here in the first place. We need to lose x.Value, but how?

What we need is a function that lets us compose add2 and mult2. Let’s call this function ‘Bind’. Looking at the example above that won’t compile, we can see what the signature of Bind should be. It needs to take the result of add2, an Identity<int> and mult2 a Func<int, Identity<int>> and return an Identity<int>. To make bind more useful, we’ll give it generic parameters. It would also be nice to write it as an extension method:

// a function 'Bind', that allows us to compose Identity returning functions
public static Identity<B> Bind<A, B>(this Identity<A> a, Func<A, Identity<B>> func)
{
return func(a.Value);
}

Of course the body of Bind simply calls a.Value and passes the result to func. We’ve wrapped x.Value with Bind. The beauty of it is that while the mechanism of dealing with an amplified type depends on the nature of the amplified type, the signature of Bind is always the same.

Now we can compose add2 and mult2:

Func<int, Identity<int>> add2Mult2 = x => add2(x).Bind(mult2);

We can use our new composed function like this:

var r1 = add2Mult2(5);
Console.Out.WriteLine("r1.Value = {0}", r1.Value);

Which prints:

r1.Value = 14

Let’s create another useful extension method, ToIdentity, to create a new identity:

public static Identity<T> ToIdentity<T>(this T value)
{
return new Identity<T>(value);
}

Now we can write arbitrarily complex expressions with various identity values:

var result = 
"Hello World!".ToIdentity().Bind( a =>
7.ToIdentity().Bind( b =>
(new DateTime(2010, 1, 11)).ToIdentity().Bind( c =>
(a + ", " + b.ToString() + ", " + c.ToShortDateString())
.ToIdentity())));

Console.WriteLine(result.Value);

Which outputs:

Hello World!, 7, 11/01/2010

Here we’re taking a string, ‘Hello World!’, turning it into an Identity<string>. turning an int, 7, into an Identity<int>, and a DateTime into an Identity<DateTime>. We are then using Bind to magically access the wrapped values, concatenate them and turn them back into an Identity<string>.

OK, that code isn’t the prettiest thing you’ve ever seen I’ll admit. But look at it again. We’ve taken various different types of amplified values, and done some operation on them without ever having to access the ‘Value’ parameter of Identity<T> (except of course when we output the result to the console). This is the key. It’s hard to see the benefit here when the lambda soup is far hairier than simply accessing x.Value, but remember these two facts: First we can remove the lambda soup, that will come later in the series, and second, Identity is the simplest possible example of an amplified type. There is no limit to the complexity we can implement.

Identity<T> is a Monad. The properties of a Monad are that it has a type constructor (Identity<T> itself) and two methods: our ‘Bind’ and ‘ToIdentity’. That’s it, that’s all there is to it.

You will find Bind and ToIdentity called various names depending on the programming language you are using, ToIdentity is called ‘return’ in Haskell and Bind goes by the name of SelectMany in C# as we will see later. It doesn’t really matter what they are called, the important thing is that they have the right signatures:

Bind: Func<Identity<A>, Func<A, Identity<B>>, Identity<B>>
ToIdentity: Func<T, Identity<T>>

Bind is where we express the functionality of our Monad. The magic of Monads is that it’s the only place where we have to express that functionality. It turns out to be a very powerful abstraction.

In the next post we will see how renaming Bind to ‘SelectMany’, and combining it with ToIdentity will allow us to use our Identity<T> Monad with Linq syntax. Stay tuned lambda lovers ;)

15 comments:

Anonymous said...

Hi Mike

Thanks for this - part2 was weak but this is very good.

Only a couple of nitpicks:

var r2 = add2Mult2(a); // Seems to be missing a = 5?

loose != lose - if you 'loose' x.Value you are freeing it to run amok in your code, whereas if you 'lose' it then it's gone.

Regards

Adam said...

I am, as ever, boggled. Good work.

Loose. Lose. Potato. Tomato. Queens' College. Blah. :)

Mike Hadlow said...

Thanks Anonymous, Adam, glad you liked it.

And many thanks for the corrections Anonymous, I honestly do appreciate you taking the time to let me know about them. I've updated the post accordingly.

Vijay said...

In this case composing the ‘GetById’ function with the ‘CreateOwner’ function

I guess you meant 'CreateOrder' function

Mike Hadlow said...

Hi Vijay, yes I did :p Thanks for pointing that out.

Vijay said...

:) welcome.

Great articles, appreciate the effort you have put in for explaining the concept.

All the best for the talk!

Anonymous said...

An alternative to Bind is the use of a conversion operator such as:

public static implicit operator T(Identity input)
{
return input.Value;
}

Mike Hadlow said...

Hi Anonymous, although it might seem that you can use an implicit conversion operator in this case, it's not a general alternative to Bind. Have a look at the other posts in this series.

Unknown said...

Thanks for pointing out that there is a higher construct at work here (bigger than a conversion operator). I am defintely reading on. I will be honest in that I don't quite get the full concept but I can tell that there is something to be discovered that is worth the investment. Just ordered a book on Haskell for the fun of it. thanks. (prior anonymous)

Unknown said...

One more thing, I would recommend the video by Brian Beckman on monads. It is very good complimentary material.

http://channel9vip.orcsweb.com/shows/Going+Deep/Brian-Beckman-Dont-fear-the-Monads/

Unknown said...

lol, now i want to learn haskell. Great post - flashbacks of a college course 15 yrs ago where we learned to bootstrap scheme, a dialect of lisp, from a few basic concepts. Thanks for posting this, much appreciated...

Damien said...

Best Monad description I've read :-)
Thanks!

Anonymous said...

Finally I think I get Monads! Thanks!

Anonymous said...

Thanks for the tutorial, I'm learning a lot. Why isn't bind written like this? (Pretend the ? are angle brackets, the comment thing mistook them for html tags)

public static Identity?B? Bind2?A, B?(this Identity?A? a, Func?A, B? func)
{
return (func(a.Value)).ToIdentity?B?();
}


so that each of the functions you compose with it don't need to know they are being used by a particular monad? Then you can write like this:

var bar = new Identity?string?("hey");
bar.Bind2(x => x.ToUpper()).Bind2(x => x.Trim());

Full disclosure I haven't finished reading it all yet so don't rip into me too much : ).

Mike Hadlow said...

Your function is called Map. Or Select(..) in Linq. Equally useful, but for different use cases. it doesn't serve to remove the M's boiler plate which is the point of a Monad.