-
-
Notifications
You must be signed in to change notification settings - Fork 424
Does C# Dream Of Electric Monads?
This is a discussion about what 'monad' means in the C# world. This makes some assumptions about the reader in that this isn't Yet Another Monad Tutorial, it's coming from the point-of-view that you've used Option
, Seq
, Either
, etc. and you used them with LINQ, and you kind of get that this is all to do with monads and the Bind
function (along with Select
and SelectMany
), but you're not quite sure how real this is in C# world. Is using LINQ for this purpose a hack?
So, to start this off, it's worth looking at Haskell, which is a language built around monads. They're used to partition side-effects, as well as simplify common tasks like control flow, carrying state contexts through the stack, etc.
If you look at the definition of the monad type-class, you'll see something like this:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
fail :: String -> m a
I've stripped out a lot of the unnecessary stuff to leave us with the core definition.
(>>=)
is the bind operator; we'll call that the bind
function from now on.
If you've never spent any time with Haskell, then this might not make a lot of sense. An imaginary C# version might look like this:
public interface Monad<M>
{
static M<B> Bind<A, B>(M<A> ma, Func<A, M<B>> f);
static M<A> Return<A>(A value);
static M<A> Fail<A>(string error);
}
The reason I say imaginary is because the monad type-class requires higher-kinded types (generic-types which, themselves, take generic parameters), and C# does not have higher-kinded types. M<A>
isn't possible, because it's only possible to parameterise the lower kind: A
, the M
part - the higher kind - can't be parameterised.
Regardless, let's imagine what could happen if C# did support those features:
public class MOption : Monad<Option>
{
public static Option<B> Bind<A, B>(Option<A> ma, Func<A, Option<B>> f) =>
ma.Match(Some: f, None: None);
public static Option<A> Return<A>(A value) =>
Some(value);
public static Option<A> Fail<A>(string _) =>
None;
}
Filling in the implementation, you can see how the M
becomes Option
. And so this would create an implementation of Monad
for Option
.
You could do one for Seq<A>
:
public class MSeq : Monad<Seq>
{
public static Seq<B> Bind<A, B>(Seq<A> ma, Func<A, Seq<B>> f)
{
return Seq(Yield());
IEnumerable<B> Yield()
{
foreach(var a in ma)
{
foreach(var b in f(a))
{
yield return b;
}
}
}
}
public static Seq<A> Return<A>(A value) =>
Seq1(value);
public static Seq<A> Fail<A>(string error) =>
Empty;
}
And so by doing this for any type you retroactively give types like Option<A>
and Seq<A>
(which are just data-types) additional traits. In this case the trait of being a monad. This is known as ad-hoc polymorphism (because we're defining aspects of the type after-the-fact), and works differently to inheritance and parametric polymorphism. By filling in the contract of a type-class the data-type gains features.
Continuing with our imaginary version of C#, we could then write a function:
M<int> Add<M>(M<int> ma, M<int> mb) where M : Monad<M> =>
M.Bind(ma, a =>
M.Bind(mb, b =>
M.Return(a + b)));
This is a polymorphic function that would work with any monad and add the bound values together before putting the result back into a new version of that monad. And so, with higher-kinded types it's possible to get super generic and write functionality once that would work over Option
, Either
, Try
, Task
, Seq
, etc.
But C# doesn't have higher kinds - it seems there's a possible proposal going on in the Roslyn project, but right now it's not there. This is what the special methods Select
and SelectMany
are for. They're a bit of a hack by the C# language team to give C# developers most of the benefits of monads.
The unfortunate missing feature is the ability to define a function that can take any monad (like Add
above), or to build a type that is constrained to have monadic components. Instead we need to know the type of monad we're working on at all times.
In practice this isn't such a big problem, especially for most development - it becomes a bit of a problem if you're writing a library (like language-ext) and you'd rather not write the same function N times for N different monads, or you want other people to be able to use your code with the monads they define.
Select
and SelectMany
allow you to give a data-type the trait of being a monad. Which then allows it to be used in LINQ expressions. What's interesting is how similar LINQ is to do notation (which is the convenient way of working with monads in Haskell).
First the LINQ way:
from a in optionA
from b in optionB
select a + b;
Next the Haskell way:
a <- optionA
b <- optionB
return (a + b)
What may start to clear in your head is that a
is in optionA
and b
is in optionB
. The in
and <-
operators are very clearly saying "get the value out of the thing on the right and put it into the variable on the left".
Another interesting fact is that select
in LINQ is return
in Haskell. Remember the original definition of a Haskell monad:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
fail :: String -> m a
See that return
takes an a
and puts it into a monad m a
. And so the from ... in ...
and the ... <- ...
can be seen as: taking the value out of the monad. The return
and select
can be seen as wrapping the value back up into a monad.
So, are the C# implementations really monads? Yes and no. They behave like monads in Haskell, yes. But there's a ton of super generic programming that C# cannot do due to the lack of higher-kinds, so no.
Is using LINQ in this way a hack? Absolutely not. LINQ has been purposefully engineered for this exact functionality. We could call it 'monads light'.
There's a bit more to this though. C# can almost do higher-kinds. This is bending the language, and I certainly wouldn't recommend using this approach too much, but we can almost do type-classes and class-instances (the implementations):
public interface Monad<MA, A>
{
MB Bind<MonadB, MB, B>(MA ma, Func<A, MB> f)
where MonadB : struct, Monad<MB, B>;
MA Return(A value);
MA Fail(string value);
}
The interface above defines a slightly less strict monad (less strict because the return type of Bind
could be any monad). We can now define the implementation:
public struct MOption<A> : Monad<Option<A>, A>
{
public MB Bind<MonadB, MB, B>(Option<A> ma, Func<A, MB> f)
where MonadB : struct, Monad<MB, B> =>
ma.Match(
Some: f,
None: () => default(MonadB).Fail(""));
public Option<A> Return(A value) =>
Some(value);
public Option<A> Fail(string value) =>
None;
}
What does the Add
function now look like?
MI Add<MonadInt, MI>(MI ma, MI mb) where MonadInt : struct, Monad<MI, int> =>
default(MonadInt).Bind(ma, a =>
default(MonadInt).Bind(mb, b =>
default(MonadInt).Return(a + b)));
Now that can be called with any monad that has a bound type of int
:
var ma = Some(10);
var mb = Some(20);
var mc = Add<MOption<int>, Option<int>>(ma, mb);
And so it can be done. And it can be type-safe and without hacks. But it can't be used with LINQ. It's also quite ugly to look at... if you squint at the Add
function implementation you can see the two Bind
lines are the from ... in ...
parts and the Return
line is the select
part.
As an aside, the +
operator here is also a limiting factor, if we could specify something addable, or numeric, then we could make something that worked with a generic value rather than just int
:
// Numeric type-class/trait
public interface Num<A>
{
A Add(A x, A y);
A Subtract(A x, A y);
A Multiply(A x, A y);
A Divide(A x, A y);
}
// Numeric instance for int
public struct NumInt : Num<int>
{
public int Add(int x, int y) => x + y;
public int Subtract(int x, int y) => x - y;
public int Multiply(int x, int y) => x * y;
public int Divide(int x, int y) => x / y;
}
// Numeric instance for double
public struct NumDouble : Num<double>
{
public double Add(double x, double y) => x + y;
public double Subtract(double x, double y) => x - y;
public double Multiply(double x, double y) => x * y;
public double Divide(double x, double y) => x / y;
}
Now we can rewrite the generic Add
function to be even more generic:
MA Add<MNumA, NumA, MA, A>(MA ma, MA mb)
where MNumA : struct, Monad<MA, A>
NumA : struct, Num<A> =>
default(MNumA).Bind(ma, a =>
default(MNumA).Bind(mb, b =>
default(MNumA).Return(default(NumA).Add(a, b))));
This is now a generic add function that takes any monadic-types as long as their bound value is numeric. This is very powerful but, it's really not worth going the super generic route unless it's going to seriously save some typing and result in more robust code. It's not easy to read or follow and will confuse most C# devs. Hopefully we'll see higher-kinds in C# proper one day, but in the mean-time stick with bespoke monads that use Select
and SelectMany
and work with LINQ.
To see the above technique in-use, check out the definition of the Monad type-class interface and some of the Monad class-instances. There's example usage in the unit tests. Again, it must be made clear that this approach isn't necessary the vast majority of the time.
Luckily all the hard work is already done - but what if you want to build your own monadic types?
First, it's important to understand why you'd want to do that. Monads are often called programmable semi-colons, this is because monads are all about chaining operations sequentially (like lines of imperative code), but in between those operations/lines some code can run. The nature of that code is what gives each monad its flavour.
It isn't quite the same as just injecting some code inbetween the lines, because for each monadic computation a context is created for the rest of the monadic computations in the expression. For example, the State<S, A>
monad allows an initial state to be passed in to the expression and then any one computation can set the state for subsequent ones. This, from the outside, looks like state mutation but it is in fact leaving the original state alone and setting the state context going forward.
The common set of features of monads are:
- Control flow -
Option
,Either
,Try
, etc. - Asynchrony/Concurrency -
Task
- State management -
State
,RWS
- Logging -
Writer
,RWS
- Configuration passing -
Reader
,RWS
- Collection enumeration -
IEnumerable
,Seq
,Lst
,Set
,HashSet
, ... - Parsing -
Parsec
- DSL - Free monads
Let's look at one of the more complex ones State<S, A>
. This monad is defined as a delegate
, which immediately makes it lazy. If you look at the definition below you can see it takes an S
state and returns a new S
state, an A
value, and a string for whether it's faulted or not.
public delegate (S State, A Value, string Error) State<S, A>(S state);
If you remember the Haskell definition of a monad:
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
fail :: String -> m a
We need a return
function. This is the constructor of the monad:
public static class State
{
public static State<S, A> Return<S, A>(A value) =>
state =>
(state, value, null);
}
See how the implementation is delegate
that takes a state
and along with the value
constructs (S State, A Value, string Error)
Let's do fail:
public static class State
{
public static State<S, A> Fail<S, A>(string errorMsg) =>
state =>
(state, default, errorMsg);
}
So now we can construct a State<S, A>
in either a success or fail state. Now we need to implement (>==)
, which is the bind function. C# does this slightly differently than Haskell. There are a few extension methods to create that make the resulting code more optimal. We can, however, derive those extension from a single bind function:
public static class State
{
public static State<S, B> Bind<S, A, B>(this State<S, A> ma, Func<A, State<S, B>> bind) =>
state =>
{
// First we invoke the State delegate to get the tuple result
var (newState, valueA, error) = ma(state);
// We can now put our special magic code in-place that
// does this flavour of monad 'in between the lines'
// We add error handling which auto-shortcuts the expression
if(error != null) return (state, default, error);
// Next we run the bind function, with the result value of `ma`
// to find out what our next step is
var mb = bind(valueA);
// `mb` is a State<S, B> - but we're still inside the delegate,
// which expects a tuple result. So, we need to invoke `mb` with
// an `S` state. Luckily we got `newState` returned from invoking
// `ma` - and this allows us to thread the state through the
// expression without any manual intervention from the programmer
return mb(newState);
};
}
None of this magically makes LINQ work and so now we have to implement the Select
and SelectMany
extension methods to pull all of this together:
public static State<S, B> Select<S, A, B>(this State<S, A> ma, Func<A, B> map) =>
ma.Bind(a => State.Return(map(a)));
public static State<S, B> SelectMany<S, A, B>(this State<S, A> ma, Func<A, State<S, B>> bind) =>
ma.Bind(bind);
public static State<S, B, C> SelectMany<S, A, B, C>(
this State<S, A> ma,
Func<A, State<S, B>> bind,
Func<A, B, C> project) =>
ma.Bind(a => bind(a).Select(b => project(a, b)));
What I find interesting about these LINQ operators is how they can all be implemented in terms of Bind
. This means that building new monadic types can be quite an easy process of just building the Bind
function and then copying the above but replacing State
for your monad.
It should be noted though that you can realise performance benefits by implementing them manually, but that's outside the scope of this discussion
Now we can try out our State
monad to see what we can do. Let's create something to use as state:
[WithLens]
public partial class Person
{
public static Person Empty = new Person("", "", default);
public readonly string Name;
public readonly string Surname;
public readonly Color FavouriteColour;
public Person(string name, string surname, Color favouriteColour)
{
Name = name;
Surname = surname;
FullName = fullName;
FavouriteColour = favouriteColour;
}
}
using static State;
var stateMonad = from name in State.Return<Person, string>("Paul")
from surname in State.Return<Person, string>("Louth")
select $"{name} {surname}";
var (state, result, error) = stateMonad(Person.Empty);
Ok, that doesn't seem that useful. We're just getting a result, but not interacting with the internal state. So, how can we do that?
First we want to be able to save state:
public static class State
{
public static State<S, Unit> put<S>(S newState) =>
_ =>
(newState, default, null);
}
Notice how the input state is ignored (that's why it's using the _
name). Next we're returning a result with newState
as the state, and Unit
as the value.
Now we need to be able to get the state out of the monad to be used like the name
and surname
are in the example above. One thing to note about name
and surname
above is that it's the A
part of State<S, A>
- so we need the S
to go where the A
is to make it available in LINQ:
public static class State
{
public static State<S, S> get<S>() =>
state =>
(state, state, null);
}
See how the state
is copied to the value position, therefore returning a State<S, S>
. This makes the state available.
using static State;
var stateMonad = from person in get<Person>()
from _ in put(person.With(Name: "Paul", Surname: "Louth"))
select unit;
var (state, result, error) = stateMonad(Person.Empty);
One more useful function is modify
:
public static class State
{
public static State<S, Unit> modify<S>(Func<S, S> f) =>
from state in get<S>()
from _ in put(f(state))
select unit;
}
We can then start wrapping up interaction with the state:
public static State<Person, Unit> putPersonName(string name, string surname) =>
modify(person => person.With(Name: name, Surname: surname));
public static State<Person, Unit> putFavouriteColour(Color colour) =>
modify(person => person.With(FavouriteColour: colour));
And those can be composed into other LINQ expressions:
public static State<Person, Unit> InitPerson(string name, string surname, Color favouriteColour) =>
from _1 in putPersonName(name, surname)
from _2 in putFavouriteColour(favouriteColour)
select unit;
Suddenly everything starts looking very declarative and isn't cluttered with explicit state management. There's not a context parameter, or a state variable in sight.
We can then build the state monad computation and invoke it with some state empty state:
var initPerson = InitPerson("Paul", "Louth", Color.Red);
var (person, _, error) = initPerson(Person.Empty);
We could also make it convenient to read back some state:
public static State<Person, string> getName =>
from person in get<Person>()
select person.Name;
public static State<Person, string> getSurname =>
from person in get<Person>()
select person.Surname;
public static State<Person, Color> getFavouriteColour =>
from person in get<Person>()
select person.FavouriteColour;
And then compose it with the previous InitPerson
:
var statementAboutPerson = from _ in initPerson
from name in getName
from surname in getSurname
from colour in getFavouriteColour
select $"{name} {surname}'s favourite colour is {colour}";
var (person, statement, error) = statementAboutPerson(Person.Empty);
The resulting code becomes more an more elegant and declarative the more you wrap up like this.
One interesting aspect of this is that a tenet of functional programming is to avoid side-effects. One way of doing that is by passing in the state of the world to any function (along with the argument it needs to run), and then instead of running a side-effect to modify the world, return an updated state of the world along with the result of the operation. Thereby maintaining the immutability of the past, you've created a new future, and left the past as it should be: immutable. The State
monad is an exact reflection of that approach.
Hopefully the demonstration of how to build a slightly more complex monad in C# (than is normally shown in blogs) will give some food for thought on the power and validity of the approach. My personal feeling is that monadic programming is the next step for mainstream programming. Most of the mainstream languages today are based on the same techniques of 20-30 years ago, and with the super-large programs we have today I believe we're running out of luck with that approach.
We need better mechanisms of abstraction, the OO ways do not compose and therefore don't deal with the complexity problem, functional programming does, and monadic programming takes it one step further.