Skip to content

Does C# Dream Of Electric Monads?

Paul Louth edited this page Aug 18, 2023 · 39 revisions

This is a discussion about what 'monad' means in the C# world. This makes some assumptions about the reader in that this isn't Yet Another Monad Tutorial, it's coming from the point-of-view that you've used Option, Seq, Either, etc. and you used them with LINQ, and you kind of get that this is all to do with monads and the Bind function (along with Select and SelectMany), but you're not quite sure how real this is in C# world. Is using LINQ for this purpose a hack?

So, to start this off, it's worth looking at Haskell, which is a language built around monads. They're used to partition side-effects, as well as simplify common tasks like control flow, carrying state contexts through the stack, etc.

If you look at the definition of the monad type-class, you'll see something like this:

    class Monad m where
        (>>=)       :: m a -> (a -> m b) -> m b
        return      :: a -> m a
        fail        :: String -> m a

I've stripped out a lot of the unnecessary stuff to leave us with the core definition.

(>>=) is the bind operator; we'll call that the bind function from now on.

If you've never spent any time with Haskell, then this might not make a lot of sense. An imaginary C# version might look like this:

    public interface Monad<M>
    {
        static M<B> Bind<A, B>(M<A> ma, Func<A, M<B>> f);
        static M<A> Return<A>(A value);
        static M<A> Fail<A>(string error);
    }

The reason I say imaginary is because the monad type-class requires higher-kinded types (generic-types which, themselves, take generic parameters), and C# does not have higher-kinded types. M<A> isn't possible, because it's only possible to parameterise the lower kind: A, the M part - the higher kind - can't be parameterised.

Regardless, let's imagine what could happen if C# did support those features:

    public class MOption : Monad<Option>
    {
        public static Option<B> Bind<A, B>(Option<A> ma, Func<A, Option<B>> f) =>
            ma.Match(Some: f, None: None);

        public static Option<A> Return<A>(A value) =>
            Some(value);

        public static Option<A> Fail<A>(string _) =>
            None;
    }

Filling in the implementation, you can see how the M becomes Option. And so this would create an implementation of Monad for Option.

You could do one for Seq<A>:

    public class MSeq : Monad<Seq>
    {
        public static Seq<B> Bind<A, B>(Seq<A> ma, Func<A, Seq<B>> f)
        {
            return Seq(Yield());

            IEnumerable<B> Yield()
            {
                foreach(var a in ma)
                {
                    foreach(var b in f(a))
                    {
                        yield return b;
                    }
                }
            }
        }

        public static Seq<A> Return<A>(A value) =>
            Seq1(value);

        public static Seq<A> Fail<A>(string error) =>
            Empty;
    }

And so by doing this for any type you retroactively give types like Option<A> and Seq<A> (which are just data-types) additional traits. In this case the trait of being a monad. This is known as ad-hoc polymorphism (because we're defining aspects of the type after-the-fact), and works differently to inheritance and parametric polymorphism. By filling in the contract of a type-class the data-type gains features.

Continuing with our imaginary version of C#, we could then write a function:

   M<int> Add<M>(M<int> ma, M<int> mb) where M : Monad<M> =>
       M.Bind(ma, a => 
       M.Bind(mb, b => 
       M.Return(a + b)));

This is a polymorphic function that would work with any monad and add the bound values together before putting the result back into a new version of that monad. And so, with higher-kinded types it's possible to get super generic and write functionality once that would work over Option, Either, Try, Task, Seq, etc.

But C# doesn't have higher kinds - it seems there's a possible proposal going on in the Roslyn project, but right now it's not there. This is what the special methods Select and SelectMany are for. They're a bit of a hack by the C# language team to give C# developers most of the benefits of monads.

The unfortunate missing feature is the ability to define a function that can take any monad (like Add above), or to build a type that is constrained to have monadic components. Instead we need to know the type of monad we're working on at all times.

In practice this isn't such a big problem, especially for most development - it becomes a bit of a problem if you're writing a library (like language-ext) and you'd rather not write the same function N times for N different monads, or you want other people to be able to use your code with the monads they define.

LINQ

Select and SelectMany allow you to give a data-type the trait of being a monad. Which then allows it to be used in LINQ expressions. What's interesting is how similar LINQ is to do notation (which is the convenient way of working with monads in Haskell).

First the LINQ way:

   from a in optionA
   from b in optionB
   select a + b;

Next the Haskell way:

   a <- optionA
   b <- optionB
   return (a + b)

What may start to clear in your head is that a is in optionA and b is in optionB. The in and <- operators are very clearly saying "get the value out of the thing on the right and put it into the variable on the left".

Another interesting fact is that select in LINQ is return in Haskell. Remember the original definition of a Haskell monad:

    class Monad m where
        (>>=)       :: m a -> (a -> m b) -> m b
        return      :: a -> m a
        fail        :: String -> m a

See that return takes an a and puts it into a monad m a. And so the from ... in ... and the ... <- ... can be seen as: taking the value out of the monad. The return and select can be seen as wrapping the value back up into a monad.

So, are the C# implementations really monads? Yes and no. They behave like monads in Haskell, yes. But there's a ton of super generic programming that C# cannot do due to the lack of higher-kinds, so no.

Is using LINQ in this way a hack? Absolutely not. LINQ has been purposefully engineered for this exact functionality. We could call it 'monads light'.

But

There's a bit more to this though. C# can almost do higher-kinds. This is bending the language, and I certainly wouldn't recommend using this approach too much, but we can almost do type-classes and class-instances (the implementations):

    public interface Monad<MA, A>
    {
        MB Bind<MonadB, MB, B>(MA ma, Func<A, MB> f) 
            where MonadB : struct, Monad<MB, B>;

        MA Return(A value);

        MA Fail(string value);
    }

The interface above defines a slightly less strict monad (less strict because the return type of Bind could be any monad). We can now define the implementation:

    public struct MOption<A> : Monad<Option<A>, A>
    {
        public MB Bind<MonadB, MB, B>(Option<A> ma, Func<A, MB> f) 
            where MonadB : struct, Monad<MB, B> => 
            ma.Match(
                Some: f, 
                None: () => default(MonadB).Fail(""));

        public Option<A> Return(A value) => 
            Some(value);

        public Option<A> Fail(string value) =>
            None;
    }

What does the Add function now look like?

   MI Add<MonadInt, MI>(MI ma, MI mb) where MonadInt : struct, Monad<MI, int> =>
       default(MonadInt).Bind(ma, a => 
       default(MonadInt).Bind(mb, b => 
       default(MonadInt).Return(a + b)));

Now that can be called with any monad that has a bound type of int:

    var ma = Some(10);
    var mb = Some(20);
    var mc = Add<MOption<int>, Option<int>>(ma, mb);

And so it can be done. And it can be type-safe and without hacks. But it can't be used with LINQ. It's also quite ugly to look at... if you squint at the Add function implementation you can see the two Bind lines are the from ... in ... parts and the Return line is the select part.

As an aside, the + operator here is also a limiting factor, if we could specify something addable, or numeric, then we could make something that worked with a generic value rather than just int:

    // Numeric type-class/trait
    public interface Num<A>
    {
        A Add(A x, A y);
        A Subtract(A x, A y);
        A Multiply(A x, A y);
        A Divide(A x, A y);
    }

    // Numeric instance for int
    public struct NumInt : Num<int>
    {
        public int Add(int x, int y) => x + y;
        public int Subtract(int x, int y) => x - y;
        public int Multiply(int x, int y) => x * y;
        public int Divide(int x, int y) => x / y;
    }

    // Numeric instance for double
    public struct NumDouble : Num<double>
    {
        public double Add(double x, double y) => x + y;
        public double Subtract(double x, double y) => x - y;
        public double Multiply(double x, double y) => x * y;
        public double Divide(double x, double y) => x / y;
    }

Now we can rewrite the generic Add function to be even more generic:

   MA Add<MNumA, NumA, MA, A>(MA ma, MA mb) 
       where MNumA : struct, Monad<MA, A> 
             NumA : struct, Num<A> =>
       default(MNumA).Bind(ma, a => 
       default(MNumA).Bind(mb, b => 
       default(MNumA).Return(default(NumA).Add(a, b))));

This is now a generic add function that takes any monadic-types as long as their bound value is numeric. This is very powerful but, it's really not worth going the super generic route unless it's going to seriously save some typing and result in more robust code. It's not easy to read or follow and will confuse most C# devs. Hopefully we'll see higher-kinds in C# proper one day, but in the mean-time stick with bespoke monads that use Select and SelectMany and work with LINQ.

To see the above technique in-use, check out the definition of the Monad type-class interface and some of the Monad class-instances. There's example usage in the unit tests. Again, it must be made clear that this approach isn't necessary the vast majority of the time.

Roll your own?

Luckily all the hard work is already done - but what if you want to build your own monadic types?

First, it's important to understand why you'd want to do that. Monads are often called programmable semi-colons, this is because monads are all about chaining operations sequentially (like lines of imperative code), but in between those operations/lines some code can run. The nature of that code is what gives each monad its flavour.

It isn't quite the same as just injecting some code inbetween the lines, because for each monadic computation a context is created for the rest of the monadic computations in the expression. For example, the State<S, A> monad allows an initial state to be passed in to the expression and then any one computation can set the state for subsequent ones. This, from the outside, looks like state mutation but it is in fact leaving the original state alone and setting the state context going forward.

The common set of features of monads are:

  • Control flow - Option, Either, Try, etc.
  • Asynchrony/Concurrency - Task
  • State management - State, RWS
  • Logging - Writer, RWS
  • Configuration passing - Reader, RWS
  • Collection enumeration - IEnumerable, Seq, Lst, Set, HashSet, ...
  • Parsing - Parsec
  • DSL - Free monads

Let's look at one of the more complex ones State<S, A>. This monad is defined as a delegate, which immediately makes it lazy. If you look at the definition below you can see it takes an S state and returns a new S state, an A value, and a string for whether it's faulted or not.

    public delegate (S State, A Value, string Error) State<S, A>(S state);

If you remember the Haskell definition of a monad:

    class Monad m where
        (>>=)       :: m a -> (a -> m b) -> m b
        return      :: a -> m a
        fail        :: String -> m a

We need a return function. This is the constructor of the monad:

    public static class State
    {
        public static State<S, A> Return<S, A>(A value) => 
             state => 
                 (state, value, null);
    }

See how the implementation is delegate that takes a state and along with the value constructs (S State, A Value, string Error)

Let's do fail:

    public static class State
    {
        public static State<S, A> Fail<S, A>(string errorMsg) => 
             state => 
                 (state, default, errorMsg);
    }

So now we can construct a State<S, A> in either a success or fail state. Now we need to implement (>==), which is the bind function. C# does this slightly differently than Haskell. There are a few extension methods to create that make the resulting code more optimal. We can, however, derive those extension from a single bind function:

    public static class State
    {
        public static State<S, B> Bind<S, A, B>(this State<S, A> ma, Func<A, State<S, B>> bind) => 
            state =>
        {
             // First we invoke the State delegate to get the tuple result
             var (newState, valueA, error) = ma(state);

             // We can now put our special magic code in-place that 
             // does this flavour of monad 'in between the lines'

             // We add error handling which auto-shortcuts the expression
             if(error != null) return (state, default, error);  

             // Next we run the bind function, with the result value of `ma`
             // to find out what our next step is
             var mb = bind(valueA);
   
             // `mb` is a State<S, B> - but we're still inside the delegate, 
             // which expects a tuple result.  So, we need to invoke `mb` with
             // an `S` state.  Luckily we got `newState` returned from invoking
             // `ma` - and this allows us to thread the state through the 
             // expression without any manual intervention from the programmer
             return mb(newState);
        };
    }

None of this magically makes LINQ work and so now we have to implement the Select and SelectMany extension methods to pull all of this together:

public static State<S, B> Select<S, A, B>(this State<S, A> ma, Func<A, B> map) => 
    ma.Bind(a => State.Return(map(a)));

public static State<S, B> SelectMany<S, A, B>(this State<S, A> ma, Func<A, State<S, B>> bind) => 
   ma.Bind(bind);

public static State<S, B, C> SelectMany<S, A, B, C>(
    this State<S, A> ma, 
    Func<A, State<S, B>> bind,
    Func<A, B, C> project) => 
        ma.Bind(a => bind(a).Select(b => project(a, b)));

What I find interesting about these LINQ operators is how they can all be implemented in terms of Bind. This means that building new monadic types can be quite an easy process of just building the Bind function and then copying the above but replacing State for your monad.

It should be noted though that you can realise performance benefits by implementing them manually, but that's outside the scope of this discussion

Usage

Now we can try out our State monad to see what we can do. Let's create something to use as state:

    [WithLens]
    public partial class Person
    {
         public static Person Empty = new Person("", "", default);

         public readonly string Name;
         public readonly string Surname;
         public readonly Color FavouriteColour;

         public Person(string name, string surname, Color favouriteColour)
         {
             Name = name;
             Surname = surname;
             FullName = fullName;
             FavouriteColour = favouriteColour;
         }
    }
    using static State;

    var stateMonad = from name    in State.Return<Person, string>("Paul")
                     from surname in State.Return<Person, string>("Louth")
                     select $"{name} {surname}";

    var (state, result, error) = stateMonad(Person.Empty);

Ok, that doesn't seem that useful. We're just getting a result, but not interacting with the internal state. So, how can we do that?

First we want to be able to save state:

    public static class State
    {
        public static State<S, Unit> put<S>(S newState) => 
            _ => 
                (newState, default, null);
    }

Notice how the input state is ignored (that's why it's using the _ name). Next we're returning a result with newState as the state, and Unit as the value.

Now we need to be able to get the state out of the monad to be used like the name and surname are in the example above. One thing to note about name and surname above is that it's the A part of State<S, A> - so we need the S to go where the A is to make it available in LINQ:

    public static class State
    {
        public static State<S, S> get<S>() =>
            state =>
                (state, state, null);
    }

See how the state is copied to the value position, therefore returning a State<S, S>. This makes the state available.

    using static State;

    var stateMonad = from person in get<Person>()
                     from _      in put(person.With(Name: "Paul", Surname: "Louth"))
                     select unit;

    var (state, result, error) = stateMonad(Person.Empty);

One more useful function is modify:

    public static class State
    {
        public static State<S, Unit> modify<S>(Func<S, S> f) =>
            from state in get<S>()
            from _     in put(f(state))
            select unit;
    }

We can then start wrapping up interaction with the state:

    public static State<Person, Unit> putPersonName(string name, string surname) =>
        modify(person => person.With(Name: name, Surname: surname));

    public static State<Person, Unit> putFavouriteColour(Color colour) =>
        modify(person => person.With(FavouriteColour: colour));

And those can be composed into other LINQ expressions:

    public static State<Person, Unit> InitPerson(string name, string surname, Color favouriteColour) =>
        from _1 in putPersonName(name, surname)
        from _2 in putFavouriteColour(favouriteColour)
        select unit;

Suddenly everything starts looking very declarative and isn't cluttered with explicit state management. There's not a context parameter, or a state variable in sight.

We can then build the state monad computation and invoke it with some state empty state:

    var initPerson = InitPerson("Paul", "Louth", Color.Red);

    var (person, _, error) = initPerson(Person.Empty);

We could also make it convenient to read back some state:

    public static State<Person, string> getName =>
        from person in get<Person>()
        select person.Name;

    public static State<Person, string> getSurname =>
        from person in get<Person>()
        select person.Surname;

    public static State<Person, Color> getFavouriteColour =>
        from person in get<Person>()
        select person.FavouriteColour;

And then compose it with the previous InitPerson:

    var statementAboutPerson = from _       in initPerson
                               from name    in getName
                               from surname in getSurname
                               from colour  in getFavouriteColour
                               select $"{name} {surname}'s favourite colour is {colour}";

    var (person, statement, error) = statementAboutPerson(Person.Empty);

The resulting code becomes more an more elegant and declarative the more you wrap up like this.

One interesting aspect of this is that a tenet of functional programming is to avoid side-effects. One way of doing that is by passing in the state of the world to any function (along with the argument it needs to run), and then instead of running a side-effect to modify the world, return an updated state of the world along with the result of the operation. Thereby maintaining the immutability of the past, you've created a new future, and left the past as it should be: immutable. The State monad is an exact reflection of that approach.

Conclusion

Hopefully the demonstration of how to build a slightly more complex monad in C# (than is normally shown in blogs) will give some food for thought on the power and validity of the approach. My personal feeling is that monadic programming is the next step for mainstream programming. Most of the mainstream languages today are based on the same techniques of 20-30 years ago, and with the super-large programs we have today I believe we're running out of luck with that approach.

We need better mechanisms of abstraction, the OO ways do not compose and therefore don't deal with the complexity problem, functional programming does, and monadic programming takes it one step further.