This is a series of articles. All parts:
Motivation
Functional Programming abstractions have a bad rap for not being accessible.
Or for being a tool to gatekeep.
Or for just sounding like plain, psychedelic nonsense (bong rip Cotambara cofreely constructs costrength).
I WANT TO FIX THIS
I’ve been using pure FP in production for 10 years. I programmed in Haskell, OCaml, Elm and PureScript. I’ve solved real-world problems in diverse industries, such as healthcare, dating, fintech and blockchain.
In my short (but eventful) career, I benefited a lot from foundational abstractions that took root in abstract algebra and category theory. I experienced first-hand the value of these concepts. This is good stuff, folks!
I want to demystify these concepts.
I want to show they’re not scary.
I want to show they’re actually useful for building real-world stuff.
I hope you’re hyped (I know I am). Let’s start.
This is a series of blog posts and videos. I’m going to focus more on use cases rather than on deep theory but the underlying theory is important too.
All code examples are provided in OCaml for illustration purposes. This is not an OCaml tutorial but basic familiarity with the language should be enough. Refer to the OCaml::Learn section to learn more about OCaml.
All code snippets can be found on GitHub:
What is Semigroup?
We start with one of the simplest yet powerful concepts — semigroup.
Why Semigroup?
This abstraction supports quite a large number of diverse use cases:
- MapReduce
- CS Data Structures such as Segment Tree, Treap and Rope
- Optimal multiplication and string concatenation algorithms
- Blazing fast string builder
- Composable lexicographic comparison API
- Set and dictionary union
- Combining config files, CLI and environment arguments
- Composable validation
- Composable logging
- The Builder pattern from OOP
- Sane JSON merging for structured logging
And we’re going to look into ALL OF THEM in future parts.
You can see that such mathematical abstractions are expressive enough to implement common OOP patterns. However, they have the added benefit of being rooted in math and backed by thousands of years of research. This means you get so much for free!
So let’s not waste any more time.
So what is it actually?
A semigroup is a pair of a type and an operation of appending two values of this type to get a third value of the same type.
👩🔬 For the sake of simplicity, here we’re using the definition of semigroup from abstract algebra. For completeness, a Category Theory definition: a semigroup is a hom-set in a semicategory with a single object. But let’s stick with appending two values for now.
Here’s a graphical explanation:
You may immediately say that this looks too generic! It’s just appending two things, what’s the big deal?!
And indeed, you can call this operation append, add, multiply, combine, compose, merge, melt, fuse, apply, squash, etc. It doesn’t really matter, we don’t gatekeep here on irrelevant details (my favourite one is smoosh). For consistency, I’ll be using append everywhere.
But just to be more concrete, this concept can also be easily expressed in OCaml using module signatures.
module type SEMIGROUP = sig
type t
val append : t -> t -> t
end
In this module signature, we have a type t
and a binary function append
that takes two values of type t
and returns another value of type t
. When implementing a module with this signature, a developer needs to specify a concrete type t
and implement the append
function.
However, there’s one extra restriction. Not every append
creates a Semigroup. This append
operation must satisfy one requirement — associativity.
In other words, the following must be true:
append a (append b c) = append (append a b) c
And that’s all! It’s that simple! Although, for now, this may not look too interesting. Don’t worry. Lots of real-world examples are waiting in future parts!
👩🔬 More formally, a semigroup is a pair of type
t
and a binary associative operation where the operands and the result all have the same typet
.
Examples
We’ll see plenty of Semigroup examples as well as some counterexamples where this associativity requirement doesn’t hold.
Numbers
Let’s start with trivial examples. Integer numbers form a semigroup. How can we append two numbers? Simple, just add them!
Here’s how it looks in OCaml:
module IntAdd = struct
type t = int
let append = ( + )
end
It’s pretty straightforward to show that associativity holds. We all know from school math that a + (b + c) = (a + b) + c
. Not all examples will be that trivial though!
We can easily verify that this implementation does what we want using utop (an OCaml REPL):
2 3 ;;
utop # IntAdd.append int = 5 - :
I already sense doubt in your eyes. A question is rising inside you:
— “Who the hell is going to
IntAdd.append
two numbers??? I can just use+
!”And you will be absolutely right. If you want to add numbers, you’re just going to use the
+
operator. However, trivial examples help with onboarding complex concepts. And we’ll see in future articles that even this simple example is quite useful.
Is this the only way to append two numbers? Surprisingly, not. We can also multiply them! And it’ll be a valid Semigroup as well. The implementation is almost identical to IntAdd
:
module IntMul = struct
type t = int
let append = ( * )
end
Let’s verify that everything works:
2 3 ;;
utop # IntMul.append int = 6 - :
You can see that a single type (in our example, int
) can have multiple ways to append its values. OCaml module system can handle this perfectly. The important takeaway here is that Semigroup is not defined just by a type, but also by the specific implementation of append
.
🧑🔬 It’s tempting to implement such modules for other number types, like
float
. Unfortunately, arithmetic operations forfloat
are not associative under IEEE 754. In such cases, it’s better to avoid providing modules entirely. We’ll see why associativity matters later.
Boolean
Another trivial example. Booleans also can form a Semigroup! The standard operations of logical or and logical and are one of the simplest examples.
Implementation in OCaml is quite similar to the previous ones:
module BoolAnd = struct
type t = bool
let append = ( && )
end
module BoolOr = struct
type t = bool
let append = ( || )
end
As always, we can leverage utop
to see that everything works:
true false ;;
utop # BoolAnd.append bool = false
- :
true false ;;
utop # BoolOr.append bool = true - :
Proving that these operations satisfy associativity is left as an exercise for the dedicated reader.
Strings
Let’s finally look at the first slightly less trivial and the first pragmatic real-world example!
It turns out that string concatenation is also a Semigroup!
The OCaml implementation is straightforward again:
module String = struct
type t = string
let append = ( ^ )
end
Verifying that it works:
String.append "I know " "Semigroup!" ;;
utop # string = "I know Semigroup!" - :
As a developer, you append strings all the time. You don’t think about this simple operation in terms of semigroups. But the structure is there. Once you open this Pandora Box, you start noticing semigroups everywhere.
String concatenation happens to resemble number addition and boolean logical operations. JavaScript devs might actually be onto something.
The string append example is different in one other significant way. All previous operations (+
, *
, &&
and ||
) satisfy a different property — commutativity. In other words, for them, the order of arguments for append
doesn’t matter, and the following holds:
append a b = append b a
String concatenation, in turn, is not commutative. The order of appending strings matters. But the operation is still associative, as can be seen in the following example:
1)
("Hello, " (append "OCaml " "World!")
append "Hello, " "OCaml World!"
= append "Hello, OCaml World!"
=
2)
("Hello, " "OCaml ") "World!"
append (append "Hello, OCaml " "World!"
= append "Hello, OCaml World!" =
Counterexample
At this point, you may start thinking that everything is a semigroup! However, life is cruel. Not everything is a semigroup.
We don’t need to come up with an esoteric example. A simple number subtraction is not a semigroup because the associativity property doesn’t work for it, as can be seen on the following example:
1)
(1 - (2 - 3)
1 - (-1)
= 2
=
2)
(1 - 2) - 3
(-1) - 3
= (-4 =
Still, nothing prevents you from writing the following OCaml code:
module IntSub = struct
type t = int
let append = ( - )
end
👩🔬 This associativity requirement is a semantic law. You can choose not to follow it at your own risk and write code that satisfies the API but not the extra contract, as demonstrated above.
If you decide not to follow the law, you can’t reap the benefits it provides. Moreover, if you lose the semantics, then you can’t take advantage of the nice properties that they will have. And we’ll see in future parts what are those nice properties.
Fortunately, you can use tests to verify that modules satisfy the associativity law (either unit or property-based tests).
Exercise: Can you come up with at least one more counterexample of semigroup?
Conclusion
That’s it for now! I wanted to keep the introduction light. But you’re going to be mind-blown pretty soon. I promise (however, I don’t give any refunds).
Acknowledgement
Many thanks to people who proofread the early draft of this article and shared their invaluable feedback: _____C adworse DazzlingHazlitt egmaleta int_index janiczek jwdunne_dev
If you liked this blog post, consider supporting my work on GitHub Sponsors, or following me on the Internet: