Generic Map, Filter and Reduce in Go

Implementing higher-order functions in Go using generics

Jul 26, 2022

∙ Paid

With generics, it has suddenly becomes much easier to add support for higher-order functions such as Map, Filter and Reduce in Go. However, there is much disagreement around how these functions should be implemented.

I will claim part of the confusion stems from the fact that many Go developers come from Java and C# backgrounds. They have become heavily invested in the object-oriented programming mindset that dominates the programming communities of those languages. It leads to attempts to define Map, Filter and Reduce as methods. Let us look at once such failed attempt, adding Reduce as a method to a custom collection type Array.

package main

import "fmt"

type Array[V any] struct {
	elements []V
}

type Reducer[T, V any] func(accum T, value V) T

func (xs Array[V]) Reduce[T any](f Reducer[T, V]) T {
	var accum T
	for _, x := range xs.elements {
		accum = f(accum, x)
	}
	return accum
}

func adder(accum, x int) int {
	return accum + x
}

func main() {
	numbers := Array{[]int{1, 2, 3, 4, 5}}
	total := numbers.Reduce(adder)

	fmt.Println(total)
}

The program will fail to compile with the error message:

reducer.go:11:26: syntax error: method must have no type parameters

The reason is that we are introducing a type parameter T on the Reducemethod. Go does not allow you to introduce new type parameters to a method. You have to use type parameters defined on your collection, such as V. A possible solution is to add an extra type parameter T to the definition of the Array parametric type.

However, I will argue that such a solution would be an ugly hack and go against what the philosophy of good Go programming, as well as being a poor software engineering choice in general. Let me quote one of the core Go designers:

For general purpose data types, prefer a function, rather than writing a constraint that requires a method.
— Ian Lance Taylor

Methods are overused, and it is actually a good thing that Go discourages their use. It is not just a belief among Go designers such as Ian Lance Taylor, but a belief held by prominent voices in the C++ community as well.

Effective C++ item 23: Prefer Non-member Non-friend Functions to Member Functions
— Scott Meyers

The I/O system in Go gives a good example of why free functions work so much better than methods in most cases. Go has many types of I/O objects such as pipes, files, sockets, and string buffers. All of these I/O objects support these two simple Go interfaces:

type Reader interface {
    Read(buf []byte) (n int, err error)
}

type Writer interface {
    Write(p []byte) (n int, err error)
}

Go provides numerous free functions which can work with these interfaces. For instance, you can write formatted text output to a Writer object with the Fprintf function. Thus, you can use this function to write to a file you have opened:

file, err := os.Create("message.txt")
if err != nil {
	panic(err)
}
defer file.Close()

fmt.Fprintf(file, "Hello %s!", name)

Would it not be better and more object-oriented to add Fprintf as a method?

file.Fprintf(file, "Hello %s!", name)

No, this approach would be a bad solution. Such an approach means that a Fprintf method needs to be added to every I/O type. The counter-point from the object-oriented programming fans would be that Fprintf should be added to a base class, which all I/O objects must inherit. That is a bad solution for several reasons:

Inheritance creates tight coupling. Changes to a base class can break subclasses. Read about the fragile base class problem.
The designers of the I/O library will have to predict all possible functionality needed by subclasses.
Go doesn't support inheritance.

By adding functionality as free functions found in separate packages, we can extend the functionality of existing types without having to edit the source code of base classes. Languages such as Swift, Objective-C and Smalltalk solve this issue by having the concept of categories: collections of methods which are added to existing classes.

Generally, solution such as categories require dynamic languages where classes are objects which can be modified at runtime. Modifying types at runtime causes its own set of problems.

In short, good software design compels us to keep types with minimal interfaces. Types should only have a minimal number of methods required to implement the most basic functionality of a type. Other functionality should instead be provided as free functions. Such a strategy allows us to organize functional extensions to our types into packages. Each package can have well-defined usage, giving us a more modularized code.

Implementing Map, Filter and Reduce as Free Functions

The right approach in implementing Map, Filter and Reduce is hence to define them as free functions. Turning them into methods is a poor choice, as that would mean reimplementing the same methods across multiple types. Even if Go did support inheritance, which it doesn't, we would be stuck with the requirement of having every collection type inherit the same base interface. We would end up with a tightly coupled and brittle design.

A naive approach would be to implement these functions to work directly on collection types:

// Don't do this
func Reduce[T, V any](xs Array[V], f Reducer[T, V]) T {
	var accum T
	for _, x := range xs.elements {
		accum = f(accum, x)
	}
	return accum
}

Why is this solution a bad idea? It has some of the same flaws as earlier solutions and some new ones:

We would have to implement the same Reduce function for every collection type, leading to code duplication.
Go does not support function overloading, so this approach wouldn't actually work even if you tried.

The key realization you should make is that you want to perform the same basic operation on any collection type. We must abstract away the difference between different collection types and talk to an interface instead of a concrete collection type.

The solution is to work with iterators rather than concrete classes. With Iterators we can implement algorithms like Map, Filter and Reduce so that they can be applied to any collection type accessible through an iterator.

Defining an Iterator Interface to Collections

We want our algorithms to work with abstract iterators. Algorithms should not have to know whether they are iterating over a binary tree, dictionary, array, or set. I have modeled my iterator interface on how the Go Scannertype works, which has an iterator-like interface.

type Iterator[T any] interface {
	Next() bool
	Value() T
}

This interface is designed, so you should be able to iterate over a collection easily with a for-loop:

// print out every value in the collection iterated over
for iter.Next() {
    fmt.Println(iter.Value())
}

We can create some concrete implementations of this iterator interface. It would be very useful to be able to iterate over a regular Go slice:

type SliceIterator[T any] struct {
	Elements []T
	value    T
	index    int
}

// Create an iterator over the slice xs
func NewSliceIterator[T any](xs []T) Iterator[T] {
	return &SliceIterator[T]{
		Elements: xs,
	}
}

// Move to next value in collection
func (iter *SliceIterator[T]) Next() bool {
	if iter.index < len(iter.Elements) {
		iter.value = iter.Elements[iter.index]
		iter.index += 1
		return true
	}

	return false
}

// Get current element
func (iter *SliceIterator[T]) Value() T {
	return iter.value
}

We now have enough implementation to allow iteration over a slice of numbers:

var numbers []int = []int{1, 2, 3, 4, 5}

iter := NewSliceIterator(numbers)
for iter.Next() {
	fmt.Println(iter.Value())
}

Implementing Algorithms Using Iterators

Now that we have defined iterators, we can define algorithms using these iterators. Instead of collecting values and returning them in a collection, we return a new iterator. By returning an iterator, we are given the flexibility of chaining these higher-order functions and writing output to any data structure we like. Before looking at algorithm implementations, I want to show you how they can be used once implemented:

var numbers []int = []int{1, 2, 3, 4, 5}

// Create iterator over a slice of integers
iter := NewSliceIterator(numbers)

// Pick values larger than 3
filtered := Filter(iter, func(x int) bool {
	return x > 3
})

// Square all values
mapped := Map(filtered, func(x int) int {
	return x * x
})

// Collect result from collection
result := Collect(mapped)

for _, x := range result {
	fmt.Println(x)
}

Map and Filter take iterators as input and produce iterators as output. The effect of this design choice is that we mimic lazy evaluation. When calling Map the values of the collection doesn't actually get mapped there and then. A benefit of this strategy is that if we perform many algorithms on the same collection, we avoid numerous memory allocations, which could kill performance. In the example above, don't actually allocate any memory for the result until Collect is called.

In this particular example, calling Collect is not necessary, as we are just printing out each value. Instead, we could access the iterator in the for-loop directly:

for mapped.Next() {
	fmt.Println(mapped.Value())
}

Map function implementation

Let us look at the simplest function to implement, Map. Most of the functionality is actually in the special iterator object, mapIterator returned. The actual value mapping happens in the Value method. The iterator stores the mapping function mapper and iterator source to the underlying collection.

The Map function itself is pretty dumb, it just constructs a mapIterator and returns it.

type mapIterator[T any] struct {
	source Iterator[T]
	mapper func(T) T
}

// advance to next element
func (iter *mapIterator[T]) Next() bool {
	return iter.source.Next()
}

func (iter *mapIterator[T]) Value() T {
	value := iter.source.Value()
	return iter.mapper(value)
}

func Map[T any](iter Iterator[T], f func(T) T) Iterator[T] {
	return &mapIterator[T]{
		iter, f,
	}
}

Filter function implementation

The Filter function is designed to keep only those values which match a supplied predicate. A predicate is a function which takes an element as input and returns true or false depending on whether the supplied element matches the filtering criteria. Some examples could be odd or even. If the predicate is odd, then only odd numbers are returned.

Keep reading with a 7-day free trial

Subscribe to Erik Explores to keep reading this post and get 7 days of free access to the full post archives.