Sometimes you want to traverse a list, perform an operation on each element,
then output a result based on the elements. Isn’t that what the function
map
does? Well, yes and no. The function map
outputs a list. You
might not want a list, but a value based on the elements of the input list. In
general, you want to reduce the elements of a list to a value, i.e. fold a list
into a value. The next section clarifies what we mean.
The library function sum
takes a list of numbers and adds all those
numbers together to result in a sum. As an exercise to help us understand how
the function sum
works, we implement our own version as follows:
:include: file=”assets/src/recurse/sum.hs”, line=27:30
Let’s generalize the above code into a pattern. Suppose we have a list $\ell$ of numbers. Let’s get the base case out of the way. If $\ell$ is empty, the sum of $\ell$ is zero. Now suppose $\ell$ is non-empty and let’s get on to the recursive case. Initially, the cumulative sum is zero. By definition of the recursive case, we can also argue that the initial cumulative sum is the head of $\ell$. As we visit the $i$-th element $e_i$ of $\ell$, the cumulative sum is the value of $e_i$ plus all the elements $e_0,\, e_1,\, \dots,\, e_{i-1}$. At any given step in the recursive process, we have three numbers:
By way of example, consider summing the list [1, 2, 3, 4, 5]
. The initial sum
is zero, which is also the starting value of the accumulator. At element
$e_0 = 1$, our accumulator is $e_0 = 1$ as well. At $e_1 = 2$, the accumulator
is $e_0 + e_1 = 3$. At $e_2$, the accumulator is $e_0 + e_1 + e_2 = 6$. At
$e_3$, the accumulator is $e_0 + e_1 + e_2 + e_3 = 10$. At $e_4$, the
accumulator is $e_0 + e_1 + e_2 + e_3 + e_4 = 15$. We have exhausted all
elements of the initial list, i.e. the remaining list is empty. By definition,
the sum of an empty list is zero. Add zero to the accumulator to result in the
accumulator itself. The result of summing the initial list is the current value
of the accumulator, namely 15. We have reduced (or folded) the initial list
[1, 2, 3, 4, 5]
to a value. The function used in the reduction is arithmetic
addition.
The pattern described in the section How to sum is an example of
a fold operation. The Haskell function foldl
takes
three parameters:1
+
.We can use foldl
to rewrite add
as follows:
:include: file=”assets/src/recurse/sum.hs”, line=32:34
The character l
(lowercase L) in the function name foldl
means “left”. This
means that the fold operation goes from the left of the list to the right. For
example, summing the list [1 .. 5]
can be written as:
1
2
ghci> foldl (+) 0 [1 .. 5]
15
In the above GHCi session, the binary operator +
hides the role of the
accumulator. We do not always have a pre-defined library function to pass to
foldl
. Sometimes we must define our own binary function. Let’s rewrite the
above code to clearly show how the accumulator is keeping track of the
cumulative sum. Examine the lambda expression below:
1
2
ghci> foldl (\acc x -> acc + x) 0 [1 .. 5]
15
Prior to starting the summation, the value of the accumulator acc
is 0. Now
start the summation at the first element of the list [1 .. 5]
. Then 1
would
be passed to the lambda expression and assigned to the variable x
, resulting
in the cumulative sum $0 + 1 = 1$. The new value of the accumulator is now 1.
Repeat the process for each of the remaining elements of the list and we obtain
the sum of the list. Using the associative law for addition,
the summation encapsulated in the above Haskell code can be written as:
1
((((0 + 1) + 2) + 3) + 4) + 5
Notice the position of the initial value 0 and how it corresponds to the position of the accumulator in the above lambda expression. Since we are folding from the left, the accumulator should be on the left-hand side in both infix and prefix notations. To evaluate the expression, we start from the innermost pair of parentheses and work our way to the outermost pair of parentheses.
Rumack: Can you fly this plane and land it?
Ted Striker: Surely you can’t be serious.
Rumack: I am serious… and don’t call me Shirley.
— Airplane!, 1980
If there is a fold left, then surely there must be a fold right, yes? Of course
there is a fold right. The Haskell function foldr
allows you to fold
from the right of a list to the left of the list. The function accepts the same
three parameters as foldl
. Here is a comparison:
1
2
3
4
5
6
7
8
ghci> foldl (+) 0 [1 .. 5]
15
ghci> foldl (\acc x -> acc + x) 0 [1 .. 5]
15
ghci> foldr (+) 0 [1 .. 5]
15
ghci> foldr (\x acc -> x + acc) 0 [1 .. 5]
15
The above summation with foldr
can be written as:
1
1 + (2 + (3 + (4 + (5 + 0))))
Again, notice the position of the initial value 0 and the position of the
accumulator in the lambda expression for foldr
. Since we are folding from the
right, the accumulator should be on the right-hand side in both infix and prefix
notations. We start from the innermost pair of parentheses and work our way out.
Why do foldl
and foldr
result in the same answer in the above GHCi session?
The reason is that addition is associative. It does not matter in which order we
perform the addition. We always get the same result no matter the order in which
we add.
Here is a simple case where foldl
and foldr
give different results:
1
2
3
4
5
6
7
8
ghci> foldl (-) 0 [1 .. 5]
-15
ghci> foldl (\acc x -> acc - x) 0 [1 .. 5]
-15
ghci> foldr (-) 0 [1 .. 5]
3
ghci> foldr (\x acc -> x - acc) 0 [1 .. 5]
3
Notice the position of the accumulator in the lambda expression for foldl
and
foldr
. The subtraction problem denoted by the code foldl (-) 0 [1 .. 5]
can
be written as:
1
((((0 - 1) - 2) - 3) - 4) - 5
On the other hand, the code foldr (-) 0 [1 .. 5]
can be written as:
1
1 - (2 - (3 - (4 - (5 - 0))))
In the case of subtraction, foldl
and foldr
give different results because
subtraction is not associative. The order of operation matters in subtraction.
Suppose we have a binary operator $f$ that is associative, i.e. the order of
operation does not matter. Using either foldl
or foldr
with $f$ would result in the same answer. Which fold operation should we choose?
A simple answer is: You should usually choose foldr
if your binary operator is
associative. The reason is that foldr
is more time efficient than foldl
. As
an example, let’s sum all integers between 1 and 1,000,000. Here is foldl
in
action:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
ghci> :set +s
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.34 secs, 161,302,760 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.25 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.23 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.23 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.25 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.25 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.25 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.23 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.23 secs, 161,299,688 bytes)
ghci> foldl (+) 0 [1 .. 1000000]
500000500000
(0.23 secs, 161,299,688 bytes)
The GHCi command :set +s
would print some basic statistics about a
function call. For the purposes of this example, we are only interested in the
amount of time elapsed from the start of a function call to its termination. In
the above GHCi session, we call foldl
ten times to sum all integers between 1
and 1,000,000. Each call has an average elapsed time of approximately 0.249
seconds.
Now let’s see foldr
in action:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
ghci> :set +s
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.22 secs, 161,593,088 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.17 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.17 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.18 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.17 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.16 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.18 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.16 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.18 secs, 161,590,016 bytes)
ghci> foldr (+) 0 [1 .. 1000000]
500000500000
(0.15 secs, 161,590,016 bytes)
ghci>
The average elapsed time is approximately 0.174 seconds. As we can see from the
above two sets of GHCi sessions, foldr
is more time efficient than foldl
in
the case of addition.
We know that there is a fold from the left (foldl
) and a fold from
the right (foldr
). But did you know there is another version of
foldl
? It is called foldl'
, notice the single quotation mark
'
. The function foldl'
is defined in the package GHC.List
and
is meant to be an efficient implementation of foldl
.
Why another implementation of fold? The reason is due to associativity again.
Suppose we have a binary operator $f$ that is not associative, i.e. the order of
operation matters. Further assume that foldl
(the left fold) produces the
result we want, whereas foldr
(the right fold) would result in an erroneous
answer. Then foldl'
should usually be preferred to foldl
because the former
is more time efficient than the latter.
Consider the example of subtraction. Remember, subtraction is not associative,
unlike addition. The GHCi session below uses foldl
ten times, yielding an
average elapsed time of approximately 0.221 seconds.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
ghci> :set +s
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.26 secs, 161,303,552 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.21 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.23 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.19 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.23 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.21 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.24 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.21 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.23 secs, 161,300,480 bytes)
ghci> foldl (-) 0 [1 .. 1000000]
-500000500000
(0.20 secs, 161,300,480 bytes)
Here’s the GHCi session for foldl'
. Note that we must import the package
GHC.List
because foldl'
is defined in that package. The average elapsed time
of ten calls is about 0.062 seconds.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
ghci> import GHC.List
ghci> :set +s
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.05 secs, 88,074,640 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.08 secs, 88,073,720 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.03 secs, 88,073,728 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.02 secs, 88,073,728 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.06 secs, 88,073,728 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.07 secs, 88,073,736 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.08 secs, 88,073,736 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.08 secs, 88,073,736 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.08 secs, 88,073,736 bytes)
ghci> foldl' (-) 0 [1 .. 1000000]
-500000500000
(0.07 secs, 88,073,736 bytes)
When you must use a left fold, foldl'
would usually be a better choice than
foldl
.
Zapp: Wait. What are you people? Idiots? I’m still going mano a mano with this envelope. And the winner is: Miss Vega 4. There it is, Miss Universe. There it is, looking weird.
— Futurama, season 2, episode 6, 2000
The universal property of fold refers to the fact that many recursive definitions of functions can be equivalently expressed as a fold operation.2 Suppose we have a binary operator $f$ and we define a function $g$ as follows:
1
2
g [] = v
g (x:xs) = f x (g xs)
where v
is the base case for $g$. The universal property of fold says that the
above recursive definition is equivalent to the following fold operation:
1
g = foldr f v
What does “universal property” mean? We won’t delve too deep into the notion. Suffice it to say that, in mathematics, the idea of universal property means structural equivalence. Mathematics gives us a number of abstract rules by which we can determine whether two constructions are fundamentally the same. Suppose we can construct an object $A$ via one technique and we can construct an object $B$ via another technique. If we can show that both $A$ and $B$ satisfy the same universal property, i.e. the same fundamental set of rules, then we may conclude that $A$ and $B$ are equivalent. The implication is immediate. Suppose the construction of $A$ happens to be long or takes a while to complete, but the construction of $B$ is short and efficient. If $A$ and $B$ are structurally equivalent, then we can replace $A$ by $B$. The fold operation happens to be a short and sweet way to define many recursive functions.
Let’s see how the universal property of fold can help us write an expression to calculate the length of a list. Recall from the section Length of list that we have the following recursive definition:
:include: file=”assets/src/recurse/length.hs”, line=27:30
To convert the above recursive definition to a fold operation, we require two ingredients:
(\_ acc -> 1 + acc)
. The wildcard character _
means that we ignore any
list element passed to the lambda expression. We replace each list element by
the value 1
.Combine the above two ingredients to obtain the following expression:
:include: file=”assets/src/recurse/length.hs”, line=32:33
In the section No loop for you, we presented a
homemade implementation of the library function map
. Here’s the code
for reference:
:include: file=”assets/src/recurse/map.hs”, line=28:31
The initial value is the empty list []
. The binary operator is the function
f x acc
, which takes an element x
of the input list, calculates the result
f x
, and prepends the result to the accumulator acc
. Writing a successive
sequence of prepending as
1
2
3
4
ghci> 1 : 2 : 3 : []
[1,2,3]
ghci> 1 : (2 : (3 : []))
[1,2,3]
it is clear that we can use a fold right. Here’s an implementation of map
by
means of foldr
:
:include: file=”assets/src/recurse/map.hs”, line=33:35
:exercise:
Why do foldl
and foldr
yield different results in the GHCi session below?
1
2
3
4
ghci> foldl (^) 2 [1, 2, 3]
64
ghci> foldr (^) 2 [1, 2, 3]
1
What can be concluded about the operator ^
?
:exercise:
In the GHCi session below, why does foldr
result in the list [1,2,3]
but
foldl
results in an error? What can be concluded about the operator :
?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
ghci> foldr (:) [] [1, 2, 3]
[1,2,3]
ghci> foldl (:) [] [1, 2, 3]
<interactive>:2:7: error:
* Couldn't match type 'a' with '[a]'
Expected: [a] -> [[a]] -> [a]
Actual: [a] -> [[a]] -> [[a]]
'a' is a rigid type variable bound by
the inferred type of it :: [a]
at <interactive>:2:1-22
* In the first argument of 'foldl', namely '(:)'
In the expression: foldl (:) [] [1, 2, 3]
In an equation for 'it': it = foldl (:) [] [1, 2, 3]
* Relevant bindings include it :: [a] (bound at <interactive>:2:1)
:exercise:
Is the division operator /
associative? Why or why not? Provide an example
using foldl
and foldr
.
:exercise: Repeat the factorial exercise, but use a fold operation.
:exercise:
Read more about foldl
, foldl'
, and foldr
.
:exercise: The arithmetic mean (or average) $a_n$ of a sequence $x_1,\, x_2,\, \dots,\, x_n$ of $n$ numbers is defined as:
\[\begin{equation} \label{eqn:arithmeticMean} a_n = \frac{1}{n} \sum_{i=1}^n x_i. \end{equation}\]Use a fold operation to implement expression (\ref{eqn:arithmeticMean}). Assume that each $x_i$ is an integer.
:exercise:
Use a fold operation to implement the function last
.
:exercise:
Write a function join
that takes two parameters: (1) a delimiter string
delim
; and (2) a list $\ell$ of numbers. The function join
outputs the
elements of $\ell$ as a string, where the elements are separated by the given
delimiter delim
. For example, the code join "," [1, 2, 3]
should result in
the string "1,2,3"
. Implement join
with and without a fold operation.
:exercise:
Use a fold operation to implement your own version of the function
reverse
.
:exercise: Repeat the exercise on dot product, but use a fold operation. Assume your lists are finite, but not necessarily of two elements.
Some programming languages use “reduce” to describe the fold operation. Examples include JavaScript, Python, and Ruby. ↩
See the following paper for more details: Graham Hutton. A tutorial on the universality and expressiveness of fold. Journal of Functional Programming, volume 9, issue 4, 1999, pp.355–372, doi: 10.1017/S0956796899003500 ↩