Bouncer: Are you on the list?
The Mask: Nooo… but I believe my friends are. Perhaps you know them. Franklin, Grant, and Jackson.
— The Mask, 1994
In the section The string and I, you learnt that a string can be understood as a list of characters. You will come across lists again and again as you delve deeper into Haskell. This section will discuss lists and various operations to perform on them. Take the time to familiarize yourself with lists. You can hug them and kiss them, but most important of all learn how to use them.
Bob the Builder!
Can we fix it?
Bob the Builder!
Yes, we can!
— Bob the Builder, theme song
How to build a list? Use square brackets like so:
1
2
ghci> []
[]
The list []
is empty. Nothing exciting. What we want are non-empty lists like
so:
1
2
3
4
5
6
7
8
ghci> [1, 2, 3]
[1,2,3]
ghci> [True, False, True]
[True,False,True]
ghci> ["a", "b", "c"]
["a","b","c"]
ghci> ['a', 'b', 'c']
"abc"
Notice that the list ['a', 'b', 'c']
is represented as the string "abc"
.
This is why we say that a string can be considered as a list of characters.
All the lists above have literal values as elements. Their elements are
pre-defined. What we want is a way to construct lists and add elements to them
as required. That is a job for the cons operator :
. The cons operator
constructs a list by prepending one element to a given list. We can build a
list of one element like so [1]
, a list of two elements as [1, 2]
, and a
list of three elements as [1, 2, 3]
. The cons operator constructs the same
lists as follows:
1
2
3
4
5
6
7
8
9
10
ghci> 1 : []
[1]
ghci> 1 : (2 : [])
[1,2]
ghci> 1 : 2 : []
[1,2]
ghci> 1 : (2 : (3 : []))
[1,2,3]
ghci> 1 : 2 : 3 : []
[1,2,3]
Another detail to note about all the above lists is the type of the elements.
The empty list []
has no elements; it does not make sense to talk about the
type of its elements. Other lists shown above are non-empty. Here is the crucial
point: all elements of each list have the same type. The list [1, 2]
is valid
because all elements have the same type. Haskell will throw a tantrum at a list
such as [1, '2']
because you are mixing different types.
Boy: Hey Billy! Hey Joey! Come on in. There’s plenty of room. Sorry, not you, Homer.
Homer: Why not!?
Boy: [Points to the sign, “No Homers Club”.]
Homer: But you let in Homer Glumplich!
Homer Glumplich: Hyuck hyuck!
Boy: It says no Homers. We’re allowed to have one.
— The Simpsons, season 6, episode 12, 1995
Everyone is lining up to enter the hip replacement joint, club Coco Hasko. Burly
the Bouncer has a list of VIPs and checks it once. When a patron attempts to
enter the club, Burly uses the function elem
to query whether the
patron’s name is on his list. He might suspect that a patron is not on the list
and would use the function notElem
to confirm his suspicion. A
typical nights goes something like this:
1
2
3
4
5
6
7
8
9
10
11
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> elem "Anne" vip
False
ghci> elem "Fiona" vip
True
ghci> "Fiona" `elem` vip
True
ghci> notElem "Danielle" vip
True
ghci> "Daniel" `notElem` vip
False
The functions elem
and notElem
are usually applied in prefix notation.
However, you can delimit the functions within backticks to apply the functions
in infix notation.
It’s a slow night for Burly the Bouncer. The queue is long, but not everyone can
enter the Coco Hasko. Burly uses the time to memorize his list of VIPs. Early in
the evening when the club was making preparation to open for the night, Burly’s
list of VIPs was empty because his boss was still trying to confirm the VIPs who
would attend the club. The function null
would output True
when
given his early, empty list. He now knows that the list is non-empty so the
function null
would output False
when given his current list. Since the VIP
list is non-empty, the next logical step is to find out how many people are on
the list. Burly uses the function length
for this purpose. Here is
how Burly checks some basic properties of his list of VIPs.
1
2
3
4
5
6
7
ghci> null []
True
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> null vip
False
ghci> length vip
6
Burly knows that the VIP list is ordered from most important to least important.
Haskell uses non-negative integers to index lists. The first element of a list
has index 0, the second has index 1, and so on. The operator
!!
is usually applied in the infix notation. Given a list
and an index, the operator !!
outputs the element in the list at the specified
index. Burly wants to confirm the most important and least important VIPs. Here
goes:
1
2
3
4
5
6
7
8
9
10
11
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! 0
"Anna"
ghci> (!!) vip 0
"Anna"
ghci> vip !! (length vip)
"*** Exception: Prelude.!!: index too large
CallStack (from HasCallStack):
error, called at libraries/base/GHC/List.hs:1368:14 in base:GHC.List
tooLarge, called at libraries/base/GHC/List.hs:1378:50 in base:GHC.List
!!, called at <interactive>:4:5 in interactive:Ghci4
The most important VIP is Anna, who occupies the position at index 0. Like other
operators, e.g. +
and -
, that are usually applied in infix notation, the
operator !!
can be used in prefix notation by enclosing it within parentheses.
Burly knows that the code length vip
outputs the number of elements in his
list. But why would Haskell throw a tantrum at the code vip !! (length vip)
when Burly attempts to find out the least important VIP? The reason is that
Haskell indexes a list starting from index 0. If $n$ is the number of elements
in the list, the last element has index $n - 1$. Burly must subtract one from
the length of his list.
1
2
3
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! ((length vip) - 1)
"Fiona"
Haskell is happy now.
Which witch is the witchiest of all?
Some of the VIP guests have arrived. Burly needs to check his list and update
them accordingly. Anna is the first to arrive. Burly already knows that Anna is
at the top (i.e. the first element) of the list. Burly knows that the element at
index 0 can be accessed as vip !! 0
. The function head
provides an
alternative way to access the first element of the VIP list. If Anna is crossed
off the list of VIPs, Burly can use the function tail
to access all
but the first element of his list, effectively removing Anna from the list.
1
2
3
4
5
6
7
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! 0
"Anna"
ghci> head vip
"Anna"
ghci> tail vip
["Bianca","Cary","Daniel","Ethan","Fiona"]
Fiona is the next VIP to arrive at club Coco Hasko. As she is at the bottom
(i.e. the last element) of the list, her index in the updated list is the length
of the list minus one. Another way for Burly to access the last element of his
list is to use the function last
. Burly crosses Fiona off his VIP list
and then uses the function init
to obtain all elements of the list
except for the last element, thus removing Fiona from the list.
1
2
3
4
5
6
7
ghci> vip = ["Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! ((length vip) - 1)
"Fiona"
ghci> last vip
"Fiona"
ghci> init vip
["Bianca","Cary","Daniel","Ethan"]
Rufus: You want to be a public nuisance?
Chicolini: Sure! How much does the job pay?
Rufus: I got a good mind to join a club and beat you over the head with it.
— Duck Soup, 1933
Burly the Bouncer has deleted two guests from his VIP list. He is breathing a
sigh of relief when his boss approaches him. The boss said he is expecting
another three VIP guests tonight: Gwen, Harry, Ira. Burly uses the operator
++
to append the new list to his own list. Another way to update
his list would be for Burly to have a list of two elements: the first element is
his own list, the second element being the list of further guests to welcome.
Burly can then use the function concat
to concatenate all elements
together into a single list. Here is Burly’s code:
1
2
3
4
5
6
ghci> vip = ["Bianca","Cary","Daniel","Ethan"]
ghci> more = ["Gwen", "Harry", "Ira"]
ghci> vip ++ more
["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]
ghci> concat [vip, more]
["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]
Juliet: Good night, good night. Parting is such sweet sorrow
That I shall say “Good night” till it be morrow.
— William Shakespeare, Romeo and Juliet, act 2, scene 2
Joining lists together all the time is no fun. Let’s play anti-Cupid and break up lists.
Instead of the function init
, Burly could also have used the function
take
to keep all elements of a list except for the last element. The
function take
accepts two parameters:
To remove the last element of a list, Burly would retain all but the last
element. The integer parameter of take
would be the length of the list minus
one. Here is a comparison between init
and take
:
1
2
3
4
5
ghci> vip = ["Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> init vip
["Bianca","Cary","Daniel","Ethan"]
ghci> take ((length vip) - 1) vip
["Bianca","Cary","Daniel","Ethan"]
Unlike the function init
which removes the last element of a list, the
function take
can remove multiple elements from the right side of a list.
Consider the following scenario. The state of Burly’s VIP list is:
1
["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]
Harry and Ira have arrived. They are waiting to get their names checked off
Burly’s list and be allowed inside club Coco Hasko. Burly uses the function
elem
to ensure that each of the two patrons are indeed on his list of VIP
guests. Yep, these two check out OK. Burly now wants to use the function take
to remove the above two VIPs from his list. The list currently has seven names.
Removing two names is equivalent to retaining the first five names. Here is
Burly’s implementation and double check.
1
2
3
4
5
ghci> vip = ["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]
ghci> take ((length vip) - 2) vip
["Bianca","Cary","Daniel","Ethan","Gwen"]
ghci> take 5 vip
["Bianca","Cary","Daniel","Ethan","Gwen"]
Whereas take
removes elements from the right side of a list, the function
drop
removes elements from the left side of a list. The function
drop
takes two parameters:
The function tail
is a special case of drop
because tail
removes the
leftmost element. Here is a repeat of the tail
operation from the section
Which, or components, but compared with drop
.
1
2
3
4
5
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> tail vip
["Bianca","Cary","Daniel","Ethan","Fiona"]
ghci> drop 1 vip
["Bianca","Cary","Daniel","Ethan","Fiona"]
As noted above, drop
is more powerful than tail
because the former can
remove any number of elements from the left side of a list. Consider the next
scenario. The state of Burly’s list is now:
1
["Bianca","Cary","Daniel","Ethan","Gwen"]
Bianca and Cary have arrived and waiting for Burly to allow them admittance into club Coco Hasko. Burly checks off the two patrons’ names on his VIP list. To update his list, Burly needs to drop the first two names from the list. Observe:
1
2
3
ghci> vip = ["Bianca","Cary","Daniel","Ethan","Gwen"]
ghci> drop 2 vip
["Daniel","Ethan","Gwen"]
The function splitAt
is a general-purpose facility for separating a
list into two segments. Whereas take
and drop
are
generalizations of init
and tail
, respectively, splitAt
generalizes all of the above four functions. In fact, splitAt
can also be used
to simulate the functionalities of head
and last
. The
parameters of splitAt
are:
The function splitAt
outputs a tuple containing two elements:
take n ell
. Let’s call the sublist the left
segment of $\ell$.drop n ell
. Refer to the sublist as the right segment of $\ell$.Let’s see splitAt
in action.
1
2
3
4
5
6
7
8
9
ghci> ell = [1, 2, 3, 4, 5, 6, 7, 8, 9]
ghci> take 4 ell
[1,2,3,4]
ghci> drop 4 ell
[5,6,7,8,9]
ghci> (take 4 ell, drop 4 ell)
([1,2,3,4],[5,6,7,8,9])
ghci> splitAt 4 ell
([1,2,3,4],[5,6,7,8,9])
The above GHCi session demonstrates that splitAt
can simulate the
functionalities of take
and drop
. How do you extract the respective
sublists? As noted in the section
Splitting headache, you can use the
functions fst
and snd
to extract the first (or left) and
second (or right) elements, respectively, of the tuple output by splitAt
. On
the other hand, you can destructure the tuple. Refer to the following GHCi
session.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
ghci> ell = [1, 2, 3, 4, 5, 6, 7, 8, 9]
ghci> (left, right) = splitAt 4 ell
ghci> take 4 ell
[1,2,3,4]
ghci> fst $ splitAt 4 ell
[1,2,3,4]
ghci> left
[1,2,3,4]
ghci> drop 4 ell
[5,6,7,8,9]
ghci> snd $ splitAt 4 ell
[5,6,7,8,9]
ghci> right
[5,6,7,8,9]
It should make sense now how splitAt
can be used to simulate the
functionalities of head
and last
. The function head
extracts the first
element of a list. This is equivalent to splitting the list at the first
element, then retain the left element of the resulting tuple. The function
last
extracts the last element of a list. You can do the same thing by
splitting the list at the penultimate element, then retain the right element of
the tuple. The above description bears some resemblance to how you use splitAt
to achieve the same functionalities as take
and drop
. Observe:
1
2
3
4
5
6
7
8
9
ghci> ell = [1, 2, 3, 4, 5, 6, 7, 8, 9]
ghci> head ell
1
ghci> (fst $ splitAt 1 ell) !! 0
1
ghci> last ell
9
ghci> (snd $ splitAt ((length ell) - 1) ell) !! 0
9
Not exactly elegant, but splitAt
gets the job done nonetheless. On second
thought, it is more elegant and faster to use head
and last
than endure the
rigmarole of splitAt
and fst
(or snd
) and indexing.
:exercise: Is each list construction below valid? Why or why not?
'a' : "b"
'a' : 'b'
[1] : [2]
[1]: [[2]]
:exercise:
The function reverse
takes a list and reverses its elements. Use
reverse
to simulate the functionalities of head
, last
, init
, and tail
.
:exercise:
The function and
takes a list of boolean values and outputs whether all
of those values are True
. Similarly, the function or
takes a list of
boolean values and outputs whether one of the values is True
. Use and
and
or
to simulate the operators &&
and ||
, respectively.
:exercise:
Use reverse
to simulate the functionalities of take
and drop
.
:exercise:
Use splitAt
to simulate the functionalities of head
and last
.
:exercise:
Use splitAt
to remove the middle element of the list [1, 2, 3, 4, 5]
. Repeat
the exercise, but without using splitAt
.
:exercise: Access the element in the middle of the list:
1
["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
:exercise: label=”exRangeNotation”
The range notation ..
allows you to create a list of integers. Two useful ways
to create lists of numbers are:
[first..last]
creates a list of integers with the first number
being first
and the last being last
. The numbers in between are integers
greater than first
, but less than last
. Successive integers are increased
by a step size of one.[first, second..last]
is more general than the previous format.
It allows you to create a list integers where the step size is the difference
between first
and second
.Use the first format to create a list of integers between one and ten, inclusive. Repeat the exercise but use the second format. Use the second format to create a list of odd integers between zero and 20, inclusive. Repeat the exercise, but create even integers.
:exercise:
The function zip
takes two lists $A$ and $B$ and outputs a list of
pairs in corresponding positions. The first element of $A$ is paired with the
first element of $B$, the second element of $A$ is paired with the second
element of $B$, and so on. Given the list of VIPs:
1
["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona", "Gwen", "Harry"]
assign an ID to each person. Then use zip
to construct a list of name/ID
pairs.
:exercise:
The function unzip
takes a list of ordered pairs and deconstruct them
into a tuple of two lists: a left list and a right list. The left list consists
of the first elements of each pair. The right list has all the second elements
of each pair. Use unzip
to deconstruct the list
1
[(1,"a"), (2,"b"), (3,"c"), (4,"d")]
into a tuple of left and right lists. Apply concat
to the right list, then
zip
the left list with the result. What do you notice?