You're not on the list

Bouncer: Are you on the list?
The Mask: Nooo… but I believe my friends are. Perhaps you know them. Franklin, Grant, and Jackson.
The Mask, 1994

In the section The string and I, you learnt that a string can be understood as a list of characters. You will come across lists again and again as you delve deeper into Haskell. This section will discuss lists and various operations to perform on them. Take the time to familiarize yourself with lists. You can hug them and kiss them, but most important of all learn how to use them.

How, or construction

Bob the Builder!
Can we fix it?
Bob the Builder!
Yes, we can!
Bob the Builder, theme song

How to build a list? Use square brackets like so:

1
2
ghci> []
[]

The list [] is empty. Nothing exciting. What we want are non-empty lists like so:

1
2
3
4
5
6
7
8
ghci> [1, 2, 3]
[1,2,3]
ghci> [True, False, True]
[True,False,True]
ghci> ["a", "b", "c"]
["a","b","c"]
ghci> ['a', 'b', 'c']
"abc"

Notice that the list ['a', 'b', 'c'] is represented as the string "abc". This is why we say that a string can be considered as a list of characters.

All the lists above have literal values as elements. Their elements are pre-defined. What we want is a way to construct lists and add elements to them as required. That is a job for the cons operator :. The cons operator constructs a list by prepending one element to a given list. We can build a list of one element like so [1], a list of two elements as [1, 2], and a list of three elements as [1, 2, 3]. The cons operator constructs the same lists as follows:

1
2
3
4
5
6
7
8
9
10
ghci> 1 : []
[1]
ghci> 1 : (2 : [])
[1,2]
ghci> 1 : 2 : []
[1,2]
ghci> 1 : (2 : (3 : []))
[1,2,3]
ghci> 1 : 2 : 3 : []
[1,2,3]

Another detail to note about all the above lists is the type of the elements. The empty list [] has no elements; it does not make sense to talk about the type of its elements. Other lists shown above are non-empty. Here is the crucial point: all elements of each list have the same type. The list [1, 2] is valid because all elements have the same type. Haskell will throw a tantrum at a list such as [1, '2'] because you are mixing different types.

Who, or membership

Boy: Hey Billy! Hey Joey! Come on in. There’s plenty of room. Sorry, not you, Homer.
Homer: Why not!?
Boy: [Points to the sign, “No Homers Club”.]
Homer: But you let in Homer Glumplich!
Homer Glumplich: Hyuck hyuck!
Boy: It says no Homers. We’re allowed to have one.
The Simpsons, season 6, episode 12, 1995

Everyone is lining up to enter the hip replacement joint, club Coco Hasko. Burly the Bouncer has a list of VIPs and checks it once. When a patron attempts to enter the club, Burly uses the function elem to query whether the patron’s name is on his list. He might suspect that a patron is not on the list and would use the function notElem to confirm his suspicion. A typical nights goes something like this:

1
2
3
4
5
6
7
8
9
10
11
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> elem "Anne" vip
False
ghci> elem "Fiona" vip
True
ghci> "Fiona" `elem` vip
True
ghci> notElem "Danielle" vip
True
ghci> "Daniel" `notElem` vip
False

The functions elem and notElem are usually applied in prefix notation. However, you can delimit the functions within backticks to apply the functions in infix notation.

What, or basic properties

It’s a slow night for Burly the Bouncer. The queue is long, but not everyone can enter the Coco Hasko. Burly uses the time to memorize his list of VIPs. Early in the evening when the club was making preparation to open for the night, Burly’s list of VIPs was empty because his boss was still trying to confirm the VIPs who would attend the club. The function null would output True when given his early, empty list. He now knows that the list is non-empty so the function null would output False when given his current list. Since the VIP list is non-empty, the next logical step is to find out how many people are on the list. Burly uses the function length for this purpose. Here is how Burly checks some basic properties of his list of VIPs.

1
2
3
4
5
6
7
ghci> null []
True
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> null vip
False
ghci> length vip
6

Burly knows that the VIP list is ordered from most important to least important. Haskell uses non-negative integers to index lists. The first element of a list has index 0, the second has index 1, and so on. The operator !! is usually applied in the infix notation. Given a list and an index, the operator !! outputs the element in the list at the specified index. Burly wants to confirm the most important and least important VIPs. Here goes:

1
2
3
4
5
6
7
8
9
10
11
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! 0
"Anna"
ghci> (!!) vip 0
"Anna"
ghci> vip !! (length vip)
"*** Exception: Prelude.!!: index too large
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/List.hs:1368:14 in base:GHC.List
  tooLarge, called at libraries/base/GHC/List.hs:1378:50 in base:GHC.List
  !!, called at <interactive>:4:5 in interactive:Ghci4

The most important VIP is Anna, who occupies the position at index 0. Like other operators, e.g. + and -, that are usually applied in infix notation, the operator !! can be used in prefix notation by enclosing it within parentheses.

Burly knows that the code length vip outputs the number of elements in his list. But why would Haskell throw a tantrum at the code vip !! (length vip) when Burly attempts to find out the least important VIP? The reason is that Haskell indexes a list starting from index 0. If $n$ is the number of elements in the list, the last element has index $n - 1$. Burly must subtract one from the length of his list.

1
2
3
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! ((length vip) - 1)
"Fiona"

Haskell is happy now.

Which, or components

Which witch is the witchiest of all?

Some of the VIP guests have arrived. Burly needs to check his list and update them accordingly. Anna is the first to arrive. Burly already knows that Anna is at the top (i.e. the first element) of the list. Burly knows that the element at index 0 can be accessed as vip !! 0. The function head provides an alternative way to access the first element of the VIP list. If Anna is crossed off the list of VIPs, Burly can use the function tail to access all but the first element of his list, effectively removing Anna from the list.

1
2
3
4
5
6
7
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! 0
"Anna"
ghci> head vip
"Anna"
ghci> tail vip
["Bianca","Cary","Daniel","Ethan","Fiona"]

Fiona is the next VIP to arrive at club Coco Hasko. As she is at the bottom (i.e. the last element) of the list, her index in the updated list is the length of the list minus one. Another way for Burly to access the last element of his list is to use the function last. Burly crosses Fiona off his VIP list and then uses the function init to obtain all elements of the list except for the last element, thus removing Fiona from the list.

1
2
3
4
5
6
7
ghci> vip = ["Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> vip !! ((length vip) - 1)
"Fiona"
ghci> last vip
"Fiona"
ghci> init vip
["Bianca","Cary","Daniel","Ethan"]

Together, or join

Rufus: You want to be a public nuisance?
Chicolini: Sure! How much does the job pay?
Rufus: I got a good mind to join a club and beat you over the head with it.
Duck Soup, 1933

Burly the Bouncer has deleted two guests from his VIP list. He is breathing a sigh of relief when his boss approaches him. The boss said he is expecting another three VIP guests tonight: Gwen, Harry, Ira. Burly uses the operator ++ to append the new list to his own list. Another way to update his list would be for Burly to have a list of two elements: the first element is his own list, the second element being the list of further guests to welcome. Burly can then use the function concat to concatenate all elements together into a single list. Here is Burly’s code:

1
2
3
4
5
6
ghci> vip = ["Bianca","Cary","Daniel","Ethan"]
ghci> more = ["Gwen", "Harry", "Ira"]
ghci> vip ++ more
["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]
ghci> concat [vip, more]
["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]

Apart, or split

Juliet: Good night, good night. Parting is such sweet sorrow
That I shall say “Good night” till it be morrow.
— William Shakespeare, Romeo and Juliet, act 2, scene 2

Joining lists together all the time is no fun. Let’s play anti-Cupid and break up lists.

I’ll take these

Instead of the function init, Burly could also have used the function take to keep all elements of a list except for the last element. The function take accepts two parameters:

  1. The number of elements to keep, going from left to right in a list.
  2. The list to process.

To remove the last element of a list, Burly would retain all but the last element. The integer parameter of take would be the length of the list minus one. Here is a comparison between init and take:

1
2
3
4
5
ghci> vip = ["Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> init vip
["Bianca","Cary","Daniel","Ethan"]
ghci> take ((length vip) - 1) vip
["Bianca","Cary","Daniel","Ethan"]

Unlike the function init which removes the last element of a list, the function take can remove multiple elements from the right side of a list. Consider the following scenario. The state of Burly’s VIP list is:

1
["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]

Harry and Ira have arrived. They are waiting to get their names checked off Burly’s list and be allowed inside club Coco Hasko. Burly uses the function elem to ensure that each of the two patrons are indeed on his list of VIP guests. Yep, these two check out OK. Burly now wants to use the function take to remove the above two VIPs from his list. The list currently has seven names. Removing two names is equivalent to retaining the first five names. Here is Burly’s implementation and double check.

1
2
3
4
5
ghci> vip = ["Bianca","Cary","Daniel","Ethan","Gwen","Harry","Ira"]
ghci> take ((length vip) - 2) vip
["Bianca","Cary","Daniel","Ethan","Gwen"]
ghci> take 5 vip
["Bianca","Cary","Daniel","Ethan","Gwen"]

Drop it

Whereas take removes elements from the right side of a list, the function drop removes elements from the left side of a list. The function drop takes two parameters:

  1. The number of elements to remove from the left side of a list.
  2. The list to process.

The function tail is a special case of drop because tail removes the leftmost element. Here is a repeat of the tail operation from the section Which, or components, but compared with drop.

1
2
3
4
5
ghci> vip = ["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]
ghci> tail vip
["Bianca","Cary","Daniel","Ethan","Fiona"]
ghci> drop 1 vip
["Bianca","Cary","Daniel","Ethan","Fiona"]

As noted above, drop is more powerful than tail because the former can remove any number of elements from the left side of a list. Consider the next scenario. The state of Burly’s list is now:

1
["Bianca","Cary","Daniel","Ethan","Gwen"]

Bianca and Cary have arrived and waiting for Burly to allow them admittance into club Coco Hasko. Burly checks off the two patrons’ names on his VIP list. To update his list, Burly needs to drop the first two names from the list. Observe:

1
2
3
ghci> vip = ["Bianca","Cary","Daniel","Ethan","Gwen"]
ghci> drop 2 vip
["Daniel","Ethan","Gwen"]

JCVD epic split

The function splitAt is a general-purpose facility for separating a list into two segments. Whereas take and drop are generalizations of init and tail, respectively, splitAt generalizes all of the above four functions. In fact, splitAt can also be used to simulate the functionalities of head and last. The parameters of splitAt are:

  1. The number $n$ of elements that form the first segment, going from left to right. This is the split boundary.
  2. The list $\ell$ to process.

The function splitAt outputs a tuple containing two elements:

  • The first element is a sublist of the first $n$ elements of $\ell$. This sublist can be obtained by take n ell. Let’s call the sublist the left segment of $\ell$.
  • The second element of the tuple is a sublist of the remaining elements of $\ell$, excluding the first $n$ elements. This sublist can obtained by drop n ell. Refer to the sublist as the right segment of $\ell$.

Let’s see splitAt in action.

1
2
3
4
5
6
7
8
9
ghci> ell = [1, 2, 3, 4, 5, 6, 7, 8, 9]
ghci> take 4 ell
[1,2,3,4]
ghci> drop 4 ell
[5,6,7,8,9]
ghci> (take 4 ell, drop 4 ell)
([1,2,3,4],[5,6,7,8,9])
ghci> splitAt 4 ell
([1,2,3,4],[5,6,7,8,9])

The above GHCi session demonstrates that splitAt can simulate the functionalities of take and drop. How do you extract the respective sublists? As noted in the section Splitting headache, you can use the functions fst and snd to extract the first (or left) and second (or right) elements, respectively, of the tuple output by splitAt. On the other hand, you can destructure the tuple. Refer to the following GHCi session.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
ghci> ell = [1, 2, 3, 4, 5, 6, 7, 8, 9]
ghci> (left, right) = splitAt 4 ell
ghci> take 4 ell
[1,2,3,4]
ghci> fst $ splitAt 4 ell
[1,2,3,4]
ghci> left
[1,2,3,4]
ghci> drop 4 ell
[5,6,7,8,9]
ghci> snd $ splitAt 4 ell
[5,6,7,8,9]
ghci> right
[5,6,7,8,9]

It should make sense now how splitAt can be used to simulate the functionalities of head and last. The function head extracts the first element of a list. This is equivalent to splitting the list at the first element, then retain the left element of the resulting tuple. The function last extracts the last element of a list. You can do the same thing by splitting the list at the penultimate element, then retain the right element of the tuple. The above description bears some resemblance to how you use splitAt to achieve the same functionalities as take and drop. Observe:

1
2
3
4
5
6
7
8
9
ghci> ell = [1, 2, 3, 4, 5, 6, 7, 8, 9]
ghci> head ell
1
ghci> (fst $ splitAt 1 ell) !! 0
1
ghci> last ell
9
ghci> (snd $ splitAt ((length ell) - 1) ell) !! 0
9

Not exactly elegant, but splitAt gets the job done nonetheless. On second thought, it is more elegant and faster to use head and last than endure the rigmarole of splitAt and fst (or snd) and indexing.

Exercises

Exercise 1. Is each list construction below valid? Why or why not?

  • 'a' : "b"
  • 'a' : 'b'
  • [1] : [2]
  • [1]: [[2]]

Exercise 2. The function reverse takes a list and reverses its elements. Use reverse to simulate the functionalities of head, last, init, and tail.

Exercise 3. The function and takes a list of boolean values and outputs whether all of those values are True. Similarly, the function or takes a list of boolean values and outputs whether one of the values is True. Use and and or to simulate the operators && and ||, respectively.

Exercise 4. Use reverse to simulate the functionalities of take and drop.

Exercise 5. Use splitAt to simulate the functionalities of head and last.

Exercise 6. Use splitAt to remove the middle element of the list [1, 2, 3, 4, 5]. Repeat the exercise, but without using splitAt.

Exercise 7. Access the element in the middle of the list:

1
["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona"]

Exercise 8. The range notation .. allows you to create a list of integers. Two useful ways to create lists of numbers are:

  1. The format [first..last] creates a list of integers with the first number being first and the last being last. The numbers in between are integers greater than first, but less than last. Successive integers are increased by a step size of one.
  2. The format [first, second..last] is more general than the previous format. It allows you to create a list integers where the step size is the difference between first and second.

Use the first format to create a list of integers between one and ten, inclusive. Repeat the exercise but use the second format. Use the second format to create a list of odd integers between zero and 20, inclusive. Repeat the exercise, but create even integers.

Exercise 9. The function zip takes two lists $A$ and $B$ and outputs a list of pairs in corresponding positions. The first element of $A$ is paired with the first element of $B$, the second element of $A$ is paired with the second element of $B$, and so on. Given the list of VIPs:

1
["Anna", "Bianca", "Cary", "Daniel", "Ethan", "Fiona", "Gwen", "Harry"]

assign an ID to each person. Then use zip to construct a list of name/ID pairs.

Exercise 10. The function unzip takes a list of ordered pairs and deconstruct them into a tuple of two lists: a left list and a right list. The left list consists of the first elements of each pair. The right list has all the second elements of each pair. Use unzip to deconstruct the list

1
[(1,"a"), (2,"b"), (3,"c"), (4,"d")]

into a tuple of left and right lists. Apply concat to the right list, then zip the left list with the result. What do you notice?