Lecture 03

2022-10-03 | Week 2 | by Boyan Ding

Boyan here! This continues from last lecture, and covers the slides 58 through 96 of the Intro to FP deck. As always, please give me feedback on the notes!

Local Bindings
Control Flow in Functions
First class function and Higher order function
- Influencer alert!!
Map, Filter and Reduce
- Map
- Filter

Local Bindings

This time, the majority of topics we cover are so called “syntactic sugar” in Haskell. They can make the programming easier. When we write code, sometime it is helpful to create “temporary variables” to simplify the code. In Haskell, it is achieved with let and where constructs.

The “let” construct

We start with the let construct with the following example.

-- let.hs
-- A function to get your nerdiness level
get_nerd_status gpa study_hrs =
  let
      gpa_part = 1 / (4.01 - gpa)
      study_part = study_hrs * 10
      nerd_score = gpa_part + study_part
  in
      if nerd_score > 100 then
           "You are super nerdy!"
      else "You're a nerd poser."

The let construct consists of two parts:

The first part allows you to define one or more “bindings” which associates a name with an expression.
- For example, the third line of the function binds the name gpa_part to the expression 1 / (4.01 - gpa).
The second part follows the keyword in. It contains an expression where the bindings can be used.

The bindings acts much like temporary variables in other languages albeit being immutable. (But we still refer to them as “variables”)

The “where” construct

The where construct is very similar, except the reversed order. With where, you first write the code then specify the bindings after the where keyword. The equivalent example is:

-- let.hs
-- A function to get your nerdiness level
get_nerd_status gpa study_hrs =
  if nerd_score > 100
     then "You are super nerdy!"
     else "You're a nerd poser."
  where
      gpa_part = 1 / (4.01 - gpa)
      study_part = study_hrs * 10
      nerd_score = gpa_part + study_part

You may ask: when do we use let and when do we use where? The rule of thumb is:

When defining bindings (variables) for a single expression (e.g. the example here), either one is fine
When defining bindings for use across multiple expressions, use where

The second case is mainly for guards and case statements. We haven’t learned them yet but we will see them shortly.

Defining Nested Functions with “let” or “where”

We can also use let and where to define nested functions:

-- nestfunc.hs
-- Function to describe someone's behavior
whats_the_behavior_of name =
 if name == "Carey"
    then behaves_like name "twelve year-old"
    else behaves_like name "grown-up"
 where
   behaves_like n what =
      n ++ " behaves like a " ++ what ++ "!"

This is extremely useful when you want to define some small helper routines, as it can often help deduplicate the code or clairfy the logic with meaningful names.

Note that the nested function has the visibility into all of the enclosing function’s bindings. Thus we remove the argument n from behaves_like and let it directly refer to name in the outer function. The code turns into:

-- nestfunc.hs
-- Function to describe someone's behavior
whats_the_behavior_of name =
 if name == "Carey"
    then behaves_like "twelve year-old"
    else behaves_like "grown-up"
 where
   behaves_like what =
      name ++ " behaves like a " ++ what ++ "!"

Lazy Execution under “let” and “where” Constructs

We have learned in previous lecture about Haskell’s lazy evaluation. It also applies to both let and where we just learned. The rule is:

The binding clauses are not evaluated when they are created
Instead, the expression associated with a given binding is only computed if it is actually used.

-- let.hs
-- A potentially really slow function
potentially_slow_func arg =
  let
      val1 = really_slow_function arg
      val2 = very_fast_function arg
      val3 = pretty_fast_function arg
  in
      if val3 > 100 then val1
                    else val2

Haskell’s behavior when running the code above is:

For the first part of the let clause, Haskell simply associate the three names with their corresponding expressions without calling the functions
When the function body runs, val3 is used for the condition of if. Haskell evaluates the expression associated with val3, calling pretty_fast_function
Only one of the val1 and val2 will be evaluated according to the condition.
- Suppose that pretty_fast_function returns 20. Then only val2 is evaluated, and very_fast_function is called. We will never call really_slow_function in this case.

Well, what if a name is referred more than once? Do Haskell need to evaluate the associated expression every time the name is used?

The answer is no. That’s because the values are immutable, so that they can be stored and reused.

The laziness does poses a challenge in getting input. Because the sequence of input can be affected by the sequence of evaluation. Haskell’s solution is to disallow IO (and other side-effects) to happen in normal code. Try to think about what will happen if the code below is valid in Haskell.

-- Which getLine will run first?!?!
laziness_example arg =
  let
      val1 = getLine
      val2 = getLine
  in
      if arg > 100 then val1 + val2
                   else val2 * val1

Control Flow in Functions

Let’s see the full range of Haskell control flow capabilities!

If-Then-Else

We have seen if before. The general syntax of the if statement (or more precisely, expression) is:

if <expression> then <expression>
                else <expression>

For example:

ageist_greeting age =
  if age > 30 then "Hey boomer!"
              else "Hey fam!"

You may have noticed something… Yes, in Haskell, every if statement must have an else statement. In the function above, if the else clause were omitted, the function would return no value, which is not allowed in Haskell.

Boyan says the precise reason is about type. Every value in Haskell has a determined type, so applies to the result of if statement. Thus, if shall not only yield a value regardless of the condition, but the value of then and else clauses should be of the same type.

Case Construct

Case construct is also straightforward. The general syntax of the case statement is:

case <expression> of
  const1 -> <expression>
  const2 -> <expression>
  ...
  _ -> <expression>

For example:

job_recommender major city =
   case major of
     "CS" -> get_highest_paying_company city
     "EE" -> "Apple"
     _    -> "Domino's"

Here, an underscore in a case means “match everything else”. So this is like “default:” in C++.

Guards

Guard is a unique functional language construct in Haskell. It is like an if-then statement with a more compact syntax by turning

if <condition> then <do-this>

into

  | <condition> = <do-this>

We can define a function as a series of one or more guards, like

somefunc param1 param2
  | <if-x-is-true> = <run-this>
  | <if-y-is-true> = <run-that>
  | <if-z-is-true> = <run-the-other>
  | otherwise = <run-this-otherwise>

A real example in Haskell:

major_guesser salary
  | salary > 150000 = "CS"
  | salary > 120000 = "EE"
  | salary < 30000 = "Any major at USC"
  | otherwise = "Probably Poly-sci"

Note:

When we define a function with guards, we do not use equal sign after the function arguments
- Instead, the equal sign is absorbed into the guards
Each guard begins with a pipe character
Each guard contains two parts, separated by the equal sign
- The first part is a Boolean expression we want to evaluate to see if it is True.
- The second part is the “payload” expression. It will be run when the expression on the left is True.
Haskell evaluates all of the guards from top to bottom, and runs the payload of the first Boolean expression that is true.

In guards, otherwise acts like a wildcard, just like the _ in case construct. In fact, otherwise evaluates to True (you can check it in ghci).

Here we can see some examples that use guards:

-- Factorial using recursion and if/then
fact n =
  if n <= 0 then 1
            else n * fact (n-1)

-- Factorial using guards
fact n
 | n == 0 = 1
 | otherwise = n * fact (n-1)

-- Find sm0l-est value in a list
sm0lest lst
 | lst == [] = error "empty list"
 | length lst == 1 = first
 | otherwise = min first (sm0lest rest)
 where
   first = head lst
   rest = tail lst

Note that in the last example, the values defined in the where clause are used by multple guards.

With what we have learned, we can implement quicksort in eight line of code. The following implementation uses local binding with where, guards and list comprehension we learned last time. The resulting code is clear and expressive, isn’t it?

qsort lst
 | lst == [] = []
 | otherwise = less_eq ++ [pivot] ++ greater
 where
    pivot = head lst
    rest_lst = tail lst
    less_eq = qsort [a | a <- rest_lst, a <= pivot]
    greater = qsort [a | a <- rest_lst, a > pivot]

Pattern Matching

Here we learn another “syntactic sugar” called pattern matching. It is a feature of most functional programming languages which simplifies writing functions that process tuples and lists.

In Haskell, when we use pattern matching:

We define multiple versions of the same function
Each version of the function must have the same number and types of arguments
Each version of the function may specify the required rules for one or more of the arguments
When function is called, Haskell checks all the version from from top to bottom and calls the first version that matches the parameters passed in.

Simple Pattern Matching

Let’s from most basic usage: matching with constant values. We can specify constant values in place of function arguments to match that value.

-- genz.hs
genz_critic :: String -> String -> String
genz_critic "Carey" word =
  "Oh Carey... OK boomer!"
genz_critic name "lit" =
  name ++ ", you're a lit genz-er"
genz_critic name "wrekt" =
  name ++ ", git wrekt cheugy"
genz_critic name word =
  name ++ ", " ++ word ++ " is not gucci!"

In the function above, the first version has "Carey" in the place of the first argument. It only runs when user passes in "Carey" for the name argument. The same rule applies to other versions.

Let’s look at the function in real action:

ghci> :load genz
[1 of 1] Compiling Main             ( genz.hs, interpreted )
Ok, one module loaded.
ghci> genz_critic "Ed" "wrekt"
"Ed, git wrekt cheugy"

This invocation matches the third version and gives us the expected result. It’s easy.

ghci> genz_critic "Carey" "lit"
"Oh Carey... OK boomer!

This gets a little tricky, becuase the two arguments matches both the first and second versions. But Haskell only calls the first version.

ghci> genz_critic "Paul" "unix"
"Paul, unix is not gucci!"

This does not match any of the first three versions with constants. So the last version is executed as the default case.

Let’s see some more examples that uses simple pattern matching!

factorial :: Integer -> Integer
factorial 0 = 1
factorial n = n * factorial (n - 1)

list_len :: [a] -> Int
list_len [] = 0
list_len lst = 1 + list_len (tail lst)

has_this_many_eyes :: Int -> String
has_this_many_eyes 1 = "Cyclops"
has_this_many_eyes 2 = "Pupper"
has_this_many_eyes 3 = "Sphenodon Punctatus"
has_this_many_eyes 4 = "A nerd with glasses"
has_this_many_eyes n = "A freak!"

Complex Pattern Matching With Tuples

In addition to specifying explicit values for pattern matching, you can perform more advanced pattern matching on tuples, lists and algebraic data types (ADTs). That’s where pattern matching becomes extremely powerful.

For parameters that hold tuples, you can replace a parameter name with tuple notation and name individual fields in the tuple.

-- tuples.hs

-- Original way
exp :: (Int,Int) -> Int
exp t = (fst t) ^ (snd t)

-- Version #2: With pattern matching
expv2 :: (Int,Int) -> Int
expv2 (a,b) = a ^ b

-- Version #3: With simple and complex matching
expv3 :: (Int,Int) -> Int
expv3 (a,0) = 1
expv3 (a,b) = a ^ b

The code above implements three version of exponentiation function according to the numbers in a tuple.

Comparing the first two version that have identical logic:

The first version uses fst and snd to get the tuple’s first and second values.
The second version implements the same logic, but replaces the t with an explicit tuple, and name its sub-parts a and b.
- a and b are bound to the two elements of the tuples in function body and can be directly used.

As we can see, pattern matching helps the second version to be more concise and straightforward.

The third version combines previous simple matching with tuple to avoid expensive power computation when exponent is 0.

Remember when we learn tuples, we said fst and snd only works for two-element tuples. Now we can use pattern matching to process tuples with more elements.

-- tuples.hs
-- Extract the first value of a tuple
extract_1st :: (type1,type2,type3) -> type1
extract_1st (a,_,_) = a

-- Extract the second value of a tuple
extract_2nd :: (type1,type2,type3) -> type2
extract_2nd (_,b,_) = b

-- Extract the third value of a tuple
extract_3rd :: (type1,type2,type3) -> type3
extract_3rd (_,_,c) = c

The functions above extracts the first, second and third element from a tuple respectively. For example:

ghci> :load tuples
[1 of 1] Compiling Main             ( tuples.hs, interpreted )
ghci> extract_3rd ("q",5,1.7)
1.7

As we see, we can also use an underscore if we want to ignore a field in the tuple!

Boyan wants to chime in and remind that mighty as pattern matching seems to be, there are things it cannot achieve (e.g. comparing the equality of two variables).

New learners are often tempted to write code like:

tuple_eq (a, a) = True   -- WRONG!!!
tuple_eq _ = False

Haskell will complain about “conflicting definitions for ‘a’” on the first line. That’s because when variable name appears in pattern matching, Haskell can only bind the variable to the matched value that appears at the place. You cannot bind both first and second of a tuple to a same variable.

To compare the two values obtained from pattern matching, they have to be bound to different names and compared afterwards.

-- using guards
tuple_eq (a, b)
  | a == b = True
  | otherwise = False

-- or, use (==) directly
tuple_eq (a, b) = a == b

Some may wonder why _ can appear more than once in a single pattern without causing the problem. Turns out that Haskell has special rules for _. Since the value at _ are simply ignored, it can be used for multiple times in a single pattern, and the values at different _ can be unrelated at all.

Complex Pattern Matching With Lists

Ok, let’s see how we use complex pattern matching with lists. As we did with tuples, we can replace the formal parameter of a function with a list pattern.

There are two types of list patterns:

The first type is enclosed in ()
- The pattern has two or more items separated by colons.
- If the last item is a variable or _, the pattern matches lists with at least the number of items minus one.
The second type is enclosed in []
- The pattern has zero, one or more items separated by commas.
- It is used to match lists with exactly the number of items in the pattern.

The “item” mentioned above can be constant, variable name, or underscore.

For patterns of the first type that has n items:

The variables on the first n-1 places refer to the first n-1 values in the list passed into the function
The last variable refers to the full tail of the list (beyond the first n-1 items)

-- lists.hs
-- Extract the first item from a list
get_first :: [a] -> a
get_first (x:xs) = x

-- Extract all but the first item of a list
get_rest :: [a] -> [a]
get_rest (x:xs) = xs

-- Extract the second item from a list
get_second :: [a] -> a
get_second (x:y:xs) = y

We can take a look at how the functions above works.

ghci> :load lists
[1 of 1] Compiling Main             ( lists.hs, interpreted )
ghci> get_first [5,10,15,20]
5
ghci> get_rest [5,10,15,20]
[10,15,20]
ghci> get_second [5,10,15,20]
10
ghci> get_rest [5,6]
[6]
ghci> get_rest [5]
[]

Another example that is self-explanatory:

-- favs.hs
favorites :: [String] -> String
favorites [] = "You have no favorites."
favorites (x:[]) = "Your favorite is " ++ x
favorites (x:y:[]) = "You have two favorites: "
 ++ x ++ " and " ++ y
favorites ("chocolate":xs) =
  "You have many favs, but chocolate is #1!"
favorites (x:y:_) =
   "You have at least three favorites!"

Note that in the second and third pattern, the “empty list” in the last position means that the “rest” of the list must be empty. So these two cases matches lists with one and two elements respectively.

Boyan thinks that the bracket notation [x] and [x,y] are more straightforward than (x:[]) and (x:y:[]) in matching exact number of elements and should be more preferable in real coding. But you should understand the form here.

ghci> :load favs
[1 of 1] Compiling Main             ( favs.hs, interpreted )
ghci> favorites []
"You have no favorites."
ghci> favorites ["chocolate"]
"Your favorite is chocolate"
ghci> favorites ["mint","earwax"]
"You have two favorites: mint and earwax"
ghci> favorites ["chocolate","egg","dirt"]
"You have many favs, but chocolate is #1!"
ghci> favorites ["tea","coffee","carrot"]
"You have at least three favorites!"

Next, we look at the second type of list pattern, the bracket and comma notation. For a pattern with n items of this kind:

It will only match if the list’s length is exactly n
Each item matches a single value in the list

Here is an example:

-- lists.hs
-- Let's match some lists
whats_in_your_list [10] =
    "Your list has just 1 value which is 10."
whats_in_your_list [a,b] = "Your list has 2 values: "
    ++ (show a) ++ " and " ++ (show b)
whats_in_your_list [9,_,c] =
    "Starts with 9 and its third item is:"  ++ (show c)
whats_in_your_list _ =
    "Your list had something else in it."

ghci> :load lists
[1 of 1] Compiling Main             ( lists.hs, interpreted )
ghci> whats_in_your_list [10]
"Your list has just 1 value which is 10."
ghci> whats_in_your_list [10,20]
"Your list has 2 values: 10 and 20"
ghci> whats_in_your_list [9,10,20]
"Starts with 9 and its third item is 20."
ghci> whats_in_your_list [9,10,20,30]
"Your list had something else in it."

Pattern matching on lists can often greatly reduce the complexity of code. Here we provide two examples by comparing the pattern matching implementation and the old way without.

The first one finds the smallest element from a list. The first version does not use pattern matching:

sm0lest lst
 | lst == [] = error "empty list"
 | length lst == 1 = first
 | otherwise = min first (sm0lest rest)
 where
   first = head lst
   rest = tail lst

With pattern matching, the logic becomes much more compact:

sm0lest [] = error "empty list"
sm0lest [x] = x
sm0lest (x:xs) = min x (sm0lest xs)

The same applies to the following list reverse example:

-- Reverse a list w/o pattern matching
rev lst
 | lst == [] = []
 | otherwise = (rev rest) ++ [first]
 where
  first = head lst
  rest  = tail lst

-- Reverse a list with pattern matching
rev [] = []
rev (x:xs) = (rev xs) ++ [x]

The reverse algorithm is not optimal. They have O(N^2) complexity. For those who are interested in a efficient O(N) algorithm, please refer to the discussion material.

Influencer Alert!!

Pattern Matching is mostly used in functional languages, but it’s notably been adopted by Python and Rust.

In Python, we can match tuples, lists, dictionaries, etc., with pattern matching. Note that Python has dynamic typing, so you can match values of different kinds in a same pattern matching.

def func(val):  # Python pattern matching example
  match val:
    case (0, x):        # match tuple w/first elem zero
        print(f"Tuple of 0 and {x}")
    case ['a', x, 'c']: # match list w/3 elems: 'a', ??, 'c'
        print(f"List of a, {x}, and c")
    case [1, 2, *rest]: # seq of: 1, 2, ... other elements
        print(f'A list with 1, 2, and *{rest}')
    case {'foo': bar}:  # match dict w/key 'foo'
        print(f"foo maps to {bar} ")
    case _:
        print('no match')

Here is an example of Rust’s pattern matching:

// Rust language example of pattern matching

let values = (1, "Carey");
...

match values {
  (0, y) => println!("1st item is `0` and 2nd item `y` is {:?} ", y),
  (1, _) => println!("1st item is `1` and 2nd item is irrelevant"),
    _    => println!("The default case – matches all other tuples"),
}

First class function and Higher order function

One key charcteristic of functional languages is that they enable you to treat a function just like any other variable. We call this first-class function.

Formally speaking, in a language with first-class functions, a function is treated like any other data. Functions can be:

Stored in variables
Passed as arguments to other functions
Returned as values by functions
Stored in data structures

By allowing these features, first-class functions enable more efficient program decomposition and enable functions to generate new functions from scratch!

Another concept that goes hand in hand with first-class function is higher-order function. A higher-order function is:

a function that accepts another function as an argument, or
a function that returns another function as its return value

Let’s see some examples of first-class and higher-order functions!

The following example exercises the property of passing functions as arguments:

-- first.hs
insult :: String -> String
insult name = name ++ " is so cringe!"

praise :: String -> String
praise name = name ++ " is dank!"

talk_to :: String -> (String -> String) -> String
talk_to name talk_func
  | name == "Carey" = "No comment."
  | otherwise = (talk_func name)

The higher-order function talk_to takes a person’s name and one function as arguments. When we look at the type information, we can see its second argument type is (String -> String), a function type enclosed in parentheses. The type matches the two functions defined above. So let’s try them out.

ghci> :load first
[1 of 1] Compiling Main             ( first.hs, interpreted )
ghci> talk_to "Devan" praise
"Devan is dank!"
ghci> talk_to "Brendan" insult
"Brendan is so cringe!"
ghci> talk_to "Carey" insult
"No comment."

As we can see, the function talk_to is able to take other functions as input and invoke the functions accordingly.

The following example exercises the second aspect of first-class functions, i.e. being able to return functions (and store them in variables).

-- first.hs
get_pickup_func :: Int -> (String -> String)
get_pickup_func born
  | born >= 1997 && born <= 2012 = pickup_genz
  | otherwise = pickup_other
 where
   pickup_genz name = name ++ ", you've got steez!"
   pickup_other name = name ++ ", you've got style!"

The higher-order function get_pickup_func returns different functions according to the argument born. The returned function can be stored into variable and invoked later.

ghci> :load first
[1 of 1] Compiling Main             ( first.hs, interpreted )
ghci> pickup_fn = get_pickup_func 2003
ghci> pickup_fn "Jayathi"
"Jayathi, you've got steez!"
ghci> get_pickup_func 1971 "Carey"
"Carey, you've got style!"

Influencer alert!!

Although first-class and higher-order functions originate from functional programming, they are a pillar of virtually all modern programming languages.

Their use case include:

Providing comparison functions for sorting
As callbacks when events trigger
Enabling multi-threading.

Here are some examples (in JS and C++):

// javascript: 1st-class function to compare two
// movies to order them by release date

function sortByDate( a, b ) {
  if (a.date < b.date) return -1;
  if (a.date > b.date ) return 1;
  return 0;
}

movies.sort(sortByDate);

// javascript: 1st-class "callback" function is called
// when the user clicks on the "big button"

var handleClick = function (event) {
   alert("A button was clicked!");
};

// Select big button from HTML web page
var button = document.querySelector('#big-button');
// Specify what func should be called when it's clicked
button.addEventListener('click', handleClick);

void foo()      { /* does some work & returns */ }
void bar(int x) { /* does some work & returns */ }

int main() {
  std::thread thread1(foo), thread2(bar,42);

  // Run main, foo and bar until they all finish
  thread1.join(); // pause until foo() finishes
  thread2.join(); // pause until bar() finishes
  std::cout << "main, foo and bar completed.\n";
}

Map, Filter and Reduce

Here we introduce a key set of higher-order utility functions to process lists that are included in virtually all functional programming languages.

These functions fall into three categories: mappers, filters and reducers.

A mapper function performs a one-to-one transformation from one list of values to another list of values using a transform function
A filter function filters out items from one list of values using a predicate function to produce another list of values
A reducer function operates on a list of values and collapses them into a single output value

Map

A mapper function maps a list of values to another list of values. Haskell provides a mapper function called map that accepts two parameters:

A function to apply to every element of a list
A list to operate on

The type signature of map is: map :: (a -> b) -> [a] -> [b]

The first argument is a function that maps an individual item from a value of type a to a value of type b (a and b can be the same)
The second argument is a list of type a
The return value is a list of type b

We can define some functions to use with map:

-- mappy.hs
cube :: Double -> Double
cube x = x^3

one_over  :: Double -> Double
one_over x = 1/x

is_even :: Int -> Bool
is_even x = x `

ghci> :load mappy
[1 of 1] Compiling Main             ( mappy.hs, interpreted )
ghci> map cube [2,4,10]
[8,64,1000]
ghci> map one_over [2,4,8]
[.5,.25,.125]
ghci> map is_even [1,2,3,4,6]
[False,True,False,True,True]
ghci> map reverse ["mouse","cat","fly"]
["esuom","tac","ylf"]

In the last invocation, we used Haskell’s library function reverse.

So how does the map function actuall work? It’s simple. We can use pattern matching we have just learned to implement it.

-- map.hs
map :: (a -> b) -> [a] -> [b]
map func [] = []
map func (x:xs) =
  (func x) : map func xs

Filter

A filter is a function that filters items from an input list to produce a new output list. Haskell provides a function called filter that accepts two parameters:

A function that determines if an item in the input list should be included in the output list
A list to operate on

The type signature of filter is: filter :: (a -> Bool) -> [a] -> [a]

The first argument is a predicate function that determines if a value from the input list should be included in the output
The second argument is a list of type a
The returned list include all the items that passed the filter

The filter function can also be easily implemented as follows:

filter :: (a -> Bool) -> [a] -> [a]
filter predicate [] = []
filter predicate (x:xs)
 | (predicate x) = x : (filter predicate xs)
 | otherwise = filter predicate xs

That’s it for this time. Next time we will continue with the last one of the three: reducer (foldl/foldr in Haskell)