Lecture 17

2023-05-31 | Week 9 | edited by Ashwin Ranade

(originally written 2022-11-28 by Carey Nachenberg)

Ashwin here! This lecture covers Control Palooza slides 1-59.

Table of Contents

Controlla-palooza

Everything you never learned about how code executes!

Expressions

An expression is a combination of values, function calls and operations that, when evaluated, produces an output value. An expression may or may not have side effects.

Which of the following are expressions?

  • f(x)*5 – Yes!
  • if (foo) cout « “bar”; – No
  • a = 5 – In some languages, yes. Like in C++ we can do b = a = 5, so a = 5 is an expression that has a side effect and produces a result
  • a && (b || c) – Yes!
  • if a > 5 then a3 else a4 – Yes - in functional languages (or languages with functional features) this if code is an expression, which produces a result of either a3 or a4

Expression Evaluation Order

Consider the following C++ code:

int c = 1, d = 2;

int f() {
  ++c;
  return 0;
}

int main() {
  cout << c - d + f() ;
}

Or this code:

int c = 0; 

int f() {
  ++c;
  return c;
}

void h(int a, int b)
  { cout << a << " "<< b; }

int main() {
  h(c, f());
}

Challenge: What will these C++ programs print?
The first program produces either -1 or 0 The second program produces either 0 1 or 1 1

Why? C++ doesn’t specify what order the terms of an arithmetic expression must be evaluated! Nor does it specify what order parameters must be evaluated! Some languages mandate a left-to-right eval order (C#, Java, JavaScript, Swift), while others don’t (C++, Rust, OCaml).

So in general, if your expressions are calling functions, those functions shouldn’t have side effects that could impact the execution of the other functions called in the expression. Otherwise you’ll get unpredictable behavior depending on the compiler and potentially each time you compile the code it could do a different thing!

Why avoid specifiying ane explicit evaluation order? The reason that an ordering is not specified is to give the compiler latitude to optimize the code.

Associativity

Regardless what order each of the terms are evaluated… Most languages use left-to-right associativity when evaluating mathematical expressions. Which is the way we all learned to do it in school.

So an expression like:

cout << c - d + f() ;

Would be evaluated as:

cout << (c - d) + f() ;

Instead of:

cout << c - (d + f());

There are some notable exceptions, like evaluating exponentiation (e.g., 2^3^2 == 512, since we often evaluate exponentiation from right-to-left). Assignment is also right-associative, so we can do something like this:

  a = b = c = 5;  
  // equivalent to:
  a  = (b = (c = 5));   // so c = 5 runs first, then b = c runs, then a = b runs

Short Circuiting

In any boolean expression with AND / OR, most languages will evaluate its sub-expressions from left-to-right. The moment the language finds a condition that satisfies or invalidates the boolean expression, it will skip evaluating the rest of the sub-expressions. This can dramatically speed up your code!

For example, this code here:

if(is_cute() || is_smart() || is_rich()) {
  date_person();
}

Really executes like this:

if (is_cute()) 
  date_person();
else if (is_smart())
  date_person();
else if (is_rich())
  date_person();

And this code here:

if (good_salary() && fun_work() && good_culture()) {
  take_job();
}

Really executes like this:

if (good_salary()) {
  if (fun_work()) {
    if (good_culture()) {
      take_job();
    }
  }
}

Some languages, like Kotlin, have the ability to use boolean expressions with short circuiting and without! e.g.

fun rich(salary: Int): Boolean {
  if (salary > 200000) return true;
  print("Not in silicon valley!\n")
  return false
}

fun main() {
  val cute = true; val smart = false;
  val salary = 200000;
    
  if (cute == true || smart == true || rich(salary) == true)  // uses short circuiting
    print("You're my type!\n")

  if (cute == true or smart == true or rich(salary) == true)  // no short-circuiting
    print("I still want to date you!\n")
}

Iteration

There are three different types of iteration that we’ll study:

Counter-controlled: We use a counter, like we would in a for-loop in C++ or using a range in Python:

    for i in range(1,10):
        do_something()

Condition-controlled: We use a boolean condition to decide how to long to loop - this is almost always a while() or do-while() loop.

    while (condition)
       do_something()

Collection-controlled: We’re enumerating the items in a collection

   for (x in container)
        do_something(x)

Iteration in loops

In languages like C++ with simple counter-controlled loops (e.g., for(int i=0;i<10;i++) { … }) it’’s easy to see how looping works. But how does looping work when we use a collection-controlled loop (e.g., iterating thru a vector of objects) or use a range()? There’’s more going on under the hood than you think! There are two primary approaches: using an iterator (of which there are two types of iterators), or using first-class functions. Let’’s learn both!

These kinds of loops are implemented using “iterators”:

for n in student_names:
 print(n)

for i in range(0,10):
  print(i)

These kinds of loops are implemented using first-class functions:

fruits.forEach
	{ f -> println("$f") }

Iterator-based loops

An iterator is an object that enables enumeration of a sequence of values. An iterator provides a way to access the values of a sequence without exposing the sequence’s underlying implementation.

  • Iterators can be used to enumerate the values held in a container (e.g., a list, set)
  • Iterators can be used to enumerate the contents of external sources like data files
  • Iterators can be used to enumerate the values of an abstract sequence. An abstract sequence is one where the values in the sequence aren’t stored anywhere, but generated as they’re retrieved via the iterator

Iterators used with containers and external sources are typically distinct objects from the objects that contain the sequences they refer to.

Iterable Objects vs Iterators

  • An Iterable Object is an object like a container (e.g., vector, list, set) which can have its items iterated over.

  • An Iterator is a thing that can refer to an Iterable Object and be used to iterate over the vaulues it holds or represents.

To get an iterator, you request one from an iterable object. The Iterable Object then returns the iterator to you. You may then use the iterator to refer to the items in the iterable object.

Examples of getting an iterator from an iterable object:

// Kotlin: iterator into container
val numbers = listOf(10,20,30,42)

val it = numbers.iterator()  // get an iterator into the numbers list
// Swift: abstract sequence
var range = 1...1000000

var it = range.makeIterator()  // get an iterator to the first # of the abstract seq.
// C++
vector<string> fruits = {"apple", "pear", ... };

auto it = fruits.begin();  // get an iterator into a C++ vector

While iterators differ by language, most* tend to have an interface that look like one of these (* C++ has a notably different approach):

An interface with hasNext() and next() methods

These types of iterators are used with containers and external sources like data files. They have two methods:

iter.hasNext(): We ask the iterator (NOT the iterable object) whether it refers to a valid value that can be retrieved?

iter.next(): We ask the iterator (NOT the iterable object) to get the value that it points to, and advance to the next item

// Kotlin: iterator into container
val numbers = listOf(10,20,30,42)

val it = numbers.iterator()
while (it.hasNext()) {
  val v = it.next();
  println(v)
}
An interface with just next() methods

These types of iterators are used with Abstract Sequences, like ranges: range(1,10)

iter.next(): The iterator (NOT the iterable object) generates and returns the next value in the sequence, or indicates the sequence is over via a return code or by throwing an exception

// Swift: abstract sequence
var range = 1...1000000

var it = range.makeIterator()
while true {
  var v = it.next()  // in Swift, the next() method returns either nil or a valid value
  if (v == nil) { break }
  print(v)
}

C++ has unique iterators

C++ models its iterators after C++ pointers, which can be incremented, decremented, and dereferenced. They’re a notable exception to iterators in most languages!

// C++ has unusual iterator syntax!
vector<string> fruits = {"apple", "pear", ... };

for (auto it = fruits.begin(); it != fruits.end(); ++it)
  cout << *it;

So how does looping work under the hood?

When you loop over the items in an iterable object , the language secretly uses an iterator to move through the items! This is a fantastic example of Syntactic Sugar!

When you do this:

// Kotlin: iterate over list of #s
val numbers = listOf(10,20,30,42)

for (v in numbers) {
  println(v)
}

Here’s what’s really happening (the language generates this code secretly for you):

val numbers = listOf(10,20,30,42)

val it = numbers.iterator()
while (it.hasNext()) {
  val v = it.next()
  println(v)
}

Or when you do this:

// Swift: iterate over range

for v in 1...1000000 {
  print(v)
}

Here’s what’s really happening:

var range = 1...1000000

var it = range.makeIterator()
while true {
  var v = it.next()
  if (v == nil) { break }
  print(v)
}

How are iterators implemented?

There are two primary approaches for creating iterators:

  • With Traditional Classes: An iterator is an object, implemented using regular OOP classes.
  • “True Iterators” aka Generators: An iterator is implemented by using a special type of function called a generator.

Iterators built using traditional classes

Let’s see an iterator that is built using a traditional class - Python makes it easy.

First we must define an iterable class (e.g., like a vector class) which we can iterate over. To make a class iterable, we must define an __iter__() method that creates and returns an iterator object. The ListsOfStrings class is an iterable class - NOT an iterator. If its __iter__ method is called, it will return an iterator that can be used to iterate over its items.

class ListOfStrings:
  def __init__(self):
    self.array = []

  def add(self,val):
    self.array.append(val)

  def __iter__(self):
    it = OurIterator(self.array)   // calling __iter__ returns a new OurIterator object
    return it

Second we can define our OurIterator class. Notice that this iterator class is uniqely tied to our ListOfStrings class above. The two work hand-in-hand.

class OurIterator:
  def __init__(self, arr):
    self.arr = arr
    self.pos = 0

  def __next__(self):
    if self.pos < len(self.arr):
      val = self.arr[self.pos]
      self.pos += 1
      return val
    else:
      raise StopIteration

Our iterator class must have a method called __next__() that gets the value of the item pointed to by the iterator, advances the iterator, and returns the value. As you can see, if the iterator has run out of items to iterate over (ther are no more items in the iterable object), the iterator (in Python) will throw an exception to tell the user of the iterator that there are no more items left.

If we define our iterable class and iterator classes this way, Python will give us first-class looping support over our iterable class!

nerds = ListOfStrings()
nerds.add("Carey")
nerds.add("David")
nerds.add("Paul")

for n in nerds: 
  print(n)

And here’’s an example of an iterator in Java:

import java.util.Iterator;  

class ListOfStrings implements Iterable<String> {
  private String[] array;
  private int numItems = 0, maxSize = 100;

  public ListOfStrings() 
    { array = new String[maxSize]; }
  public void add(String val) 
    { array[numItems++] = val; }
  @Override public Iterator<String> iterator() {      
    Iterator<String> it = new OurIterator();
    return it;
  }

  class OurIterator implements Iterator<String> {
    private int iteratorIndex = 0;
    @Override public boolean hasNext() 
      { return iteratorIndex < numItems; }
    @Override public String next() 
      { return array[iteratorIndex++]; }
  }
}

Things to note:

  • The ListOfStrings class implements Java’s Iterable interface, which makes this clas an Iterable class (it can have its items iterated over). The class must then implement the iterator() method which can be used to obtain an iterator into the iterable object.
  • The Iterator class is a nested class within ListOfStrings, which lets it access the memer variables of ListOfStrings. Notice how it implements the Iterator interface. To do so, it must provide a hasNext() method and a next() method.

And here’s how we’d use our class.

public void testOurList() {         
  ListOfStrings nerds = 
    new ListOfStrings();

  nerds.add("Carey");
  nerds.add("David");
  nerds.add("Paul");

  Iterator it = nerds.iterator();  
  while (it.hasNext()) { 
    String value = (String)it.next();    
    System.out.println(value);
  } 

  for (String s: nerds)
    System.out.println(s);
}

Notice how above we can manually get the iterator then use hasNext() and next() with it. Or we can use Java’s looping syntax, and Java will use syntactic sugar, hiding the calls to get the iterator and use hasNext() and next().

Ok, next let’s see how to create a range() iterable using a class like we have in python (and many other languages):

class OurIter:  # iterator class
  def __init__(self, start, end):
   self.cur = start
   self.end = end

  def __next__(self):
    if self.cur < self.end:
      val = self.cur
      self.cur += 1
      return val
    else:
      raise StopIteration()

class OurRange:  # iterable class
  def __init__(self, end):
    self.end = end

  def __iter__(self):
    iter = OurIter(0, self.end)
    return iter

Notice the OurRange class, which is an iterable. We construct it with the maximum value to iterate to, so it will iterate from [0,N). The class implements an __iter__() method which is called to obtain an iterator into the iterable. You can see that this method creates a new OurIter iterator, specifying the starting and ending range values.

The OurIter class simply implements a __next__ method, just like any other iterator in python. This method obtains the next value if one is available, then advances so the next call will return the next value in the range, and finally returns the value. If no more values are available in the range, the method raises a StopIteration exception, which is the way that Python indicates the iterator has reached the end of its iteration.

“True Iterators” aka Generators

A “true iterator” is an iterator that is implemented by using a special type of function called a generator.

A generator is essentially a closure that can be paused and resumed over time. Many languages have generators, but we’ll use python to illustrate the concept since the syntax is pretty simple. But first let’s start with pseudocode:

def foo(n):
  for i in range (1, n):
    print(f'i is {i}')
    pause
    print('woot!')

def main():
  p = create_pausable(func=foo, n=4)  # Line A
  print('CS')
  start(p)    # Line B
  print('is')
  resume(p)  # Line C
  print('cool!')
  resume(p)

Here’s the general idea - remember the above is pseudocode:

  • Line A: You create a special closure (called a generator), which has one extra piece of data - an instruction pointer. The instruction pointer starts off on the first line of the function, and the function starts in a suspended state. It’s not running at this point. We’ve just created the closure.
  • Line B: We wake up the generator/closure begin running it. It will run until it hits the pause command. At that point, the instruction pointer is saved in the closure so we know where to continue running, and all changes made to the local variables in the closure are also retained. The generator returns to the next line of the main function and keeps running there. Later when we resume the generator, it continues running where it left off, with the same variables/values.
  • Line C: We resume the generator, and it continues running where it left off last time

So the above pseudocode would print:

CS
i is 1
is
woot!
i is 2
cool!
woot!
i is 3

Now here’s the real python syntax:

def foo(n):
  for i in range (1, n):
    print(f'i is {i}')
    yield
    print('woot!')

def main():
  p = foo(4)
  print('CS')
  next(p)
  print('is')
  next(p)
  print('cool!')
  next(p)

Here’s what changed:

  • Instead of “pause” we use the keyword “yield” to pause the function
  • Insetad of calling create_pausable to create our closure, we just call the function and pass in the arguments, then store the return value in a variable. This does NOT run the function. It just creates the closure, and then returns an object reference to the closure. We then store the object reference to the closure inside our variable p. How does Python know not to run the function but just to create a closure? Because it has a “yield” command inside of it. This implicitly tells Python this is a generator function and not a traditional function, and that the first call should not run the function, but just create a closure
  • Instead of calling “start” to start the generator running, and “resume” to resume its execution, we just use next() on the closure that was returned during the initial call.

By the way, to use our earler iterable/iterator nomenclature, the generator function is an “iterable”, and the closure that is returned when it’s first called behaves like an iterator.

Now let’s see a more complex generator - it can not only yield its execution, but every time it yields it can return a value to the caller!

def bar(a, b):
  while a > b:
    yield a  # Line A
    a -= 1

f = bar(3, 0)
print('Eat prunes!')
t = next(f)  # Line B
print(t)
t = next(f)
print(t)
t = next(f)
print(t)
print(f'Explosive diarrhea!')

Notice that on Line A, the generator “yields a value”. So not only does it pause itself, but it also returns a value at this time. Notice that on Line B, we wake up the generator and let it run until it yields a value. The yielded value is returned by next() and that value is stored in the t variable.

This is a more idiomatic use of a generator – they’re used to “generate” a sequence of values that can be retrieved one at a time – the sequence might be finite, or infinite!

Since a generator is an iterable object, and produces an iterator… You can use it like any other iterable object in loops! Another name for a generator function is a “true iterator.”

def our_range(a, b):
  while a > b:
    yield a
    a -= 1

print('Eat prunes!')
for t in our_range(3,0):
  print(t)
print(f'Explosive diarrhea!')

Here’s a Javascript generator:

// JavaScript defines generators with a * character after the function name
gen = function*(n) {
  yield 'I';
  yield 'want';
  yield n;
  yield 'cookies';
};

g = gen(10);
while (true) {
  r = g.next();
  if (r.done) break;
  console.log(r.value);
}
 
for (word of gen(3)) {
  console.log(word);
}

And here’s one in Kotlin:

// Kotlin generators care called "sequences"
fun vals(start: Int) = sequence {
  var i = start
  while (true) {
    yield(i)
    ++i
  }
}

fun main(args: Array<String>) {
  val iter = vals(42).iterator()
  var count = 3
  while (count-- > 0) {
    var n = iter.next()
    println(n)
  } 
  
  for (n in vals(7).take(5))  
    println(n)     // 7, 8, 9, 10, 11
}

Analysis of Class-based Iterators and Generators

Some things are easier to do with class-based iterators, and some things are easier to do with generators. But the two are isomorphic and functionally equivalent - anything we can do with a class we can do with a generator, and visa-versa.

First-class function-based iteration

Alright, we’ll end this lecture by looking at how some languages use first-class functions to do iteration. The syntax often looks something like this:

var fruits = ... // some list of fruits

fruits.forEach { f -> println("$f") }

Basically, the container (e.g., vector, list or set) contains a forEach() method, which accepts one parameter. We pass in a { lambda } function to the forEeach method as this parameter. The forEach method iterates over each item in the iterable, and passes each item as the parameter to the lambda function (e.g., to the parameter f shown in the lambda above).

Here are some examples:

// Rust
(0..10).for_each(|elem| println!("{}", elem));

let items = vec!["fee","fi","fo","fum"];
items.iter().for_each(|elem| println!("{}", elem));
# Ruby
(0..9).each do |elem| 
  print elem, " " 
end

items = ["fee","fi","fo","fum"]

items.each do |elem|
    print(elem,"\n")
end
// Kotlin
(0..9).forEach({ elem -> println("$elem") })

var items = arrayOf("fee","fi","fo","fum")
items.forEach({ elem -> println("$elem") })

When we use this syntax, we’re NOT using an iterator! Instead, we’re passing a function as an argument to a forEach()/each() method that loops over the iterable’s items!

Let’s build a forEach() method for our earlier Python container:

class ListOfStrings:
  def __init__(self):
    self.array = []

  def add(self,val):
    self.array.append(val)

  def forEach(self, f):   # define our own for-each method
    for x in self.array:
      f(x)

yummy = ListOfStrings()
... # add items to yummy

def like(x):
 print(f'I like {x}'))

yummy.forEach(like)

Notice how the iterable object (a ListOfStrings) iterates over its own items, and calls the provided f(). This is essentially a map() operation like we saw in functional programming!