Lectures 17 and 18

2022-11-21 and 2022-11-23 | Week 9 | by Carey Nachenberg

Multiple Inheritance and Its Issues
Abstract Methods and Classes
- When should you use abstract methods/classes
Inheritance and Typing
- Inheritance and Typing Challenges
Subtype Polymorphism
- Subtype Polymorphism in Dynamically-typed Languages?
- Classify That Language: Subtype Polymorphism
Dynamic Dispatch
Deep Dive: OOP Design Best Practices
Controlla-palooza

Multiple Inheritance and Its Issues

Multiple inheritance is when a derived class inherits implementations from two or more superclasses, e.g.:

class Person { ... };
class Robot { ... };

class Cyborg: public Person, public Robot
{ ... }

Multiple inheritance is considered an anti-pattern, and forbidden in most languages! In cases where you want to use multiple inheritance, have your class implement multiple interfaces instead.

Here’s an example where there’s a problem:

class CollegePerson {
public:
  CollegePerson(int income) { income_ = income; }
  int income() const { return income_; }
private:
    int income_;
};

class Student: public CollegePerson {
public:
  Student(int part_time_income) :
    CollegePerson(part_time_income)
  {/* Student-specific logic */ }
};

class Teacher: public CollegePerson {
public:
  Teacher(int teaching_income) : 
    CollegePerson(teaching_income)
  { /* Teacher-specific logic */ }
};

class TA: public Student, public Teacher  {
public:
    TA(int part_time_income, int ta_income) : 
      Student(part_time_income),
      Teacher(ta_income) 
    { /* TA-specific logic */ }
};

int main() {
 TA amy(1000, 8000);
 cout << "Amy's pay is $"
      << amy.income();
}

Notice that in this example, both Student and Teacher inherit from the same base class, CollegePerson. Also notice that CollegePerson has an income() method that returns the CollegePerson’s income_ field. But here’s the problem, since a TA class inherits from both Student and Teacher, and each of these has a CollegePerson base part (each with their own copy if the income_ field), the TA class inherits TWO copies of the income_ field, and two copies of the income() method! So when you ask for a TA’s income(), does this call Student’s version of income() or Teacher’s version of income()? Given the example above, if we called Student’s version, we’d get a result of $1000, but if we called Teacher’s version we’d get a result of $8000. This is called the “diamond pattern” - it occurs when a derived class D inherits from a base class B twice by inheriting from two distinct classes X, Y that both on their own are derived from B. It actually inherits the base class B twice, and thus has two copies of every member field, etc.

Needless to say, this is very confusing for the programmer. Some languages, like C++ will generate an error for this: “error: non-static member income’ found in multiple base-class subobjects of type ‘CollegePerson’”. Other languages have a mechanism to pick one of the two functions to always call, or perhaps let the programmer specify which base version to call (e.g., TA.income()). Regardless, you want to avoid the diamond pattern, and multiple inheritance more generally.

We can even have issues with multiple inheritance without a diamond pattern. Consider what would happen if two base classes define a method with the same prototype, and then a third class inherits from both base classes. When calling that method, which version should be used? It gets complicated.

TL;DR - don’t use multiple subclass inheritance; inherit from multiple interfaces instead

Abstract Methods and Classes

Abstract methods are methods that define an interface, but don’t have an implementation. For instance, area() and perimeter() are abstract methods in the example below:

class Shape { 
public:
  double area() const = 0;
  double perimeter() const = 0;
};

An abstract method defines “what” a method is supposed to do along with its inputs and outputs – but not “how.” An abstract class is a class with at least one abstract method - it defines the public interface and partial implementation for subclasses.

When defining a base class, most languages allow the programmer to specify one or more “abstract methods”, which are just function prototypes without implementations.

class Shape { // C++ class
public:
  // area and rotate are "abstract" methods
  virtual double area() const = 0;
  virtual void rotate(double angle) = 0;
  virtual string myName() 
    { return "abstract shape"; }
};

abstract class Shape {  // Java class
 // area and rotate are "abstract" methods
 public abstract double area();
 public abstract void rotate(double angle);
 public String myName()
   { return "abstract shape"; }
};

Question: Can we instantiate an object of an A.B.C. like Shape?

int main() {
  Shape s;  // ?????
  cout << s.area(); // ????
}

Answer: You can’t instantiate an object of an ABC because it has an incomplete implementation. What if the user tried to call s.area()? There’s no method implementation to call!

So why define abstract methods at all? Why not just provide dummy implementations in the base class?

class Shape { // C++ class
public:
  // area is an "abstract" method
  virtual double area() const { return 0; }   // dummy version returns 0
  virtual void rotate(double angle) { }        // dummy version does nothing
  virtual string myName() 
    { return "abstract shape"; }
};

Answer: An abstract method forces the programmer to redefine the method in a derived class, ensuring they don’t accidentally use a dummy base implementation in a derived class!

OK what about abstract classes and types? Just as with concrete classes, when you define a new abstract class, it automatically creates a new type of the same name! More specifically, it defines a reference type, which can ONLY be used to define pointers, references, and object references… not concrete objects.

When should you use abstract methods/classes

Use abstract methods/classes under the following circumstances.

Prefer an abstract method when there’s no valid default implementation for the method in your base class.
Prefer an abstract class over an interface when all subclasses share a common implementation (one or more concrete methods or fields), and you want the derived classes to inherit this implementation

In contrast, you would inherit/implement an interface when you have a “can-do” relationship, e.g., a Car can do Driving, so a car might implement a Drivable interface.

Inheritance and Typing

When we use inheritance to define classes, we automatically create new supertypes and subtypes! The base class (e.g., Person) defines a supertype. The derived class (e.g., Student) defines a subtype. And given the rules of typing, we can then use a subtype object anywhere a supertype is expected.

As we learned earlier, any time you define a class or an interface, it implicitly defines a new type.

The type associated with a concrete class is specifically called a value type. Value types can be used to define references, object references, pointers, and instantiate objects.

The type associated with an interface OR abstract class is called a reference type. Reference types can ONLY be used to define references, object references, and pointers.

Ok, so how does typing work with inheritance? Consider these two classes:

class Person {
public:
  virtual void talk()
    { cout << "I'm a person"; }
  virtual void listen()
    { cout << "What'd you say again?"; }
};

class Student: public Person {
public:
 virtual void talk() 
   { cout << "I hate finals."; }
 virtual void study() 
   { cout << "Learn, learn, learn"; }
}; 

Well… the Person base class definition implicitly defines a new type called Person. And the Student subclass definition implicitly defines a new type called Student. And since Student is a subclass of Person the Student type is a subtype of the Person type!

Ok, and how does typing work with interface inheritance? Consider this example:

// Java interface inheritance
public interface Washable {
  public void wash();
  public void dry();
}

public class Car implements Washable {
  Car(String model) { ... }
  void accelerate(double acc) { ... }
  void brake(double decel) { ... }

  public void wash() {
   System.out.println("Soap & bucket");
  }
  public void dry() {
    System.out.println("Dry with rags");
  }
}

Well… the Washable interface definition implicitly defines a new type called Washable. And the Car class definition implicitly defines a new type called Car. And since Car is an implementer of Washable the Car type is a subtype of the Washable type!

Inheritance and Typing Challenges

Question: When would a derived class not be a subtype of its base class? Answer: When we privately inherit from a base class, the derived class is NOT a subtype of the superclass’s type, nor does it share the interface of the base class!

Question: When could a class have a supertype, but not a superclass? Answer: A class could have a supertype of an interface type, without having a superclass, since an interface is not a class (at least in some languages).

Subtype Polymorphism

Subtype Polymorphism is the ability to substitute an object of a subtype (e.g., Circle) anywhere a supertype (e.g., Shape) is expected. More technically, S.P. is where code designed to operate on an object of type T operates on an object that is a subtype of T.

class Shape { ... };
class Circle: public Shape { ... };

void display(Shape& s) {    // display accepts a Shape (supertype) object
  ...
}

int main() {
  Circle circ(0,0,5);
  display(circ);            // but we're passing in an object that's asubtype of Shape
}

As we learned in datapalooza, typing rules state that if a function can operate on a supertype, it can also operate on a subtype! The same holds true for supertypes and subtypes associated with inherited classes! And for classes that implement interfaces!

This also works because the class associated with the subtype supports the same public interface as the class associated with the supertype:

class Washable { // Interface
  virtual void wash() = 0;
  virtual void dry() = 0;
};
class Car: public Washable { ... }

void washAThing(Washable& w) {  // this function  accepts a supertype
  w.wash();
}

void cleaningCrew() {
  Car Subaru;
  washAThing(subaru);  // we're passing a subtype object to the function above
}

So we’ve seen that we can substitute a subtype object anywhere a supertype is expected. This works because the subclass inherits/implements the same interface as its superclass. But that’s not enough! We also expect the methods of the subclass to have the same semantics as those in superclass.

Consider these base and derived classes:

class Shape {
public:
  virtual double getRotationAngle() const {
      // returns angle in radians
  } 
  ...
};

class Square: public Shape {
public:
  virtual double getRotationAngle() const {
    // returns angle in degrees
  }
  ...
};

Notice that our derived class returns the result in degrees, but the base class returns the result in radians? This would change the semantics of the interface defined by the base class, and is a bad idea. Imagine if you passed a Square object to a function that operated on Shapes? Any time it operated on a Square and called getRotationAngle() it would get a result in degrees, when it expected results in radians. This would cause tons of bugs. To ensure this doesn’t happen, a derived class should always implement the same semantics as the base class. If a subclass adheres to the interface and semantics of the base class, we can truly substitute any subtype object for a supertype, and our code will still work. This is called the Liskov Substitution Principle, named after computer scientist Barbara Liskov.

Subtype Polymorphism in Dynamically-typed Languages?

So far, all of our examples have been with statically-typed languages!

Question: Can we have subtype polymorphism in dynamically-typed languages like Python? Consider this example:

class Car:
  def speedUp(self, accel):
    ...
  def slowDown(self, decel):
    ...
  def steer(self, angle):
    ...

class HybridCar(Car):
  def speedUp(self, accel):
    ...
  def slowDown(self, decel):
    ...

def drive_car(c):
  c.speedUp(5)

def main():
  prius = HybridCar()
  drive_car(prius)  # is this using subtype polymorphism?

Answer: Nope. Subtype polymorphism occurs when we process a subtype object as if it were an object of the supertype. But in dynamically-typed languages, variables have NO type at all! For instance, the parameter c in the drive_car(c) class above has no type. So how can we be treating the prius (a subtype) as if it’s a supertype (a Car)? We can’t, and we’re not. So we’re NOT using subtype polymorphism! This code will work, however (using duck typing)! It’s just not subtype polymorphism!

Classify That Language: Subtype Polymorphism

Consider the following code. In this language, a “trait” is the name for an interface. Does this program use subtype polymorphism? Why or why not?

trait Shape {
  def getArea(): Double
  def getPerim(): Double  
}

class Square(side: Double) extends Shape {
  override def getArea(): Double = side * side
  override def getPerim(): Double = 4 * side
}
class Circle(r: Double) extends Shape {
  override def getArea(): Double = Math.PI * r * r
  override def getPerim(): Double = 2 * Math.PI * r
}

def printStats[T <: Shape](s: T): Unit = { 
  println("The area is: " + s.getArea()) 
  println("The perimeter is: " + s.getPerim()) 
}

val square = Square(10.0)
val circle = Circle(1.0)
printStats(square)
printStats(circle)

Answer: This program does use subtype polymorphism. We pass a subtype object like square/circle, and treat it like a supertype, using type T, which is designated to be some subtype of a Shape.

Dynamic Dispatch

Imagine we have a variable var that refers to some object:

void foo(Shape& var) { // c++
  cout << var.area();
}

or:

def foo(var):
  print(var.area()) 

With Dynamic Dispatch, when you call a method on var, e.g.: var.area(), the actual method that gets called is determined at run-time based on the target object that var refers to. The determination is made in one of two ways:

Based on the target object’s class, or
By seeing if the target object has a matching method (regardless of the object’s type or class), and calling it if one was found - in languages without classes and just objects

So dynamic dispatch is the process of figuring out which method to call when a call is made to a virtual method at runtime. Do we call Circle’s area() method, Square’s area() method, or what?

Dynamic Dispatch in Statically-typed Languages

In statically typed languages, the language examines the class of an object at the time of a method call and uses this to “dynamically dispatch” to the class’s proper method. Typically this is implemented by adding a hidden pointer into each object which points at a vtable for the class. A vtable is an array of function pointers which points to the proper method implementation to use for the current class. Every class has its own vtable. Only virtual methods are included in the vtable. So a Circle object would have a pointer to a vtable pointing to Circle’s area() and perimter() methods. And a Square object would have a pointer to a vtable pointing to Square’s area() and perimeter() methods. When the call happens, e.g. var.area(), the vtable associated with the object is consulted and the pointer to the area() function is used to call the proper function.

Dynamic dispatch is NOT used to call regular instance or class methods. Instead a standard function call is used in these cases. For regular instance functions (that are non-virtual) or class/static functions, a function call to them is called “static dispatch,” since the proper function to call can be determined at compile time (statically), not at runtime (dynamically).

Dynamic Dispatch in Dynamically-typed Languages

In dynamically typed languages, things get more interesting! While a class may define a default set of methods for its objects the programmer can add/remove methods at runtime to classes or sometimes even individual objects! So when a method is called, the language can’t necessarily rely upon an object’s class or type to figure out what method to use! Here’s an example from javascript:

// Javascript dynamic dispatch example
function makeThemTalk(p) {
  console.log(p.name + " says: " + p.talk());
}

var person = {
  name: "Alan Kay",
  talk: function() { return "Java is meh"; },
  change: function()
    { this["talk"] = function() { return "I <3 Smalltalk"; } }

};

var bird = {
  name: "bluebird",
  talk: function() { return "Chirp chirp"; }
};

makeThemTalk(person)
person.change()
makeThemTalk(person)
makeThemTalk(bird)

So how might a language that allows you to customize methods for individual objects determine which method to call? Answer: The language stores a unique vtable in every object!

Dynamic Dispatch Challenge Questions

Question: What is the difference between subtype polymorphism and dynamic dispatch?

Answer: Subtype polymorphism is a statically-typed language concept that allows you to substitute/use a subtype object (e.g., HighPitchedGeek) anywhere code expects a supertype object (e.g., a Geek). Since the subtype class shares the same public interfaces as the supertype, the compiler knows that the types are compatible and substitution is allowed.

In contrast, dynamic dispatch is all about determining where to find the right method to call at run-time for virtual/overridable methods. Dynamic dispatch occurs whether or not we have subtype polymorphism, as we’ve seen in dynamically-typed languages like Ruby and Python.

Question: Is dynamic dispatch slower, faster, or the same speed as static dispatch? Why?

Dynamic dispatch is slower than static dispatch, since we need to look up function pointers in the vtable at runtime before performing the call. Static dispatch calls are hard-wired at compile time into the generated machine code, and thus much faster (a direct machine language call).

Classify That Language: Dynamic Dispatch

Consider the following classes…

class Geek {
public:
  void tickleMe() { 
    laugh(); 
  }
  virtual void laugh()
    { cout << “ha ha!”; }
};

class HighPitchGeek: public Geek {
public:
  virtual void laugh()
    { cout << “tee hee hee”; }
};

int main() {
  Geek *p = new HighPitchGeek;
  
  p->tickleMe(); // ?

  delete p;
} 

Question: Will dynamic dispatch be used in this program? Also, what will print, and why? Answer: Yes, it’s using dynamic dispatch. This will print “tee hee hee”. Why? Because dynamic dispatch always selects the most derived-version of a function to call. So even though the laugh() method is called by the tickleMe() method which is defined in Geek, the laugh() method has been redefined in HighPitchedGeek, and our object is of type HighPitchedGeek, and so this version is called..

Deep Dive: OOP Design Best Practices

The SOLID principles are a set of design principles for OOP class/interface design which are considered best-practices in industry. They’re critical to know well.

Single Responsibility Principle (S in SOLID)

Each class should have a single, limited responsibility – if it has more, it should be split into two or more classes.

Don’t define a “god” class that does everything:

class ProcessOrder {
public:
  void submitOrder(...);
  void cancelOrder(...);
  void sendConfirmationEmail(...);
  void printSalesReport(...);
};

Instead, define classes for specific purposes and limit the scope to one thing.

class ProcessOrder {
public:
  void submitOrder(...);
  void cancelOrder(...);
};

class ConfirmationEmailer {
public:
  void sendEmail(...);
};

class ReportPrinter {
public:
  void printSalesReport(...);
};

SOLID: Open/Closed Principle (O in SOLID)

When a class A operates on other classes B, C, D, you should design A such that it can operate on new related classes E, F, G without modification.

Don’t do this:

class Shape { ... };
class Circle: public Shape { ... };
class Rectangle: public Shape { ... };

class ShapeDrawer {
public:
  void drawAll(vector<Shape *>& v) {
    for (Shape *s: v) {
      if (s.type() == kCircleType) 
        drawCircle((Circle*)s);
      else if (s.type() == kRectangleType)
        drawRectangle((Rectangle*)s);
    }
  }
};

Notice that any time we want to support a new shape, like a Triangle, we’d need to update our ShapeDrawer class. That’s a violation of the Open/Closed principle.

Instead do this:

class Shape { ... };
class Circle: public Shape { ... };
class Rectangle: public Shape { ... };

class ShapeDrawer {
public:
  void drawAll(vector<Shape *>& v) {
    for (Shape *s: v) {
      s->draw();
    }
  }
};

Notice that since we just ask each shape to draw itself, now if we want to support a new shape in the ShapeDrawer class, we don’t need to modify the ShapeDrawer class at all.

Open: The classes like Shape that you’re operating on are “open” for subclassing (we can create new shapes) Closed: The code that operates on those objects is closed for modification, and doesn’t need to be updated in order to support related subtypes.

SOLID: Liskov Substitution Principle (L in SOLID)

A properly written subclass should be substitutable for its superclass and your code should still work. Basically, subclasses should honor the semantics of the superclass.

Consider this base class and its derived class:

class Shape {
public:
  virtual double getRotationAngle() const {
      // returns angle in radians
  } 
  ...
};

class Square: public Shape {
public:
  virtual double getRotationAngle() const {
    // returns angle in radians
  }
  ...
};

See the problem? The Shape class’s getRotationAngle() returns results in radians, but the Square class’s getRotationAngle() returns results in degrees. So any code built for the base class wouldn’t work with the derived class.

LSP: A properly written subclass should be substitutable for its superclass and your code should still work.

“Subtype Requirement: Let φ(x) be a property provable about objects x of type T. Then φ(y) should be true for objects y of type S where S is a subtype of T.” Liskov and Wing 1994

L.S.P. enables code re-usability – it’s also a litmus test for whether subclassing is warranted.

SOLID: Interface Segregation Principle (I in SOLID)

No code should be forced to depend on an interface (or base class) that contains methods it does not use. For example, here’s an interface that violates the ISP:

public interface Phone {
  void makeCall(string number);
  void hangUpCall();
  void takePhoto();
  void surfInternet(string url);
};

Why? Well what if we had a flip-phone implement this interface? Flip phones can’t take photos or surf the internet.

class FlipPhone implements Phone {
  void makeCall(string number) { ... }
  void hangUpCall() { ... }
  void takePhoto() 
   { throw NotImplementedException(); }
  void surfInternet(string url)
   { throw NotImplementedException(); }
};

So this interface is too broad. We’d have to have methods that are unimplemented and throw exceptions/panics. If you find yourself doing this, your interface is too broad. Instead, do this:

public interface Phone {
  void makeCall(string number);
  void hangUpCall();
};

public interface Camera {
  void takePhoto();
};

public interface Browser {
  void surfInternet(string url);
};

Now you can selectively choose which functions to support in your flip phone or smart phone.

SOLID: Dependency Inversion Principle (D in SOLID)

If class A uses class B, then design A to operate on an interface I, and have B implement I. Don’t have A directly use class B.

First Aspect: Higher-level classes should not directly refer to/use lower-level classes. – Instead, lower-level classes should implement an interface… and higher-level classes should be built on that interface.
Second Aspect: Abstractions (interfaces) should not be designed based on specific details of an implementation. – Instead, implementations should be designed to meet the requirements of the interface.

Don’t do this:

class CheapCoffeeMachine {
  public void addBeans() { ... }
  public void brewCoffee() { ... }
};

class CoffeeShop {
public CoffeeShop(CheapCoffeeMachine cm) 
    { this.cm = cm; }
  public void serveCoffeeToCustomer() {
    cm.addBeans();
    cm.brewCoffee();
    System.out.println("Here's your coffee!\n");
  }
  private CheapCoffeeMachine cm;
};

See the problem? Our CoffeeShop is now hard-coded to using cheap coffee machines. What if we wanted to open up a high-end coffee shop.

Instead define a coffee brewing interface:

public interface BrewCoffee {
  public void addBeans();
  public void brewCoffee();
};

class CheapCoffeeMachine implements BrewCoffee {
  public void addBeans() { ... }
  public void brewCoffee() { ... }
};

class FancyCoffeeMachine implements BrewCoffee {
  public void addBeans() { ... }
  public void brewCoffee() { ... }
};

And then build your CoffeeShop class around that:

 class CoffeeShop {
  public CoffeeShop(BrewCoffee cm) 
    { this.cm = cm; }
  public void serveCoffeeToCustomer() {
    cm.addBeans();
    cm.brewCoffee();
    System.out.println("Here's your coffee!\n");
  }
  private BrewCoffee cm;
};

public void launchChainOfShops() {
  // Low-end coffee shop
  CheapCoffeeMachine m1 =
    new CheapCoffeeMachine();
  CoffeeShop store1 = new CoffeeShop(m1);

  // High-end coffee shop
  FancyCoffeeMachine m2 =
    new FancyCoffeeMachine();
  CoffeeShop store2 = new CoffeeShop(m2);
}

public void launchChainOfShops() {
  // Low-end coffee shop
  CheapCoffeeMachine m1 =
    new CheapCoffeeMachine();
  CoffeeShop store1 = new CoffeeShop(m1);

  // High-end coffee shop
  FancyCoffeeMachine m2 =
    new FancyCoffeeMachine();
  CoffeeShop store2 = new CoffeeShop(m2);
}

See - now our CoffeeShop can use any coffee maker that implements the BrewCoffee interface! It’s not hard-coded to one type of coffee maker.

D.I.P. reduces coupling between classes providing flexibility for mixing/matching classes.

Controlla-palooza

Everything you never learned about how code executes!

Expressions

An expression is a combination of values, function calls and operations that, when evaluated, produces an output value. An expression may or may not have side effects.

Which of the following are expressions?

f(x)*5 – Yes!
if (foo) cout « “bar”; – No
a = 5 – In some languages, yes. Like in C++ we can do b = a = 5, so a = 5 is an expression that has a side effect and produces a result
a && (b || c) – Yes!
if a > 5 then a3 else a4 – Yes - in functional languages (or languages with functional features) this if code is an expression, which produces a result of either a3 or a4

Expression Evaluation Order

Consider the following C++ code:

int c = 1, d = 2;

int f() {
  ++c;
  return 0;
}

int main() {
  cout << c - d + f() ;
}

Or this code:

int c = 0; 

int f() {
  ++c;
  return c;
}

void h(int a, int b)
  { cout << a << " "<< b; }

int main() {
  h(c, f());
}

Challenge: What will these C++ programs print?
The first program produces either -1 or 0 The second program produces either 0 1 or 1 1

Why? C++ doesn’t specify what order the terms of an arithmetic expression must be evaluated! Nor does it specify what order parameters must be evaluated! Some languages mandate a left-to-right eval order (C#, Java, JavaScript, Swift), while others don’t (C++, Rust, OCaml).

So in general, if your expressions are calling functions, those functions shouldn’t have side effects that could impact the execution of the other functions called in the expression. Otherwise you’ll get unpredictable behavior depending on the compiler and potentially each time you compile the code it could do a different thing!

Why avoid specifiying ane explicit evaluation order? The reason that an ordering is not specified is to give the compiler latitude to optimize the code.

Associativity

Regardless what order each of the terms are evaluated… Most languages use left-to-right associativity when evaluating mathematical expressions. Which is the way we all learned to do it in school.

So an expression like:

cout << c - d + f() ;

Would be evaluated as:

cout << (c - d) + f() ;

Instead of:

cout << c - (d + f());

There are some notable exceptions, like evaluating exponentiation (e.g., 2^3^2 == 512, since we often evaluate exponentiation from right-to-left). Assignment is also right-associative, so we can do something like this:

  a = b = c = 5;  
  // equivalent to:
  a  = (b = (c = 5));   // so c = 5 runs first, then b = c runs, then a = b runs

Short Circuiting

In any boolean expression with AND / OR, most languages will evaluate its sub-expressions from left-to-right. The moment the language finds a condition that satisfies or invalidates the boolean expression, it will skip evaluating the rest of the sub-expressions. This can dramatically speed up your code!

For example, this code here:

if(is_cute() || is_smart() || is_rich()) {
  date_person();
}

Really executes like this:

if (is_cute()) 
  date_person();
else if (is_smart())
  date_person();
else if (is_rich())
  date_person();

And this code here:

if (good_salary() && fun_work() && good_culture()) {
  take_job();
}

Really executes like this:

if (good_salary()) {
  if (fun_work()) {
    if (good_culture()) {
      take_job();
    }
  }
}

Some languages, like Kotlin, have the ability to use boolean expressions with short circuiting and without! e.g.

fun rich(salary: Int): Boolean {
  if (salary > 200000) return true;
  print("Not in silicon valley!\n")
  return false
}

fun main() {
  val cute = true; val smart = false;
  val salary = 200000;
    
  if (cute == true || smart == true || rich(salary) == true)  // uses short circuiting
    print("You're my type!\n")

  if (cute == true or smart == true or rich(salary) == true)  // no short-circuiting
    print("I still want to date you!\n")
}

Iteration

There are three different types of iteration that we’ll study:

Counter-controlled: We use a counter, like we would in a for-loop in C++ or using a range in Python:

    for i in range(1,10):
        do_something()

Condition-controlled: We use a boolean condition to decide how to long to loop - this is almost always a while() or do-while() loop.

    while (condition)
       do_something()

Collection-controlled: We’re enumerating the items in a collection

   for (x in container)
        do_something(x)

Iteration in loops

In languages like C++ with simple counter-controlled loops (e.g., for(int i=0;i<10;i++) { … }) it’’s easy to see how looping works. But how does looping work when we use a collection-controlled loop (e.g., iterating thru a vector of objects) or use a range()? There’’s more going on under the hood than you think! There are two primary approaches: using an iterator (of which there are two types of iterators), or using first-class functions. Let’’s learn both!

These kinds of loops are implemented using “iterators”:

for n in student_names:
 print(n)

for i in range(0,10):
  print(i)

These kinds of loops are implemented using first-class functions:

fruits.forEach
	{ f -> println("$f") }

Iterator-based loops

An iterator is an object that enables enumeration of a sequence of values. An iterator provides a way to access the values of a sequence without exposing the sequence’s underlying implementation.

Iterators can be used to enumerate the values held in a container (e.g., a list, set)
Iterators can be used to enumerate the contents of external sources like data files
Iterators can be used to enumerate the values of an abstract sequence. An abstract sequence is one where the values in the sequence aren’t stored anywhere, but generated as they’re retrieved via the iterator

Iterators used with containers and external sources are typically distinct objects from the objects that contain the sequences they refer to.

Iterable Objects vs Iterators

An Iterable Object is an object like a container (e.g., vector, list, set) which can have its items iterated over.
An Iterator is a thing that can refer to an Iterable Object and be used to iterate over the vaulues it holds or represents.

To get an iterator, you request one from an iterable object. The Iterable Object then returns the iterator to you. You may then use the iterator to refer to the items in the iterable object.

Examples of getting an iterator from an iterable object:

// Kotlin: iterator into container
val numbers = listOf(10,20,30,42)

val it = numbers.iterator()  // get an iterator into the numbers list

// Swift: abstract sequence
var range = 1...1000000

var it = range.makeIterator()  // get an iterator to the first # of the abstract seq.

// C++
vector<string> fruits = {"apple", "pear", ... };

auto it = fruits.begin();  // get an iterator into a C++ vector

While iterators differ by language, most* tend to have an interface that look like one of these (* C++ has a notably different approach):

An interface with hasNext() and next() methods

These types of iterators are used with containers and external sources like data files. They have two methods:

iter.hasNext(): We ask the iterator (NOT the iterable object) whether it refers to a valid value that can be retrieved?

iter.next(): We ask the iterator (NOT the iterable object) to get the value that it points to, and advance to the next item

// Kotlin: iterator into container
val numbers = listOf(10,20,30,42)

val it = numbers.iterator()
while (it.hasNext()) {
  val v = it.next();
  println(v)
}

An interface with just next() methods

These types of iterators are used with Abstract Sequences, like ranges: range(1,10)

iter.next(): The iterator (NOT the iterable object) generates and returns the next value in the sequence, or indicates the sequence is over via a return code or by throwing an exception

// Swift: abstract sequence
var range = 1...1000000

var it = range.makeIterator()
while true {
  var v = it.next()  // in Swift, the next() method returns either nil or a valid value
  if (v == nil) { break }
  print(v)
}

C++ has unique iterators

C++ models its iterators after C++ pointers, which can be incremented, decremented, and dereferenced. They’re a notable exception to iterators in most languages!

// C++ has unusual iterator syntax!
vector<string> fruits = {"apple", "pear", ... };

for (auto it = fruits.begin(); it != fruits.end(); ++it)
  cout << *it;

So how does looping work under the hood?

When you loop over the items in an iterable object , the language secretly uses an iterator to move through the items! This is a fantastic example of Syntactic Sugar!

When you do this:

// Kotlin: iterate over list of #s
val numbers = listOf(10,20,30,42)

for (v in numbers) {
  println(v)
}

Here’s what’s really happening (the language generates this code secretly for you):

val numbers = listOf(10,20,30,42)

val it = numbers.iterator()
while (it.hasNext()) {
  val v = it.next()
  println(v)
}

Or when you do this:

// Swift: iterate over range

for v in 1...1000000 {
  print(v)
}

Here’s what’s really happening:

var range = 1...1000000

var it = range.makeIterator()
while true {
  var v = it.next()
  if (v == nil) { break }
  print(v)
}

How are iterators implemented?

There are two primary approaches for creating iterators:

With Traditional Classes: An iterator is an object, implemented using regular OOP classes.
“True Iterators” aka Generators: An iterator is implemented by using a special type of function called a generator.

Iterators built using traditional classes

Let’s see an iterator that is built using a traditional class - Python makes it easy.

First we must define an iterable class (e.g., like a vector class) which we can iterate over. To make a class iterable, we must define an __iter__() method that creates and returns an iterator object. The ListsOfStrings class is an iterable class - NOT an iterator. If its __iter__ method is called, it will return an iterator that can be used to iterate over its items.

class ListOfStrings:
  def __init__(self):
    self.array = []

  def add(self,val):
    self.array.append(val)

  def __iter__(self):
    it = OurIterator(self.array)   // calling __iter__ returns a new OurIterator object
    return it

Second we can define our OurIterator class. Notice that this iterator class is uniqely tied to our ListOfStrings class above. The two work hand-in-hand.

class OurIterator:
  def __init__(self, arr):
    self.arr = arr
    self.pos = 0

  def __next__(self):
    if self.pos < len(self.arr):
      val = self.arr[self.pos]
      self.pos += 1
      return val
    else:
      raise StopIteration

Our iterator class must have a method called __next__() that gets the value of the item pointed to by the iterator, advances the iterator, and returns the value. As you can see, if the iterator has run out of items to iterate over (ther are no more items in the iterable object), the iterator (in Python) will throw an exception to tell the user of the iterator that there are no more items left.

If we define our iterable class and iterator classes this way, Python will give us first-class looping support over our iterable class!

nerds = ListOfStrings()
nerds.add("Carey")
nerds.add("David")
nerds.add("Paul")

for n in nerds: 
  print(n)

And here’’s an example of an iterator in Java:

import java.util.Iterator;  

class ListOfStrings implements Iterable<String> {
  private String[] array;
  private int numItems = 0, maxSize = 100;

  public ListOfStrings() 
    { array = new String[maxSize]; }
  public void add(String val) 
    { array[numItems++] = val; }
  @Override public Iterator<String> iterator() {      
    Iterator<String> it = new OurIterator();
    return it;
  }

  class OurIterator implements Iterator<String> {
    private int iteratorIndex = 0;
    @Override public boolean hasNext() 
      { return iteratorIndex < numItems; }
    @Override public String next() 
      { return array[iteratorIndex++]; }
  }
}

Things to note:

The ListOfStrings class implements Java’s Iterable interface, which makes this clas an Iterable class (it can have its items iterated over). The class must then implement the iterator() method which can be used to obtain an iterator into the iterable object.
The Iterator class is a nested class within ListOfStrings, which lets it access the memer variables of ListOfStrings. Notice how it implements the Iterator interface. To do so, it must provide a hasNext() method and a next() method.

And here’s how we’d use our class.

public void testOurList() {         
  ListOfStrings nerds = 
    new ListOfStrings();

  nerds.add("Carey");
  nerds.add("David");
  nerds.add("Paul");

  Iterator it = nerds.iterator();  
  while (it.hasNext()) { 
    String value = (String)it.next();    
    System.out.println(value);
  } 

  for (String s: nerds)
    System.out.println(s);
}

Notice how above we can manually get the iterator then use hasNext() and next() with it. Or we can use Java’s looping syntax, and Java will use syntactic sugar, hiding the calls to get the iterator and use hasNext() and next().

Ok, next let’s see how to create a range() iterable using a class like we have in python (and many other languages):

class OurIter:  # iterator class
  def __init__(self, start, end):
   self.cur = start
   self.end = end

  def __next__(self):
    if self.cur < self.end:
      val = self.cur
      self.cur += 1
      return val
    else:
      raise StopIteration()

class OurRange:  # iterable class
  def __init__(self, end):
    self.end = end

  def __iter__(self):
    iter = OurIter(0, self.end)
    return iter

Notice the OurRange class, which is an iterable. We construct it with the maximum value to iterate to, so it will iterate from [0,N). The class implements an __iter__() method which is called to obtain an iterator into the iterable. You can see that this method creates a new OurIter iterator, specifying the starting and ending range values.

The OurIter class simply implements a __next__ method, just like any other iterator in python. This method obtains the next value if one is available, then advances so the next call will return the next value in the range, and finally returns the value. If no more values are available in the range, the method raises a StopIteration exception, which is the way that Python indicates the iterator has reached the end of its iteration.

“True Iterators” aka Generators

A “true iterator” is an iterator that is implemented by using a special type of function called a generator.

A generator is essentially a closure that can be paused and resumed over time. Many languages have generators, but we’ll use python to illustrate the concept since the syntax is pretty simple. But first let’s start with pseudocode:

def foo(n):
  for i in range (1, n):
    print(f'i is {i}')
    pause
    print('woot!')

def main():
  p = create_pausable(func=foo, n=4)  # Line A
  print('CS')
  start(p)    # Line B
  print('is')
  resume(p)  # Line C
  print('cool!')
  resume(p)

Here’s the general idea - remember the above is pseudocode:

Line A: You create a special closure (called a generator), which has one extra piece of data - an instruction pointer. The instruction pointer starts off on the first line of the function, and the function starts in a suspended state. It’s not running at this point. We’ve just created the closure.
Line B: We wake up the generator/closure begin running it. It will run until it hits the pause command. At that point, the instruction pointer is saved in the closure so we know where to continue running, and all changes made to the local variables in the closure are also retained. The generator returns to the next line of the main function and keeps running there. Later when we resume the generator, it continues running where it left off, with the same variables/values.
Line C: We resume the generator, and it continues running where it left off last time

So the above pseudocode would print:

CS
i is 1
is
woot!
i is 2
cool!
woot!
i is 3

Now here’s the real python syntax:

def foo(n):
  for i in range (1, n):
    print(f'i is {i}')
    yield
    print('woot!')

def main():
  p = foo(4)
  print('CS')
  next(p)
  print('is')
  next(p)
  print('cool!')
  next(p)

Here’s what changed:

Instead of “pause” we use the keyword “yield” to pause the function
Insetad of calling create_pausable to create our closure, we just call the function and pass in the arguments, then store the return value in a variable. This does NOT run the function. It just creates the closure, and then returns an object reference to the closure. We then store the object reference to the closure inside our variable p. How does Python know not to run the function but just to create a closure? Because it has a “yield” command inside of it. This implicitly tells Python this is a generator function and not a traditional function, and that the first call should not run the function, but just create a closure
Instead of calling “start” to start the generator running, and “resume” to resume its execution, we just use next() on the closure that was returned during the initial call.

By the way, to use our earler iterable/iterator nomenclature, the generator function is an “iterable”, and the closure that is returned when it’s first called behaves like an iterator.

Now let’s see a more complex generator - it can not only yield its execution, but every time it yields it can return a value to the caller!

def bar(a, b):
  while a > b:
    yield a  # Line A
    a -= 1

f = bar(3, 0)
print('Eat prunes!')
t = next(f)  # Line B
print(t)
t = next(f)
print(t)
t = next(f)
print(t)
print(f'Explosive diarrhea!')

Notice that on Line A, the generator “yields a value”. So not only does it pause itself, but it also returns a value at this time. Notice that on Line B, we wake up the generator and let it run until it yields a value. The yielded value is returned by next() and that value is stored in the t variable.

This is a more idiomatic use of a generator – they’re used to “generate” a sequence of values that can be retrieved one at a time – the sequence might be finite, or infinite!

Since a generator is an iterable object, and produces an iterator… You can use it like any other iterable object in loops! Another name for a generator function is a “true iterator.”

def our_range(a, b):
  while a > b:
    yield a
    a -= 1

print('Eat prunes!')
for t in our_range(3,0):
  print(t)
print(f'Explosive diarrhea!')

Here’s a Javascript generator:

// JavaScript defines generators with a * character after the function name
gen = function*(n) {
  yield 'I';
  yield 'want';
  yield n;
  yield 'cookies';
};

g = gen(10);
while (true) {
  r = g.next();
  if (r.done) break;
  console.log(r.value);
}
 
for (word of gen(3)) {
  console.log(word);
}

And here’s one in Kotlin:

// Kotlin generators care called "sequences"
fun vals(start: Int) = sequence {
  var i = start
  while (true) {
    yield(i)
    ++i
  }
}

fun main(args: Array<String>) {
  val iter = vals(42).iterator()
  var count = 3
  while (count-- > 0) {
    var n = iter.next()
    println(n)
  } 
  
  for (n in vals(7).take(5))  
    println(n)     // 7, 8, 9, 10, 11
}

Analysis of Class-based Iterators and Generators

Some things are easier to do with class-based iterators, and some things are easier to do with generators. But the two are isomorphic and functionally equivalent - anything we can do with a class we can do with a generator, and visa-versa.

First-class function-based iteration

Alright, we’ll end this lecture by looking at how some languages use first-class functions to do iteration. The syntax often looks something like this:

var fruits = ... // some list of fruits

fruits.forEach { f -> println("$f") }

Basically, the container (e.g., vector, list or set) contains a forEach() method, which accepts one parameter. We pass in a { lambda } function to the forEeach method as this parameter. The forEach method iterates over each item in the iterable, and passes each item as the parameter to the lambda function (e.g., to the parameter f shown in the lambda above).

Here are some examples:

// Rust
(0..10).for_each(|elem| println!("{}", elem));

let items = vec!["fee","fi","fo","fum"];
items.iter().for_each(|elem| println!("{}", elem));

# Ruby
(0..9).each do |elem| 
  print elem, " " 
end

items = ["fee","fi","fo","fum"]

items.each do |elem|
    print(elem,"\n")
end

// Kotlin
(0..9).forEach({ elem -> println("$elem") })

var items = arrayOf("fee","fi","fo","fum")
items.forEach({ elem -> println("$elem") })

When we use this syntax, we’re NOT using an iterator! Instead, we’re passing a function as an argument to a forEach()/each() method that loops over the iterable’s items!

Let’s build a forEach() method for our earlier Python container:

class ListOfStrings:
  def __init__(self):
    self.array = []

  def add(self,val):
    self.array.append(val)

  def forEach(self, f):   # define our own for-each method
    for x in self.array:
      f(x)

yummy = ListOfStrings()
... # add items to yummy

def like(x):
 print(f'I like {x}'))

yummy.forEach(like)

Notice how the iterable object (a ListOfStrings) iterates over its own items, and calls the provided f(). This is essentially a map() operation like we saw in functional programming!