Lecture 11

2022-10-31 | Week 6 | by Boyan Ding

Hi! Boyan here. This lecture note covers Function Palooza starting from slide 21.

Parameter Passing Semantics
Returning Results and Handling Errors

Parameter Passing Semantics

Parameter Passing Semantics is the term we use to describe the underlying mechanisms that languages use to pass arguments (e.g., x+y) to functions (e.g., f(x+y)).

The four most common parameter passing semantics are:

Pass by value: The formal parameter gets a copy of the argument’s value/object
Pass by reference: The formal parameter is an alias for the argument’s value/object
Pass by object reference: The formal parameter is a pointer to the argument’s value/object
Pass by name: The parameter points to an expression graph that represents the argument

As it turns out, these are closely related to binding semantics we learned!

Pass by Value

Approach: Each argument is first evaluated to get a value, and a copy of the value is then passed to the function for local use.

Take a look at the following C++ program, the parameter n in stinkify is passed by value:

void stinkify(string n) {
  string t = n + " stinks!";
  n = t;
} 

int main() {
  string s = "Devan";

  stinkify(s);
  cout << s; // Prints "Devan"
}

When stinkify(s) is executed, the n in stinkify’s activation record is initially copied from s from main. When n is modified at the end of the function, it only affects the local copy without affecting the original argument s in main.

Pass by Reference

Approach: Secretly pass the address of each argument to the function. In the called function, all reads/writes to the parameter are directed to the data at the original address.

Let’s looked at a slightly altered version of above program, where the parameter n is passed by reference instead (notice the “&” before n):

void stinkify(string &n) {
  string t = n + " stinks!";
  n = t;
} 

int main() {
  string s = "Devan";

  stinkify(s);
  cout << s; // Prints "Devan stinks!"
}

This time when stinkify(s) is executed, the n in stinkify’s activation record points to the s in main’s activation record. This enables the formal parameter to act as an alias for the original value/object. Thus, each read/write of the formal parameter is directed to the original variable’s storage. So the modification of n changes the s in main.

One byproduct that can be caused by pass by reference is aliasing. It occurs when two parameters unknowingly refer to the same value/object and unintentionally modify it. It can cause subtle and difficult bugs.

void filter(set<int> &in,
            set<int> &out) {
  out.clear();
  for (auto x: in)
    if (is_prime(x)) out.insert(x);
}

int main() {
  set<int> a;
  // ... fill up a with #s
  filter(a, a);
}

The function filter is supposed copy all prime numbers from in into out. Although it seems to do the job perfectly, problem will occur when in and out refer to the same variable, as demonstrated with the call filter(a, a) above.

In the example, when in and out refer to a, out.clear() will clear the input a before it gets processed. This causes the wrong result to be generated.

Pass by Object Reference

Approach: All values/objects are passed by pointer to the called function. The called function can use the pointer to read/mutate the pointed-to argument.

class Nerd:
  def __init__(self, name, iq):
    self.name = name
    self.iq = iq

  def study(self):
    self.iq = self.iq + 50

# ... 

def be_a_nerd(n):
  n.study();

def main():
  a = Nerd("Alwyn", 150)
  be_a_nerd(a);
  print(a)

In the python code, the call be_a_nerd(a) copies the object reference a into the formal parameter n of function be_a_nerd. So within be_a_nerd, the local variable n points to our original object, which can be mutated through the object reference.

But there is one thing to remember: when we pass by object reference, we can’t use assignment to change the value of the original value/object. Let’s see the following example:

def stinkify(n):
  t = n + " stinks!";
  n = t

def main():
  s = "Devan";

  stinkify(s);
  print(s); # Prints "Devan"

We might expect the assignment n = t to change the original variable s in the calling function. But in fact, the assignment only changes the local object reference n to point to the storage of t in Heap Memory. It has no imact on the original object reference s or the object s points to.

The takeaway is that assignments of object references never change the passed-in value/object. They just change where the local object reference points to. It contrasts with passing by reference, where assignment can change the original value.

So, in a language that uses Object References like Java or Python, your function can either return a new object with relevant changes (e.g., x = stinkify(x)), or use mutator functions.

Pass by Name/Need

Approach: Each parameter is bound to a pointer that points to an expression graph (a “thunk”) which can be used to compute the passed-in argument’s value.

Here, a trunk is typically implemented as a lambda function, which can be called to evaluate the full expression graph and produce a concrete result when it is needed.

Haskell is a notable example of pass by need.

func2 y =
  y^2+7

func1 x =
  func2 (3 + x)

main = do
  let z = func1 5
  print z

In pass-by-need, once an expression graph is evaluated, the computed result is memoized (cached) to prevent repeat computation. For example, in the code above, if we do another print z after the existing print, the value of z will be cached without the need for another evaluation.

Parameter Passing by Language

Let’s look at some common language and the parameter passing scheme they use:

C++: Pass by value/reference/object reference (pointer)/macro expansion
Go: Pass by value (primitives)/object reference
Haskell: Pass by need
Java: Pass by value (primitives)/object reference
JavaScript: Pass by value (primitives)/object reference
Python: Pass by object reference

Practice: Classify That Language: Parameter Passing

Consider the following program, which prints:

q is 110 x is 20, y is 60

What parameter passing strategy is this language using?

struct Record {
 x:i32,
 y:i32
}

fn change_value(v: &mut i32) {
  *v += 100;
}

fn change_struct(r: &mut Record) {
  r.x *= 2;
  r.y *= 3;
}

fn main() {
  let mut q:i32 = 10;
  change_value(&mut q);
  println!("q is {}", q);

  let mut rec = Record{x:10,y:20};
  change_struct(&mut rec);
  println!("x is {}, y is {}", rec.x, rec.y); 
}

Answer: This language is using a hybrid of pass-by-reference and pass-by-pointer. (This is rust)

def addIfFirstEven(a: => Int, b: => Int): Int = 
  var sum = a 
  if (a % 2 == 0)
    sum += b
  return sum

def triple(x: Int): Int =
   var trip: Int = x*3
   println("3*"+x+" is: "+trip)
   return trip

object Main {
 def main(args: Array[String]): Unit =
   var v1 = 1
   var v2 = 2
   var result = addIfFirstEven(v1,triple(v2))
   println("The result is: "+result)
}

Returning Results and Handling Errors

By the end of this section, you should be able to: Take a new language and figure out its approach to returning results and handling errors in functions.

The Big Picture

In addition to returning values (e.g., the square root of a number) functions often need to communicate error conditions.

Early on, languages had no real support for this, so folks would just roll their own solutions. For instance, functions used enums (success, failure) or sentinel values (-1, null, or false) to report results/errors.

This created a hodge-podge of different error-handling approaches, and introduced its own set of bugs and issues. So as languages have evolved, they’ve started providing explicit mechanisms for dealing with results and errors.

Bugs, Errors, Results

Before we discuss how languages handle bugs/errors/results, let’s define these terms.

A bug is a flaw in a program’s logic – the only solution is aborting execution and fixing the bug. The examples of a bug can include:

Out of bound array access
Deference of a nullptr
Divide by zero
etc.

Other than bugs, there also exist unrecoverable errors. These are non-bug errors where recovery is impossible, and the program must shut down. Examples include:

Out of memory error
Network host not found
Disk full
Invalid app configuration

Other less severe errors are recoverable errors. Under this kind of error, the program may continue the execution. Some examples of this type include:

File not found
Network service temporarily overloaded
Malformed email address

Finally, when there is no bug or error, the program will produce a result, which is an indication of the outcome/status of an operation.

Handling Techniques: Overview

Here are the major “handling” paradigms provided by various languages:

Roll your own: The programmer must “roll their own” handling, like defining enumerated types (success,error) to communicate results.
Error Objects: Error objects are used to return an explicit error result from a function to its caller, independent of any valid return value.
Optional Objects: An “Optional” object can be used by a function to return a single result that can represent either a valid value or a generic failure condition.
Result Objects: A “Result” object can be used by a function to return a single result that can represent either a valid value or a specific Error Object.
Assertions/Conditions: An assertion clause checks whether a required condition is true, and immediately exits the program if it is not.
Exceptions and Panics: f() may “throw an exception” which exits f() and all calling functions until the exception is explicitly “caught” and handled by a calling function or the program terminates.

We will detail a part of them.

Error Objects

Error objects are language-specific objects used to return an explicit error result from a function to its caller, independent of any valid return value.

Languages with error objects provide a built-in error class to handle common errors.
Error objects are returned along with a function’s result as a separate return value.

Let’s look at a real example written in Go:

func circArea(rad float32) (float32, error) {
  if rad >= 0 {
    return math.Pi*rad*rad, nil
  } else {
    return 0, errors.New("Negative radius")
  }
}

func cost(rad float32, cost_per_sqft float32) 
         (float32, error) {
  area, err := circArea(rad)
  if err != nil { return 0, err }
  return cost_per_sqft * area, nil
}

The function circArea returns both a double and an error result, error is a built-in type in the second place of result tuple.

If the radius is valid, then we return the circle’s area and nil for the error result, meaning no error occurs.
Otherwise, we return 0 for th area and an error object. We specify what went wrong with the message passed into the constructor error.New

The function cost gets bot the area and error result from circArea. When an error occurs, it propages the error up, and does normal computation otherwise.

Besides using the built-in error type, you can define custom error classes with fields that are specific to your error condition, and even wrap (nested) errors to provide more context.

Optionals

An “Optional” object can be used by a function to return a single result that can represent either a valid value or a generic failure.

You can think of an Optional as a struct that holds two items: a value and a Boolean indicating whether the value is valid.

Since you only have a simple Boolean to indicate success or failure (not a detailed error description), you only want to use Optional if there’s an obvious, single failure mode.

Let’s first see an example in C++ (C++17)

std::optional<float> divide(float a, float b) {
  if (b == 0)  
    return std::nullopt;  // error result!  
  else
    return std::optional(a/b); 
}

int main() {
 auto result = divide(10.0,0);
 if (result)
    cout << "a/b is equal to " << *result;
 else 
    cout << "Error during division!";
}

The return type std::optional<float> of function divide indicates that this function returns a float value if it’s successful.

If the function can’t compute a valid result, it can return nullopt which explicitly indicates an error.
Otherwise, we construct and return an optional object containing our valid floating-point value.

C++ allows us to treat the optional object like a Boolean when checking it. If it contains a valid result, it’ll evaluate to true. Then we can then use the overloaded * operator to get the actual float value embedded in the optional object.

You can only use the * operator on C++ optional object when the result is valid. Otherwise, it will result in unspecified behavior (because there is no result when error occurs).

In some languages, optionals are a built-in part of the language with dedicated syntax! Let’s look at an equivalent example in Swift.

func divide(a: Float, b: Float) -> Float? {
    if b == 0 {
      return nil;
    }
    return a/b;
}

var opt: Float?;
opt = divide(a:10, b:20);
 
if opt != nil {
  let a_div_b: Float = opt!;
  print("Result was: ", a_div_b);
} else {
  print ("Error result!")
}

As we can see, the ? as in Float? is used for optional return value.

When returning:
nil is used for error result
Any other value directly creates the optional that represents a valid value.
When using: ! is used to extract the value from the optional if it is valid.

Result Object

A “Result” object can be used by a function to return a single result that can represent either a valid value or a distinct error.

You can think of a Result as a struct that holds two items: a value and an Error object with details about the nature of the error.

enum ArithmeticError: Error {
  case divisionByZero
  // add other error types here as necessary
}

func my_div(x: Double, y: Double) -> 
       Result<Double, ArithmeticError> {
  if y == 0  { return .failure(.divisionByZero) }
  else       { return .success(x / y) }
}

let result = my_div(x:10, y: 0)
switch result {
  case .success(let number):
    print("Successful division: ", number)
  case .failure(let error):
    dealWithTheError(error)
}

Different from optional we discussed earlier, the error result of result type carries detailed information using error object. You can use it when there are multiple distinct failure modes that need to be distinguished and handled differently.

Assertions

An assertion is a statement/clause inserted into a program that verifies assumptions about your program’s state (its variables) that must be true for correct execution.

We typically use assertions to verify:

Preconditions: something that must be true at the start of a function for it to work correctly.
Postconditions: something that the function guarantees is true once it finishes.
Invariants: An invariant is a condition that is expected to be true across a function or class’s execution.

Consider the states to verify in selection sort: void selection_sort(int *arr, int n);.

Preconditions: arr must not be nullptr, n must be >= 0
Postconditions: For all i, 1 <= i < n, arr[i] > arr[i-1]
Invariants: At the end of iteration j, the first j items of arr are in ascending order

An assertion tests a particular condition and terminates the program if it’s not met. An assertion states what you expect to be true. Your program aborts if it’s not true. Here are a few examples:

// C++ assertions
void selection_sort(int *arr, int n) {
  assert(arr != nullptr);
  assert(n >= 0);
  ...
}

Some languages enable you to provide a message to explain what went wrong.

// Java assertions
public class SelectionSort {
 public void sort(int arr[]) {
   assert arr != null : "Invalid arr";
   ...
}

A few languages (Eiffel, Ada) let you explicitly specify pre- and post-conditions for each function.

-- Eiffel pre and post conditions
selection_sort (arr: ARRAY [G]): ARRAY [G]
  require
    arr_not_void: arr /= Void
  local
    -- locals go here ...
  do
    -- sorting code here sorts the numbers
    -- and stores results in out_arr variable
  ensure
    result_is_set: out_array /= Void
    result_sorted: is_ascend(out_array) = True
  end

Exception Handling

With other error handling approaches, error checking is woven directly into the code, making it harder to understand the core business logic.

With exception handling, we separate the handling of exceptional situations/unexpected errors from the core problem-solving logic of our program. Thus, errors are communicated and handled independently of the mainline logic. This allows us to create more readable code that focuses on the problem we’re trying to solve and isn’t littered with extraneous error checks that complicate the code.

Participants of Exception Handling

There are two participants with exception handling: a catcher and a thrower

A catcher has two parts:

A block of code that “tries” to complete one or more operations that might result in an unexpected error.
An “exception handler” that runs iff an error occurs during the tried operations, and deals with the error.

void f() {
  try {
    g();
    h();   // Might have an unexpected error
    i();
  }
  catch (Exception &e) {
    deal_with_issue(e);   // Deals with the error if one occurs
  }
}

In the C++ code above. The block of code within try belongs to the first part of the catcher, which contains the call to function h, which might result in an error. On the other hand, the following catch block is the second part. It contains the error handler, and it will only be executed in case an error happens.

The thrower is a function that performs an operation that might result in an error. If an error occurs, the thrower creates an exception object containing details about the error, and “throws” it to the exception handler to deal with.

A C++ example might look like the following:

void h() {
  // Next command might fail!
  if (some_operation() == failure)
    throw runtime_error("details..");

  op_succeeded_so_do_other_stuff();
  do_even_more_stuff();
  finish_with_more_stuff();
}

If any operation performed in the try block results in a thrown exception, the exception is immediately passed to the exception handler (in catch block) for processing.

Let’s say in the function f, an failure occurs inside function h as shown in the code, it throws an exception at the throw in the code above.

When the exception is thrown within h, it is immediately exited, and the remaining statements are skipped. It acts as if the function h immediately returns.
All remaining statements in the try block within f are also skipped.
The execution flow goes into the exception handler within the catch block instead. The exception handler processes the exception and figures out how to proceed.
Finally, when the exception handler completes, execution continues normally.

Execution Flow

Let’s use another exemple to illustrate the execution flow with exception handling.

void f() {
  do_thing0();
  try {
    do_thing1();
    do_thing2();
    do_thing3();
    do_thing4();
    do_thing5();
  }
  catch (Exception &e) {
    deal_with_issue1(e);
    deal_with_issue2(e);
  }
  do_post_thing1();
  do_post_thing2();
}

If we assume there is an exception generated in do_thing3(), the execution flow of function f is:

do_thing0
do_thing1
do_thing2
do_thing3
deal_with_issue1
deal_with_issue2
do_post_thing1
do_post_thing2

On the other hand, when no exception is generated, the execution flow will be:

do_thing0
do_thing1
do_thing2
do_thing3
do_thing4
do_thing5
do_post_thing1
do_post_thing2

What is An Exception?

An exception is an object with one or more fields which describe an exceptional situation.

At a minimum, every exception has a way to get a description of the problem that occurred, e.g., “division by zero.”

But programmers can also use subclassing to create their own exception classes and encode other relevant info. The following C++ code demonstrates the creation of a custom exception.

class WebsiteException: 
           public std::exception {
public:
 WebsiteException(const string& website,
                  int http_error) {
    website_ = website;
    http_error_code_ = http_error;
 }
        
 string get_website_url() const 
    { return website_; }
 string get_http_error_code () const 
    { return http_error_code_; }
 const char* what() const noexcept 
    { return "Unable to connect to URL\n";}
private:
  string website_;
  int http_error_code_; // eg, 404 access denied
};

// ...
HttpStatus s = connect(url);
if (s != OK)
  throw WebsiteException(url, s.getHTTPStatus()); 

More on Exception Handling

In fact, an exception may be thrown an arbitrary number of levels down from the catcher.

void h() {
  if (some_op() == failure)
    throw runtime_error("deets");

  do_other_stuff();
}

void g() {
  some_intermediate_code();
  h();
  some_more_code();
}

void f() {
  try {
    cout << "Before!\n";
    g();
    cout << "After!\n";
  }
  catch (Exception &e) {
    deal_with_issue(e);
  }
}

In the C++ example above, when function h throws an exception, it will automatically terminate every intervening function (g in the example) on its way back to the handler (f).

Additionally, an exception handler can specify exactly what type of exception(s) it handles. A thrown exception will be directed to the closest handler (in the call stack) that covers its exception type.

void h() {
  throw out_of_range("negative index");
}

void g() {
  try {
    h();
  }
  catch (invalid_argument &e) {
    deal_with_invalid_argument_err(e);
  }
}

void f() {
  try {
    g();
  }
  catch (overflow_error &e) {
    deal_with_arithmetic_overflow(e);
  }
  catch (out_of_range &e) {
    deal_with_out_of_range_error(e);
  }
}

In the code above, the out_of_range exception thrown in h is catched by the second catch block in f.

If an exception is thrown for which there is no compatible handler, the program will just terminate. This is the equivalent of a “panic” which basically terminates the program when an unhandle-able error occurs.

It turns out that different types of exceptions are derived from more basic types of exceptions. For example, C++ has an exception hierarchy, where std::exception is the base class of all exceptions.

So if you want to create a catch-all handler, you can have it specify a (more) basic exception class. That handler will deal with the base exception type and all of its subclassed exceptions.

That’s it for this time. We still have more to discuss about exception, and we’ll continue in the next lecture.