Lecture 08

2023-04-26 | Week 4 | edited by Ruining

(originally written 2022-10-19 by Siddarth Krishnamoorthy)

Hi! Ruining here. This lecture note covers slides 53 to 88 of data palooza. This is also the last of the material that will be covered on the midterm.

Strong Typing continued
Weak typing
- Undefined behaviour
Type casting, conversion and coercion

Strong Typing continued

The definition of strong typing is disputed. Many academics argue for a stronger definition (e.g. all conversions between types should be explicit, the language should have explicit type annotations for all variables, etc.). But ultimately while these definitions may make a languages type system stricter, they don’t impact the languages type or memory safety.

Weak typing

Weak typing is essentially the opposite of strong typing, in that it does not guarantee that all operations are invoked on objects/values of the appropriate type. Weakly typed languages are generally neither type safe nor memory safe.

Undefined behaviour

In a strongly typed language, we know that all operations on variables will either succeed or generate an explicit type exception at runtime (in dynamically-typed languages). But in weakly-typed languages, we can have undefined behavior at runtime!

Undefined behaviour is the result of executing a program whose behaviour is prescribed as unpredictable in the language spec. Consider the following example in C++.

// C++ example w/undefined behavior!
class Nerd {
public:
  Nerd(string name, int IQ) { ...}
  int get_iq() { return iq_; }
  ...
};

int main() {
  int a = 10;
  Nerd *n = reinterpret_cast<Nerd *>(&a); // reinterpret the integer as a Nerd object
  cout << n->get_iq(); // ?? What happens?!?!?
}

When we call the get_iq method on n (which is actually an integer), the program is going to crash!

It’s tough to say whether or not a language is weakly or strongly typed just from looking at its behaviour in one situation. Consider the following example from a mystery language.

# Defines a function called ComputeSum
# In this language, @_ is an array that holds
# all arguments passed to the function

sub ComputeSum {
   $sum = 0;

   foreach $item (@_) {    # loop thru args
      $sum += $item;
   }

   print("Sum of inputs: $sum\n")
}

# Function call
ComputeSum(10, "90", "cat");

This outputs

Sum of inputs: 100

It may seem like the mystery language is weakly typed, but in fact the mystery language is Perl, which is strongly typed!

Weak typing is the situation where an undefined result occurs because we apply some operation to a type that does not support it.

In Perl, the + operator does support the string type. It will convert the string to an integer, if possible (e.g., if it holds all digits). If not, it’ll convert the string to a value of zero. Then it’ll perform the add.

So while Perl does perform implicit conversions which might not do what you want, its behavior is never undefined.

Type casting, conversion and coercion

Type casting, conversion and coercion are different ways of explicitly or implicitly changing a value of one data type into another.

For example, we can change the type of a variable using static_cast in C++.

void convert() {
  float pi = 3.141;
  cout << static_cast<int>(pi); //type conversion
}

Before diving into the details of casting and conversion, it helps to learn the concepts of supertypes and subtypes.

Subtypes and Supertypes

Often the types you are converting will be related to each other, for instance like float and double. Both types semantically represent real-valued numbers, but:

double can hold every value a float can hold and more, due to having a relatively higher precision.
any operation (e.g., +, -, *, /) you can perform on doubles you can also perform on floats.

So in language theory, we say that float is a subtype of double, or alternatively that double is a supertype of float.

More formally, given two types $T_{sub}$ and $T_{super}$, we say that $T_{sub}$ is a subtype of $T_{super}$ if and only if

every element belonging to the set of values of type $T_{sub}$ is also a member of the set of values of $T_{super}$.
All operations (eg +, -, *, /) that you can use on a value of type $T_{super}$ must also work properly on a value of type $T_{sub}$.
- i.e., If I have code designed to operate on a value of type $T_{super}$, it must also work if I pass in a value of type $T_{sub}$.

Ruining here. Another way of thinking about this could be like two toddlers bragging on a playground: Imagine the supertype saying:”Anything you can be, I can be” and the subtype saying:”Anything you can do, I can do”.

Does this mean float is a subtype of double in C++? Yes! Since all operations that can be performed on a double can also be performed on an float (+, -, *, /, etc.) and every float value is also in the set of double values.

To clear up the discussion from class about the type relations of int with either float or double: int is NOT a subtype of float but int IS a subtype of double since doubles have enough precision to represent all values that int can hold.

What about this example?

class Person {
public: 
  virtual void eat();
  virtual void sleep();
};

class Nerd: public Person {
  virtual void eat();
  virtual void sleep();
  virtual void study();
};

Is Nerd a subtype of Person? Yes, since all operations that can be performed on a Person can also be performed on a Nerd and all Nerds belong to the set of Persons.

For a more tricky example, is int a subtype of const int? Yes, because all operations that can be performed on a const int can also be performed on an int and int and const int have the same set of values.

Switching between types

Type conversion and type casting are used when we want to perform an operation on a value of type A, but the operation requires a value of type B. There are two ways of converting between types - conversions and casts

Type conversions

A conversion takes a value of type A and generates a whole new value (occupying new storage, with a different bit encoding) of type B. Type conversions are typically used to convert between primitives (e.g. float and int).

void convert() {
  float pi = 3.141;
  cout << (int)pi; // 3
}

This is a conversion since the compiler generates a new temporary value of a different type.

Type casts

A cast takes a value of type A and treats it as if it were value of type B – no conversion takes place! No new value is created!

Type casts are typically used with objects.

// Another casting example; treat an
// int as if it's an unsigned int!


int main() {
  int val = -42;

  cout << (unsigned int)val;
      // prints 4294967254
}

In the above example, val refers to the original int object, but interprets the bits as if it were an unsigned int.

Conversions and casts can be widening or narrowing.

Widening means the target type is a super type, and can fully represent the source type’s values.
Narrowing means the target type may lose precision or otherwise fail to represent the source type’s values. The target type could be a subtype (like long and short), or the two types could also have a non-overlapping set of values (like unsigned int and int).

Conversions and casts can also be expliict or implicit.

An explicit conversion requires the programmer to use explicit syntax to force the conversion/cast
An implicit conversion (also called coercion) is one which happens without explicit conversion syntax.

An implicit conversion or cast that is widening is called a type promotion.

Explicit conversions and casts

When you’re using an explicit conversion or cast, you’re telling the compiler to change what would be a compile time error to a run time check.

class Example
{
  public void askToCookFavoriteMeal(Machine m) {
    if (m.canCook() && m.canTalk()) {
      RobotChef r = m; // Line 5
      r.requestMeal("seared ahi tuna");
    }
  }
}

For example, at line 5, the programmer may know that converting from Machine to RobotChef is always safe, but a statically typed compiler can’t know that. Therefore, the compiler will output an error. With an explicit cast (like in the example below), the compiler knows that the programmer wants to do this, and allows the code to compile.

class Example
{
  public void askToCookFavoriteMeal(Machine m) {
    if (m.canCook() && m.canTalk()) {
      RobotChef r = (RobotChef)m; // Line 5
      r.requestMeal("seared ahi tuna");
    }
  }
}

If the language is strongly typed, then it will perform a runtime check to ensure the type conversion is valid. In weakly typed languages, however, improper casts/conversions are often not checked at run time, leading to nasty bugs!

Here are a few examples of explicit type conversions in different languages.

C++

// Explicit C++ conversions
float fpi = 3.14;
int ipi = (int)fpi;               // old way
int ipi2 = static_cast<int>(fpi); // new way
RobotCook *r = dynamic_cast<RobotCook *>(m);

Confusingly, even though static_cast<int>(fpi) is creating a new value (so it’s performing a conversion), C++ calls it a cast

Python

# Explicit Python conversions
fpi = 3.14
ipi = int(pi)

Haskell

-- Explicit Haskell conversion
fpi = 3.14 :: Double
ipi = floor(foo)     -- converts to 'Integral' super type

JavaScript

-- Explicit JavaScript conversion
fpi = 3.14
ipi = parseInt(pi) -- converts to int

Implicit conversions and casts

Most languages have a set of rules that govern implicit conversions that may occur without warnings/errors. For instance, the lectures slides have some associated with C/C++. When learning a new language, it helps to understand its implicit conversion policy!

Let’s look at some examples of implicit conversions and casts in different languages.

// Implicit C++ conversions
void foo(double x) { ... }

int main() {
  bool b = true;
  int i = b;  // b promoted to int
  double d =
    3.14 + i; // i promoted to double
  i = 2.718;  // 2.718 coerced/narrowed to int
  foo(i);     // i promoted to double
}

When we promote i to double in the statement double d = 3.14 + i;, it brings the all of the operands to a common super type so that addition can be performed. In such an expression with mixed types, the types of all variables must have “type compatibility” with each other.

In the above example, i is still an int, and it’s value is 2, not 2.718. Compilers (especially recent ones) will often generate warnings for narrowing coercions which may not preserve the full source value.

Python

# Implicit Python conversions
i = 123 + True  # True promoted to int  (124)
f = 1.23
sum = i + f     # i promoted to float

JavaScript

// Implicit JavaScript conversion
str = "sk" + 8   // coercion of 8 to string
n = "10" * "15"  // coercion to numbers

Java

// Implicit (un)boxing in Java
int i = 10;
// implicit boxing
Integer boxed_i = i;
// implicit unboxing
int unboxed_i = boxed_i;

Haskell doesn’t do implicit type conversions.

Type casting in depth

Casting is when we reinterpret the bits of the original value in a manner consistent with a different type. No new value is created by a cast, we just interpret the bits differently. The most common type of casting is from a super type to a subtype, for example, when upcasting from a subclass to a super class to implement polymorphism. Downcasting is another example, where we go from a super class to a subclass.

// C++
class Mammal { ... }
class Person : public Mammal { ... };
class Nerd: public Person { ... };

void ask_hobbies(Person &p) { ... }

void date_with_nerd(Nerd &n) {
  ask_hobbies(n); // upcast Nerd to Person
}

void date_with_mammal(Mammal &m) {
 // downcast Mammal to Person
 Person& p = dynamic_cast<Person &>(m);
 ask_hobbies(p);
}

Here is an example of implicit casting.

// C++
void print(unsigned int &val)
 { cout << val << endl; }

int main() {
   int j = -42;

   print(j);  // prints 4294967254
}

This cast simply reinterprets the value stored by j as if it were an unsigned int.

Many compilers will not warn about implicit casts when they are between unsigned and signed types - even though they are narrowing.

// C++
void print(const string &s)
 { cout << s << endl; }

int main() {
   string s = "UCLA";

   print(s);
}

This cast implicitly converts from type string to const string.

We can also use casting to change the interpretation of pointers and read/write memory using a different type. For example, consider the following snippet of code in C++.

void cast() {
  int a = 1078530000;
  int *ptr = &a;
  // Treat ptr as a float *
  cout << *reinterpret_cast<float *>(ptr); // prints 3.14159
}

The code treats ptr as if it is a pointer to a float (using reinterpret_cast), and when we dereference ptr, the bits are interpreted as if they were a float.

We can also reinterpret the value of an integer as a pointer (though this isn’t recommended).

void cast() {
  int a = 12340;
  // Treat a as a double pointer
  // Store value of pi at address 12340!
  *reinterpret_cast<double *>(a) = 3.14159;
}

So what’s better, explicit or implicit type conversions? It depends! Explicit type conversions lead to more verbose code and is less convenient, but reduces the likelihood of bugs. Implicit type conversions makes code less verbose and is more convenient, but greatly increases the likelihood of bugs.

Here are a few final examples of casting and conversions.

$a = 5;
if ($a) {
    print("5 is true!");
}

From this snippet of code, we can infer that the language supports type coercion, since it does a narrowing coercion from an integer value 5 to a boolean value false.

The language is actually Perl.

In the following example, we are told that the program doesn’t work if we remove the float32 conversion.

func main() {
  var x int = 5
  var y float32 = 10.0
  var result float32

  result = float32(x) * y // doesn't work if we remove float32()
  fmt.Printf("5 * 10.0 = %f\n", result)
}

From this, we can infer that the language requires explicit conversions between even comparable types (like int and float).

The language is actually Golang.