Lecture 05

2023-04-17 | Week 3 | edited by Matt Wang

(originally written 2022-10-10 by Ashwin Ranade)

Hey everyone, Matt here! This lecture is disjoint from the last one, and focuses entirely on Python (instead of functional programming). It covers slides 1-36 of Python Palooza. Let me know if you’ve got any corrections or questions!

Table of Contents

Just a Bit on Python

What do people use Python for?

  • quick and dirty scripting
  • industrial scripts
  • web backends (ex Flask and Django)
  • data science (pandas, numpy) and machine learning (pytorch, tensorflow)

Generally, people say to avoid python for writing efficient programs. Among other things, this is because Python is “interpreted”.

There are alternative implemenations of Python that aren’t interpreted, but are rather compiled!

But, people often use it in machine learning, which is a compute-intensive application. What gives? Well, Python can use something called a “C extension” - calling code written in other higher-performance languages like C and C++. This is what powers most data science and machine learning libraries like pytorch and tensorflow do under-the-hood!

“Pythonic”

Compared to other languages, Python is relatively opinionated on the right way to do things. Doing things the right way is often called “Pythonic”; we’ll use this word throughout these notes.

Learning Python through Challenges - Basics

Much of this lecture is learning program characteristics through examples. Let’s get started!

(implicitly, we’re assuming some existing Python knowledge from CS35L and other prerequisites)

Program Execution

Consider this following code:

# deltaco.py
def lets_go(person):
  print('Hey, ' + person)
  print("Let's go to Del Taco!")

lets_go('Carey')

def main():
  print('Oh crap, now I have gas!')

print('Nom nom')

When we run it, we get:

$ python3 deltaco.py
Hey, Carey
Let's go to Del Taco!
Nom nom

Pause. What does this tell us about Python?

  1. the Python interpreter runs from top-down
  2. defining a function with def does not run it
  3. even if we have a function called main(), it won’t always be run!

So, how would we define a main function? We will use a “magic” piece of code like this:

if __name__ == "__main__":
  main()

When we run a python script from the command line, the interpreter will set __name__ to "__main__". It’s a “special variable”!

Why not just do a top-level call to main(), like this?

main()

While it would fix our immediate problem, it would also run the code when we imported the module - which is not what we want!

More Variables and Types

Consider the following code: will it generate the error:

  • before the program runs
  • on line #1
  • on line #2
def add_n_print(x,y):
  print(x + y) # Line 1

def main():
  add_n_print('foo', 5) # Line 2

Answer: it generates an error when the code reaches line #1 at runtime. We’d get an error of the form:

TypeError: can only concatenate str (not "int") to str

Unlike C++, Python’s type-checker is at runtime. The type-checker detects that the types of values that a and b refer to are incompatible, and generates an error.

  • this is called dynamic typing - which we’ll talk about in-depth
  • however, note that Python still has types!
  • (and, this makes things harder to debug :/)

Fun fact: Python has recently supported “type hints”, which add Haskell-like type annotations to functions. These aren’t enforced by the interpreter!

Variables and Types

Will this trick_or_treat() function generate an error, or will it work?

def trick_or_treat(name, age):
  if age < 10:
    goodie = 'candy corn'
    print('Boooooo!')
  else:
    goodie = 'Snickers'
    print('Bwahahahahaha!')

  print(f"{name}, here's your {goodie}!")

trick_or_treat('Felix', 5)

Answer: it does work! Python is function-scoped: when an identifier is declared, it persists until the end of the function.

Contrast this to C++, where the equivalent code would generate an error:

void trick_or_treat(string name, int age) {
  if (age < 10) {
    string goodie = "candy corn";
    cout << "Boooooo!";
  } else {
    string goodie = "Snickers";
    cout << "Bwahahahahaha!";
  }
  cout << name << ", here's your " << goodie << "!\n";
}

Looping with Counters

What does this program print?

for i in range(2,5):
 print(f'{i} witches!')

for i in range(7,1,-2):
 print(f'{i} zombies!')

goblins = 100
while goblins > 0:
 if goblins % 97 == 0 or goblins < 62:
   break
 print('Boo!')
 goblins -= 1
print('That was scary!')

Answer: the following:

2 witches!
3 witches!
4 witches!
7 zombies!
5 zombies!
3 zombies!
Boo!
Boo!
Boo!
That was scary!

Why bring this up? Python’s range() behaves slightly differently from Haskell’s ranges:

  • Python’s range() is semi-open: it includes the first item, but not the last. So, range(2,5) = [2,5) = [2,3,4]
  • Haskell’s .. is completely closed: 2..5 = [2,5] = [2,3,4,5]

Learning Python through Challenges - Object-Oriented Programming

Classes

Consider the following Python class:

class Car:
  MILES_PER_GAL = 30
  def __init__(self, gallons):
    self.gas_gallons = gallons
    self.odometer = 0

  def drive(self, miles):
    gals_needed = miles / Car.MILES_PER_GAL
    if self.gas_gallons > gals_needed:
      self.gas_gallons -= gals_needed
      self.odometer += miles

  def get_odometer(self):
    return self.odometer

We can insantiate and use the object like so:

def use_car():
  c1 = Car(16)  # 16 gallons of gas
  c1.drive(10)
  print(f'I drove {c1.get_odometer()} mi')

Looking at this code, can you identify:

  1. how do you define a constructor in a Python class?
  2. how do you define member variables in a Python class?
  3. how is a method different from a normal Python function?

Answer:

  1. We define a constructor with the __init__ function. Notice the “explicit” self - this behaves like this in C++, but is always the first argument!
  2. We can define new member variables by just calling self.[variable_name]! Note that you can’t declare variables without assigning them!
  3. We define it “in” the function; it must have a self argument; and, it can access member fields/variables.

Now, consider this updated code:

class Car:
  MILES_PER_GAL = 30
  def __init__(self, gallons):
    self.gas_gallons = gallons
    self.odometer = 0

  def drive(self, miles):
    gals_needed = miles / Car.MILES_PER_GAL
    if self.gas_gallons > gals_needed:
      self.gas_gallons -= gals_needed

  def __update_odometer(self, miles):
    self.odometer += miles

  def get_odometer(self):
    return self.odometer

and

def use_car():
  c1 = Car(16)  # 16 gallons of gas
  c1.drive(10)
  print(f'I drove {c1.get_odometer()} mi')

Three more questions:

  1. Changing get_odometer()’s return statement to return odometer doesn’t work — why?
  2. How do you call a method from another method?
  3. How do we indicate whether a method is public or private?

Answers:

  1. Unlike C++, Python requires an explicit self. to identify member variables.
  2. Again, unlike C++, Python requires an explicit self. to identify member functions.
  3. Python’s access modifiers are built into variable names: two underscores creates a private method (compare __update_odometer to get_odometer). Other variables are public by default!

Syntactically, the definition of a Python class is a sequence of statements that gets run anytime the class is instantiated. So, you can include any legal Python code in your class definition!

Object Allocation

This question has a bit of setup. Compare this C++ code:

#include <cmath>

class Circle {
public:
  Circle(double rad)
    { this->rad = rad; }

  double area() const
    { return M_PI * rad * rad; }

  void setRadius(double rad)
    { this->rad = rad; }

private:
  double rad;
};

with the Python equivalent:

import math

class Circle:
  def __init__(self, rad):
    self.rad = rad

  def area(self):
    return math.pi * self.rad**2

  def set_radius(self, rad):
    self.rad = rad

In C++, there are two ways we can define an object:

// on the stack - "no pointer"
Circle c(10);
cout << c.area();
// on the heap - "with a pointer"
Circle *p = new Circle(20);
cout << p->area();

In Python, we can only do

c = Circle(10)
print(c.area())

(Carey’s slides then have a long animation that explains how heap versus stack allocation works; this is a refresher from CS32)

How does Python allocate objects - is it directly/on-the-stack, or with pointer indirection?

Hint: this is valid code:

c = Circle(10)
print(c.area())
c = None

Answer: Python uses something like pointers, called an object reference. Every Python object is allocated on the heap. Unlike C++, we don’t use any special syntax to “dereference” a pointer.

In Python, all variables are object references. This applies to primitives in other languages, like numbers – 42. On reassignment, we don’t “change” the allocated 42; instead, we allocate a new number, and point to it. Though, this is mostly abstracted away from you as a programmer!

This affects how we call methods too:

# Circle class in Python
import math

class Circle:

  def __init__(self, rad):
    self.rad = rad

  def area(self):
    return math.pi * self.rad**2

  def set_radius(self, rad):
    self.rad = rad

c = Circle(10)
print(c.area()) # here, "self" points to c!

Class Variables

Consider this module thing.py:

# thing.py
class Thing:
 thing_count = 0

 def __init__(self, v):
  self.value = v
  Thing.thing_count += 1
  self.thing_num = Thing.thing_count

What will the following print? Why?

import thing

t1 = Thing("a")
print(f"{t1.thing_num} {Thing.thing_count}")


t2 = Thing("b")
print(f"{t1.thing_num} {Thing.thing_count}")
print(f"{t2.thing_num} {Thing.thing_count}")

Answer:

1 1
1 2
2 2

Why?

  • thing_count is a class variable
  • when a class is defined, Python creates a special “class” object that’s shared by all instances of the class; it represents the entire class
  • so, we can think of thing_count as being “shared” across different instances of Thing
  • we can access class (member) variables with the class name - like Thing.thing_count += 1
  • in contrast, self.value is an instance (member) variable

Class Method

Consider the following Python code with a class method, which has no self parameter:

class Thing:
 thing_count = 0

 def __init__(self, v):
  self.value = v
  Thing.thing_count += 1
  self.thing_num = Thing.thing_count


 def change_val(self, new_val):
  self.value = new_val

 def a_class_method(foo, bar):
  return Thing.thing_count * foo + bar

 def another_class_method(bletch):
  Thing.a_class_method(bletch,20)

a_class_method and another_class_method don’t have a self parameter - so, they can’t access instance (member) variables or functions!

You can call them like this:

import thing

t1 = Thing("a")
Thing.another_class_method(42)

You generally want to use class methods and variables that only operate on state shared across the entire class (or have no state at all), and don’t need any information about the instances.

Copying of Objects

# Circle class in Python
import math

class Circle:

  def __init__(self, rad):
    self.rad = rad

  def area(self):
    return math.pi * self.rad**2

  def set_radius(self, rad):
    self.rad = rad

What happens when we run this code?

c = Circle(10)
print(c.area())
c2 = c
c.set_radius(0)
print(c2.area())  # What happens?!?!?

Answer: prints 0! This is a consequence of Python’s ojbect reference behaviour. Assignment does not copy by default!

But, we can implement our own copy implementation. One way to do this is with the copy.deepcopy() function:

import copy

c = Circle(10)
print(c.area())

c2 = copy.deepcopy(c)
c.set_radius(0)
print(c2.area) # not 0!

“Deep copying” recurisvely makes a copy of the top-level object, and every object it refers to.

In contrast, shallow copying - done with copy.copy() - copies just the top level object (but not recursively).

Automatic Garbage Collection

In Python, you don’t have to worry about freeing objects when you’re done with them, like you have to in C++. Instead, Python keeps track of who’s pointing to each object; when nothing points to it, Python cleans it up.

More broadly, this is called “automatic memory management”.

We’ll cover this much more in-depth in the coming weeks!

In other words: we don’t need the new and delete keywords!

Classes and Destructors

Python has “destructors”, but they’re rarely used, and not guaranteed to run. More generally, they’re really a type of “finalizer”.

It looks like this:

class TextBook:
  ...

class Student:
  def __init__(self):
    self.book = TextBook()

  def study(self):
    self.book.read()

  def __del__(self):
    if self.book.finished_reading():
      print("You graduated!")

Destructors are only called when GC’d - but, it might not happen at all! Again, we’ll explore this in a few lectures.

(in Python, instead of having your destructor deal with non-memory resources - like temporary files, network connections, etc. - you’ll want to explicitly free these)

Inheritance

One key OOP feature is inheritance. Consider:

class Person:
  def __init__(self, name):
    self.name = name

  def talk(self):
    print('Hi!\n')

class Student(Person):
  def __init__(self, name):
    super().__init__(name)
    self.units = 0

  def talk(self):
    print(f"Heya, I'm {self.name}.")
    print("Let's party! Oh... and ")
    super().talk()
def chat(p):
 p.talk()

def cs131_lecture():
 s = Student('Angelina')
 chat(s)

Some questions:

  1. How does a derived class inherit from a base class?
  2. How does a derived class method override a base class method?
  3. How does a derived method call a base-class method?

Answers:

  1. By placing the base class in the parens (e.g. Student(Person))
  2. “Virtual by default” - as long as they have the same name!
  3. Using the super() prefix!

You need to explicitly call the base class constructor!

Duck Typing

Consider another classical pedagogical example:

class PersonInDuckSuit:
  ... 			  # code omitted for clarity
  def quack(self):
    print('Hi! Err... oops, I mean quack quack.')

class Duck:
  ... 			  # code omitted for clarity
  def quack(self):
    print('Quack quack quack!')

class Vehicle:
  ... 			  # code omitted for clarity
  def drive(self):
    print('Vrooooom!')

What does this print?

def quack_please(x):
  x.quack()

p = PersonInDuckSuit()
d = Duck()
v = Vehicle()
quack_please(p)
quack_please(d)
quack_please(v)

Answer:

Hi! Err... oops, I mean quack quack.
Quack quack quack!
AttributeError: 'Vehicle' object has no attribute 'quack'

This is a feature called duck typing - at runtime (and only when the line runs), Python checks if the input to quack_please() has a quack method. If it does, it runs that method; if not, an error is run.

This is different from C++, which requires features like inheritance (or interfaces) to make this code works.

Object Equality

What does the following print?

# Different types of equality in Python
fav = 'pizza'
a = f'I <3 {fav}!'
b = f'I <3 {fav}!'
c = a

if a == b:
  print('Both objects have same value!')
if c is a:
  print('c and a refer to the same obj')
if a is not b:
  print('a and b refer to diff. objs')

Answer:

Both objects have same value!
c and a refer to the same obj
a and b refer to diff. objs

Core ideas:

  • == looks at values, not references
  • is looks at references, not values

Object Identity

Each distinct object has an ID number that we can access with the id function. For example,

# Different types of equality in Python
fav = 'pizza'
a = f'I <3 {fav}!'
b = f'I <3 {fav}!'
c = a

print(id(a)) # 140426129935136
print(id(b)) # 140426129649376
print(id(c)) # 140426129935136 -- the same!

In CPython (the reference Python implementation), id() returns the (virtual) memory address.

Here’s a challenge: are these object IDs going to be the same?

booger = 10
print(id(booger))
booger = booger + 1
print(id(booger))

Answer: no! Since, 10 and 11 have different addresses in memory! This is in contrast to C++, where the reference stays the same - the value at that memory just changes.

None

Python has a keyword called None. Based on some code examples, what do you think it does?

def play_with_dog(lst):
  obj = None
  for x in lst:
    if x.is_canine():
      obj = x
      break
  if obj is not None:
    obj.give_bone()

def declare_happiness(what = None):
  if what is None:
    print("I'm happy for no reason!")
  else:
    print(f"{what} makes me happy!")

We use None to indicate the absence of a value - it’s like nullptr (in a way). The Pythonic way to validate if an item is/is not None is with the is and is not keywords.

What do you think the following will do?

q = None

if q is False:
  print('Is None the same as False?')

if not q:
  print('Does not work with None?')

if q is None:
  print('Ahhh q is None!')

if q == None:
  print('Ahh q == None!')

Answer:


Does not work with None?
Ahh q is None!
Ahh q == None!

Importantly, None and False aren’t exactly the same thing! But, it’s “falsy” - which is why we get if not q working with None.