Lecture 06

2023-04-19 | Week 3 | edited by Ruining Ding

(originally written 2022-10-12 by Ashwin Ranade, Siddarth Krishnamoorthy)

Hey everyone, Ruining here! This lecture covers slides 37-59 of Python Palooza and slides 1-15 of Data Palooza. Let me know if you’ve got any corrections or questions!

Strings
- Substrings
Lists
- List Implementation + Complexity
- Lists of Lists
Tuples
Sets
- more Python set operations
Dictionaries
Parameter Passing
- Parameter Passing Practice
Handling Errors in Python
Modules
Functional Influences in Python
- Comprehensions
- Lambdas
Data Palooza
Variables and Values

Strings

In Python, each string is an object
Strings are immutable, meaning they can’t be modified once created.

fact1 = 'Del Taco rules! '
fact2 = 'CS131 is lit! '

truth = fact1 + fact2 #1

truth += 'I have spoken.' #2

The code above appears to mutate the string referred to by truth, but what’s really happening? …

Answer: Line 1 and 2 both create a NEW string object, change the object reference to point at the new object, and, at some later time, the original strings are garbage collected.

In general, Python automatically generates/trashes lots of objects as code runs without needing to notify the programmer.

Substrings

# String slicing in Python
truth = 'UCLA students are awesome!'
print(truth[6]) #t
print(truth[1:3]) #CL
print(truth[:4]) #UCLA
print(truth[22:]) # some!
print(truth[-4:-1]) # ome

substring of length 1 at a: str[a]
substring from indices [a-b): str[a:b]
- when first param is empty, 0 is assumed: str[:b] is the same as str[0:b]
- when the second param is empty, End is assumed: str[a:] <-> str[a:End]
to slice characters relative to the end of the string, use negative indexes, e.g.:
- str[-1] is the last character in str
- str[:-1] == str[:len(str)-1]

Keep in mind, the right value of the substring range is always exclusive.

Lists

List are also objects and support the same operations as strings, but, lists are mutable!

stuff = [42, False, 'walnuts']  

stuff[2] = 'USC students'
stuff[0:2] = ["It's", 4, 'real']
stuff.append('are')    # or stuff += ['are']

stuff = stuff + ['subpar']
print(stuff[3:])
if 'walnuts' not in stuff:
  for s in stuff:
    print(s)
------------------------------
['USC students', 'are', 'subpar']
It's
4
real
USC students
are
subpar

Unlike Haskell Lists, Python lists can hold elements of different types.

We can replace elements/sublists with other elements/sublists. Element replacement must be one-to-one (‘walnuts’ and ‘USC students’) but sublists don’t need to be the same size to be reassigned ([“It’s”, 4, ‘real’] and [42, False]).

stuff.append mutates the list directly while stuff = stuff + ['subpar'] creates a new list since assignment still creates a whole new object.

Sidenote: We can also use stuff.pop to remove return the last item of a list.

The for loop s will point to successive items in the listas the loop runs.

List Implementation + Complexity

Accessing the jth element of a list, e.g., x[j] is super fast!
Lists are implemented with dynamically-allocated arrays of object references.

big-O

in and not in: O(n) for a list of size n
list[j] where j is large: O(1)
appending one list to another: O(m+n)

For more details on this last bulletpoint, check out this CampusWire post.

Lists of Lists

# Lists of lists (lol) in Python

primes = [1,2,3]
odds = ['carey', 'todd']

lol = [primes, odds]

primes[2] = 7
odds.append('paul')
odds = [1,3,5]

print(lol)
-------
[[1, 2, 7], ['carey', 'todd', 'paul']]

lol builds a list of lists and essentially contains two object references: one to primes and one to odds.

When we run primes[2] = 7 and odds.append('paul'), lol is updated accordingly But, when we run odds = [1,3,5], the odds reference is reassigned but the first object reference in lol is not!

Q: What would print if we moved the line declaring lol to right above the print statement?

…

A: [[1, 2, 7], [1, 3, 5]]

Tuples

Python tuples are immutable, ordered groups of items, e.g., ([1,2,3],False), (1,True,'foo'), etc.

Tuples can contain various types of items and are commonly used for returning multiple values from a function.

def get_school_and_scholarship(gpa):
  if gpa > 4.2:
    return ('UCLA', 0)
  else:
    return 'USC', 100000


tup = get_school_and_scholarship(4.5)
print(f'You got into {tup[0]} with ${tup[1]}')

skool, dough = get_school_and_scholarship(1.3)
print(f'You got into {skool} with ${dough}')
-------
You got into UCLA with $0
You got into USC with $100000

Tuples can be defined explicitly with parentheses (('UCLA', 0)) or implicitly ('USC', 100000).

We can access the items of the tuple by indexing or assign names with pattern matching.

Sets

Python has built-in support for the set abstract data type (ADT).
Python sets hold a single unique copy of each value
operations: .add, .remove

draining = set()

draining.add('CS131')
draining.add('dating')
draining.add('studying')
draining.remove('CS131')
print(draining)

if 'CS131' not in draining:
  print('Studying for CS131 is NOT draining!')

# Let's create a set from a list...
dinner = ['salad','soup','steak','soup','pie']
dinner_set = set(dinner)
print(f'Unique foods: {dinner_set}')
---------------------------------
{'dating', 'studying'}
Studying for CS131 is NOT draining!
Unique foods: {'soup', 'steak', 'pie', 'salad'}

sets are unordered
Python uses hash table for sets

more Python set operations

- is difference
| is union
& is intersection

Be careful differentiating is/and/or from ==/&/ | in Python. and and or will give you the wrong result if you try to perform set operations.

Dictionaries

Python has first-class support for dictionaries (maps) – they’re super fast!
Note: Python dictionaries are insertion-ordered as of Python 3.7, but sets are still unordered. (Unlike C++ maps, Python dictionaries are NOT ordered by key.)
each key is unique in a dictionary and maps to only 1 value
different keys can map to values of different types

attrs = {'cs32':'a weeder', 'cs143':'practical'}

attrs['cs181'] = 'theoretical'
attrs['cs132'] = ['challenging','applied']
attrs['cs181'] = 'fascinating'

if 'cs181' in attrs:
  print(f"CS181 is known to be {attrs['cs181']}")

del attrs['cs181']    # remove key/value

for key, val in attrs.items():
  print(f'Key {key} maps to {val}')
------------------------------------------------
CS181 is known to be fascinating
Key cs32 maps to a weeder
Key cs143 maps to practical
Key cs132 maps to ['challenging', 'applied']

Parameter Passing

It has just one approach – and it’s called “pass by object reference.”
(And it’s identical to pass by pointer in C++)

Every variable in Python is an object reference – it just holds the address of a value!

When we call a function, Python just passes the object reference (pointer) to the function!
The parameter is also an object reference (since it’s just a variable).

Parameter Passing Practice

Examples of creating a new object (so original object doesn’t change):

def nerdify(s):
 s = 'coding ' + s

i_like = 'parties'
nerdify(i_like)
print(i_like)

def peachify(f):
 f = f + ['peach']

fruits = ['apple', 'cherry']
peachify(fruits)
print(fruits)

def largeify(c):
 c = Circle(10)

unit = Circle(1)
largeify(unit)
print(unit.radius())
-------------------------
parties
['apple', 'cherry']
1

Examples of mutating the original object, so that the original object DOES change:

def peachify2(f):
 f.append('peach')

fruits = ['apple', 'cherry']
peachify2(fruits)
print(fruits)

def largeify2(c):
 c.set_radius(10)

unit = Circle(1)
largeify2(unit)
print(unit.radius())
-------------------------
['apple', 'cherry', 'peach']
10

Understanding pass-by-object reference and its implications is critical for writing correct code in Python … and most other languages - except C++!

Handling Errors in Python

When Python encounters an error that it doesn’t know how to handle, it generates a special error called an “exception.”
If you don’t add code to “handle” an exception, it will cause the program to terminate.
Why? The function that “generated the exception” will immediately return, then the function that called it will immediately return, and so on, until your program exits!

We use try and except to handle errors in Python.

#we can have the except in this function
def div(a, b):
  try:
    temp = a/b
  except:
    return None #invalid result
  return temp

#we could also catch the error here
def main():
  try:
   result = div(10, 0)
   print(f'The result was {result}')
  except:
   print('You divided by zero!')

main() # call main function

Finally, we can even have multiple except blocks, each dealing with a different type of issue!

Modules

A script is a .py file that implements a main() function and is meant to run a stand-alone program.
Scripts are run from the command line, like this: python3 script_name.py
A module is a .py file that implements a set of related classes or functions for use as a library (e.g., for machine learning).
Modules are intended to be imported into a python script or other modules to provide needed functionality.

Importing Modules

You can “import” a module and use its function/classes in your programs.

Different ways to import modules:

import math

def hypot(a,b):
  return math.sqrt(a**2+b**2)

from math import sqrt, cos, sin

def hypot(a,b):
  return sqrt(a**2+b**2)

from math import *

def hypot(a,b):
  return sqrt(a**2+b**2)

import math as m

def hypot(a,b):
  return m.sqrt(a**2+b**2)

Each module implicitly defines its own namespace (e.g., math).
This prevents collisions between similarly-named functions/classes in different modules.

Creating a Module

our_shape.py

class Circle:
 def __init__(self, radius):
  self.radius = radius

 def area(self):
  return 3.14 * self.radius**2

We import the module with import our_shape, since the file is called our_shape.py. Then, we can create objects using the module’s name as a prefix. c = our_shape.Circle(1)

We create larger Python programs using multiple modules

Module versus Script

A Python script is a .py file that’s run from the command line.
A Python module is a .py file that’s imported by another .py file.

How can we tell?

If you run a .py file from the command line, Python sets __name__ to __main__ indicating it’s a script. Otherwise, Python sets __name__ to the module’s name.

Functional Influences in Python

Comprehensions

List Comprehension:

input = [10,11,12,15,17,22,23,5]
doubled_odds = [x*2 for x in input if x % 2 == 1]

s = "David's dirty dog drank dirty water down by the dam"

Set Comprehension:

wordz3 = {w for w in s.split() if w[0] == 'd'} # hint: set

Dict Comprehension: wordz4 = {w:len(w) for w in s.split()} # hint: dict

Output:

[9, 16, 25]
{'drank', 'dam', 'dirty', 'down', 'dog'}
{"David's": 7, 'dirty': 5, 'dog': 3, 'drank': 5, 'water': 5, 'down': 4, 'by': 2, 'the': 3, 'dam': 3}

Lambdas

def foo(f):
  print(f("Carey"))

def main():
  foo(lambda x : x + "has earwax") #returns "Carey has earwax"
  y = "a lot"
  foo(lambda x : x + "has earwax" + y) #returns "Carey has earwax a lot"

The last lambda captures the y variable from the enclosing scope!

This is the end of the Python section of the class. Here are some Python Cheat Sheets and we’re moving on to the next topic: Data palooza, where we study a feature across different languages.

Data Palooza

In the next few lectures, we will be covering the internals of how many languages manage data (including types, variables and values). We won’t be covering specific languages, but will instead look at patterns that occur across many languages. Before that, let’s do a brief introduction of some of the terms we will be discussing.

Variables and Values

A variable is the symbolic name associated with a location that contains a value or a pointer. A value is a piece of data with a type (usually) that is either referred to by a variable, or computed by a program. For a concrete example, consider the statement

a = 42

Here a is a variable, and 42 is a value.

What are the facets that make up a variable?

names: variables almost always have a name
types: a variable may (or may not) have a type associated with it
values: a variable stores a value (and its type)
binding: how a variable is connected to its current value
storage: the slot of memory that stores the value associated with the variable
lifetime: the timeframe over which a variable exists
scope: when/where a variable is accessible by code
mutability: can a variable’s value be changed?

What are the facets that make up a value?

~~names: variables almost always have a name~~
types: a value will always have a type associated with it
~~values: a variable stores a value (and its type)~~
~~binding: how a variable is connected to its current value~~
storage: the slot of memory that stores the value
lifetime: the timeframe over which a value exists
~~scope: when/where a variable is accessible by code~~
mutability: can a value be changed?

Lifetime and scope seem similar, but they are not the same. Lifetime refers to the existence of the variable, whereas scope refers to the accessibility of a variable. It is entirely possible for a variable to be out-of-scope but still be alive.

Variable names

What must a language designer consider when deciding variable naming rules for a language? There are multiple possible choices here, and there isn’t one single correct answer.

Almost all languages stipulate that names should contain valid characters
Almost all languages stipulate that names should not be the same as keywords or constants
Most languages have a rule that disallows spaces in variable names
Some languages have rules about special characters in names, some enforce length restrictions, and some even enforce some sort of case sensitivity rule.

Practically, variable naming conventions are more important than the naming rules of languages. These conventions are usually designed to enforce some degree of standardisation across a codebase. A good example of a standard would be Google’s standard.

Why do most loops idiomatically use i or j for loop variables? It all goes back to Fortran, the first standardised programming langauge. Fortran had a quirky rule where if you didn’t explicitly declare a variable type, it would the first letter of the name to determine the type. Variables starting from a-h and o-z defaulted to floats, while variables starting from i-n defaulted to integers.

Variable storage

Variables and values are stored in one of three different places: the stack, the heap or the static data area.


Layout of program memory. Function parameters and local variables are stored on the stack. Variables dynamically allocated during runtime lie in the heap. Static members and globals are stored in the static data area.

Usually local variables and function parameters are stored on the stack. Dynamically allocated objects and values are usually stored on the heap. Most languages store global and static variables in the static data area. Of course, you can also have combinations of these; for example, a pointer that is stored on the stack, but whose value lies on the heap.

Variable types

What can you infer about a value, given its type?

The set of legal values it can hold
The operations we can perform on it
How much memory you need
How to interpret the bytes stored in RAM
How values are converted between types

Variable lifetime and scope

Every variable and value has a lifetime, over which they are valid and may be accessed. Note that it is possible for a variable to be “alive” during execution, and yet not be accessible. Some languages give the programmer explicit control over the lifetime of a value, while others completely abstract this away.

void bar() {
  ...
}

int main() {
  int x = 5;
  bar();
  ...
}

A variable is in scope if it can be explicitly accessed by name in that region.

void bar(int *ptr) {
  cout << x;    // ERROR! x isn’t in-scope here!
  cout << *ptr; // Even though its value can be accessed
}

void foo() {
  int x;
  cout << x;    // x is in-scope here
  bar(&x);
}

There are two primary approaches to scoping: lexical (or static) scoping and dynamic scoping. We will cover these topics in the next lecture.

Variable binding

As we have seen in Lecture 1, different languages have different ways of binding variable names with values. For example a C++ variable name directly refers to the storage location holding it’s value, while in Python, a variable holds an “object reference”, which then in turn points to the actual value. We will cover the major binding approaches in detail later on.

Mutability

If a variable is immutable, then it’s value cannot be changed once assigned, and vice versa if a variable is mutable. Most languages have some form of immutability. In C++, variables can be made immutable using the const keyword. In Haskell, all variables are by default immutable. Immutability might seem like it makes the language less flexible (and it does to some extent), it also allows for the program to be less buggy, and can also allow for some compiler optimisations to take place.