Lecture 06
2023-04-19 | Week 3 | edited by Ruining Ding
(originally written 2022-10-12 by Ashwin Ranade, Siddarth Krishnamoorthy)
Hey everyone, Ruining here! This lecture covers slides 37-59 of Python Palooza and slides 1-15 of Data Palooza. Let me know if you’ve got any corrections or questions!
Table of Contents
- Strings
- Lists
- Tuples
- Sets
- Dictionaries
- Parameter Passing
- Handling Errors in Python
- Modules
- Functional Influences in Python
- Data Palooza
- Variables and Values
Strings
- In Python, each string is an object
- Strings are immutable, meaning they can’t be modified once created.
fact1 = 'Del Taco rules! '
fact2 = 'CS131 is lit! '
truth = fact1 + fact2 #1
truth += 'I have spoken.' #2
The code above appears to mutate the string referred to by truth
, but what’s really happening?
…
Answer: Line 1 and 2 both create a NEW string object, change the object reference to point at the new object, and, at some later time, the original strings are garbage collected.
In general, Python automatically generates/trashes lots of objects as code runs without needing to notify the programmer.
Substrings
# String slicing in Python
truth = 'UCLA students are awesome!'
print(truth[6]) #t
print(truth[1:3]) #CL
print(truth[:4]) #UCLA
print(truth[22:]) # some!
print(truth[-4:-1]) # ome
- substring of length 1 at
a
:str[a]
- substring from indices
[a-b)
:str[a:b]
- when first param is empty, 0 is assumed:
str[:b]
is the same asstr[0:b]
- when the second param is empty, End is assumed:
str[a:]
<->str[a:End]
- when first param is empty, 0 is assumed:
- to slice characters relative to the end of the string, use negative indexes, e.g.:
str[-1]
is the last character in strstr[:-1]
==str[:len(str)-1]
Keep in mind, the right value of the substring range is always exclusive.
Lists
List are also objects and support the same operations as strings, but, lists are mutable!
stuff = [42, False, 'walnuts']
stuff[2] = 'USC students'
stuff[0:2] = ["It's", 4, 'real']
stuff.append('are') # or stuff += ['are']
stuff = stuff + ['subpar']
print(stuff[3:])
if 'walnuts' not in stuff:
for s in stuff:
print(s)
------------------------------
['USC students', 'are', 'subpar']
It's
4
real
USC students
are
subpar
Unlike Haskell Lists, Python lists can hold elements of different types.
We can replace elements/sublists with other elements/sublists. Element replacement must be one-to-one (‘walnuts’ and ‘USC students’) but sublists don’t need to be the same size to be reassigned ([“It’s”, 4, ‘real’] and [42, False]).
stuff.append
mutates the list directly while stuff = stuff + ['subpar']
creates a new list since assignment still creates a whole new object.
Sidenote: We can also use stuff.pop
to remove return the last item of a list.
The for loop s
will point to successive items in the listas the loop runs.
List Implementation + Complexity
- Accessing the jth element of a list, e.g., x[j] is super fast!
- Lists are implemented with dynamically-allocated arrays of object references.
big-O
in
andnot in
:O(n)
for a list of size nlist[j]
where j is large:O(1)
- appending one list to another:
O(m+n)
For more details on this last bulletpoint, check out this CampusWire post.
Lists of Lists
# Lists of lists (lol) in Python
primes = [1,2,3]
odds = ['carey', 'todd']
lol = [primes, odds]
primes[2] = 7
odds.append('paul')
odds = [1,3,5]
print(lol)
-------
[[1, 2, 7], ['carey', 'todd', 'paul']]
lol
builds a list of lists and essentially contains two object references: one to primes
and one to odds
.
When we run primes[2] = 7
and odds.append('paul')
, lol is updated accordingly
But, when we run odds = [1,3,5]
, the odds
reference is reassigned but the first object reference in lol
is not!
Q: What would print if we moved the line declaring lol
to right above the print
statement?
…
A: [[1, 2, 7], [1, 3, 5]]
Tuples
Python tuples are immutable, ordered groups of items, e.g.,
([1,2,3],False)
, (1,True,'foo')
, etc.
Tuples can contain various types of items and are commonly used for returning multiple values from a function.
def get_school_and_scholarship(gpa):
if gpa > 4.2:
return ('UCLA', 0)
else:
return 'USC', 100000
tup = get_school_and_scholarship(4.5)
print(f'You got into {tup[0]} with ${tup[1]}')
skool, dough = get_school_and_scholarship(1.3)
print(f'You got into {skool} with ${dough}')
-------
You got into UCLA with $0
You got into USC with $100000
Tuples can be defined explicitly with parentheses (('UCLA', 0)
) or implicitly ('USC', 100000
).
We can access the items of the tuple by indexing or assign names with pattern matching.
Sets
- Python has built-in support for the set abstract data type (ADT).
- Python sets hold a single unique copy of each value
- operations:
.add
,.remove
draining = set()
draining.add('CS131')
draining.add('dating')
draining.add('studying')
draining.remove('CS131')
print(draining)
if 'CS131' not in draining:
print('Studying for CS131 is NOT draining!')
# Let's create a set from a list...
dinner = ['salad','soup','steak','soup','pie']
dinner_set = set(dinner)
print(f'Unique foods: {dinner_set}')
---------------------------------
{'dating', 'studying'}
Studying for CS131 is NOT draining!
Unique foods: {'soup', 'steak', 'pie', 'salad'}
- sets are unordered
- Python uses hash table for sets
more Python set operations
-
is difference|
is union&
is intersection
Be careful differentiating is/and/or from ==/&/ | in Python. and
and or
will give you the wrong result if you try to perform set operations.
Dictionaries
- Python has first-class support for dictionaries (maps) – they’re super fast!
- Note: Python dictionaries are insertion-ordered as of Python 3.7, but sets are still unordered. (Unlike C++ maps, Python dictionaries are NOT ordered by key.)
- each key is unique in a dictionary and maps to only 1 value
- different keys can map to values of different types
attrs = {'cs32':'a weeder', 'cs143':'practical'}
attrs['cs181'] = 'theoretical'
attrs['cs132'] = ['challenging','applied']
attrs['cs181'] = 'fascinating'
if 'cs181' in attrs:
print(f"CS181 is known to be {attrs['cs181']}")
del attrs['cs181'] # remove key/value
for key, val in attrs.items():
print(f'Key {key} maps to {val}')
------------------------------------------------
CS181 is known to be fascinating
Key cs32 maps to a weeder
Key cs143 maps to practical
Key cs132 maps to ['challenging', 'applied']
Parameter Passing
- It has just one approach – and it’s called “pass by object reference.”
- (And it’s identical to pass by pointer in C++)
Every variable in Python is an object reference – it just holds the address of a value!
- When we call a function, Python just passes the object reference (pointer) to the function!
- The parameter is also an object reference (since it’s just a variable).
Parameter Passing Practice
Examples of creating a new object (so original object doesn’t change):
def nerdify(s):
s = 'coding ' + s
i_like = 'parties'
nerdify(i_like)
print(i_like)
def peachify(f):
f = f + ['peach']
fruits = ['apple', 'cherry']
peachify(fruits)
print(fruits)
def largeify(c):
c = Circle(10)
unit = Circle(1)
largeify(unit)
print(unit.radius())
-------------------------
parties
['apple', 'cherry']
1
Examples of mutating the original object, so that the original object DOES change:
def peachify2(f):
f.append('peach')
fruits = ['apple', 'cherry']
peachify2(fruits)
print(fruits)
def largeify2(c):
c.set_radius(10)
unit = Circle(1)
largeify2(unit)
print(unit.radius())
-------------------------
['apple', 'cherry', 'peach']
10
Understanding pass-by-object reference and its implications is critical for writing correct code in Python … and most other languages - except C++!
Handling Errors in Python
- When Python encounters an error that it doesn’t know how to handle, it generates a special error called an “exception.”
- If you don’t add code to “handle” an exception, it will cause the program to terminate.
- Why? The function that “generated the exception” will immediately return, then the function that called it will immediately return, and so on, until your program exits!
We use try
and except
to handle errors in Python.
#we can have the except in this function
def div(a, b):
try:
temp = a/b
except:
return None #invalid result
return temp
#we could also catch the error here
def main():
try:
result = div(10, 0)
print(f'The result was {result}')
except:
print('You divided by zero!')
main() # call main function
Finally, we can even have multiple except blocks, each dealing with a different type of issue!
Modules
- A script is a .py file that implements a main() function and is meant to run a stand-alone program.
-
Scripts are run from the command line, like this:
python3 script_name.py
- A module is a .py file that implements a set of related classes or functions for use as a library (e.g., for machine learning).
- Modules are intended to be imported into a python script or other modules to provide needed functionality.
Importing Modules
You can “import” a module and use its function/classes in your programs.
Different ways to import modules:
import math
def hypot(a,b):
return math.sqrt(a**2+b**2)
from math import sqrt, cos, sin
def hypot(a,b):
return sqrt(a**2+b**2)
from math import *
def hypot(a,b):
return sqrt(a**2+b**2)
import math as m
def hypot(a,b):
return m.sqrt(a**2+b**2)
- Each module implicitly defines its own namespace (e.g., math).
- This prevents collisions between similarly-named functions/classes in different modules.
Creating a Module
our_shape.py
class Circle:
def __init__(self, radius):
self.radius = radius
def area(self):
return 3.14 * self.radius**2
We import the module with import our_shape
, since the file is called our_shape.py
.
Then, we can create objects using the module’s name as a prefix.
c = our_shape.Circle(1)
- We create larger Python programs using multiple modules
Module versus Script
- A Python script is a .py file that’s run from the command line.
- A Python module is a .py file that’s imported by another .py file.
How can we tell?
If you run a .py file from the command line, Python sets __name__
to __main__
indicating it’s a script. Otherwise, Python sets __name__
to the module’s name.
Functional Influences in Python
Comprehensions
List Comprehension:
input = [10,11,12,15,17,22,23,5]
doubled_odds = [x*2 for x in input if x % 2 == 1]
s = "David's dirty dog drank dirty water down by the dam"
Set Comprehension:
wordz3 = {w for w in s.split() if w[0] == 'd'} # hint: set
Dict Comprehension:
wordz4 = {w:len(w) for w in s.split()} # hint: dict
Output:
[9, 16, 25]
{'drank', 'dam', 'dirty', 'down', 'dog'}
{"David's": 7, 'dirty': 5, 'dog': 3, 'drank': 5, 'water': 5, 'down': 4, 'by': 2, 'the': 3, 'dam': 3}
Lambdas
def foo(f):
print(f("Carey"))
def main():
foo(lambda x : x + "has earwax") #returns "Carey has earwax"
y = "a lot"
foo(lambda x : x + "has earwax" + y) #returns "Carey has earwax a lot"
- The last lambda captures the
y
variable from the enclosing scope!
This is the end of the Python section of the class. Here are some Python Cheat Sheets and we’re moving on to the next topic: Data palooza, where we study a feature across different languages.
Data Palooza
In the next few lectures, we will be covering the internals of how many languages manage data (including types, variables and values). We won’t be covering specific languages, but will instead look at patterns that occur across many languages. Before that, let’s do a brief introduction of some of the terms we will be discussing.
Variables and Values
A variable is the symbolic name associated with a location that contains a value or a pointer. A value is a piece of data with a type (usually) that is either referred to by a variable, or computed by a program. For a concrete example, consider the statement
a = 42
Here a
is a variable, and 42
is a value.
What are the facets that make up a variable?
- names: variables almost always have a name
- types: a variable may (or may not) have a type associated with it
- values: a variable stores a value (and its type)
- binding: how a variable is connected to its current value
- storage: the slot of memory that stores the value associated with the variable
- lifetime: the timeframe over which a variable exists
- scope: when/where a variable is accessible by code
- mutability: can a variable’s value be changed?
What are the facets that make up a value?
names: variables almost always have a name- types: a value will always have a type associated with it
values: a variable stores a value (and its type)binding: how a variable is connected to its current value- storage: the slot of memory that stores the value
- lifetime: the timeframe over which a value exists
scope: when/where a variable is accessible by code- mutability: can a value be changed?
Lifetime and scope seem similar, but they are not the same. Lifetime refers to the existence of the variable, whereas scope refers to the accessibility of a variable. It is entirely possible for a variable to be out-of-scope but still be alive.
Variable names
What must a language designer consider when deciding variable naming rules for a language? There are multiple possible choices here, and there isn’t one single correct answer.
- Almost all languages stipulate that names should contain valid characters
- Almost all languages stipulate that names should not be the same as keywords or constants
- Most languages have a rule that disallows spaces in variable names
- Some languages have rules about special characters in names, some enforce length restrictions, and some even enforce some sort of case sensitivity rule.
Practically, variable naming conventions are more important than the naming rules of languages. These conventions are usually designed to enforce some degree of standardisation across a codebase. A good example of a standard would be Google’s standard.
Why do most loops idiomatically use i
or j
for loop variables? It all goes back to Fortran, the first standardised programming langauge. Fortran had a quirky rule where if you didn’t explicitly declare a variable type, it would the first letter of the name to determine the type. Variables starting from a-h
and o-z
defaulted to floats
, while variables starting from i-n
defaulted to integers
.
Variable storage
Variables and values are stored in one of three different places: the stack, the heap or the static data area.
Layout of program memory. Function parameters and local variables are stored on the stack. Variables dynamically allocated during runtime lie in the heap. Static members and globals are stored in the static data area. |
Usually local variables and function parameters are stored on the stack. Dynamically allocated objects and values are usually stored on the heap. Most languages store global and static variables in the static data area. Of course, you can also have combinations of these; for example, a pointer that is stored on the stack, but whose value lies on the heap.
Variable types
What can you infer about a value, given its type?
- The set of legal values it can hold
- The operations we can perform on it
- How much memory you need
- How to interpret the bytes stored in RAM
- How values are converted between types
Variable lifetime and scope
Every variable and value has a lifetime, over which they are valid and may be accessed. Note that it is possible for a variable to be “alive” during execution, and yet not be accessible. Some languages give the programmer explicit control over the lifetime of a value, while others completely abstract this away.
void bar() {
...
}
int main() {
int x = 5;
bar();
...
}
A variable is in scope if it can be explicitly accessed by name in that region.
void bar(int *ptr) {
cout << x; // ERROR! x isn’t in-scope here!
cout << *ptr; // Even though its value can be accessed
}
void foo() {
int x;
cout << x; // x is in-scope here
bar(&x);
}
There are two primary approaches to scoping: lexical (or static) scoping and dynamic scoping. We will cover these topics in the next lecture.
Variable binding
As we have seen in Lecture 1, different languages have different ways of binding variable names with values. For example a C++ variable name directly refers to the storage location holding it’s value, while in Python, a variable holds an “object reference”, which then in turn points to the actual value. We will cover the major binding approaches in detail later on.
Mutability
If a variable is immutable, then it’s value cannot be changed once assigned, and vice versa if a variable is mutable. Most languages have some form of immutability. In C++, variables can be made immutable using the const
keyword. In Haskell, all variables are by default immutable. Immutability might seem like it makes the language less flexible (and it does to some extent), it also allows for the program to be less buggy, and can also allow for some compiler optimisations to take place.