In [1]:

from IPython.display import display
%pylab inline

Populating the interactive namespace from numpy and matplotlib

Python Basics

The highlights

Python is an interpreted language
- Somewhat more flexibility w/r/t data types
- Prototyping is faster
- No compiler error-checking
- Programs run slower, but this can be minimized for most practical purposes.
Whitespace (in the form of tab, "") delineates code blocks
- No braces!
Modular philosophy
- You can write programs using a single file, but Python also makes it reasonably easy to import code from other sources
- A large amount of code for doing common things has been open-sourced, which you can use in your code
  - Either already included in the "standard library" or available online

Syntax

NB: It is entirely possible to change Python's syntax on the fly, and there's generally a number of ways you can do things. One way of doing things is generally easier than the others though.

Data Types

Everything in Python is an object.
- This can be confusing
There are a few built-in data types, but the main ones to be aware of are
- int
- float
- str
- boolean
- list
- dict
Variables are objects accessible by a certain sequence of characters separated by whitespace (their name), with some restrictions on what can comprise the name
- Basically, don't include control characters or punctuation in the name, and begin the name with a letter

In [2]:

# Examples of types
print 1,type(1)
print 1.0,type(1.0)
print "hello", type("hello")
print True, type(True)
print [1,1.0,"hello",True], type([1,1.0,"hello",True])

# Example variable assignments
a = 1
b = 1.0
c = "hello"
d = True
e = [1,1.0,"hello",True]

print("\n")
print a, b, c, d, e

1 <type 'int'>
1.0 <type 'float'>
hello <type 'str'>
True <type 'bool'>
[1, 1.0, 'hello', True] <type 'list'>


1 1.0 hello True [1, 1.0, 'hello', True]

We define functions like the following:

In [3]:

# A function that takes an argument and prints something, but doesn't return anything
def foo(argument):
    print("No Python tutorial is complete without",argument)

# A function that takes two arguments and checks if their "type" is equal
def bar(arg1,arg2):
    return type(arg1) == type(arg2)
    
# A function that takes TWO arguments and returns TWO values
def fib(a,b):
    return b, b + a

Importing modules can (unsurprisingly) be done using an "import" statement. Python automatically looks for them in a few places; if you have problems importing a module, check that the name is correct and that the directory that the module's located in appears in your

In [4]:

import math
import random

Using Imported Modules

Using imported modules is somewhat different from the builtin functions or the ones you might define yourself; once you've imported a module you have to prefix the function you want with the module's name:

In [5]:

print random.random()

0.634189223155

If you just try to use "random", you'll get an exception (unless you've imported using the "from < module > import ..." syntax).

In [6]:

random()

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-6-1f8a42ab0e97> in <module>()
----> 1 random()

TypeError: 'module' object is not callable

Lists

Lists are probably the main thing you'll use to store your data. They're essentially linked pieces of data, and can be accessed on an element-by-element basis, or as a "slice" containing multiple such elements. They use "0-based indexing".

In [7]:

lst = [1,2,3,"a","b","c"] # Construct the list
print(lst)    # Print the list
print(lst[0]) # Print an element of the list
print(lst[5])
print(lst[1:4])

# We can modify individual parts of the list:
lst[2] += 1
print(lst[2])
print(lst)

# Or whole swaths:
lst[:3] = [3,2,1]
print(lst)

[1, 2, 3, 'a', 'b', 'c']
1
c
[2, 3, 'a']
4
[1, 2, 4, 'a', 'b', 'c']
[3, 2, 1, 'a', 'b', 'c']

For Loops

For most of your repeatable data manipulation you'll probably use "For..." loops, which execute the same code on some sort of "iterable" object. For example, you might print the numbers from 0 to 3. Or multiply such numbers together. It takes the following form:

In [8]:

print("Here, using a range, which is an iterable object:")
for i in range(4):
    print(i)

print("\nYou can also perform operations on things in ranges:")
for i in range(4):
    print(i**2)
    
print("\nAnd now with an iterable object (here it's a list):")
for i in [0,1,2,3]:
    print(i)
    
for i in "hello":
    print i

Here, using a range, which is an iterable object:
0
1
2
3

You can also perform operations on things in ranges:
0
1
4
9

And now with an iterable object (here it's a list):
0
1
2
3
h
e
l
l
o

In [9]:

def exampleFunction(x):
    tmp = x + 1
    return tmp**2

In [10]:

lst = [i for i in range(10)]
print(lst)

lst = [exampleFunction(i) for i in range(10)]
print(lst)

lst = [exampleFunction(i) for i in lst]
print lst

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[4, 25, 100, 289, 676, 1369, 2500, 4225, 6724, 10201]

Manipulating Data

Plan: implement something that calculates the Fibonacci sequence using all the techniques mentioned above.

Global variables are variables that can be referened from the outermost scope. That is, globally.

Python makes it somewhat difficult for you to modify global variables from inside a function. There are some good reasons for this, but if you're writing small programs and just need to be able to manipulate data from anywhere, problems generally don't occur.

In [11]:

# If the variable's defined in the outer scope, you can access it from inside
gvar = 10
def gvarExample(arg):
    arg  += gvar
    return arg
    
print gvarExample(2)

In [12]:

# You cannot, however, modify it
gvar = 10
def gvarExample(arg):
    arg  += gvar
    gvar = arg
    return arg
    
print gvarExample(2)

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
<ipython-input-12-d5b4e0f230d6> in <module>()
      6     return arg
      7 
----> 8 print gvarExample(2)

<ipython-input-12-d5b4e0f230d6> in gvarExample(arg)
      2 gvar = 10
      3 def gvarExample(arg):
----> 4     arg  += gvar
      5     gvar = arg
      6     return arg

UnboundLocalError: local variable 'gvar' referenced before assignment

In [13]:

# So to accomplish the above, we'd use
# If the variable's defined in the outer scope, you can access it from inside
gvar = 10
def gvarExample(arg):
    global gvar
    arg  += gvar
    gvar = arg
    return arg
    
print gvarExample(2)
print gvar

12
12

You can also use it to create functions that don't take arguments, but still do something based on the global variables

In [14]:

a = 1
b = 1
def summation():
    global a, b
    b, a = a+b,b

In [15]:

for i in range(10):
    summation()
    print b

Tuple Assignment

You can perform multiple assignment operations at the same time. This is mainly useful for when assigning things in series would be inconvenient. For example, without tuple assignment, computing a recurrence relation might require a temporary variable to store intermediate numbers while the others are being computed.

In [16]:

ta, tb  = "Bond, James","Bond"
# The above is the same thing as this:

tEx1 = "Bond, James"
tEx2 = "Bond"

print ta, tb
print tEx1, tEx2

# You can even switch the variables conveniently
ta, tb = tb, ta
print ta, tb

Bond, James Bond
Bond, James Bond
Bond Bond, James

Returning multiple things at the same time

You can also return multiple things from functions, separated by a comma

In [17]:

def fibWrapper(n):
    # Fibonacci sequence done differently
    a,b = 1,1
    print "Printing the Fibonacci sequence up to the %dth term"%n
    for i in range(n):
        a, b = fib(a,b)
        print b
    print "\n"
        
fibWrapper(5)
print b

Printing the Fibonacci sequence up to the 5th term
2
3
5
8
13


144

NumPy

Numpy is an absurdly useful module for Python that addresses two weaknesses: Python's lack of an efficient array data container, and speed. It also contains a number of useful functions for linear algebra and data analysis.

In [18]:

import numpy as np

Quick Intro to Useful Things

In [19]:

array = np.array([i for i in range(10)]) # Make an array from a list
display(array)

array = np.ones((3,3)) # Make a 3-by-3 array of ones
display(array)

array = np.zeros((2,3,4)) # Two rows, three columns, four layers
display(array)

array = np.random.random((3,2)) # Three rows, two colums, random array
display(array)

display(array[:,1]) # Show ALL values from the SECOND column

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

array([[[ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.]],

       [[ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.],
        [ 0.,  0.,  0.,  0.]]])

array([[ 0.99806244,  0.90363532],
       [ 0.99535123,  0.24491663],
       [ 0.09041789,  0.81109934]])

array([ 0.90363532,  0.24491663,  0.81109934])

Operations on Arrays

You can operate on each array in a highly convenient manner:

In [20]:

array = np.arange(1,10)
array = array.reshape((3,3)) # Reshape it to be 3-by-3

display(array)

# Summing over an array
print(np.sum(array))

# Summing over a single column of an array
print(np.sum(array[:,1]))

# Adding one to all elements of an array
array += 1
display(array)

# Multiplying a single row of the array by 10
array[1] *= 10
display(array)

# Performing the dot product (i.e., matrix multiplication) between two arrays.
product = np.dot(array,array)
display(product)

# Change a single element of an array
product[1,1] = -1
display(product)

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

45
15

array([[ 2,  3,  4],
       [ 5,  6,  7],
       [ 8,  9, 10]])

array([[ 2,  3,  4],
       [50, 60, 70],
       [ 8,  9, 10]])

array([[ 186,  222,  258],
       [3660, 4380, 5100],
       [ 546,  654,  762]])

array([[ 186,  222,  258],
       [3660,   -1, 5100],
       [ 546,  654,  762]])

Example: Numerical Integration

In the 1660s or so, a physicist and part-time mercury-crazed alchemist named Isaac Newton developed the theory of calculus so that he could derive solutions to... something or other. Thus, he became famous, and went on to become "Master of the Mint", thereby becoming the first quant in history. Anywho, I'm sure that was nice at the time, but here I'm going to use NumPy to accomplish all that in like, twenty seconds.

In [21]:

# Integrate a (python) function between a and b using the trapezoidal rule and numpy
def nptrapezoid(func,a,b,n=100):
    npfunc = np.frompyfunc(func,1,1) # Create a new numpy function, because syntactic sugar
    
    # Set up the points at which to evaluate the function, the interval lengths
    xp = np.linspace(a,b,n)
    xd = xp[1:] - xp[:-1]
    
    # Evaluate at points, take average
    yp = (npfunc(xp[1:]) + npfunc(xp[:-1]))/2
    
    # Return averages times distances
    return np.dot(yp,xd)

Now we can prove how much smarter we collectively are than that jerk, Isaac Newton.

In [22]:

nptrapezoid(math.exp,0,5,n=10000)

Out[22]:

147.41316217429818

In [23]:

math.exp(5) - math.exp(0)

Out[23]:

147.4131591025766

The level of agreement isn't terrible! And it was super painless to write. I hope Newton has a law of cooling or something, because he just got burned. Oh wait, he does. Still, the takeaway is the NumPy is very useful.

Miscellaneus Useful Things

You might want to remember the following:

* argmax
* argmin
* argsort

The "argmax" function finds the maximum argument, but returns the index at which it is located. Similarly for "argmin". Guess what "argsort" does?

In [24]:

import matplotlib
import matplotlib.pyplot as plt

Basic Things

Matplotlib is a graphing utility written developed for Python. It's fairly flexible and even widely used, but it's not

How not to Graph like a Jerk

These days, people just don't know how to graph properly. Which is a shame, because graphing lends meaning to our lives.

Here are a number of guidelines for formatting graphs. They aren't comprehensive, but should be a decent start.

Graph using the proper style
- E.g., if the data points aren't smooth, don't connect them with a line plot.
Title each graph appropriately
- It should tell the viewer what they're looking at, more or less.
- It can be more than a couple words, if it helps the viewer understand
Label each axis
- What does the axis represent?
- To make sense of the graph, the viewer needs to know scale, units, etc.
Adjust limits appropriately
- The point of a graph is to organize and visually convey information because the visual system has more bandwidth
- As such, aim to have the graph fill around 2/3rds of the space
- If there is a particularly interesting area, perhaps another graph could focus on that.
If multiple data series are used, consider adding a legend or other means of indicating what each line represents.
If necessary to the experiment, include errors/uncertainties.
Consider explaining your graph in words, as well.

Example: Numerical Integration, Performed In The Style Of Internet Spam

In [25]:

sine = math.sin
# Alternatively, sine = lambda x: math.sin(x)

# Create the x-data and the y-data, in arrays
x_data  = np.linspace(0,6*math.pi,100)

# You can use list comprehensions if it makes things easier, or even do both
y_data  = np.array([nptrapezoid(sine,0,x) for x in x_data])


# Create the plots, as a figure object and an axis
fig, ax = plt.subplots(1)
ax.scatter(x_data,y_data)

# Label the plots
ax.set_title("One Weird Numerical Trick") # Calculus teachers HATE him!
ax.set_ylabel("$\int_0^t \cos{(x)}\,dx$") # You can even embed LaTeX in these labels!
ax.set_xlabel("$t$")

# Make some modifications to the x- and y-axis
#ax.set_xlim([0,4*math.pi])
#ax.set_ylim([-0.5,2.5])
ax.grid(True) # Turn on gridlines

# Adjusting size for improved resolution in saved file
figDefaultSize = fig.get_size_inches()  
fig.set_size_inches((figDefaultSize[0]*2, figDefaultSize[1]*2))

# Saving File (in current working directory)
#fig.savefig("MyGraphFilename.png") # or .pdf, .svg, .jpeg

# Close the graph so it doesn't appear in the IPython Notebook
#plt.close()

Miscellaneous

Here are some things I'd like to write about, but aren't necessary for the assignment.

Functional Programming

Running Code From Command Line

How to Set Up Python

IPython

IPython is an extremely useful tool. Essentially, it's a version of Python with a number of magic methods and other conveniences built in. When run in terminal, it's a quite a bit more flexible than the standard Python shell, but you can do other things with it, too. For instance, this document was produced using IPython Notebook.

IPython Notebook

Integrated Graphing

Parallel Processing

Resources

Anaconda's installation instructions (i.e., how to get python set up, quickly and easily)
- http://docs.continuum.io/anaconda/install.html
Python's Cheeseshop (central location of a number of useful modules not in the Standard Library)
- https://pypi.python.org/pypi
Python's Documentation
- http://docs.python.org/2/index.html
Numpy's Documentation
- http://docs.scipy.org/doc/numpy/user/index.html
Some NumPy Tutorials
- http://wiki.scipy.org/Tentative_NumPy_Tutorial
- http://nbviewer.ipython.org/urls/raw.github.com/iguananaut/notebooks/master/numpy.ipynb
Some Matplotlib Tutorials
- http://www.loria.fr/~rougier/teaching/matplotlib/
- http://nbviewer.ipython.org/urls/raw.github.com/jrjohansson/scientific-python-lectures/master/Lecture-4-Matplotlib.ipynb