COE 111L - Spring 2017 - W 9-10 AM - WRW 209Jekyll2018-04-17T12:23:50-05:00http:/ECL2017S/Amir Shahmoradihttp:/ECL2017S/amir@ices.utexas.edu<![CDATA[Announcement 3: Final Semester Grades]]>http:/ECL2017S/announcement/3-final-grades2017-05-16T00:00:00-05:002017-05-16T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>The following figure is the histogram of the grand total grade for this course.<br />
<!-- Anyone above $3.5/4.0$, received a letter grade of **A**. --></p>
<figure>
<img src="http:/ECL2017S/announcement/3/gradeDistECL2017S.png" width="700" />
</figure>
<p><br /></p>
<p><a href="http:/ECL2017S/announcement/3-final-grades">Announcement 3: Final Semester Grades</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on May 16, 2017.</p><![CDATA[Homework 10: Solutions - Python advanced, Monte Carlo methods]]>http:/ECL2017S/homework/10-solutions-python-advanced-monte-carlo2017-05-04T00:00:00-05:002017-05-04T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is the solution to <a href="10-problems-python-advanced-monte-carlo.html" target="_blank">Homework 10: Problems - Python advanced Monte Carlo</a>.</p>
<p>The following figure illustrates the grade distribution for this homework.</p>
<figure>
<img src="http:/ECL2017S/homework/gradeDist/gradeHistHomework10.png" width="700" />
<figcaption style="text-align:center">
Maximum possible points is 100.<br />
</figcaption>
</figure>
<hr />
<hr />
<p><br /></p>
<p>This homework further explores Monte Carlo methods in Python.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> <strong>Monte Carlo approximation of the number $\pi$</strong>. Suppose we did not know the value of $\pi$ and we wanted to estimate its value using Monte Carlo methods. One practical approach is to draw a square of unit side, with its diagonal opposite corners extending from the coordinates origin $(0,0)$ to $(1,1)$. Now we try to simulate uniform random points from inside of this square by generating uniform random points along the $X$ and $Y$ axes, i.e., by generating two random uniform numbers (x,y) from the range $[0,1]$.</p>
<p>Now the generated random point $P$ has the coordinate $(x,y)$, so we can calculate its distance from the coordinate origin. Now suppose we also draw a quarter-circle inside of this square whose radius is unit and is centered at the origin $(0,0)$. The ratio of the area of this quarter-circle, $S_C$ to the area of the area of the square enclosing it, $S_S$ is,</p>
<script type="math/tex; mode=display">\frac{S_C}{S_S} = \frac{\frac{1}{4}\pi r^2}{r^2} = \frac{1}{4}\pi</script>
<p>This is because the area of the square of unit sides, is just 1. Therefore, if we can somehow measure the area of the quarter $S_C$, then we can use the following equation, to get an estimate of $\pi$,</p>
<script type="math/tex; mode=display">\pi = 4S_C</script>
<p>In order to obtain, $S_C$, we are going to throw random points in the square, just as described above, and then find the fraction of points, $f=n_C/n_{\rm total}$, that fall inside this quarter-circle. This fraction is related to the area of the circle and square by the following equation,</p>
<script type="math/tex; mode=display">f=\frac{n_C}{n_{\rm total}} = \frac{S_C}{S_S}</script>
<p>Therefore, one can obtain an estimate of $\pi$ using this fraction,</p>
<script type="math/tex; mode=display">\pi \approx \frac{1}{4}\frac{n_C}{n_{\rm total}}</script>
<p>Now, write a Python function, that takes in the number of points to be simulated, and then calculates an approximate value for $\pi$ based on the Monte Carlo algorithm described above. Write a second function that plot the estimate of $\pi$ versus the number of points simulated, like the following,</p>
<figure>
<img src="http:/ECL2017S/homework/10/approximatePi_10000.png" width="900" />
</figure>
<p><br /></p>
<p><br />
<strong>Answer:</strong><br />
Here is an example Python code that estimates $\pi$.</p>
<pre><code class="language-python">import numpy.random as rnd
import numpy.linalg as nlg
import numpy as np
import matplotlib.pyplot as plt
def estimatePi(n=100000):
x = np.zeros((n,4),dtype=np.dtype)
counter = 0
for i in range(n):
x[i,0:2] = rnd.uniform(0.0,1.0,2) # generate two uniform random numbers
x[i,2] = nlg.norm(x[i,:], ord=2)
if (x[i,2]<=1.0): counter += 1
x[i,3] = 4.0*np.double(counter)/np.double(i+1)
return x # the running approximate of pi is returned as a vector in the fourth column of x
def plotPi(n=10000):
x = estimatePi(n)
plt.semilogx( list(range(1,n+1)) \
, x[:,3] \
) # plot with color red, as line
plt.xlabel('Number of simulated points')
plt.ylabel('Approximate value of Pi')
#plt.axis([1, n , 0.1, 4.0]) # [xmin, xmax, ymin, ymax]
plt.title('Estimating Pi by Monte Carlo simulation')
plt.savefig('approximatePi_{}.png'.format(n))
plt.show()
plotPi()
</code></pre>
<p><br /></p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> <strong>Monte Carlo sampling of distribution functions</strong> Suppose that you wanted to generate points whose distribution follows the blue curve in the following curve, whose mathematical formulation is known (in the case here, the function is just the sum of two Gaussian distributions).</p>
<figure>
<img src="http:/ECL2017S/homework/10/normSum.gif" width="700" />
</figure>
<p><br /></p>
<p>Now, one oway of doing this, is to draw a box around this curve, such that the box encompasses the entire curve.</p>
<figure>
<img src="http:/ECL2017S/homework/10/normSumWithRec_2.gif" width="700" />
</figure>
<p><br /></p>
<p>Then, just as we did in the previous problem above, we draw random points from this square, and keep only those points that fall beneath this blue curve, like the red points in the following animation.</p>
<figure>
<img src="http:/ECL2017S/homework/10/RejSamForever.gif" width="700" />
</figure>
<p><br /></p>
<p>Now, if you plot the histogram of these points, you will see that the distribution of the red points follows closely the blue curve that we wanted to sample.</p>
<figure>
<img src="http:/ECL2017S/homework/10/rejSamHistForever.gif" width="700" />
</figure>
<p><br /></p>
<p>Now, given the above example, consider the following distribution function which we want to sample,</p>
<script type="math/tex; mode=display">f(x) = \frac{(x+1)}{12} \exp\bigg(-\frac{(x-1)^2}{2x}\bigg) ~~,~~ x > 0.</script>
<p>Suppose we know already that the highest point (maximum value) of this function is $f<0.2$, so that the value of this function always remains below $0.2$ everywhere along the positive x-axis, as seen in the following figure,</p>
<figure>
<img src="http:/ECL2017S/homework/10/prob2Func.png" width="900" />
</figure>
<p><br /></p>
<p><strong>(A)</strong> Now, first write a function that generates a plot of this function, similar to the above plot.</p>
<p><br />
<strong>Answer:</strong><br />
Here is an example Python script that plots the requested curve.</p>
<pre><code class="language-python">import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0.001,15,100)
fig,ax=plt.subplots()
f= lambda x: np.exp(-(x-1)**2/2./x)*(x+1)/12.
fx = f(x)
ax.plot(x,fx,label='$f(x)$')
ax.legend(loc=0,fontsize=16);
plt.savefig('prob2Func.png')
</code></pre>
<p><br /></p>
<p><strong>(B)</strong> Then write another Python script, that samples from this function by first drawing a rectangle of base size $[0,15]$ and height $[0,h]$ with $h=0.2$. Then, draw uniform random points from this rectangle, and keep those that fall beneath the the value of $f(x)$ given above as points that are sampled from this function. Finally make a histogram of these points like the following.</p>
<figure>
<img src="http:/ECL2017S/homework/10/prob2FuncHist.png" width="900" />
</figure>
<p><br /></p>
<p><br />
<strong>Answer:</strong><br />
Here is an example Python script that makes the requested plot.</p>
<pre><code class="language-python">h=.2
u1 = np.random.rand(10000)*15 # uniform random samples scaled out
u2 = np.random.rand(10000) # uniform random samples
idx = np.where(u2<=f(u1)/h)[0] # rejection criterion
v = u1[idx]
fig,ax=plt.subplots()
plt.hold(True)
ax.hist(v,normed=1,bins=40,alpha=.3)
ax.plot(x,fx,'r',lw=3.,label='$f(x)$')
ax.legend(fontsize=18)
plt.savefig('prob2FuncHist.png')
plt.show()
</code></pre>
<p><br /></p>
<p><strong>(C)</strong> Now make a plot of all generated points, both those that were accepted as samples, and those that were rejected, similar to the following plot, with accepted points in red color, and rejected points in black,</p>
<figure>
<img src="http:/ECL2017S/homework/10/prob2FuncScatter.png" width="700" />
</figure>
<p><br /></p>
<p><br />
<strong>Answer:</strong><br />
Here is an example Python script that makes the requested plot.</p>
<pre><code class="language-python">fig,ax=plt.subplots()
ax.plot(u1,u2,'k.',label='rejected',alpha=.3)
ax.plot(u1[idx],u2[idx],'r.',label='accepted',alpha=.3)
ax.legend(fontsize=16)
plt.savefig('prob2FuncScatter.png')
plt.show()
</code></pre>
<p><br /></p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/10-solutions-python-advanced-monte-carlo">Homework 10: Solutions - Python advanced, Monte Carlo methods</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on May 04, 2017.</p><![CDATA[Lecture 11: Python advanced topics - decorators and classes]]>http:/ECL2017S/lecture/11-python-advanced-decorator-class2017-04-26T00:00:00-05:002017-04-26T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This lecture discusses some further important topics in Python: Decorators and Classes.</p>
<div class="post_toc"></div>
<h2 id="pyton-decorators">Pyton Decorators</h2>
<p>In simple words, Python decorators are functions that can modify (e.g., add to) the functionalities of other functions. As will be described below, decorators are particularly useful in making your code shorter. To understand the workings of decorators, we will have to recall a few properties of functions in Python.</p>
<p>Firstly, since every entity in Python is an object, including functions, almost everything, including functions can be assigned to a variable. For example, the simple function,</p>
<pre><code class="language-python">def hello(name='Amir'):
return 'Hello ' + name
</code></pre>
<p><br />
can be assigned to a new variable,</p>
<pre><code class="language-python">greet = hello
</code></pre>
<p><br />
which is also a function,</p>
<pre><code class="language-python">greet
</code></pre>
<pre><code><function __main__.hello>
</code></pre>
<p>and more importantly, it is not attached to the original function <code>hello()</code>,</p>
<pre><code class="language-python">del hello
hello
</code></pre>
<pre><code>---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-6-b1946ac92492> in <module>()
----> 1 hello
NameError: name 'hello' is not defined
</code></pre>
<pre><code class="language-python">greet
</code></pre>
<pre><code><function __main__.hello>
</code></pre>
<pre><code class="language-python">greet()
</code></pre>
<pre><code>'Hello Amir'
</code></pre>
<h3 id="functions-inside-other-functions">Functions inside other functions</h3>
<p>Now one thing to keep in mind, is that you can define functions inside functions in Python, just as you can do in almost any other capable language.</p>
<pre><code class="language-python">def hello(name='Amir'):
print('This is from inside function hello()')
def greet():
return '\t This is from inside function greet() inside function hello()'
def welcome():
return '\t This is from inside function welcome() inside function hello()'
print(greet())
print(welcome())
print("This is from inside function hello()")
</code></pre>
<p><br />
So now, if you type,</p>
<pre><code class="language-python">hello()
</code></pre>
<pre><code>This is from inside function hello()
This is from inside function greet() inside function hello()
This is from inside function welcome() inside function hello()
This is from inside function hello()
</code></pre>
<p>Another cool feature to know about, is that you can have functions both as input and return values to and from another function. We have seen this already in previous lectures, where we discussed functions for the first time.</p>
<p>With this in mind, let’s create a function like the following,</p>
<pre><code class="language-python">def decorateThisFunction(func):
def wrapInputFunction():
print("Some decorating code can be executed here, before calling the input function")
func()
print("Some decorating code can be executed here, after calling the input function")
return wrapInputFunction
def needsDecorator():
print("\t This function needs a Decorator")
</code></pre>
<p><br />
So now,</p>
<pre><code class="language-python">needsDecorator()
</code></pre>
<pre><code>This function needs a Decorator
</code></pre>
<p>Now, it can happen in your programming that you need to do a specific set of tasks for a function repeatedly, so you may prefer to redefine/reassign your function to your decorated function, such that whenever you call your function by its own name, it is always returned in the modified (decorated) state. For example, see what happens with,</p>
<pre><code class="language-python">needsDecorator = decorateThisFunction(needsDecorator) # Reassign needsDecorator to the new decorated state
</code></pre>
<p><br />
which upon calling outputs,</p>
<pre><code class="language-python">needsDecorator()
</code></pre>
<pre><code>Some decorating code can be executed here, before calling the input function
This function needs a Decorator
Some decorating code can be executed here, after calling the input function
</code></pre>
<p>What happened above is that we wrapped the function and modified its behavior using a simple <strong>decorator</strong>. We could have also assigned this new modified (decorated) state of the function to a variable with other name than the function name itself. For example,</p>
<pre><code class="language-python">test = decorateThisFunction(needsDecorator)
test()
</code></pre>
<pre><code>Some decorating code can be executed here, before calling the input function
This function needs a Decorator
Some decorating code can be executed here, after calling the input function
</code></pre>
<p>But, you should have done this <strong>before</strong> reassigning the function name <code>needsDecorator</code> to its new, decorated, state. If you do this after the reassignment, then this is what you get,</p>
<pre><code class="language-python">test = decorateThisFunction(needsDecorator)
test()
</code></pre>
<pre><code>Some decorating code can be executed here, before calling the input function
Some decorating code can be executed here, before calling the input function
This function needs a Decorator
Some decorating code can be executed here, after calling the input function
Some decorating code can be executed here, after calling the input function
</code></pre>
<p>In other word, you decorate your already-decorated function <code>needsDecorator()</code>, one more time by passing it to <code>decorateThisFunction()</code>. Now, since this functionality is needed frequently in Python, Python has a special syntax for it, the <strong>Decorator syntax</strong>,</p>
<pre><code class="language-python">@decorateThisFunction
def needsDecorator():
print "This function needs a Decorator"
</code></pre>
<p><br />
The above statement is an exact equivalent to,</p>
<pre><code class="language-python">needsDecorator = decorateThisFunction(needsDecorator) # Reassign needsDecorator to the new decorated state
</code></pre>
<p><br />
which we used before to decorate our function.</p>
<p>You may wonder what the use of decorators could be. Decorators can be a handy tool for functionalities that have to be repeated for many functions. For example, <a href="https://en.wikipedia.org/wiki/Profiling_(computer_programming)" target="_blank">Porofiling</a> and timing the performance of functions require the idea of decorators. Most often, decorators are useful and needed in web development with Python.</p>
<h2 id="pyton-classes">Pyton classes</h2>
<p>The concept of Python class, as in almost any other programming language, relates to packing <strong>a set of variables</strong> together <strong>with a set of functions</strong> operating on the data. The goal of writing classes is to achieve more modular code by grouping data and functions into manageable units. One thing to keep in mind for scientific computing is that, classes, and more generally, <a href="https://en.wikipedia.org/wiki/Object-oriented_programming" target="_blank">Object Oriented Programming (OOP)</a>, are not necessary, and could be a hinderance to efficient computing if used naively. Nevertheless, classes lead to either more elegant solutions to the programming problem, or a code that is easier to extend and maintain in large scale projects. In the non-mathematical programming world where there are no mathematical concepts and associated algorithms to help structure the programming problem, software development can be very challenging. In those cases, Classes greatly improve the understanding of the problem and simplify the modeling of data. As a consequence, almost all large-scale software systems being developed in the world today are heavily based on classes (but certainly not all scientific projects!).</p>
<p>Programming with classes is offered by most modern programming languages, including Python. Python uses the concept of classes in almost every bit of it. However, most Python users don’t even notice the heavy dependence of Python on classes under the hood, until the actually learn what a class is, just as we have made progress in this class so far, without knowing about classes.</p>
<p>Classes can be used for many purposes in scientific programming and computation. One of the most frequently encountered tasks is to represent mathematical functions that have a set of parameters in addition to one or more independent variables. To expand on this, consider the problem described in the following section.</p>
<h3 id="a-common-programming-challenge-in-numerical-computing">A common programming challenge in numerical computing</h3>
<p>To motivate for the class concept, let’s look at functions with parameters. One example is $y(t) = v_0t-\frac{1}{2}gt^2$. Conceptually, in physics, $y$ is viewed as a function of $t$, but mathematically $y$ also depends on two other parameters, $v_0$ and $g$, although it is not natural to view $y$ as a function of these parameters. One can therefore write $f(t;v_0g)$ to emphasize that $t$ is the independent variable, while $v_0$ and $g$ are parameters. Strictly speaking, $g$ is a fixed parameter (as long as the experiment is run on the surface of the earth), so only $v_0$ and $t$ can be arbitrarily chosen in the formula.
It would then be better to write $y(t;v_0). Here is an implementation of this function,</p>
<pre><code class="language-python">def y(t, v0):
g = 9.81
return v0*t - 0.5*g*t**2
</code></pre>
<p><br />
This function gives the height of the projectile as a function of time. Now suppose you wanted to differentiate $y$ with respect to $t$ in order to obtain the velocity. You could write the following code to do so,</p>
<pre><code class="language-python">def diff(f, x, h=1E-5):
return (f(x+h) - f(x))/h
</code></pre>
<p><br />
But, here is the catch with this problem of differentiation. The <code>diff</code> function works with any function <code>f</code> that takes <strong>only</strong> one argument. In other words, if we want to input <code>y</code> to <code>diff</code>, then we will have to redefine <code>y</code> to take only one argument. You may wonder why not change <code>diff</code>. For this simple problem, this could be a solution. But, with larger problems, you are more likely to use sophisticated routines and modules that have been already developed and many of these routines take a function as input that only has one input variable. This is quite often the case with high-performance integration routines.</p>
<p>One, perhaps bad, solution to the above problem is to use <strong>global variables</strong>. The requirement is thus to define Python implementations of mathematical functions of one variable with one argument, the independent variable,</p>
<pre><code class="language-python">def y(t):
g = 9.81
return v0*t - 0.5*g*t**2
</code></pre>
<p><br />
This function will work only if <code>v0</code> is a global variable, initialized before one attempts to call the function. Here is an example call where <code>diff</code> differentiates y,</p>
<pre><code class="language-python">v0 = 3
dy = diff(y, 1)
</code></pre>
<p><br />
The use of global variables is in general considered bad programming. Why global variables are problematic in the present case can be illustrated when there is need to work with several versions of a function. Suppose we want to work with two versions of $y(t;v_0)$, one with $v_0=1$ and one with $v_0=5$. Every time we call <code>y</code>, we must remember which version of the function we work with, and set <code>v0</code> accordingly prior to the call,</p>
<pre><code class="language-python">v0 = 1; r1 = y(t)
v0 = 5; r2 = y(t)
</code></pre>
<p><br />
<em>Another problem</em> is that variables with simple names like <code>v0</code>, may easily be used as global variables in other parts of the program. These parts may change our <code>v0</code> in a context different from the <code>y</code> function, but the change affects the correctness of the <code>y</code> function. In such a case, we say that changing <code>v0</code> has <strong>side effects</strong>, i.e., <strong>the change affects other parts of the program in an unintentional way</strong>. This is one reason why a golden rule of programming tells us to <strong>limit the use of global variables as much as possible</strong>.</p>
<p>An alternative solution to the problem of needing two <code>v0</code> parameters could be to introduce two <code>y</code> functions, each with a distinct <code>v0</code> parameter,</p>
<pre><code class="language-python">def y1(t):
g = 9.81
return v0_1*t - 0.5*g*t**2
def y2(t):
g = 9.81
return v0_2*t - 0.5*g*t**2
</code></pre>
<p><br /></p>
<p>Now we need to initialize <code>v0_1</code> and <code>v0_2</code> once, and then we can work with <code>y1</code> and <code>y2</code>. However, if we need $100$ <code>v0</code> parameters, we need $100$ functions. This is tedious to code, error prone, difficult to administer, and simply a really bad solution to a programming problem.</p>
<p><strong>So, is there a good remedy?</strong> The answer is yes: the class concept solves all the problems described above.</p>
<h4 id="class-representation-of-a-function">Class representation of a function</h4>
<p>A class as contains a set of variables (data) and a set of functions, held together as one unit. The variables are visible in all the functions in the class. That is, we can view the variables as “global” in these functions. These characteristics also apply to modules, and modules can be used to obtain many of the same advantages as classes offer (see comments in Sect. 7.1.6). However, classes are technically very different from modules. You can also make many copies of a class, while there can be only one copy of a module. When you master both modules and classes, you will clearly see the similarities and differences. Now we continue with a specific example of a class.</p>
<p>Consider the function $y(t;v_0) = v_0t - \frac{1}{2}gt^2$. We may say that $v_0$ and $g$, represented by the variables <code>v0</code> and <code>g</code>, constitute the data. A Python function, say <code>value(t)</code>, is then needed to compute the value of $y(t;v_0)$ and this function must have access to the data <code>v0</code> and <code>g</code>, while <code>t</code> is an argument. A programmer experienced with classes will then suggest to collect the data <code>v0</code> and <code>g</code>, and the function <code>value(t)</code>, together as a <strong>class</strong>. In addition, a class usually has another function, called <strong>constructor</strong> for <strong>initializing the data</strong>. The constructor is always named <code>__init__</code>. Every <strong>class must have a name</strong>, often <strong>starting with a capital</strong>, so we choose <code>Y</code> as the name since the class represents a mathematical function with name <code>y</code>. The next step is to implement this class in Python. A complete class code <code>Y</code> for our problem here would look as follows in Python:</p>
<pre><code class="language-python">class Y:
def __init__(self, v0):
self.v0 = v0
self.g = 9.81
def value(self, t):
return self.v0*t - 0.5*self.g*t**2
</code></pre>
<p><br />
<strong>A class creates a new data type</strong>, here of name <code>Y</code>, so when we use the class to make objects, those objects are of type Y. <strong>All the standard Python objects, such as lists, tuples, strings, floating-point numbers, integers, …, are built-in Python classes</strong>, and each time the user creates on these variable types, one instance os these classes is created by the Python interpreter. A user-defined object class (like Y) is usually called an <strong>instance</strong>. We need such an instance in order to use the data in the class and call the value function. The following statement constructs an instance bound to the variable name y:</p>
<pre><code class="language-python">y = Y(3)
</code></pre>
<p><br />
Seemingly, we <em>call the class <code>Y</code> as if it were a function</em>. Indeed, <code>Y(3)</code> is automatically translated by Python to a call to the constructor <code>__init__</code> in class Y. The arguments in the call, here only the number <code>3</code>, are always passed on as arguments to <code>__init__</code> after the <code>self</code> argument. That is, <code>v0</code> gets the value <code>3</code> and self is just dropped in the call. This may be confusing, but it is a rule that the self argument is never used in calls to functions in classes.
With the instance <code>y</code>, we can compute the value of y(t=0.1;v_0=3) by the statement,</p>
<pre><code class="language-python">v = y.value(0.1)
</code></pre>
<p><br />
Note that the <code>self</code> input argument is dropped in the call to <code>value()</code>. To access functions and variables in a class, one must prefix the function and variable names by the name of the instance and a dot: the value function is reached as <code>y.value</code>, and the variables are reached as <code>y.v0</code> and <code>y.g</code>. One could, for example, print the value of <code>v0</code> in the instance <code>y</code> by writing,</p>
<pre><code class="language-python">print y.v0
</code></pre>
<p><br />
We have already introduced the term <strong>instance</strong> for the object of a class. <strong>Functions</strong> in classes are commonly called <strong>methods</strong>, and <strong>variables (data)</strong> in classes are called <strong>data attributes</strong>. Methods are also known as <strong>method attributes</strong>. For example, in our sample class <code>Y</code> we have two methods or method attributes, <code>__init__</code> and <code>value</code>, two data attributes, <code>v0</code> and <code>g</code>, and four attributes in total (<code>__init__</code>, <code>value</code>, <code>v0</code>, <code>g</code>). Note that the names of attributes can be chosen freely, just as names of ordinary Python functions and variables. However, <strong>the constructor
must have the name <code>__init__</code>, otherwise it is not automatically called when new instances are created</strong>. You can do whatever you want in whatever method, but it is a common convention to <strong>use the constructor for initializing the variables in the class</strong>.</p>
<p>So far, we have explained a method of writing our function of interest in a class style, which resolves the need to pass a auxiliary variable to a function explicitly. But if you look at the original problem that we had, you will notice that we still cannot use our class <code>Y</code> instance <code>y</code> as an argument to other functions similar to <code>diff()</code>. The final resolution to this problem is to add a <code>__call__</code> method to our originally defined <code>Y</code> class.</p>
<h3 id="callable-objects">Callable objects</h3>
<p>If you recall, computing the value of the mathematical function represented by class <code>Y</code>, with <code>y</code> as the name of the instance, is performed by writing <code>y.value(t)</code>. If we could write just <code>y(t)</code>, the <code>y</code> instance would look as an ordinary function. Such a syntax is indeed possible and offered by the special method named <code>__call__</code>.</p>
<pre><code class="language-python">class Y:
def __init__(self, v0):
self.v0 = v0
self.g = 9.81
def value(self, t):
return self.v0*t - 0.5*self.g*t**2
def __call__(self, t):
return self.v0*t - 0.5*self.g*t**2
</code></pre>
<p><br />
then, writing <code>y(t)</code> implies a call like <code>y.__call__(t)</code>, which is equivalent to <code>y.value(t)</code>. The previous value method is now redundant. A good programming convention is to <strong>include a <code>__call__</code> method in all classes that represent a mathematical function</strong>. Instances with <code>__call__</code> methods are said to be <strong>callable objects</strong>, just as plain functions are callable objects as well. The call syntax for callable objects is the same, regardless of whether the object is a function or a class instance.</p>
<p>You can always test if an instance is callable or not by <code>callable()</code>,</p>
<pre><code class="language-python">callable(y)
</code></pre>
<pre><code>True
</code></pre>
<p><br /><br />
<strong>Reference</strong><br />
<br />
The book <a href="http://link.springer.com/book/10.1007%2F978-3-662-49887-3" target="_blank">A Primer on Scientific Programming with Python</a> by Hans Petter Langtangen, provides a good starting point on the use of Classes and OOP in Python from a scientific programming perspective. The examples provided in this lecture heavily rely on Langtangen’s notes on Python classes in his textbook in chapter 7. You can download a complete electronic copy of this book for free from Springer website, if you redirect to Springer page from <a href="http://www.lib.utexas.edu/" target="_blank">UT Austin library page</a>.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/lecture/11-python-advanced-decorator-class">Lecture 11: Python advanced topics - decorators and classes</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 26, 2017.</p><![CDATA[Homework 9: Solutions - Python advanced IO, Monte Carlo]]>http:/ECL2017S/homework/9-solutions-python-advanced-io-monte-carlo-interoperability2017-04-26T00:00:00-05:002017-04-26T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is the solution to <a href="9-problems-python-advanced-io-monte-carlo-interoperability.html" target="_blank">Homework 9: Problems - Python advanced IO, Monte Carlo</a>.</p>
<p>The following figure illustrates the grade distribution for this homework.</p>
<figure>
<img src="http:/ECL2017S/homework/gradeDist/gradeHistHomework9.png" width="700" />
<figcaption style="text-align:center">
Maximum possible points is 100.<br />
</figcaption>
</figure>
<hr />
<hr />
<p><br /><br />
This homework aims at giving you some experience with Python’s tools for interacting with the World Wide Web and writing Monte Carlo simulations.
<br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>Update:</strong> As I discussed in class, in order to avoid creating potential traffic on Professor Butler’s webpage, I have now uploaded all the necessary files on <a href="http:/ECL2017S/homework/9/swift/bat_time_table.html" target="_blank">this address</a> (don’t click on the links in this table, because it will take you to Professor Butler’s repository for this data. I have all the data already saved in our domain locally). So now, your goal is to first read the event-ID HTML table from</p>
<p><a href="http:/ECL2017S/homework/9/swift/bat_time_table.html" target="_blank">https://www.shahmoradi.orghttp:/ECL2017S/homework/9/swift/bat_time_table.html</a>.</p>
<p>Then, use the event-IDs in this table to generate web addresses like:</p>
<p><a href="http:/ECL2017S/homework/9/swift/GRB00100433_ep_flu.txt" target="_blank">https://www.shahmoradi.orghttp:/ECL2017S/homework/9/swift/GRB00100433_ep_flu.txt</a></p>
<p>in order to download these <code>.txt</code> files from the web. The rest of the homework is just as you would have done this problem as descibed below.</p>
<p><strong>1. </strong> <strong>Reading scientific data from web</strong>. Consider the webpage of Professor <a href="http://butler.lab.asu.edu/" target="_blank">Nat Butler</a> at Arizona State University. He has successfully written Python piplines for automated analysis of data from <a href="https://www.nasa.gov/mission_pages/swift/main" target="_blank">NASA’s Swift satellite</a>. For each <a href="https://en.wikipedia.org/wiki/Gamma-ray_burst" target="_blank">Gamma-Ray Burst (GRB)</a> detection that Swift makes, his pipline analyzes and reduces data for the burst and summarizes the results on his personal webpage, for example in <a href="http://butler.lab.asu.edu/swift/bat_spec_table.html" target="_blank">this table</a>.</p>
<p><strong>(A)</strong> Write a Python function named <code>fetchHtmlTable(link,outputPath)</code> that takes two arguments:</p>
<ol>
<li>a web address (which will be this: <a href="http://butler.lab.asu.edu/swift/bat_time_table.html" target="_blank">http://butler.lab.asu.edu/swift/bat_time_table.html</a>), and</li>
<li>an output path to where you want the code save the resulting files. One file is exact HTML contained in the input webpage address, and a second file, which is the Table contained in this HTML address. To parse the HTML table in this address, you will need the Python code <a href="http:/ECL2017S/homework/9/parseTable.py" target="_blank">parseTable.py</a> also available and explained on <a href="https://www.summet.com/dmsi/html/readingTheWeb.html" target="_blank">this page</a>. This parsed HTML table, will be in the form of a list, whose elements correspond to each row in the HTML table, and each row of element of this parsed table is itself another list, that contains the columns of the HTML table in that row. Output this table as well, in a separate file, in a formatted style, meaning that each element of table in a row has a space of 30 characters for itself (or something appropriate as you wish, e.g., <code>'{:>30}'.format(item)</code>). You can see an example output of the code <a href="http:/ECL2017S/homework/9/bat_time_table.html" target="_blank">here for the HTML output file</a>, and <a href="http:/ECL2017S/homework/9/bat_time_table.html.tab" target="_blank">here for parse HTML table</a>.</li>
</ol>
<p><strong>(B)</strong> Now, if you look at the content of the file that your function has generated (once you run it), you will see something like the following,</p>
<pre><code class="language-text"> GRB (Trig#) Trig_Time (SOD) Time Region [s] T_90 T_50 rT_0.90 rT_0.50 rT_0.45 T_av T_max T_rise T_fall Cts Rate_pk Band
GRB170406x (00745966) 44943.130 -40.63->887.37 881.000 +/-7.697 549.000 +/-25.558 280.000 +/-13.432 124.000 +/-5.751 109.000 +/-4.997 433.667 +/-15.557 877.870 +/-367.494 890.500 +/-366.900 0.000 +/-366.630 6.082 +/-0.344 0.018 +/-0.0065 15-350keV
GRB170402x (00745090) 38023.150 54.35->66.35 9.000 +/-2.096 5.000 +/-1.490 7.000 +/-1.535 4.000 +/-0.640 3.000 +/-0.559 60.964 +/-1.316 58.850 +/-2.417 1.500 +/-2.894 7.500 +/-2.720 0.162 +/-0.045 0.022 +/-0.0106 15-350keV
GRB170401x (00745022) 68455.150 -19.63->71.49 78.880 +/-5.224 39.440 +/-4.168 41.480 +/-3.823 18.360 +/-1.585 16.320 +/-1.341 29.541 +/-2.857 24.910 +/-24.386 37.740 +/-24.251 41.140 +/-23.977 1.181 +/-0.122 0.024 +/-0.0130 15-350keV
GRB170331x (00744791) 6048.440 9.835->35.875 20.160 +/-1.285 10.290 +/-0.914 14.070 +/-1.041 5.880 +/-0.415 5.040 +/-0.359 21.461 +/-0.598 12.460 +/-5.633 0.525 +/-5.718 19.635 +/-5.840 1.875 +/-0.154 0.134 +/-0.0408 15-350keV
</code></pre>
<p><br />
Now write another function that reads the event unique numbers that appear in this table in parentheses (e.g., 00745966 is the first in table), and puts this number in place of <code>event_id</code> in this web address template: <code>http://butler.lab.asu.edu/swift/event_id/bat/ep_flu.txt</code>.</p>
<p>Now note that, for some events, this address exists, for example,</p>
<p><a href="http://butler.lab.asu.edu/swift/00745966/bat/ep_flu.txt" target="_blank">http://butler.lab.asu.edu/swift/00745966/bat/ep_flu.txt</a>,</p>
<p>which is a text file named <code>ep_flu.txt</code>. For some other events, this address might not exist, for example,</p>
<p><a href="http://butler.lab.asu.edu/swift/00680331/bat/ep_flu.txt" target="_blank">http://butler.lab.asu.edu/swift/00680331/bat/ep_flu.txt</a>,</p>
<p>in which case your code will have to raise a <code>urllib.request.HTTPError</code> exception. Write your code such that it can smoothly skip these exceptions. Write your code such that it saves all those text files on your local computer, in a file-name format like <a href="http:/ECL2017S/homework/9/GRB00100433_ep_flu.txt" target="_blank">this example: <code>GRB00100433_ep_flu.txt</code></a> (A total of 938 files exist).</p>
<p><strong>(C)</strong> Now write a third function, that reads all of these files in your directory, one by one, as numpy arrays, and plots the content of all of them together, on a single scatter plot like the following,</p>
<figure>
<img src="http:/ECL2017S/homework/9/ep_flu.png" width="900" />
</figure>
<p><br /></p>
<p>To achieve this goal, your function should start like the following,</p>
<pre><code class="language-python">def plotBatFiles(inPath,figFile):
import os
import numpy as np, os
import matplotlib.pyplot as plt
ax = plt.gca() # generate a plot handle
ax.set_xlabel('Fluence [ ergs/cm^2 ]') # set X axis title
ax.set_ylabel('Epeak [ keV ]') # set Y axis title
ax.axis([1.0e-8, 1.0e-1, 1.0, 1.0e4]) # set axix limits [xmin, xmax, ymin, ymax]
plt.hold('on') # add all data files to the same plot
counter = 0 # counts the number of events
</code></pre>
<p><br />
where <code>inPath</code> and <code>figFile</code> are the path to the directory containing the files, and the name and path to the output figure file. You will have to use <code>os.listdir(inPath)</code> to get a list of all files in your input directory. Then loop over this list of files, and use only those that end with <code>ep_flu.txt</code> because that’s how you saved those files, e.g.,</p>
<pre><code class="language-python">for file in os.listdir(inPath):
if file.endswith("ep_flu.txt"):
# rest of your code ...
</code></pre>
<p><br />
But now, you have to also make sure that your input data does indeed contain some numerical data, because some files do contain anything, although they exist, like <a href="http:/ECL2017S/homework/9/GRB00559075_ep_flu.txt" target="_blank">this file: ``</a>. To do so, you will have to perform a test on the content of file, once you read it as numpy array, like the following,</p>
<pre><code class="language-python"> data = np.loadtxt(os.path.join(inPath, file), skiprows=1)
if data.size!=0 and all(data[:,1]<0.0):
# then plot data
</code></pre>
<p><br />
the condition <code>all(data[:,1]<0.0)</code> is rather technical. It makes sure that all values are positive on the second column. Once you have done all these checks, you have to do one final manipulation of data, that is, the data in these files on the second column is actually the log of data, so have to get the <code>exp()</code> value to plot it (because plot is log-log). To do so you can use,</p>
<pre><code class="language-python"> data[:,1] = np.exp(data[:,1])
</code></pre>
<p><br />
and then finally,</p>
<pre><code class="language-python"> ax.scatter(data[:,1],data[:,0],s=1,alpha=0.05,c='r',edgecolors='none')
</code></pre>
<p><br />
which will add the data for the current file to the plot. At the end, you will have to set a title for your plot as well, and save your plot,</p>
<pre><code class="language-python"> ax.set_title('Plot of Epeak vs. Fluence for {} Swift GRB events'.format(counter))
plt.savefig(figFile)
</code></pre>
<p><br />
Note that the variable <code>counter</code> contains the total number of events for which the text files exists on the website, <strong>and</strong> the file contained some data (i.e., was not empty).</p>
<p><strong>Question:</strong> What does <code>alpha=0.05</code> and <code>s=1</code> do in the following scatter plot command? (Vary their values to see what happens)</p>
<p><br />
<strong>Answer:</strong><br />
An example code can be downloaded from <a href="http:/ECL2017S/homework/9/solutions/readTable.py" target="_blank">here</a>.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> <strong>Simulating a fun Monte Carlo game.</strong> Suppose you’re on a game show, and you’re given the choice of three doors:</p>
<figure>
<img src="http:/ECL2017S/homework/9/Monty_1.png" width="600" />
</figure>
<p><br /></p>
<p>Behind one door is a car; behind the two others, goats. You pick a door, say No. 1, and the host of the show opens another door, say No. 3, which has a goat.</p>
<figure>
<img src="http:/ECL2017S/homework/9/Monty_open_door.png" width="600" />
</figure>
<p><br /></p>
<p>He then says to you, “Do you want to pick door No. 2?”.</p>
<p><strong>Question: What would you do?</strong><br />
Is it to your advantage to switch your choice from door 1 to door 2? Is it to your advantage, <strong>in the long run, for a large number of game tries</strong>, to switch to the other door?</p>
<p>Now whatever your answer is, I want you to check/prove your answer by a Monte Carlo simulation of this problem. Make a plot of your simulation for $ngames=100000$ repeat of this game, that shows, in the long run, on average, what is the probability of winning this game if you switch your choice, and what is the probability of winning, if you do not switch to the other door.</p>
<p><br />
<strong>Answer:</strong><br />
An example code can be downloaded from <a href="http:/ECL2017S/homework/9/solutions/monteGame.py" target="_blank">here</a>. Here is the code figure output,</p>
<figure>
<img src="http:/ECL2017S/homework/9/solutions/MontyGameResult.png" width="800" />
</figure>
<p><br />
As you see in the figure, although you may initially win by not switching your choice, but in the long run, on average, you will lose, if you don’t switch your choice.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/9-solutions-python-advanced-io-monte-carlo-interoperability">Homework 9: Solutions - Python advanced IO, Monte Carlo</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 26, 2017.</p><![CDATA[Homework 10: Problems - Python advanced, Monte Carlo methods]]>http:/ECL2017S/homework/10-problems-python-advanced-monte-carlo2017-04-26T00:00:00-05:002017-04-26T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This homework further explores Monte Carlo methods in Python.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> <strong>Monte Carlo approximation of the number $\pi$</strong>. Suppose we did not know the value of $\pi$ and we wanted to estimate its value using Monte Carlo methods. One practical approach is to draw a square of unit side, with its diagonal opposite corners extending from the coordinates origin $(0,0)$ to $(1,1)$. Now we try to simulate uniform random points from inside of this square by generating uniform random points along the $X$ and $Y$ axes, i.e., by generating two random uniform numbers (x,y) from the range $[0,1]$.</p>
<p>Now the generated random point $P$ has ther coordinate $(x,y)$, so we can calculate its distance from the coordiante origin. Now suppose we also draw a quarter-circle inside of this square whose radius is unit and is centered at the origin $(0,0)$. The ratio of the area of this quarter-circle, $S_C$ to the area of the area of the square enclosing it, $S_S$ is,</p>
<script type="math/tex; mode=display">\frac{S_C}{S_S} = \frac{\frac{1}{4}\pi r^2}{r^2} = \frac{1}{4}\pi</script>
<p>This is because the area of the square of unit sides, is just 1. Therefore, if we can somehow measure the area of the quarter $S_C$, then we can use the following equation, to get an estimate of $\pi$,</p>
<script type="math/tex; mode=display">\pi = 4S_C</script>
<p>In order to obtain, $S_C$, we are going to throw random points in the square, just as described above, and then find the fraction of points, $f=n_C/n_{\rm total}$, that fall inside this quarter-circle. This fracton is related to the area of the circle and square by the following equation,</p>
<script type="math/tex; mode=display">f=\frac{n_C}{n_{\rm total}} = \frac{S_C}{S_S}</script>
<p>Therefore, one can obtain an estimate of $\pi$ using this fraction,</p>
<script type="math/tex; mode=display">\pi \approx \frac{1}{4}\frac{n_C}{n_{\rm total}}</script>
<p>Now, write a Python function, that takes in the number of points to be simulated, and the calculate an approximate value for $\pi$ based on the Monte Carlo algorithm described above. Write a second function that plot the estimate of $\pi$ versus the number of points simulated, like the following,</p>
<figure>
<img src="http:/ECL2017S/homework/10/approximatePi_10000.png" width="900" />
</figure>
<p><br /></p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> <strong>Monte Carlo sampling of distribution functions</strong> Suppose that you wanted to generate points whose distribution follows the blue curve in the following figure, whose mathematical formulation is known (in the case here, the function is just the sum of two Gaussian distributions).</p>
<figure>
<img src="http:/ECL2017S/homework/10/normSum.gif" width="700" />
</figure>
<p><br /></p>
<p>Now, one oway of doing this, is to draw a box around this curve, such that the box encompasses the entire curve.</p>
<figure>
<img src="http:/ECL2017S/homework/10/normSumWithRec_2.gif" width="700" />
</figure>
<p><br /></p>
<p>Then, just as we did in the previous problem above, we draw random points from this square, and keep only those points that fall beneath this blue curve, like the red points in the following animation.</p>
<figure>
<img src="http:/ECL2017S/homework/10/RejSamForever.gif" width="700" />
</figure>
<p><br /></p>
<p>Now, if you plot the histogram of these points, you will see that the distribution of the red points follows closely the blue curve that we wanted to sample.</p>
<figure>
<img src="http:/ECL2017S/homework/10/rejSamHistForever.gif" width="700" />
</figure>
<p><br /></p>
<p>Now, given the above example, consider the following distribution function which we want to sample,</p>
<script type="math/tex; mode=display">f(x) = \frac{(x+1)}{12} \exp\bigg(-\frac{(x-1)^2}{2x}\bigg) ~~,~~ x > 0.</script>
<p>Suppose we know already that the highest point (maximum value) of this function is $f_{\rm max}<0.2$, so that the value of this function always remains below $0.2$ everywhere along the positive x-axis, as seen in the following figure,</p>
<figure>
<img src="http:/ECL2017S/homework/10/prob2Func.png" width="900" />
</figure>
<p><br /></p>
<p><strong>(A)</strong> Now, first write a function that generates a plot of this function, similar to the above plot.</p>
<p><strong>(B)</strong> Then write another Python script, that samples from this function by first drawing a rectangle of base size $[0,15]$ and height $[0,h]$ with $h=0.2$. Then, draw uniform random points from this rectangle, and keep those that fall beneath the the value of $f(x)$ given above as points that are sampled from this function. Finally make a histogram of these points like the following.</p>
<figure>
<img src="http:/ECL2017S/homework/10/prob2FuncHist.png" width="900" />
</figure>
<p><br /></p>
<p><strong>(C)</strong> Now make a plot of all generated points, both those that were accepted as samples, and those that were rejected, similar to the following plot, with accepted points in red color, and rejected points in black,</p>
<figure>
<img src="http:/ECL2017S/homework/10/prob2FuncScatter.png" width="700" />
</figure>
<p><br /></p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/10-problems-python-advanced-monte-carlo">Homework 10: Problems - Python advanced, Monte Carlo methods</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 26, 2017.</p><![CDATA[Quiz 7: Solutions - Python - I/O, error handling, and tesing frameworks]]>http:/ECL2017S/quiz/7-solutions-python-io-error-handling-unit-testing2017-04-19T00:00:00-05:002017-04-19T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is the solution to <a href="7-problems-python-io-error-handling-unit-testing" target="_blank">Quiz 7: Problems - Python - I/O, error handling, and tesing frameworks</a>.</p>
<p>The following figure illustrates the grade distribution for this quiz.</p>
<figure>
<img src="http:/ECL2017S/quiz/gradeDist/gradeHistQuiz7.png" width="700" />
<figcaption style="text-align:center">
Maximum possible points is 100.
</figcaption>
</figure>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p>This quiz aims at testing your basic knowledge of Python’s I/O.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> Consider this <a href="http:/ECL2017S/quiz/7/testInput.in" target="_blank">comma-separated data file</a>. Write a simple Python code <code>outputter.py</code> that takes two command line arguments like the following,</p>
<pre><code class="language-bash">python outputter.py outputter.in outputter.out
</code></pre>
<p><br />
and then writes the same float data in the output file <code>outputter.out</code> (whose name and path was taken from the command-line), and writes out data in a formatted style, like this <a href="http:/ECL2017S/quiz/7/outputter.out" target="_blank">example output file</a>, with only three digits after decimal point.</p>
<p><br />
<strong>Answer:</strong><br />
<a href="http:/ECL2017S/quiz/7/solutions/outputter.py" target="_blank">Here</a> is an example attempt.<br />
<br /></p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/quiz/7-solutions-python-io-error-handling-unit-testing">Quiz 7: Solutions - Python - I/O, error handling, and tesing frameworks</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 19, 2017.</p><![CDATA[Quiz 7: Problems - Python - I/O, error handling, and tesing frameworks]]>http:/ECL2017S/quiz/7-problems-python-io-error-handling-unit-testing2017-04-19T00:00:00-05:002017-04-19T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This quiz aims at testing your basic knowledge of Python’s I/O.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> Consider this <a href="http:/ECL2017S/quiz/7/outputter.in" target="_blank">comma-separated data file</a>. Write a simple Python code <code>outputter.py</code> that takes two command line arguments like the following,</p>
<pre><code class="language-bash">python outputter.py outputter.in outputter.out
</code></pre>
<p><br />
and then writes the same float data in the output file <code>outputter.out</code> (whose name and path was taken from the command-line), and writes out data in a formatted style, like this <a href="http:/ECL2017S/quiz/7/outputter.out" target="_blank">example output file</a>, with only three digits after decimal point.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/quiz/7-problems-python-io-error-handling-unit-testing">Quiz 7: Problems - Python - I/O, error handling, and tesing frameworks</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 19, 2017.</p><![CDATA[Lecture 10: Python advanced topics - IO, Monte Carlo, wrappers and interoperability]]>http:/ECL2017S/lecture/10-python-advanced-io-monte-carlo-interoperability2017-04-19T00:00:00-05:002017-04-19T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This lecture discusses some further important topics in Python IO, the use of random numbers and Monte Carlo simulations, as well as methods of integrating Python codes with codes from other programming languages, in particular, the use of Python as a wrapper for highly efficient, fast, low-level codes written in Fortran and C.</p>
<div class="post_toc"></div>
<h2 id="more-on-io-in-python">More on IO in Python</h2>
<p>There are a few topics and methods of <a href="https://en.wikipedia.org/wiki/Input/output" target="_blank">input/output (IO)</a> in Python that we have not discussed yet, such as reading data from special data files, or web pages. Such problems, happen almost daily in a scientific research career, even in High Performance Computing, and Python’s capability to easily handle such IO problems is indeed one of the main reasons for Python’s popularity.</p>
<h3 id="reading-data-from-special-data-files">Reading data from special data files</h3>
<p>It will happen quite often in your research that you will need to read data from a spreadsheet data file, most importantly <code>*.csv</code> and Microsoft Excel files (e.g., <code>*.xls</code> data files), or also frequently, from an <code>*.xml</code> data file. There are many ways and Python libraries to read such files. For Excel files, the task can be a bit complex, since Excel files can contain multiple sheets. A good starting point might be <a href="http://www.python-excel.org/" target="_blank">this webpage</a>, also <a href="http://pbpython.com/excel-pandas-comp.html" target="_blank">Pandas module</a>.</p>
<p>For CSV files, Python standard library has a solution. Suppose you want to read <a href="http:/ECL2017S/lecture/10/jec_pdb_r4s.csv" target="_blank">this CSV file</a>. A Python solution would be the following,</p>
<pre><code class="language-python">import csv
with open('jec_pdb_r4s.csv','r') as myfile:
for counter, row in enumerate(csv.reader(myfile)):
print(row)
if counter>10: break
</code></pre>
<pre><code>['pdb', 'pdb_id', 'chain', 'site', 'zr4s_JTT', 'r4s_JTT', 'zr4s_JC', 'r4s_JC']
['132L_A', '132L', 'A', '2', '-0.3133', '1.02', '0.04475', '1.188']
['132L_A', '132L', 'A', '3', '0.8385', '1.955', '0.2036', '1.311']
['132L_A', '132L', 'A', '4', '2.093', '2.973', '1.451', '2.272']
['132L_A', '132L', 'A', '5', '-0.8878', '0.5537', '-0.7985', '0.5382']
['132L_A', '132L', 'A', '6', '-1.443', '0.1028', '-1.426', '0.05416']
['132L_A', '132L', 'A', '7', '-0.1195', '1.177', '-0.07917', '1.093']
['132L_A', '132L', 'A', '8', '-0.7236', '0.6869', '-0.8997', '0.4602']
['132L_A', '132L', 'A', '9', '-1.107', '0.3755', '-0.8971', '0.4622']
['132L_A', '132L', 'A', '10', '0.7076', '1.848', '0.7369', '1.722']
['132L_A', '132L', 'A', '11', '0.9573', '2.051', '0.8809', '1.833']
['132L_A', '132L', 'A', '12', '-0.8315', '0.5993', '-0.9243', '0.4413']
</code></pre>
<p>Note how I have used Python <code>enumerate()</code> function to control the number of lines that is read from the file (The file contains more than 70000 lines of data!).</p>
<p>Similarly, if you wanted to write a CSV file, you can use <code>csv.writer()</code> method,</p>
<pre><code class="language-python">with open('jec_pdb_r4s.csv','r') as infile, open('jec_out.csv', 'w') as outfile:
for counter, row in enumerate(csv.reader(infile)):
csv.writer(outfile).writerow(row)
if counter>10: break
</code></pre>
<p><br />
The output of the code is <a href="http:/ECL2017S/lecture/10/jec_out.csv" target="_blank">this file</a> (If you run this code on Windows machines, you will probably get an extra empty line between each row in the csv file).</p>
<p>For <code>*.xml</code> files, Python standard library has a package <a href="https://docs.python.org/3/library/xml.etree.elementtree.html" target="_blank">ElementTree</a>, which you can use for both parsing and writing xml data files.</p>
<h3 id="reading-data-from-web">Reading data from web</h3>
<p>Nowadays, a lot of data repositories are available online publicly, and you may encounter problems that need to parse data from an online repository. For many of the most famous repositories, such as the <a href="http://www.rcsb.org/pdb/home/home.do" target="_blank">Protein databank</a>, excellent python packages have been written that automate the process of fetching data from online pages or repositories (e.g., <a href="http://biopython.org/wiki/Biopython" target="_blank">Biopython</a>).</p>
<p>Nevertheless you may need at some point in your research or career to read data from a web address. Most often, the online data is contained in a <code>html</code> file, like the content of the <a href="" target="_blank">about page</a> for this course, for example, which has the address: <a href="http:/ECL2017S/about" target="_blank">https://www.shahmoradi.orghttp:/ECL2017S/about</a>. Suppose you wanted to extract the content of this page. A simple solution would be the following via Python’s standard <a href="https://docs.python.org/3/library/urllib.html" target="_blank">urllib</a> module,</p>
<pre><code class="language-python">import urllib.request as ur
myurl = 'https://www.shahmoradi.orghttp:/ECL2017S/about'
with ur.urlopen(myurl) as webfile:
webcontent = [line.decode("utf-8") for line in webfile.readlines()]
</code></pre>
<p><br />
Now the variable <code>webcontent</code> is a list, whose elements are each row in the html file for this page.</p>
<pre><code class="language-python">webcontent[0:10]
</code></pre>
<pre><code>['<!DOCTYPE html>\n',
'<html>\n',
'<head>\n',
'<meta charset="utf-8">\n',
'<title>COE 111L - SPRING 2017</title>\n',
'<meta name="description" content="Engineering Computation Lab">\n',
'<meta name="keywords" content="Amir, Shahmoradi, Instructor">\n',
'\n',
'<!-- Twitter Cards -->\n',
'<meta name="twitter:card" content="summary_large_image">\n']
</code></pre>
<p>Note that the content of the file is read in <code>byte</code> format. Therefore, to convert it to string, one has to apply <code>.decode("utf-8")</code> on each line. Similar to opening a file on harddisk, one can also use <code>.read()</code> and <code>.readline()</code> methods to read the contant of the web address. Alternatively, one could also save the entire content of the web address, in a single file locally,</p>
<pre><code class="language-python">import urllib.request as ur
myurl = 'https://www.shahmoradi.orghttp:/ECL2017S/about'
ur.urlretrieve(myurl, filename='about.html')
</code></pre>
<p><br />
This will output <a href="http:/ECL2017S/lecture/10/about.html" target="_blank">this file</a> in your current working diretory of Python.</p>
<p>Now, the file that we imported from the web does not contains any scientific data. But, in the homework you will see a real-world scientific example and value of Python’s ability to parse the content of web pages.<br />
<br /></p>
<h3 id="writing-data-in-html-web-format">Writing data in HTML (web) format</h3>
<p>Doing research at a professional level requires reporting the results professionally as well. That is, the results of the project, including the final report itself have to be <strong>auto-generated</strong> and <strong>reproducile</strong> as much as possible, and reachable to the widest audience (which nowadays means, availibility on the world-wide web).</p>
<p>Suppose you have worked on your final project for this course, which has resulted in several figures, that you wanted to put them all together on a single webpage in your repository on Github, together with some information about each figure. Let’s say the figures are
<a href="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_1_t10.0.png" target="_blank">figure 1</a>
, <a href="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_2_t12.0.png" target="_blank">figure 2</a>
, <a href="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_3_t14.0.png" target="_blank">figure 3</a>
, <a href="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_4_t15.0.png" target="_blank">figure 4</a>
, <a href="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_5_t16.0.png" target="_blank">figure 5</a>
, <a href="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_6_t18.0.png" target="_blank">figure 6</a>
, <a href="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_7_t20.0.png" target="_blank">figure 7</a>. Now, since these figures represent the time evolution of the growth of the tumor, you would wnat to write a code that automatically generates an HTML (or Markdown) files, which contains the correct HTML code for adding these figures in your page for the project. You could for example write the following Python code to achieve this goal,</p>
<pre><code class="language-python">with open('SampleProjectReport.html', 'w') as html:
html.write('<HTML><BODY BGCOLOR="white">\n')
html.write('<H1>Sample Semester Project: Tumor growth modeling</H1><br> \n')
html.write('<H2>Each of following subplots figure represents the stage of the growth of tumor at the specified date in the figure title.</H2><br><br> \n')
time = [10.0,12.0,14.0,15.0,16.0,18.0,20.0]
nfig = 7
figReposPrefix = 'https://www.shahmoradi.orghttp:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_'
for ifig in range(1,nfig+1):
html.write( '<img src="{}{:d}_t{:.1f}.png" width="900px"><br><br>\n'.format(figReposPrefix,ifig,time[ifig-1]) )
html.write('<H2>Conclusions:</H2>\n')
html.write('<p>Chances of survival for this rat are virtually zero.</p><br>\n')
html.write('</BODY></HTML>\n')
</code></pre>
<p><br />
This code will generate an HTML file, which you can view in browser <a href="http:/ECL2017S/lecture/10/SampleProjectReport.html" target="_blank">here</a>.</p>
<h2 id="random-numbers-in-python">Random numbers in Python</h2>
<p>One of the most important topics in todays’s science and computer simulation is <a href="https://en.wikipedia.org/wiki/Random_number_generation" target="_blank">random number generation</a> and <a href="https://en.wikipedia.org/wiki/Monte_Carlo_method" target="_blank">Monte Carlo simulation</a> methods. In the simplest scenario for your research, you may need to generate a sequence of uniformly distributed random numbers in Python. There are several approaches to handle such random number generation problems in Python. Here is one, via Python’s standard <code>random</code> module:</p>
<pre><code class="language-python">In [43]: import random as rnd
In [44]: rnd.random() # generates a random number in the half open interval [0,1)
Out[44]: 0.012519922307372311
In [45]: rnd.
rnd.BPF rnd.Random rnd.betavariate rnd.gauss rnd.normalvariate rnd.randrange rnd.shuffle rnd.weibullvariate
rnd.LOG4 rnd.SG_MAGICCONST rnd.choice rnd.getrandbits rnd.paretovariate rnd.sample rnd.triangular
rnd.NV_MAGICCONST rnd.SystemRandom rnd.expovariate rnd.getstate rnd.randint rnd.seed rnd.uniform
rnd.RECIP_BPF rnd.TWOPI rnd.gammavariate rnd.lognormvariate rnd.random rnd.setstate rnd.vonmisesvariate
</code></pre>
<p><br />
As you see in the list of available methods in <code>random</code>, you can generate random numbers from a wide variaty of univariate probability distributions, e.g.,</p>
<pre><code class="language-python">In [46]: rnd.betavariate(0.5,0.5) # Beta variate with the input parameters
Out[46]: 0.9281984408820623
In [54]: rnd.expovariate(1) # random variable from exponential distribution with mean 1.
Out[54]: 2.546912414260747
In [55]: rnd.gammavariate(1,1) # random variable from gamma distribution with parameters 1,1.
Out[55]: 0.5364897808236537
</code></pre>
<p><br />
Recall that if you needed help on a method or function in Python, you could use <code>help()</code>,</p>
<pre><code class="language-python">In [61]: help(rnd.weibullvariate)
Help on method weibullvariate in module random:
weibullvariate(alpha, beta) method of random.Random instance
Weibull distribution.
alpha is the scale parameter and beta is the shape parameter.
</code></pre>
<p><br />
To generate <code>float</code> random numbers between the given input bounds,</p>
<pre><code class="language-python">In [64]: rnd.uniform(50,100) # generate a random float between 50 and 100
Out[64]: 65.59688328558263
</code></pre>
<p><br /></p>
<blockquote>
<b>ATTENTION: </b><br /><br />
Alwasy make sure you import modules with unique names, as different modules with similar component names may overwrite each other. For example <code>import random</code> followed by <code>from numpy import *</code> wil cause the <code>random</code> module to be overwritten by <code>numpy.random</code> module.
</blockquote>
<p><br />
Also pay attention to sublte differences between similar functions, with the same names, but in different modules. For example,</p>
<pre><code class="language-python">import numpy as np
np.random.randint(1,6,1)
</code></pre>
<p><br />
will draw a random integer from the interval $[1,6)$ excluding the value $6$ (the third input, $1$, indicates how many numbers has to be drawn randomly by the function). However,</p>
<pre><code class="language-python">import random as rnd
rnd.randint(1,6)
</code></pre>
<p><br />
will draw a random integer form the interval $[1,6]$. Also note that <code>randint()</code> from module <code>random</code> is a scalar function, whereas the numpy’s version is vectorized.</p>
<h3 id="the-deterministic-aspect-of-randomness-in-python">The deterministic aspect of randomness in Python</h3>
<p>There is a truth about random numbers and random number generators and algorithms, not only in Python, but in all programming languages, and that is, <strong>true random numbers do not exist in the world of programming</strong>. What we call a seuqence of random numbers, is simply a sequence of numbers that we, the user, to the best of our knowledge, don’t know how it was generated, and therefore, <strong>the sequence looks random to us, bu not the to the developer of the algorithm!</strong>. To prove this, type the following code in a Python session,</p>
<pre><code class="language-python">In [13]: import numpy as np
In [14]: np.random.seed(42)
In [15]: np.random.randint(1,6,6)
Out[15]: array([4, 5, 3, 5, 5, 2])
In [16]: np.random.randint(1,6,6)
Out[16]: array([3, 3, 3, 5, 4, 3])
In [17]: np.random.seed(42)
In [18]: np.random.randint(1,6,6)
Out[18]: array([4, 5, 3, 5, 5, 2])
</code></pre>
<p><br />
You notice that everytime the random function is called, it generates a new sequence of random numbers, apparently completely random. But as soon as the function <code>np.random.seed(42)</code> is called, it appears that the random number generator also restarts from the beginning, generating the same sequence of random numbers as it did before.</p>
<p>You can even test the same code on a different computer, and as long as you set the seed of the random number generator to a specific value (here 42), <code>np.random.seed(42)</code>, you will the same sequence of random numbers. So afterall, random numbers are not random at all, as they can be generated detrerministically, however, they mimic the behavior of true random numbers. The ability to set the seed for a random number generator is actually very useful, since it enables us to replicate the work of a code, exactly it has been done in the past. In particular, this is very useful for code debugging. However, beware of cases were you need to get a different result, everytime you run the code. If you set the random seed of the random generator to to a fixed value, right at the beginning of the code, you will never get a random behavior.</p>
<h3 id="drawing-a-random-element-from-a-list">Drawing a random element from a list</h3>
<p>Suppose you have the following list,</p>
<pre><code class="language-python">import numpy as np
mylist = np.linspace(0,100,51)
</code></pre>
<pre><code class="language-python">mylist
</code></pre>
<pre><code>array([ 0., 2., 4., 6., 8., 10., 12., 14., 16.,
18., 20., 22., 24., 26., 28., 30., 32., 34.,
36., 38., 40., 42., 44., 46., 48., 50., 52.,
54., 56., 58., 60., 62., 64., 66., 68., 70.,
72., 74., 76., 78., 80., 82., 84., 86., 88.,
90., 92., 94., 96., 98., 100.])
</code></pre>
<p>and now you wanted to draw a random element from the above list. You could do,</p>
<pre><code class="language-python">import random as rnd
rnd.choice(mylist)
</code></pre>
<pre><code>80.0
</code></pre>
<p>This will give a random element from the list. You could also generate a random shuffling of the list by,</p>
<pre><code class="language-python">import random as rnd
rnd.shuffle(mylist)
mylist
</code></pre>
<pre><code>array([ 98., 12., 76., 60., 46., 22., 24., 92., 66.,
16., 6., 34., 14., 8., 18., 50., 30., 74.,
4., 2., 38., 90., 70., 56., 94., 80., 32.,
20., 10., 44., 72., 84., 0., 78., 100., 88.,
86., 96., 48., 52., 62., 64., 26., 36., 40.,
54., 68., 58., 82., 42., 28.])
</code></pre>
<p><br /></p>
<h3 id="summary-of-some-important-random-functions-in-python">Summary of some important random functions in Python</h3>
<p>As you may have noticed, since none of the random functions are builtin, things can get really confusing very easily, by simply mixing numpy’s random mdule with Python’s random module. The following helps to clarify some of the most important differences between these two modules.</p>
<table class="center">
<caption class="title" style="padding-bottom:10px">
Table 1: Some useful functions and their functionalities in <code>random</code> and <code>numpy</code> modules
</caption>
<thead>
<tr>
<th>Purpose</th>
<th>random module</th>
<th>numpy.random module</th>
</tr>
</thead>
<tbody>
<tr>
<td>random uniform numbers in $[0,1)$</td>
<td><code>random()</code></td>
<td><code>random(N)</code> (vectorized)</td>
</tr>
<tr>
<td>random uniform numbers in $[a,b)$</td>
<td><code>uniform(a,b)</code></td>
<td><code>uniform(a,b,N)</code> (vectorized)</td>
</tr>
<tr>
<td>random integers in $[a,b]$</td>
<td><code>randint(a,b)</code></td>
<td><code>randint(a,b+1,N)</code> (vectorized) <br /> <code>random_integers(a,b+1,N)</code> (vectorized) </td>
</tr>
<tr>
<td>random Gaussian deviate with parameters $[\mu, \sigma]=[m,s]$</td>
<td><code>gauss(m,s)</code></td>
<td><code>normal(m,s,N)</code> (vectorized)</td>
</tr>
<tr>
<td>setting random number generator seed $i$</td>
<td><code>seed(i)</code></td>
<td><code>seed(i)</code></td>
</tr>
<tr>
<td>shuffling list mylist</td>
<td><code>shuffle(mylist)</code></td>
<td><code>shuffle(mylist)</code></td>
</tr>
<tr>
<td>choose a random element from mylist</td>
<td><code>choice(mylist)</code></td>
<td> -- </td>
</tr>
</tbody>
</table>
<p><br /></p>
<h3 id="monte-carlo-simulations">Monte Carlo simulations</h3>
<p>A Monte Carlo simulation is basically any simulation problem that somehow involves random numbers. Let’s start with an example of throwing a die repeatedly for N times. We can simulate the process of throwing a die by the following python code,</p>
<pre><code class="language-python">def throwFairDie():
import random as rnd
return rnd.randint(1, 6)
</code></pre>
<p><br />
Now, each time the function is called, it returns a random value for one throw of a virtual die,</p>
<pre><code class="language-python">In [7]: throwFairDie()
Out[7]: 6
In [8]: throwFairDie()
Out[8]: 1
In [9]: throwFairDie()
Out[9]: 4
In [10]: throwFairDie()
Out[10]: 1
</code></pre>
<p><br />
This is likely one of the simplest examples of <a href="https://en.wikipedia.org/wiki/Monte_Carlo_method" target="_blank">Monte Carlo simulations</a>. Now suppose we wanted to make sure that the die is fair, meaning that each number (out of 6 possibilities) only appears with a frequency of $1/6$ over many throws of the die. To test this hypothesis, we could write the following code,</p>
<pre><code class="language-python">import numpy as np
def throwFairDie():
import random as rnd
return rnd.randint(1, 6)
def getMeanDieValue(n=10000):
meanDieValue = np.zeros((n,6),dtype=np.double)
randomThrow = throwFairDie() - 1 # assign the first value to the above array
meanDieValue[0,randomThrow] = 1.0 / 1.0 # one try so far, one success for the die value that is obtained.
for i in range(1,n):
randomThrow = throwFairDie() - 1
meanDieValue[i,randomThrow] = 1.0 # add one success for the value obtained
meanDieValue[i,:] += meanDieValue[i-1,:] # combine the recent success with the total number of successes from previous tries.
meanDieValue[i-1,:] /= np.double(i) # Now normalize the values form the last try to the total number of tries.
meanDieValue[-1:,:] /= np.double(n) # Now normalize the very last try to the total number of tries.
return meanDieValue
</code></pre>
<p><br />
What this function does, is that it throws a die for given input number of times (n=10000 by default if not given as input), and then calculates for each new try, how many times each of the die values have occurred so far, and then finally outputs all the result as numpy <code>double</code> array, each row of which contains the number of successes for each of the 6 die values. Normally, if the die is fair, you would expect that with more tries, the average number of successes for each try would converge more and more to the canonical value $1/6\sim0.1667$. We can test this, by calling the function with a large number of tries, and checking the values in the last row of the output array,</p>
<pre><code class="language-python">print( getMeanDieValue()[-1:,:] )
</code></pre>
<pre><code>[[ 0.1645 0.1668 0.1683 0.1664 0.169 0.165 ]]
</code></pre>
<pre><code class="language-python">print( getMeanDieValue(n=100000)[-1:,:] )
</code></pre>
<pre><code>[[ 0.16488 0.1665 0.16635 0.16841 0.1661 0.16776]]
</code></pre>
<p>A better approach would be plot the output as a function of the number of tries, and see if the results for each of possible die outcomes do indeed converge to the canonical value or not.</p>
<pre><code class="language-python">import numpy as np
import matplotlib.pyplot as plt
nDieValues = 6 # 6 possible values for a die throw
nTrial = 100000 # total number of die throws
meanDieValues = getMeanDieValue(n=nTrial)
fig1 = plt.figure()
trial = np.linspace( 1 , nTrial+1 , nTrial )
lineTypes = ['r-','b-','g-','y-','b-','g-']
for i in range( nDieValues ) :
plt.semilogx( trial[:] \
, meanDieValues[:,i] \
, lineTypes[i] \
) # plot with color red, as line
plt.hold('on')
plt.xlabel('trial number')
plt.ylabel('fraction of occurrence for each die number')
plt.legend(['die value: '+str(i) for i in range(1,7) ])
plt.axis([1, nTrial , 0.0, 1.0]) # [xmin, xmax, ymin, ymax]
plt.title('N={} throws of a virtual die in Python'.format(nTrial))
plt.savefig('diceThrowsN{}.png'.format(nTrial))
plt.show()
</code></pre>
<p><br />
You can see the output of the above code in the following figure,</p>
<figure>
<img src="http:/ECL2017S/lecture/10/diceThrowsN100000.png" width="900" />
</figure>
<p><br /></p>
<h2 id="python-wrappers-and-interfaces">Python wrappers and interfaces</h2>
<p>Python is a very convenient language for implementing scientific computations as the code can be made very close to the mathematical algorithms. However, the execution speed of the code is significantly lower than what can be obtained by
programming in languages such as Fortran, C, or C++. For example see the following performance comparisons and tests in <a href="https://modelingguru.nasa.gov/docs/DOC-1762" target="_blank">NASA modeling guru webpage</a>. As you can see there, the <strong>performance of Python code can be significantly lower, up to 500 times and more, compared to compiled languages such as Fortran and C</strong>. These languages compile the program to machine language, which enables the computing resources to be utilized with very high efficiency. Knowing the performance hit in Python, the scientific programming paradigm in Python is to write compute-intensive parts of the code in lower level languages such as Fortran or C, and use Python as wrapper and glue between lower level codes and as a handy tool for high-level tasks.</p>
<p>Python was initially designed for being integrated with C. This feature has spawned the development of several techniques and tools for calling compiled languages from Python, allowing us to relatively easily reuse fast and
well-tested scientific libraries in Fortran, C, or C++ from Python, or migrate slow Python code to compiled languages. It often turns out that only smaller parts of the code, usually for loops doing heavy numerical computations, suffer from low speed and can benefit from being implemented in Fortran, C, or C++.</p>
<p>There are already several Python wrappers developed for integrating Python with other programming language codes. Most prominent examples include <a href="https://docs.scipy.org/doc/numpy-dev/f2py/" target="_blank">F2PY</a> for Fortran and C codes, <a href="http://www.swig.org/" target="_blank">SWIG</a> for C, C++, Perl, Java, and many others, <a href="http://cython.org/" target="_blank">Cython</a> for C, <a href="http://www.jython.org/" target="_blank">Jython</a> for Java, and several others.</p>
<p>The usage of some of these wrappers can be tricky and requires some work and good familiarity with the wrapper. This is in particular true about SWIG, which involves a significant amount of manual modifications to the interfaces, compared to F2PY, for example. At the moment, F2PY only works with Python 2.x standard.</p>
<p>There is also a Python module <a href="http://pymat.sourceforge.net/" target="_blank">pymat</a> developed for direct interaction of Python code with MATLAB.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/lecture/10-python-advanced-io-monte-carlo-interoperability">Lecture 10: Python advanced topics - IO, Monte Carlo, wrappers and interoperability</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 19, 2017.</p><![CDATA[Homework 9: Problems - Python advanced IO, Monte Carlo]]>http:/ECL2017S/homework/9-problems-python-advanced-io-monte-carlo-interoperability2017-04-19T00:00:00-05:002017-04-19T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This homework aims at giving you some experience with Python’s tools for interacting with the World Wide Web and writing Monte Carlo simulations.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>Update:</strong> As I discussed in class, in order to avoid creating potential traffic on Professor Butler’s webpage, I have now uploaded all the necessary files on <a href="http:/ECL2017S/homework/9/swift/bat_time_table.html" target="_blank">this address</a> (don’t click on the links in this table, because it will take you to Professor Butler’s repository for this data. I have all the data already saved in our domain locally). So now, your goal is to first read the event-ID HTML table from</p>
<p><a href="http:/ECL2017S/homework/9/swift/bat_time_table.html" target="_blank">https://www.shahmoradi.orghttp:/ECL2017S/homework/9/swift/bat_time_table.html</a>.</p>
<p>Then, use the event-IDs in this table to generate web addresses like:</p>
<p><a href="http:/ECL2017S/homework/9/swift/GRB00100433_ep_flu.txt" target="_blank">https://www.shahmoradi.orghttp:/ECL2017S/homework/9/swift/GRB00100433_ep_flu.txt</a></p>
<p>in order to download these <code>.txt</code> files from the web. The rest of the homework is just as you would have done this problem as descibed below.</p>
<p><strong>1. </strong> <strong>Reading scientific data from web</strong>. Consider the webpage of Professor <a href="http://butler.lab.asu.edu/" target="_blank">Nat Butler</a> at Arizona State University. He has successfully written Python piplines for automated analysis of data from <a href="https://www.nasa.gov/mission_pages/swift/main" target="_blank">NASA’s Swift satellite</a>. For each <a href="https://en.wikipedia.org/wiki/Gamma-ray_burst" target="_blank">Gamma-Ray Burst (GRB)</a> detection that Swift makes, his pipline analyzes and reduces data for the burst and summarizes the results on his personal webpage, for example in <a href="http://butler.lab.asu.edu/swift/bat_spec_table.html" target="_blank">this table</a>.</p>
<p><strong>(A)</strong> Write a Python function named <code>fetchHtmlTable(link,outputPath)</code> that takes two arguments:</p>
<ol>
<li>a web address (which will be this: <a href="http://butler.lab.asu.edu/swift/bat_time_table.html" target="_blank">http://butler.lab.asu.edu/swift/bat_time_table.html</a>), and,</li>
<li>an output path to where you want the code save the resulting files.</li>
</ol>
<p>One file is exact HTML contained in the input webpage address, and a second file, which is the Table contained in this HTML address. To parse the HTML table in this address, you will need the Python code <a href="http:/ECL2017S/homework/9/parseTable.py" target="_blank">parseTable.py</a> also available and explained on <a href="https://www.summet.com/dmsi/html/readingTheWeb.html" target="_blank">this page</a>. This parsed HTML table, will be in the form of a list, whose elements correspond to each row in the HTML table, and each row of element of this parsed table is itself another list, that contains the columns of the HTML table in that row. Output this table as well, in a separate file, in a formatted style, meaning that each element of table in a row has a space of 30 characters for itself (or something appropriate as you wish, e.g., <code>'{:>30}'.format(item)</code>). You can see an example output of the code <a href="http:/ECL2017S/homework/9/bat_time_table.html" target="_blank">here for the HTML output file</a>, and <a href="http:/ECL2017S/homework/9/bat_time_table.html.tab" target="_blank">here for parse HTML table</a>.</p>
<p><strong>(B)</strong> Now, if you look at the content of the file that your function has generated (once you run it), you will see something like the following,</p>
<pre><code class="language-text"> GRB (Trig#) Trig_Time (SOD) Time Region [s] T_90 T_50 rT_0.90 rT_0.50 rT_0.45 T_av T_max T_rise T_fall Cts Rate_pk Band
GRB170406x (00745966) 44943.130 -40.63->887.37 881.000 +/-7.697 549.000 +/-25.558 280.000 +/-13.432 124.000 +/-5.751 109.000 +/-4.997 433.667 +/-15.557 877.870 +/-367.494 890.500 +/-366.900 0.000 +/-366.630 6.082 +/-0.344 0.018 +/-0.0065 15-350keV
GRB170402x (00745090) 38023.150 54.35->66.35 9.000 +/-2.096 5.000 +/-1.490 7.000 +/-1.535 4.000 +/-0.640 3.000 +/-0.559 60.964 +/-1.316 58.850 +/-2.417 1.500 +/-2.894 7.500 +/-2.720 0.162 +/-0.045 0.022 +/-0.0106 15-350keV
GRB170401x (00745022) 68455.150 -19.63->71.49 78.880 +/-5.224 39.440 +/-4.168 41.480 +/-3.823 18.360 +/-1.585 16.320 +/-1.341 29.541 +/-2.857 24.910 +/-24.386 37.740 +/-24.251 41.140 +/-23.977 1.181 +/-0.122 0.024 +/-0.0130 15-350keV
GRB170331x (00744791) 6048.440 9.835->35.875 20.160 +/-1.285 10.290 +/-0.914 14.070 +/-1.041 5.880 +/-0.415 5.040 +/-0.359 21.461 +/-0.598 12.460 +/-5.633 0.525 +/-5.718 19.635 +/-5.840 1.875 +/-0.154 0.134 +/-0.0408 15-350keV
</code></pre>
<p><br />
Now write another function that reads the events’ unique numbers that appear in this table in parentheses (e.g., 00745966 is the first in table), and puts this number in place of <code>event_id</code> in this web address template: <code>http://butler.lab.asu.edu/swift/event_id/bat/ep_flu.txt</code>.</p>
<p>Now note that, for some events, this address exists, for example,</p>
<p><a href="http://butler.lab.asu.edu/swift/00745966/bat/ep_flu.txt" target="_blank">http://butler.lab.asu.edu/swift/00745966/bat/ep_flu.txt</a>,</p>
<p>which is a text file named <code>ep_flu.txt</code>. For some other events, this address might not exist, for example,</p>
<p><a href="http://butler.lab.asu.edu/swift/00680331/bat/ep_flu.txt" target="_blank">http://butler.lab.asu.edu/swift/00680331/bat/ep_flu.txt</a>,</p>
<p>in which case your code will have to raise a <code>urllib.request.HTTPError</code> exception. Write your code such that it can smoothly skip these exceptions. Write your code such that it saves all those existing text files on your local computer, in a file-name format like <a href="http:/ECL2017S/homework/9/GRB00100433_ep_flu.txt" target="_blank">this example: <code>GRB00100433_ep_flu.txt</code></a> (A total of 938 files exist).</p>
<p><strong>(C)</strong> Now write a third function, that reads all of these files in your directory, one by one, as numpy arrays, and plots the content of all of them together, on a single scatter plot like the following,</p>
<figure>
<img src="http:/ECL2017S/homework/9/ep_flu.png" width="900" />
</figure>
<p><br /></p>
<p>To achieve this goal, your function should start like the following,</p>
<pre><code class="language-python">def plotBatFiles(inPath,figFile):
import os
import numpy as np, os
import matplotlib.pyplot as plt
ax = plt.gca() # generate a plot handle
ax.set_xlabel('Fluence [ ergs/cm^2 ]') # set X axis title
ax.set_ylabel('Epeak [ keV ]') # set Y axis title
ax.axis([1.0e-8, 1.0e-1, 1.0, 1.0e4]) # set axix limits [xmin, xmax, ymin, ymax]
plt.hold('on') # add all data files to the same plot
counter = 0 # counts the number of events
</code></pre>
<p><br />
where <code>inPath</code> and <code>figFile</code> are the path to the directory containing the files, and the name and path to the output figure file. You will have to use <code>os.listdir(inPath)</code> to get a list of all files in your input directory. Then loop over this list of files, and use only those that end with <code>ep_flu.txt</code> because that’s how you saved those files, e.g.,</p>
<pre><code class="language-python">for file in os.listdir(inPath):
if file.endswith("ep_flu.txt"):
# rest of your code ...
</code></pre>
<p><br />
But now, you have to also make sure that your input data does indeed contain some numerical data, because some files do contain anything, although they exist, like <a href="http:/ECL2017S/homework/9/GRB00559075_ep_flu.txt" target="_blank">this file: ``</a>. To do so, you will have to perform a test on the content of file, once you read it as numpy array, like the following,</p>
<pre><code class="language-python"> data = np.loadtxt(os.path.join(inPath, file), skiprows=1)
if data.size!=0 and all(data[:,1]<0.0):
# then plot data
</code></pre>
<p><br />
the condition <code>all(data[:,1]<0.0)</code> is rather technical. It makes sure that all values are positive on the second column. Once you have done all these checks, you have to do one final manipulation of data, that is, the data in these files on the second column is actually the log of data, so have to get the <code>exp()</code> value to plot it (because plot is log-log). To do so you can use,</p>
<pre><code class="language-python"> data[:,1] = np.exp(data[:,1])
</code></pre>
<p><br />
and then finally,</p>
<pre><code class="language-python"> ax.scatter(data[:,1],data[:,0],s=1,alpha=0.05,c='r',edgecolors='none')
</code></pre>
<p><br />
which will add the data for the current file to the plot. At the end, you will have to set a title for your plot as well, and save your plot,</p>
<pre><code class="language-python"> ax.set_title('Plot of Epeak vs. Fluence for {} Swift GRB events'.format(counter))
plt.savefig(figFile)
</code></pre>
<p><br />
Note that the variable <code>counter</code> contains the total number of events for which the text files exists on the website, <strong>and</strong> the file contained some data (i.e., was not empty).</p>
<p><strong>Question:</strong> What does <code>alpha=0.05</code> and <code>s=1</code> do in the following scatter plot command? (Vary their values to see what happens)</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> <strong>Simulating a fun Monte Carlo game.</strong> Suppose you’re on a game show, and you’re given the choice of three doors:</p>
<figure>
<img src="http:/ECL2017S/homework/9/Monty_1.png" width="600" />
</figure>
<p><br /></p>
<p>Behind one door is a car; behind the two others, goats. You pick a door, say No. 1, and the host of the show opens another door, say No. 3, which has a goat.</p>
<figure>
<img src="http:/ECL2017S/homework/9/Monty_open_door.png" width="600" />
</figure>
<p><br /></p>
<p>He then says to you, “Do you want to pick door No. 2?”.</p>
<p><strong>Question: What would you do?</strong><br />
Is it to your advantage to switch your choice from door 1 to door 2? Is it to your advantage, <strong>in the long run, for a large number of game tries</strong>, to switch to the other door?</p>
<p>Now whatever your answer is, I want you to check/prove your answer by a Monte Carlo simulation of this problem. Make a plot of your simulation for $ngames=100000$ repeat of this game, that shows, in the long run, on average, what is the probability of winning this game if you switch your choice, and what is the probability of winning, if you do not switch to the other door.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/9-problems-python-advanced-io-monte-carlo-interoperability">Homework 9: Problems - Python advanced IO, Monte Carlo</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 19, 2017.</p><![CDATA[Homework 8: Solutions - Python array computing and plotting]]>http:/ECL2017S/homework/8-solutions-python-array-computing-plotting2017-04-19T00:00:00-05:002017-04-19T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is the solution to <a href="8-problems-python-array-computing-plotting.html" target="_blank">Homework 8: Problems - Python array computing and plotting</a>.</p>
<p>The following figure illustrates the grade distribution for this homework.</p>
<figure>
<img src="http:/ECL2017S/homework/gradeDist/gradeHistHomework8.png" width="700" />
<figcaption style="text-align:center">
Maximum possible points is 100.<br />
</figcaption>
</figure>
<hr />
<hr />
<p>This homework aims at giving you some experience with Python’s array computing and plotting features.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> <strong>The while-loop implementation of a for-loop</strong>. Consider the following mathematical function resembling a Hat function,</p>
<script type="math/tex; mode=display">% <![CDATA[
f(x) =
\begin{cases}
0 ~, & \text{if}~~ x<0 \\
x ~, & \text{if}~~ 0\leq x <1 \\
2-x ~, & \text{if}~~ 1\leq x <2 \\
0 ~, & \text{if}~~ x \geq 2 \\
\end{cases} %]]></script>
<p>A scalar implementation of this function would be,</p>
<pre><code class="language-python">def hatFunc(x):
if x < 0:
return 0.0
elif 0 <= x < 1:
return x
elif 1 <= x < 2:
return 2 - x
elif x >= 2:
return 0.0
</code></pre>
<p><br />
Write a vectorized version of this function. (Hint: you may need numpy’s <code>logical_and</code> method for building the vectorized version of this function.)</p>
<p><br />
<strong>Answer:</strong></p>
<pre><code class="language-python">def hatFunc(x):
condition1 = x < 0
condition2 = np.logical_and(0 <= x, x < 1)
condition3 = np.logical_and(1 <= x, x < 2)
condition4 = x >= 2
r = np.zeros(len(x))
r[condition1] = 0.0
r[condition2] = x[condition2]
r[condition3] = 2-x[condition3]
r[condition4] = 0.0
return r
</code></pre>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> The vertical position $y(t)$ of a ball thrown upward is given by $y(t)=v_0t-\frac{1}{2}gt^2$, where $g$ is the acceleration of gravity and $v_0$ is the initial vertical velocity at $t=0$. Two important physical quantities in this context are the potential energy, obtained by doing work against gravity, and the kinetic energy, arising from motion. The potential energy is defined as $P=mgy$, where $m$ is the mass of the ball. The kinetic energy is defined as $K=\frac{1}{2}mv^2$, where $v$ is the velocity of the ball, related to $y$ by $v(t)=y’(t)$.</p>
<p>Write a program that can plot $P(t)$ and $K(t)$ in the same plot, along with their sum $E = P + K$. Let $t\in[0,2v_0/g]$. Write your program such that $m$ and $v_0$ are read from the command line. Run the program with various choices of $m$ and $v_0$ and observe that $P+K$ always remains constant in this motion, regardless of initial conditions. This is in fact, the fundamental principle of conservation of energy in Physics.</p>
<p><br />
<strong>Answer:</strong><br />
A sample code can be downloaded from <a href="http:/ECL2017S/homework/8/ball_energy.py" target="_blank">here</a>. Here is an example output figure of the code:</p>
<figure>
<img src="http:/ECL2017S/homework/8/ball_energy.png" width="900" />
</figure>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>3. </strong> <strong>Integration by midpoint rule</strong>: The idea of the Midpoint rule for integration is to divide the area under a curve $f(x)$ into $n$ equal-sized rectangles. The height of the rectangle is determined by the value of $f$ at the midpoint of the rectangle. The figure below illustrates the idea,</p>
<figure>
<img src="http:/ECL2017S/homework/8/midpnt.gif" width="700" />
</figure>
<p>To implement the midpointrule, one has to compute the area of each rectangle, sum them up, just as in the formula for the Midpoint rule,</p>
<script type="math/tex; mode=display">\int^b_a f(x) dx \approx h\sum^{n-1}_{i=0} f(a+ih+0.5h) ~,</script>
<p>where $h=(b-a)/n$ is the width of each rectangle. Implement this formula as a Python function midpoint(f, a, b, n) and test the integrator with the following example input mathematical functions.</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
f_1(x) &= exp(x) ~,~ \text{for integration range } ~ [0, \log(3)] \\\\
f_2(x) &= cos(x) ~,~ \text{for integration range } ~ [0, \pi] \\\\
f_3(x) &= sin(x) ~,~ \text{for integration range } ~ [0, \pi] \\\\
f_4(x) &= sin(x) ~,~ \text{for integration range } ~ [0, \pi / 2] \\\\
\end{align*} %]]></script>
<p><br />
<strong>Answer:</strong><br />
An example code can be downloaded from <a href="http:/ECL2017S/homework/8/midpoint.py" target="_blank">here</a>. Here is the output of the code,</p>
<pre><code class="language-python">In [38]: run midpoint.py
The exact integral of exp(x) between 0.00000 and 1.09861 is 2.00000. The approximate answer is 1.99899 giving an error of 0.00101
The exact integral of cos(x) between 0.00000 and 3.14159 is 0.00000. The approximate answer is 0.00000 giving an error of 0.00000
The exact integral of sin(x) between 0.00000 and 3.14159 is 2.00000. The approximate answer is 2.00825 giving an error of 0.00825
The exact integral of sin(x) between 0.00000 and 1.57080 is 1.00000. The approximate answer is 1.00103 giving an error of 0.00103
</code></pre>
<p><br /></p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>4. </strong> <strong>Visualize approximations in the Midpoint integration rule</strong> Now consider the following function,</p>
<script type="math/tex; mode=display">f(x) = x(12-x)+\sin(\pi x) ~~,~~ x\in[0,10] ~,</script>
<p>which we wish to integrate using the midpoint integrator that you wrote in the previous example. Now write a new code that visualizes the midpoint rule, similar to in the following figure. (Hint: you will need to use the Matplotlib function <code>fill_between</code> and use this function to create the filled areas between f(x) and the approximating rectangles)</p>
<figure>
<img src="http:/ECL2017S/homework/8/midpoint_visualization.png" width="700" />
</figure>
<p><br />
<strong>Answer:</strong><br />
An example code can be downloaded from <a href="http:/ECL2017S/homework/8/visualize_midpoint.py" target="_blank">here</a>.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/8-solutions-python-array-computing-plotting">Homework 8: Solutions - Python array computing and plotting</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 19, 2017.</p><![CDATA[Lecture 9: Python - array computing and plotting]]>http:/ECL2017S/lecture/9-python-array-computing-plotting2017-04-12T00:00:00-05:002017-04-12T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This lecture focuses on array computing and code vectorization, as well as methods of plotting data in Python.</p>
<div class="post_toc"></div>
<h2 id="vectorization-and-array-computing">Vectorization and array computing</h2>
<p>With regards to capabilities of Python for scientific calculations, there are conflicting opinions. On the scientific side of the opinion spectrum, some people think that Python is not good enough for number crunching (as a result of which, new programming languages such as <a href="https://en.wikipedia.org/wiki/Julia_(programming_language)" target="_blank">Julia</a> have been developed). However, there are people at the other extreme who believe that Python is too much oriented towards scientific computation (as a result of which, new programming languages have emerged, such as Google’s <a href="https://en.wikipedia.org/wiki/Go_(programming_language)" target="_blank">Go language</a>.</p>
<p>So far in this course, you may have noticed that all numerical vector calculations were either performed with lists, tuples, or dictionaries. Sadly, Python standard does not have an intrinsic special way of defining and manipulating numerical vectors and arrays, unlike most High Performance Computing (HPC) languages for scientific computations (such as Fortran, Ada, or C). However, there are powerful Python modules that enable a Python programmer to use Python efficiently for numerical analysis as well.</p>
<blockquote>
If you expect to use Python heavily and mostly for scientific computation in future, you should keep in mind that Python's builtin list, tuple and dictionary types can be very slow for number crunching.
</blockquote>
<p><br /></p>
<h3 id="vectors-arrays-and-the-numerical-python-numpy-package">Vectors, arrays and the Numerical Python (numpy) package</h3>
<p>In Python, a list can be <strong>heterogeneous</strong> meaning that not all its elements are of the same type. An <strong>array object</strong> in Python can be viewed as a variant of a list, but with the following assumptions:</p>
<ul>
<li>All elements must be of the same type, preferably integer, real, or complex numbers, for efficient numerical computing and storage.</li>
<li>The number of elements must be known when the array is created.</li>
<li>Arrays are not part of standard Python – one needs an additional package called <strong>Numerical Python</strong>, often abbreviated as <strong>NumPy</strong>. The Python name of the package, to be used in import statements, is <code>numpy</code>.</li>
<li>With numpy, a wide range of mathematical operations can be done directly on complete arrays, thereby removing the need for loops over array elements. This is commonly called <strong>vectorization</strong>.</li>
<li>Arrays with one index are often called <strong>vectors</strong>. Arrays with two indices are used as an efficient data structure for tables, instead of lists of lists. Arrays can also have three or more indices.</li>
</ul>
<p>The number of elements of an array can be changed, but keep in mind that this can cause significant computational cost. Creating an array of a given length is frequently referred to as <strong>allocating the
array</strong>. It means that a part of the computer’s memory is marked for being occupied by this array.</p>
<p>To create a numpy array, you will have to first import it,</p>
<pre><code class="language-python">import numpy as np
</code></pre>
<p><br />
The tradition is to import <code>numpy</code> as <code>np</code>. To convert a list to a numpy array,</p>
<pre><code class="language-python">In [3]: import numpy as np
In [4]: a = [1,2,3,4,5]
In [5]: a = np.array(a)
In [6]: type(a)
Out[6]: numpy.ndarray
In [7]: a
Out[7]: array([1, 2, 3, 4, 5])
</code></pre>
<p><br />
To create a new <strong>array of length n, filled with zeros</strong>,</p>
<pre><code class="language-python">a = np.zeros(n)
</code></pre>
<p><br />
Note that numpy automatically identifies the appropriate type for all array elements, whether <code>int</code>, <code>float</code>, or etc.</p>
<pre><code class="language-python">In [10]: a[1]
Out[10]: 2
In [11]: type(a[1])
Out[11]: numpy.int32
</code></pre>
<p><br />
Even if there is a single <code>float</code> element in the list, then all elements in the list will be converted to float in the numpy array by default,</p>
<pre><code class="language-python">In [11]: type(a[1])
Out[11]: numpy.int32
In [12]: a = [1,2,3,4,5.0]
In [13]: a = np.array(a)
In [14]: type(a[1])
Out[14]: numpy.float64
</code></pre>
<p><br />
If you want to get the desired element type, then you will have to ask numpy for it explicitly,</p>
<pre><code class="language-python">In [17]: a = [1,2,3.5,4.9,5.0]
In [18]: a = np.array(a, int) # convert all elements in the list to integer
In [19]: a
Out[19]: array([1, 2, 3, 4, 5])
</code></pre>
<p><br />
You can see the full list of input arguments to np.array function <a href="https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html" target="_blank">here</a>.</p>
<p>A similar function <code>np.zeros_like(c)</code> generates an array of zeros where the length of the generated array is that of the input array c and the element type is the same as those in c.</p>
<pre><code class="language-python">In [33]: b = [1,2,3,4,5,6,7]
In [34]: a = np.zeros_like(b)
In [35]: a
Out[35]: array([0, 0, 0, 0, 0, 0, 0])
</code></pre>
<p><br />
Often one wants an array to have $n$ elements with uniformly distributed values in an interval $[p,q]$. The numpy function <code>linspace</code> creates such arrays,</p>
<pre><code class="language-python">In [36]: a = np.linspace(1, 100, 53)
In [37]: a
Out[37]:
array([ 1. , 2.90384615, 4.80769231, 6.71153846,
8.61538462, 10.51923077, 12.42307692, 14.32692308,
16.23076923, 18.13461538, 20.03846154, 21.94230769,
23.84615385, 25.75 , 27.65384615, 29.55769231,
31.46153846, 33.36538462, 35.26923077, 37.17307692,
39.07692308, 40.98076923, 42.88461538, 44.78846154,
46.69230769, 48.59615385, 50.5 , 52.40384615,
54.30769231, 56.21153846, 58.11538462, 60.01923077,
61.92307692, 63.82692308, 65.73076923, 67.63461538,
69.53846154, 71.44230769, 73.34615385, 75.25 ,
77.15384615, 79.05769231, 80.96153846, 82.86538462,
84.76923077, 86.67307692, 88.57692308, 90.48076923,
92.38461538, 94.28846154, 96.19230769, 98.09615385, 100. ])
</code></pre>
<p><br /></p>
<h3 id="vectorization">Vectorization</h3>
<p>Loops over very long arrays may run slowly. An advantage of arrays is that, with arrays, loops can be avoided the whole array be manipulated directly and simultaneously. If you are a Fortran programmer, you are likely already quite familiar with the powerful idea of array computing and vectorization. If not, then consider the following example,</p>
<pre><code class="language-python">x = np.linspace(0, 2, 201)
In [39]: x
Out[39]:
array([ 0. , 0.02, 0.04, 0.06, 0.08, 0.1 , 0.12, 0.14, 0.16,
0.18, 0.2 , 0.22, 0.24, 0.26, 0.28, 0.3 , 0.32, 0.34,
0.36, 0.38, 0.4 , 0.42, 0.44, 0.46, 0.48, 0.5 , 0.52,
0.54, 0.56, 0.58, 0.6 , 0.62, 0.64, 0.66, 0.68, 0.7 ,
0.72, 0.74, 0.76, 0.78, 0.8 , 0.82, 0.84, 0.86, 0.88,
0.9 , 0.92, 0.94, 0.96, 0.98, 1. , 1.02, 1.04, 1.06,
1.08, 1.1 , 1.12, 1.14, 1.16, 1.18, 1.2 , 1.22, 1.24,
1.26, 1.28, 1.3 , 1.32, 1.34, 1.36, 1.38, 1.4 , 1.42,
1.44, 1.46, 1.48, 1.5 , 1.52, 1.54, 1.56, 1.58, 1.6 ,
1.62, 1.64, 1.66, 1.68, 1.7 , 1.72, 1.74, 1.76, 1.78,
1.8 , 1.82, 1.84, 1.86, 1.88, 1.9 , 1.92, 1.94, 1.96,
1.98, 2. ])
</code></pre>
<p><br />
Now, if you wanted to calculate the <code>sin</code> of the elements of <code>x</code> in the traditional way, you would do,</p>
<pre><code class="language-python">In [41]: from math import sin
In [42]: sinX = [sin(i) for i in x]
</code></pre>
<p><br />
This approach however, can be quite time consuming and computationally costly, because <strong>for-loops are very slow in Python</strong>, up to a few hundred times than what you get in Fortran or C.</p>
<p>A more appropriate solution to the above problem is use the <code>sin</code> function from numpy module, which enables vectorization,</p>
<pre><code class="language-python">sinX = np.sin(x)
</code></pre>
<p><br />
You see, with the above numpy call, there is no need for a for-loop. The above Python code is an example of a <strong>vectorized code</strong> and the previous code which contained for-loop is an example <strong>scalar code</strong>. The numpy functions are capable of handling arrays as input. Compare the performance of the two codes in the above example,</p>
<pre><code class="language-python">In [45]: %timeit np.sin(x)
The slowest run took 11.73 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.21 µs per loop
</code></pre>
<p><br /></p>
<pre><code class="language-python">In [46]: %timeit [sin(i) for i in x]
10000 loops, best of 3: 23.1 µs per loop
</code></pre>
<p><br />
The vectorized code in this example appears to be more than one order of magnitude (more than 10 times) faster than faster than the scalar version of the code.</p>
<p><strong>Why is the vectorized code faster in Python?</strong><br />
The reason is that numpy uses precompiled Fortran and C loops to loop over the elements of the input array. loops in Fortran and C have far less overhead than loops in Python. Similar to the above example, you can define your own functions that are also vectorized, for example,</p>
<pre><code class="language-python">def f(x):
return x**2*np.exp(-x**2)
x = np.linspace(-3, 3, 101)
y = f(x)
</code></pre>
<p><br />
The numpy package also has a method for <strong>Automatic vectorization</strong> of scalar functions (function that only take scalar arguments), for example,</p>
<pre><code class="language-python">func_vec = np.vectorize(func_scalar)
</code></pre>
<p><br />
However, for serious programming, I do not recommend you to use this numpy functionality as it can be slow and inefficient.</p>
<h4 id="vectorization-of-if-blocks">vectorization of if-blocks</h4>
<p>For vectorization of calculations involving booleans and if conditions, the solution can be problem dependent, but one common easy way of addressing simple boolean problems could be <code>where</code> method in numpy package. For example, suppose you have an list of numbers and you would like to perform a task on all negative numbers in the array, say set them all to zero, and leave the positive numbers intact. One solution would be the following,</p>
<pre><code class="language-python">In [57]: x = np.array([1,-1,3,-5,-6,8,7,4,10])
In [58]: np.where(x<0,0,x)
Out[58]: array([ 1, 0, 3, 0, 0, 8, 7, 4, 10])
</code></pre>
<p><br /></p>
<h4 id="aliasing-vs-copying-arrays">Aliasing vs. copying arrays</h4>
<p>If you recall from <a href="http:/ECL2017S/lecture/5-python-variables-assignments#aliasing-vs-copying" target="_blank">lecture </a>, there is a difference between aliasing and copying sequence objects in Python. The same rules also hold for numpy arrays, meaning that if you need an independent copy of an existing array, then you have to use <code>copy</code> method to generate it,</p>
<pre><code class="language-python">In [63]: a = np.array([1,2,3,4,5])
In [64]: b = a.copy()
In [65]: b[0] = -1
In [66]: a
Out[66]: array([1, 2, 3, 4, 5])
In [67]: b
Out[67]: array([-1, 2, 3, 4, 5])
</code></pre>
<p><br />
otherwise a simple equality assignment like <code>b = a</code> will only create an alias for numpy array <code>a</code>.</p>
<pre><code class="language-python">In [68]: a = np.array([1,2,3,4,5])
In [69]: b = a
In [70]: b[0] = -1
In [71]: a
Out[71]: array([-1, 2, 3, 4, 5])
</code></pre>
<p><br /></p>
<h4 id="in-place-arithmetic-in-python">In-place arithmetic in Python</h4>
<p>Consider two arrays <code>a</code> and <code>b</code> of the same shape. The expression <code>a += b</code> means <code>a = a + b</code>. There are however hidden differences between the two. In the statement <code>a = a + b</code>, the sum <code>a + b</code> is first computed, yielding a new array, and then the name <code>a</code> is bound to this new array. The old array a is lost unless there are other names assigned to this array. In the statement <code>a += b</code>, elements of <code>b</code> are added directly into the elements of <code>a</code> (in memory). There is no hidden intermediate array as in <code>a = a + b</code>. This implies that <strong><code>a += b</code> is more efficient than <code>a = a + b</code> since Python avoids making an extra array</strong>. In other words, the operators +=, *=, and similar operators, perform <strong>in-place arithmetic</strong> in arrays.</p>
<h4 id="allocating-arrays-in-python">Allocating arrays in Python</h4>
<p>We have already seen in the above that the <code>np.zeros</code> function is useful for making a new array of a given size. Very often the size and the type of array elements are known a priori or has to match another existing array’s shape and type <code>b</code>. There are two ways of achieving this goal,</p>
<pre><code class="language-python">In [66]: a
Out[66]: array([1, 2, 3, 4, 5])
In [67]: b
Out[67]: array([-1, 2, 3, 4, 5])
In [68]: a
Out[68]: array([1, 2, 3, 4, 5])
In [69]: b = a.copy()
In [70]: c = np.zeros(a.shape, a.dtype)
In [71]: a.shape
Out[71]: (5,)
In [72]: a.
a.T a.argsort a.compress a.cumsum a.dumps a.imag a.min a.prod a.reshape a.shape a.sum a.tostring
a.all a.astype a.conj a.data a.fill a.item a.nbytes a.ptp a.resize a.size a.swapaxes a.trace
a.any a.base a.conjugate a.diagonal a.flags a.itemset a.ndim a.put a.round a.sort a.take a.transpose
a.argmax a.byteswap a.copy a.dot a.flat a.itemsize a.newbyteorder a.ravel a.searchsorted a.squeeze a.tobytes a.var
a.argmin a.choose a.ctypes a.dtype a.flatten a.max a.nonzero a.real a.setfield a.std a.tofile a.view
a.argpartition a.clip a.cumprod a.dump a.getfield a.mean a.partition a.repeat a.setflags a.strides a.tolist
</code></pre>
<p><br />
Notice how the attribute <code>a.dtype</code> (dtype standing for data type), and <code>x.shape</code> (a tuple) were used in the above example. The shape attribute in array objects holds the shape, i.e., the size of each dimension. The method <code>size</code> returns the total number of elements in the array.</p>
<p>Sometimes one may also want to ensure that an object is an array, and if not, turn it into an array. The <code>np.asarray</code> function is useful in such cases,</p>
<pre><code class="language-python">a = np.asarray(a)
</code></pre>
<p><br />
Note that one could have also use,</p>
<pre><code class="language-python">a = np.array(a)
</code></pre>
<p><br />
There is no difference in the output, but note that the second approach does one redundant step, because in the first approach, if the input object is already an array, then there is no need in converting it to an array.</p>
<h4 id="multidimensional-numpy-arrays">Multidimensional NumPy arrays</h4>
<p>Creating multidimensional arrays is very much the same as vectors in numpy. The only thing to keep in mind is that the shape of the array is given as a tuple to <code>np.array()</code>. For example, to initialize a 3D array of size (0:3,0:5,0:2), you would do,</p>
<pre><code class="language-python">In [86]: a = np.zeros((3,5,2))
In [87]: a
Out[87]:
array([[[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]],
[[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]],
[[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]]])
</code></pre>
<p><br />
The arrays created so far have been of type <code>ndarray</code>. NumPy also has a matrix type called <code>matrix</code> or <code>mat</code> for one- and two-dimensional arrays. One-dimensional arrays are then extended with one extra dimension such that they become matrices, i.e., either a row vector or a column vector,</p>
<pre><code class="language-python">In [99]: x1 = np.array([1, 2, 3], float)
In [100]: x2 = np.matrix(x1) # or np.mat(x1)
In [102]: x3 = np.mat(x1).T # transpose = column vector
In [103]: x3
Out[103]:
matrix([[ 1.],
[ 2.],
[ 3.]])
In [104]: type(x3)
Out[104]: numpy.matrixlib.defmatrix.matrix
</code></pre>
<p><br />
A special feature of matrix objects in NumPy is that the multiplication operator represents the matrix-matrix, vector-matrix, or matrix-vector product as we know from linear algebra. However, keep in mind that <strong>the multiplication operator between standard ndarray objects is different from multiplication between numpy matrices</strong>. The <code>ndarray</code> multiplication is simply a vectorized version of scalar multiplication,</p>
<pre><code class="language-python">In [105]: a = np.array([1,2,3])
In [106]: b = np.array([1,2,3])
In [107]: a*b
Out[107]: array([1, 4, 9])
</code></pre>
<p><br />
whereas, the matrix multiplication would yield,</p>
<pre><code class="language-python">In [108]: aMat = np.mat(a)
In [109]: bMat = np.mat(b)
In [110]: aMat*bMat.T
Out[110]: matrix([[14]])
In [111]: aMat.T*bMat
Out[111]:
matrix([[1, 2, 3],
[2, 4, 6],
[3, 6, 9]])
</code></pre>
<p><br />
If you intend to use Python and MATLAB together for your projects, then I recommend you to consider programming with matrices in Python instead of <code>ndarray</code> objects, because the matrix type in Python behaves quite similar to matrices in MATLAB.
Numpy has a lot more to offer for linear algebra operation, that far beyond the scope of this lecture. More information about algebraic operations in NumPy can be found <a href="https://docs.scipy.org/doc/numpy/reference/routines.linalg.html" target="_blank">here</a>.</p>
<h4 id="symbolic-linear-algebra">Symbolic linear algebra</h4>
<p>There also a package <a href="http://www.sympy.org/en/index.html" target="_blank">SymPy</a> that supports symbolic computations for linear algebra operations as well,</p>
<pre><code class="language-python">In [116]: import sympy as sym
In [117]: a = sym.Matrix([[2, 0], [0, 5]])
In [118]: a**-1 # inverse of matrix a
Out[118]:
Matrix([
[1/2, 0],
[ 0, 1/5]])
In [119]: a.inv() # same as above, inverse of a
Out[119]:
Matrix([
[1/2, 0],
[ 0, 1/5]])
In [120]: a.det() # determinant of a
Out[120]: 10
In [121]: a.eigenvals() # eigenvalues of a
Out[121]: {2: 1, 5: 1}
In [122]: a.eigenvects() # eigenvectors of a
Out[122]:
[(2, 1, [Matrix([
[1],
[0]])]), (5, 1, [Matrix([
[0],
[1]])])]
</code></pre>
<p><br />
A tutorial on <code>sympy</code> can be found <a href="http://docs.sympy.org/dev/tutorial/matrices.html" target="_blank">here</a>.</p>
<h2 id="curve-plotting-in-python">Curve plotting in Python</h2>
<p>The workhorse of plotting in Python is <a href="https://matplotlib.org/" target="_blank">Matplotlib</a> which is a Python 2D plotting library capable of producing publication quality figures. The usage of matplotlib is very similar to MATLAB.</p>
<h3 id="matplotlib-the-workhorse-of-plotting-in-python">Matplotlib, the workhorse of plotting in Python</h3>
<p>To see how plotting with Matplotlib works, let’s start with a simple example of 2D curve plotting,</p>
<pre><code class="language-python">import numpy as np
import matplotlib.pyplot as plt
def f(x):
return x**2*np.exp(-x**2)
x = np.linspace(0, 3, 51) # 51 points between 0 and 3
y = np.zeros(len(x)) # allocate y with float elements
for i in range(len(x)):
y[i] = f(x[i])
plt.plot(x, y)
plt.show()
</code></pre>
<p><br />
If you try the above code in IPython, the out on screen would be something like the following,</p>
<figure>
<img src="http:/ECL2017S/lecture/9/simple_curve_screen.png" width="900" />
</figure>
<p>You can also save the figure output as a file by,</p>
<pre><code class="language-python">In [8]: plt.plot(x, y)
Out[8]: [<matplotlib.lines.Line2D at 0x1bff2e479e8>]
In [9]: plt.savefig('simple_curve.pdf') # produces PDF file.
In [10]: plt.savefig('simple_curve.png') # produces PNG file.
In [11]: pwd
Out[11]: 'C:\\Users\\Amir' # files are saved here
</code></pre>
<p><br />
Just like MATLAB, the figures could be also decorated with axis labels, plot title, legend and a lot more, in a syntax very much like MATLAB,</p>
<pre><code class="language-python">plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['x^2*exp(-x^2)'])
plt.axis([0, 3, -0.05, 0.6]) # [xmin, xmax, ymin, ymax]
plt.title('A simple Matplotlib decorated plot')
plt.savefig('simple_curve_decorated.png')
plt.show()
</code></pre>
<p><br />
which outputs <a href="http:/ECL2017S/lecture/9/simple_curve_decorated.png" target="_blank">this file</a> in your current directory,</p>
<figure>
<img src="http:/ECL2017S/lecture/9/simple_curve_decorated.png" width="900" />
</figure>
<h4 id="plotting-multiple-curves-in-one-figure">Plotting multiple curves in one figure</h4>
<p>Again, similar to MATLAB, this can be achieved by the statement <code>hold('on')</code> like the following,</p>
<pre><code class="language-python">def f(x):
return x**2*np.exp(-x**2)
def g(x):
return x*np.exp(-x)
x = np.linspace(0, 3, 51) # 51 points between 0 and 3
yf = np.zeros(len(x)) # allocate y with float elements
yg = np.zeros(len(x)) # allocate y with float elements
for i in range(len(x)):
yf[i] = f(x[i])
yg[i] = g(x[i])
plt.plot(x, yf, 'r-') # plot with color red, as line
plt.hold('on')
plt.plot(x, yg, 'bo') # # plot with color blue, as points
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['x^2*exp(-x^2)' , 'x*exp(-x)'])
plt.axis([0, 3, -0.05, 0.6]) # [xmin, xmax, ymin, ymax]
plt.title('multiple Matplotlib curves in a decorated plot')
plt.savefig('multiple_curves_decorated.png')
plt.show()
</code></pre>
<p><br />
The output of the code is a PNG figure <a href="http:/ECL2017S/lecture/9/multiple_curves_decorated.png" target="_blank">available here</a>.</p>
<figure>
<img src="http:/ECL2017S/lecture/9/multiple_curves_decorated.png" width="900" />
</figure>
<p>If you need to discontinue multiple plots on the same figure, again, as in MATLAB, you use <code>hold('off')</code>.</p>
<h4 id="subplots-in-matplotlib">Subplots in Matplotlib</h4>
<p>Suppose you wanted to generate the same curves as in the above example, but each in a different plot, but in the same figure. One way to do this would be like the following,</p>
<pre><code class="language-python">plt.figure() # generates a new figure as in MATLAB
plt.subplot(2,1,1) # create a 2-row, 1-column subplot, and this is the 1st subplot.
plt.plot(x, yf, 'r-') # plot with color red, as line
plt.subplot(2,1,2) # this is the 2nd subplot.
plt.plot(x, yg, 'bo') # plot with color blue, as points
plt.xlabel('x')
plt.ylabel('y')
plt.legend(['x*exp(-x)'])
plt.axis([0, 3, -0.05, 0.6]) # [xmin, xmax, ymin, ymax]
plt.title('an example Matplotlib subplot')
plt.savefig('two_by_one_subplot.png')
plt.show()
</code></pre>
<p><br />
The output of the code is a PNG figure <a href="http:/ECL2017S/lecture/9/two_by_one_subplot.png" target="_blank">available here</a>.</p>
<figure>
<img src="http:/ECL2017S/lecture/9/two_by_one_subplot.png" width="900" />
</figure>
<p>Note that since the decorations appeared only for the second subplot, only the second one in the figure above is decorated with plot title, legend, etc. Also, note that the <code>figure()</code> method creates a new plot window on the screen.</p>
<h3 id="other-plotting-packages">Other plotting packages</h3>
<p>For more complicated 2D/3D or vector field plotting, you may find Matplotlib inadequate. To address these inadequacies, other packages have been developed which provide interface to more advanced plotting software such as, MATLAB, Gnuplot, Grace, OpenDX, VTK, and others.</p>
<h4 id="easyviz-from-scitools">Easyviz from SciTools</h4>
<p>Because each of the above mentioned visualization software has its own plotting syntax, a Python module <code>easyviz</code> has been developed which provides a universal interface for any of the above mentioned back-end plotting software. In other words, the user can request eazyvis to use one of the above-mentioned software as the plotting engine in Python, while the syntax of the Python code is universal and the same for all of them, and this is achieved by using <code>eazyvis</code>. Just like Matplotlib, the syntax of <code>eazyvis</code> has been also purposefully made very similar to MATLAB.</p>
<p>The Easyviz module is part of the <a href="https://github.com/hplgit/scitools" target="_blank">SciTools package</a>, which consists of a set of Python tools building on Numerical Python, ScientificPython, the comprehensive SciPy environment, and other packages for scientific computing with Python. However, keep in mind that <strong>SciTools strictly requires <a href="http://python.org" target="_blank">Python v2.7</a> and <a href="http://numpy.org" target="_blank">Numerical Python</a></strong>.</p>
<h4 id="mayavi-visualization-package">Mayavi visualization package</h4>
<p><a href="http://docs.enthought.com/mayavi/mayavi/" target="_blank">Mayavi</a> is another advanced, free, scientific data visualizer for Python, with emphasis on <strong>three-dimensional visualization techniques</strong>. The package is written in Python, and uses the <a href="http://www.vtk.org/" target="_blank">Visualization Toolkit (VTK)</a> in C++ for rendering graphics. Since VTK can be configured with different backends, so can Mayavi. Mayavi is cross
platform and runs on most platforms like Mac OS X, Windows, and Linux.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/lecture/9-python-array-computing-plotting">Lecture 9: Python - array computing and plotting</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 12, 2017.</p><![CDATA[Homework 8: Problems - Python array computing and plotting]]>http:/ECL2017S/homework/8-problems-python-array-computing-plotting2017-04-12T00:00:00-05:002017-04-12T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This homework aims at giving you some experience with Python’s array computing and plotting features.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> <strong>The while-loop implementation of a for-loop</strong>. Consider the following mathematical function resembling a Hat function,</p>
<script type="math/tex; mode=display">% <![CDATA[
f(x) =
\begin{cases}
0 ~, & \text{if}~~ x<0 \\
x ~, & \text{if}~~ 0\leq x <1 \\
2-x ~, & \text{if}~~ 1\leq x <2 \\
0 ~, & \text{if}~~ x \geq 2 \\
\end{cases} %]]></script>
<p>A scalar implementation of this function would be,</p>
<pre><code class="language-python">def hatFunc(x):
if x < 0:
return 0.0
elif 0 <= x < 1:
return x
elif 1 <= x < 2:
return 2 - x
elif x >= 2:
return 0.0
</code></pre>
<p><br />
Write a vectorized version of this function. (Hint: you may need numpy’s <code>logical_and</code> method for building the vectorized version of this function.)</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> The vertical position $y(t)$ of a ball thrown upward is given by $y(t)=v_0t-\frac{1}{2}gt^2$, where $g$ is the acceleration of gravity and $v_0$ is the initial vertical velocity at $t=0$. Two important physical quantities in this context are the potential energy, obtained by doing work against gravity, and the kinetic energy, arising from motion. The potential energy is defined as $P=mgy$, where $m$ is the mass of the ball. The kinetic energy is defined as $K=\frac{1}{2}mv^2$, where $v$ is the velocity of the ball, related to $y$ by $v(t)=y’(t)$.</p>
<p>Write a program that can plot $P(t)$ and $K(t)$ in the same plot, along with their sum $E = P + K$. Let $t\in[0,2v_0/g]$. Write your program such that $m$ and $v_0$ are read from the command line. Run the program with various choices of $m$ and $v_0$ and observe that $P+K$ always remains constant in this motion, regardless of initial conditions. This is in fact, the fundamental principle of conservation of energy in Physics.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>3. </strong> <strong>Integration by midpoint rule</strong>: The idea of the Midpoint rule for integration is to divide the area under a curve $f(x)$ into $n$ equal-sized rectangles. The height of the rectangle is determined by the value of $f$ at the midpoint of the rectangle. The figure below illustrates the idea,</p>
<figure>
<img src="http:/ECL2017S/homework/8/midpnt.gif" width="700" />
</figure>
<p>To implement the midpointrule, one has to compute the area of each rectangle, sum them up, just as in the formula for the Midpoint rule,</p>
<script type="math/tex; mode=display">\int^b_a f(x) dx \approx h\sum^{n-1}_{i=0} f(a+ih+0.5h) ~,</script>
<p>where $h=(b-a)/n$ is the width of each rectangle. Implement this formula as a Python function midpoint(f, a, b, n) and test the integrator with the following example input mathematical functions.</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
f_1(x) &= exp(x) ~,~ \text{for integration range } ~ [0, \log(3)] \\\\
f_2(x) &= cos(x) ~,~ \text{for integration range } ~ [0, \pi] \\\\
f_3(x) &= sin(x) ~,~ \text{for integration range } ~ [0, \pi] \\\\
f_4(x) &= sin(x) ~,~ \text{for integration range } ~ [0, \pi / 2] \\\\
\end{align*} %]]></script>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>4. </strong> <strong>Visualize approximations in the Midpoint integration rule</strong> Now consider the following function,</p>
<script type="math/tex; mode=display">f(x) = x(12-x)+\sin(\pi x) ~~,~~ x\in[0,10] ~,</script>
<p>which we wish to integrate using the midpoint integrator that you wrote in the previous example. Now write a new code that visualizes the midpoint rule, similar to in the following figure. (Hint: you will need to use the Matplotlib function <code>fill_between</code> and use this function to create the filled areas between f(x) and the approximating rectangles)</p>
<figure>
<img src="http:/ECL2017S/homework/8/midpoint_visualization.png" width="700" />
</figure>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/8-problems-python-array-computing-plotting">Homework 8: Problems - Python array computing and plotting</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 12, 2017.</p><![CDATA[Homework 7: Solutions - Python I/O, error handling, and unit testing]]>http:/ECL2017S/homework/7-solutions-python-IO-error-handling-unit-testing2017-04-12T00:00:00-05:002017-04-12T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is the solution to <a href="7-problems-python-IO-error-handling-unit-testing" target="_blank">Homework 7: Problems - Python I/O, error handling, and unit testing</a>.</p>
<p>The following figure illustrates the grade distribution for this homework.</p>
<figure>
<img src="http:/ECL2017S/homework/gradeDist/gradeHistHomework7.png" width="700" />
<figcaption style="text-align:center">
Maximum possible points is 100.<br />
</figcaption>
</figure>
<hr />
<hr />
<p><br /></p>
<p>This homework aims at giving you some experience with Python I/O, error handling in your code, and testing you code for accuracy and robustness.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> Write a simple program named <code>sum.py</code>, that takes in an arbitrary-size list of input floats from the command-line, and prints out the sum of them on the terminal with the following message,</p>
<pre><code class="language-bash">$ python sum.py 1 2 1 23
The sum of 1 2 1 23 is 27.0
</code></pre>
<p><br />
Note that you will need to use the Python’s builtin function <code>sum()</code>.</p>
<p><br />
<strong>Answer:</strong></p>
<pre><code class="language-python">import sys
print( 'The sum of {} is {}'.format( ' '.join(sys.argv[1:]) , sum([float(x) for x in sys.argv[1:]]) )
</code></pre>
<p><br />
Here is the Bash output,</p>
<pre><code class="language-bash">$ python sum.py 1 2 1 23
The sum of 1 2 1 23 is 27.0
</code></pre>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> Similar to the previous probelm, write a simple program named <code>sum_via_eval.py</code>, that takes in an arbitrary-size list of input numbers from the command-line, and prints out the sum of them on the terminal, this time using Python’s <code>eval</code> function. The program output should look like the following,</p>
<pre><code class="language-bash">$ python sum.py 1 2 1 23
The sum of 1 2 1 23 is 27
</code></pre>
<p><br /></p>
<p><strong>Answer:</strong></p>
<pre><code class="language-python">import sys
print( 'The sum of {} is {}'.format( ' '.join(sys.argv[1:]) , eval('+'.join(sys.argv[1:]) ) ) )
</code></pre>
<p><br />
Here is the Bash output,</p>
<pre><code class="language-bash">$ python sum.py 1 2 1 23
The sum of 1 2 1 23 is 27
</code></pre>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>3. </strong> Consider <a href="http:/ECL2017S/homework/7/1A2T_A.dssp" target="_blank">this data file</a>. It contains information about the amino acids in <a href="http://www.rcsb.org/pdb/explore.do?structureId=1a2t" target="_blank">a protein</a> called <code>1A2T</code>. Each amino acid in protein is labeled by a single letter. There are 20 amin acid molecules in nature, and each has a total surface area (in units of Angstroms squared) that is given by the following table,</p>
<pre><code>'A': 129.0
'R': 274.0
'N': 195.0
'D': 193.0
'C': 167.0
'Q': 225.0
'E': 223.0
'G': 104.0
'H': 224.0
'I': 197.0
'L': 201.0
'K': 236.0
'M': 224.0
'F': 240.0
'P': 159.0
'S': 155.0
'T': 172.0
'W': 285.0
'Y': 263.0
'V': 174.0
</code></pre>
<p>However, when these amino acids sit next to each other to form a chain protein, they cover parts of each other, such that only parts of their surfaces is exposed, while the rest is hidden from the outside world by other neighboring amino acids. Therefore, one would expect an amino acid that is at the core of a spherical protein would have almost zero exposed surface area.</p>
<p>Now given the above information, write a Python program that takes in two command-line input arguments, one of which is a string containing the path to the above <a href="http:/ECL2017S/homework/7/1A2T_A.dssp" target="_blank">input file</a> <code>1A2T_A.dssp</code> which contains the partially exposed surface areas of amino acids in protein <code>1A2T</code> for each of its amino acids, and a second command-line argument which is the path to the file containing output of the code (e.g., it could be <code>./readDSSP.out</code>). Then,</p>
<ol>
<li>the code reads the content of this file, and<br />
<br /></li>
<li>extracts the names of the amino acids in this protein from the data column inside the file which has the header <code>AA</code> (look at the line number 25 inside the input data file, below <code>AA</code> is the column containing the one-letter names of amino acids in this protein), and<br />
<br /></li>
<li>also extracts the partially exposed surface area information for each of these amino acids which appear in the column with header <code>ACC</code>, and<br />
<br /></li>
<li>then uses the above table of maximum surface area values to calculate the fractional exposed surface area of each amino acid in this protein (i.e., for each amino acid, fraction_of_exposed_surface = ACC / maximum_surface_area_from_table), and<br />
<br /></li>
<li>finally for each amino acid in this protein, it prints the one-letter name of the amino acid, its corresponding partially exposed surface area (ACC from the input file), and its corresponding fractional exposed surface area (name it RSA) to the output file given by the user on the command line.<br />
<br /></li>
<li>On the first column of the output file, the code should also write the name of the protein (which is basically the name of the input file <code>1A2T_A</code>) on each line of the output file. <strong>Note that your code should extract the protein name from the input filename</strong> (by removing the file extension and other unnecessary information from the input command line string). <a href="http:/ECL2017S/homework/7/readDSSP.out" target="_blank">Here</a> is an example output of the code.<br />
<br /></li>
<li>Your code should also be able to handle an error resulting from less or more than 2 input command line arguments. That is, if the number of input arguments is 3 or 1, then it should input the following message on screen and stop.</li>
</ol>
<pre><code class="language-bash">$ ./readDSSP.py ./1A2T_A.dssp
Usage:
./readDSSP.py <input dssp file> <output summary file>
Program aborted.
</code></pre>
<p><br />
or,</p>
<pre><code class="language-bash">$ ./readDSSP.py ./1A2T_A.dssp ./readDSSP.out amir
Usage:
./readDSSP.py <input dssp file> <output summary file>
Program aborted.
</code></pre>
<p><br />
To achieve the above goal, you will have to create a dictionary from the above table, with amino acid names as the keys, and the maximum surface areas as the corresponding values. Name your code <code>readDSSP.py</code> and submit it to your repository.</p>
<p><strong>Write your code in such a way that it checks for the existence of the output file</strong>. If it already exists, then it does not remove the content of the file, whereas, it appends new data to the existing file. therwise, if the file does not exist, then it creates a new output file as requested by the user. To do so, you will need to use <code>os.path.isfile</code> function from module <code>os</code>.</p>
<p><strong>ATTENTION</strong>: Note that in some rows instead of a one-letter amino acid name, there is <code>!</code>. In such cases, your code should be able to detect the abnormality and skip that row, because that row does not contain amino acid information.</p>
<p><br />
<strong>Answer:</strong><br />
An example implementation can be downloaded from <a href="http:/ECL2017S/homework/7/readDSSP.py" target="_blank">here</a>.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>4. </strong> Consider the simplest program for evaluating the formula $y(t) = v_0t-\frac{1}{2}gt^2$,</p>
<pre><code class="language-python">v0 = 3; g = 9.81; t = 0.6
y = v0*t - 0.5*g*t**2
print(y)
</code></pre>
<p><br />
(A) Write a program that takes in the above necessary input data ($t$,$v_0$) as command line arguments.<br />
<br />
(B) Extend your program from part (A) with exception handling such that missing command-line arguments are detected. For example, if the user has entered enough input arguments, then the code should raise <code>IndexError</code> exception. In the <code>except IndexError</code> block, the code should use the <code>input</code> function to ask the user for the missing input data.<br />
<br />
(C) Add another exception handling block that tests if the $t$ value read from the command line, lies between $0$ and $2v_0/g$. If not, then it raises a <code>ValueError</code> exception in the if block on the legal values of $t$, and notifes the user about the legal interval for $t$ in the exception message.</p>
<p>Here are some example runs of the code,</p>
<pre><code class="language-bash">$ ./projectile.py
Both v0 and t must be supplied on the command line
v0 = ?
5
t = ?
4
Traceback (most recent call last):
File "./projectile.py", line 17, in <module>
'must be between 0 and 2v0/g = {}'.format(t,2.0*v0/g))
ValueError: t = 4.0 is a non-physical value.
must be between 0 and 2v0/g = 1.019367991845056
</code></pre>
<p><br /></p>
<pre><code class="language-bash">$ ./projectile.py
Both v0 and t must be supplied on the command line
v0 = ?
5
t = ?
0.5
y = 1.27375
</code></pre>
<p><br /></p>
<pre><code class="language-bash">$ ./projectile.py 5 0.4
y = 1.2151999999999998
</code></pre>
<p><br /></p>
<pre><code class="language-bash">$ ./projectile.py 5 0.4 3
y = 1.2151999999999998
</code></pre>
<p><br /></p>
<p><br />
<strong>Answer:</strong><br />
<a href="http:/ECL2017S/homework/7/projectile.py" target="_blank">Here</a> is an example implementation.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>5. </strong> Consider the function <code>Newton</code> that we discussed in <a href="http:/ECL2017S/lecture/8-python-io-error-handling-unit-testing" target="_blank">lecture 8</a>,</p>
<pre><code class="language-python">def Newton(f, dfdx, x, eps=1E-7, maxit=100):
if not callable(f): raise TypeError( 'f is %s, should be function or class with __call__' % type(f) )
if not callable(dfdx): raise TypeError( 'dfdx is %s, should be function or class with __call__' % type(dfdx) )
if not isinstance(maxit, int): raise TypeError( 'maxit is %s, must be int' % type(maxit) )
if maxit <= 0: raise ValueError( 'maxit=%d <= 0, must be > 0' % maxit )
n = 0 # iteration counter
while abs(f(x)) > eps and n < maxit:
try:
x = x - f(x)/float(dfdx(x))
except ZeroDivisionError:
raise ZeroDivisionError( 'dfdx(%g)=%g - cannot divide by zero' % (x, dfdx(x)) )
n += 1
return x, f(x), n
</code></pre>
<p><br />
This function is supposed to be able to handle exceptions such as divergent iterations (which we discussed in the lecture), and division-by-zero. The latter error happens when <code>dfdx(x)=0</code> in the above code. Write a test code that ensures the above code is able to correctly identify a division-by-zero exception and raise the correct assertionError.<br />
(<em>Hint: To do so, you need to consider a test mathematical function as input to <code>Newton</code>. One example could be $f(x)=\cos(x)$ with a starting search value $x=0$. This would result in derivative value $f’(x=0)=-\sin(x=0)=0$, which should lead to a <code>ZeroDivisionError</code> exception. Now, write a test function <code>test_Newton_div_by_zero</code> that can explicitly handle this exception by introducing a boolean variable <code>success</code> that is <code>True</code> if the exception is raised and otherwise <code>False</code></em>.)</p>
<p><br /></p>
<p><strong>Answer:</strong></p>
<pre><code class="language-python">def test_Newton_div_by_zero():
from math import sin, cos
f = cos
dfdx = lambda x: -sin(x)
success = False
try:
x, f_x, n = Newton(f, dfdx, 0, eps=1E-4, maxit=1)
except ZeroDivisionError:
success = True
assert success , "Test for division-by-zero failed"
</code></pre>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/7-solutions-python-IO-error-handling-unit-testing">Homework 7: Solutions - Python I/O, error handling, and unit testing</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 12, 2017.</p><![CDATA[Quiz 6: Solutions - Python modules, loops, and IO]]>http:/ECL2017S/quiz/6-solutions-python-modules-loops-io2017-04-05T00:00:00-05:002017-04-05T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is the solution to <a href="6-problems-python-modules-loops-io" target="_blank">Quiz 6: Problems - Python modules, loops, and IO</a>.</p>
<p>The following figure illustrates the grade distribution for this quiz.</p>
<figure>
<img src="http:/ECL2017S/quiz/gradeDist/gradeHistQuiz6.png" width="700" />
<figcaption style="text-align:center">
Maximum possible points is 100.
</figcaption>
</figure>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p>This quiz aims at testing your basic knowledge of Python’s modules, loops and simple I/O. Don’t forget to push your answers to your remote repository by the end of quiz time. Push your quiz-6 <em>readme.md</em> file to quiz/6/ folder in your Github project. If you write your answers in Python scripts, put the script files in the same folder as well. If you feel uncertain about your answer, you can test your final codes on Jupyter or IPython command lines.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> Suppose you write a Python module, which you would also like to run it as a standalone Python code. If you wanted to make sure that some specific Python statements are executed only when the code is run a Python code (and not a module), you may recall from the lecture, that we had to use and if block like the following,</p>
<pre><code class="language-python">if __name__ == "__main__":
<Python statements>
</code></pre>
<p><br />
Briefly explain what this if block does and mean.</p>
<p><br />
<strong>Answer:</strong><br />
Each Python module has an attribute <code>__name__</code>. When the code is used as a Python module, the <code>__name__</code> is set to the name of the module, otherwise it is set to <code>__main__</code>. Therefore, this if block makes sure that the code is running as a standalone code or not.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> Suppose you write a module named <code>myModule</code>, which contains the function <code>myfunc</code>. Now you import this module to another code.</p>
<p>(A) Write down the import statement that would enable you to use <code>myfunc</code> with name <code>f</code> instead.</p>
<p>(B) What would be the output of the following Python print statement,</p>
<pre><code class="language-python">import myModule as mm
print(mm.__name__)
</code></pre>
<p><br /></p>
<p><br />
<strong>Answer:</strong><br />
(A)</p>
<pre><code class="language-python">In [27]: from math import sqrt as f
In [28]: f(4.0)
Out[28]: 2.0
</code></pre>
<p><br />
(B)</p>
<pre><code class="language-python">In [25]: import math as m
In [26]: print(m.__name__)
math
</code></pre>
<p><br /></p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>3. </strong> Suppose there are two lists of numbers,</p>
<pre><code class="language-python">even = [0,2,4,6,8]
odd = [1,3,5,7,9]
</code></pre>
<p><br />
Write a <strong>one-line</strong> Python statement (list comprehension) that gives a list <code>summ</code> whose elements are the sum of the respective elements in the above two lists <code>odd</code> and <code>even</code>, that is,</p>
<pre><code class="language-python">In [37]: summ
Out[37]: [1, 5, 9, 13, 17]
</code></pre>
<p><br />
(Hint: You can use <code>zip</code> function inside the list comprehension.)</p>
<p><br />
<strong>Answer:</strong></p>
<pre><code class="language-python">In [39]: even = [0,2,4,6,8]
In [40]: odd = [1,3,5,7,9]
In [41]: summ = [i+j for i,j in zip(odd,even)]
In [42]: summ
Out[42]: [1, 5, 9, 13, 17]
</code></pre>
<p><br /></p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>4. </strong> Consider the following for-loop,</p>
<pre><code class="language-python">mylist = list(range(0,10,2))
for item in mylist:
mylist.append(item+1)
</code></pre>
<p><br />
How many iterations does this for-loop perform before ending? Explain briefly why.</p>
<p><br />
<strong>Answer:</strong><br />
This for-loop never ends! Because at each iteration, a new element is added to the end of the list. You can check if this is indeed the case by adding a print statement inside the loop,</p>
<pre><code class="language-python">mylist = list(range(0,10,2))
for item in mylist:
mylist.append(item+1)
print(item)
</code></pre>
<pre><code>0
2
4
6
8
1
3
5
7
9
2
4
6
8
10
3
5
7
9
11
4
6
8
10
12
5
7
9
11
13
6
8
10
12
14
7
9
11
</code></pre>
<p>and the loop keeps printing forever!</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/quiz/6-solutions-python-modules-loops-io">Quiz 6: Solutions - Python modules, loops, and IO</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 05, 2017.</p><![CDATA[Quiz 6: Problems - Python modules, loops, and IO]]>http:/ECL2017S/quiz/6-problems-python-modules-loops-io2017-04-05T00:00:00-05:002017-04-05T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This quiz aims at testing your basic knowledge of Python’s modules, loops and simple I/O. Don’t forget to push your answers to your remote repository by the end of quiz time. Push your quiz-6 <em>readme.md</em> file to quiz/6/ folder in your Github project. If you write your answers in Python scripts, put the script files in the same folder as well. If you feel uncertain about your answer, you can test your final codes on Jupyter or IPython command lines.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> Suppose you write a Python module, which you would also like to run it as a standalone Python code. If you wanted to make sure that some specific Python statements are executed only when the code is run a Python code (and not a module), you may recall from the lecture, that we had to use and if block like the following,</p>
<pre><code class="language-python">if __name__ == "__main__":
<Python statements>
</code></pre>
<p><br />
Briefly explain what this if block does and mean.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> Suppose you write a module named <code>myModule</code>, which contains the function <code>myfunc</code>. Now you import this module to another code.</p>
<p>(A) Write down the import statement that would enable you to use <code>myfunc</code> with name <code>f</code> instead.</p>
<p>(B) What would be the output of the following Python print statement,</p>
<pre><code class="language-python">import myModule as mm
print(mm.__name__)
</code></pre>
<p><br /></p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>3. </strong> Suppose there are two lists of numbers,</p>
<pre><code class="language-python">even = [0,2,4,6,8]
odd = [1,3,5,7,9]
</code></pre>
<p><br />
Write a <strong>one-line</strong> Python statement (list comprehension) that gives a list <code>summ</code> whose elements are the sum of the respective elements in the above two lists <code>odd</code> and <code>even</code>, that is,</p>
<pre><code class="language-python">In [37]: summ
Out[37]: [1, 5, 9, 13, 17]
</code></pre>
<p><br />
(Hint: You can use <code>zip</code> function inside the list comprehension.)</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>4. </strong> Consider the following for-loop,</p>
<pre><code class="language-python">mylist = list(range(0,10,2))
for item in mylist:
mylist.append(item+1)
</code></pre>
<p><br />
How many iterations does this for-loop perform before ending? Explain briefly why.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/quiz/6-problems-python-modules-loops-io">Quiz 6: Problems - Python modules, loops, and IO</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 05, 2017.</p><![CDATA[Lecture 8: Python - I/O, error handling, and testing frameworks]]>http:/ECL2017S/lecture/8-python-io-error-handling-unit-testing2017-04-05T00:00:00-05:002017-04-05T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This lecture further explains topics on Input/Output processes and error handling in Python, as well as methods of testing the accuracy and robustness of your code.</p>
<div class="post_toc"></div>
<h2 id="io-continued">I/o (continued)</h2>
<p>So far in this course, we have indirectly discussed several methods of getting input information from the user, and several methods of outputting the result in a Python program. This lecture, attempts at formalizing all the previous discussions and introduce more general efficient methods of the code interaction with users.</p>
<h3 id="methods-of-inputting-data">Methods of inputting data</h3>
<p>Let’s begin with an example code, explaining the meaning of input/output (I/O) in Python,</p>
<pre><code class="language-python">from math import exp
a = 0.1
b = 1
x = 0.6
y = a*exp(b*x)
print(y)
</code></pre>
<pre><code>0.1822118800390509
</code></pre>
<p>In the above code, $a,b,x$ are examples of input data to a code, and $y$ is an example of code output. In such case as in the above, the input data is said to be <strong>hardcoded</strong> in the program.</p>
<blockquote>
<b>In general, in any programming language, including Python you should always avoid hardcoding input information into your program.</b><br /><br />
If data is hardcoded, then every time it has to change, the user has to change the content of the code, and this is not considered good programming style for software development.
</blockquote>
<p><br />
In general, input data can be fed to a program in four different ways:</p>
<ol>
<li>let the user answer questions in a dialog in the <strong>terminal window</strong>,</li>
<li>let the user provide input on the <strong>command line</strong>,</li>
<li>let the user provide input data in a <strong>file</strong>,</li>
<li>let the user write input data in a <strong>graphical interface</strong>.</li>
</ol>
<h4 id="input-data-from-terminal-window">Input data from terminal window</h4>
<p>We have already introduced and used this method frequently in previous lectures, via the Python’s builtin function <code>input()</code>. If we were to get the input data for the above example code via the terminal window, an example would be the following,</p>
<pre><code class="language-python">from math import exp
a,b,x = input('input the values for a,b,x (comma separated): ').split(",")
y = float(a)*exp(float(b)*float(x))
print(y)
</code></pre>
<pre><code>input the values for a,b,x (comma separated): 0.1,1,0.6
0.1822118800390509
</code></pre>
<h4 id="input-data-from-command-line">Input data from command line</h4>
<p>This approach, which we discussed in previous lecture, is most popular in Unix-like environments, where most users are accustomed to using Bash command line. However, it can be readily used in Windows environment as well. For this approach, there is a Python module <code>sys</code> that can accomplish what we desire,</p>
<pre><code class="language-python">from math import exp
import sys
a,b,x = sys.argv[1],sys.argv[2],sys.argv[3]
y = float(a)*exp(float(b)*float(x))
print(y)
</code></pre>
<p><br />
Now if you save this code in a <a href="http:/ECL2017S/lecture/8/input_via_sys.py" target="_blank">file</a>, and run it on the Bash command line, the program expects you the enter 3 float numbers following the name of the program,</p>
<pre><code class="language-bash">$ python input_via_sys.py 0.1 1 0.6
0.1822118800390509
</code></pre>
<p><br /></p>
<blockquote>
<b>ATTENTION: Notice the convention for command-line arguments</b><br /><br />
<b>1.</b> As you see in the above example, the name of the program is considered as the first command line argument (<code>sys.argv[0]</code>). Also the arguments must be separated by a white space, and should appear in the proper order.<br /><br />
<b>2.</b> If one value has a white space (e.g., a string value with white space character in it), then it has to be contained in quotation marks <code>''</code> or <code>""</code>.<br /><br />
<b>3.</b> Also note that all input command-line arguments are taken as string values. Therefore, you will have to convert them to the proper type (e.g., float, int, ...) once they are read from the command line.
</blockquote>
<p><br /></p>
<h5 id="variable-number-of-command-line-arguments">Variable number of command line arguments</h5>
<p>If the number of input arguments on the command line is not known a priori, then you can get a list of all input arguments using <code>sys.argv[1:]</code> and then use a for-loop to loop over individual elements of it, or use <code>len()</code> function to find the total number of input arguments.<br />
<br /></p>
<h5 id="option-value-pairs-as-command-line-input">Option-value pairs as command-line input</h5>
<p>Once the number of input arguments to your code increases, the process of inputting data as command line arguments can get complicated. Ideally, the user should be able to enter data in any arbitrary order. This can be done by indicating the meaning of each input by a flag before the input value. For example, suppose you were to find the location $y(t)$ of an object thrown up in the air vertically, given that the object started at $y=y_0$, at $t=0$ with an initial velocity $v_0$, and thereafter was subject to a constant acceleration $a$,
<script type="math/tex">y(t) = y_0 + v_0t + \frac{1}{2}at^2 ~.</script>
Obviously, this formula requires four input variables: $y_0$, $v_0$, $a$, and $t$, and we don’t the program user to memorize their order of entry on the command line. The solution is to identify the type of each input using a flag preceding the input value. This can be done using <code>argparse</code> Python module. Details of the usage of this module goes beyond the limited time of our class. However, I recommend you to have a look at the <a href="https://docs.python.org/3/library/argparse.html" target="_blank">syntax and usage of <em>argparse</em> module</a>, as you will find it very handy in your Python codes, projects, and software development.</p>
<h4 id="input-data-from-file">Input data from file</h4>
<p>In cases where the input data is large, the command-line arguments and input from terminal window are not efficient anymore. In such cases, the most common approach is to let the code read input data from a file, the path to which is most often given to the code from the command line or the terminal window.</p>
<h5 id="reading-a-file-line-by-line">Reading a file line by line</h5>
<p>To read a file, say <a href="http:/ECL2017S/lecture/8/data.in" target="_blank">this file</a>, one first needs to open it,</p>
<pre><code class="language-python">In [1]: myfile = open('data.in', 'r')
In [2]: type(myfile)
Out[2]: _io.TextIOWrapper
In [5]: myfile.
myfile.buffer myfile.detach myfile.fileno myfile.line_buffering myfile.newlines myfile.readline myfile.seekable myfile.writable
myfile.close myfile.encoding myfile.flush myfile.mode myfile.read myfile.readlines myfile.tell myfile.write
myfile.closed myfile.errors myfile.isatty myfile.name myfile.readable myfile.seek myfile.truncate myfile.writelines
</code></pre>
<p><br />
The function <code>open</code> creates a file object, stored in the variable <code>myfile</code>. The second input argument to <code>open</code>, <code>'r'</code> tells the function that the purpose of this file opening is to read data (as opposed to, for example, writing data, or both reading and writing).</p>
<p>Now you can use a for loop to read the data in this file line by line:</p>
<pre><code class="language-python">for line in myfile:
print(line)
</code></pre>
<pre><code>1
3
4
5
6
7
88
65
</code></pre>
<p>What is printed here, is actually the content of <code>data.in</code> file, line by line.</p>
<h5 id="alternative-method-of-reading-file-data">Alternative method of reading file data</h5>
<p>Instead of reading one line at a time, as in the above, we can load all lines into a single list of strings,</p>
<pre><code class="language-python">In [9]: myfile = open('data.in', 'r')
In [10]: lines = myfile.readlines()
In [11]: type(lines)
Out[11]: list
</code></pre>
<p><br />
Note that each element of <code>line</code> contains one line of the file.</p>
<pre><code class="language-python">In [15]: lines
Out[15]: ['1\n', '3\n', '4\n', '5\n', '6\n', '7\n', '88\n', '65\n']
</code></pre>
<p><br />
The action of the method <code>readlines()</code> is equivalent to a for-loop like the following,</p>
<pre><code class="language-python">In [16]: myfile = open('data.in', 'r')
...: lines = []
...: for line in myfile:
...: lines.append(line)
...: lines
...:
Out[16]: ['1\n', '3\n', '4\n', '5\n', '6\n', '7\n', '88\n', '65\n']
</code></pre>
<p><br />
or this <em>list comprehension</em> format,</p>
<pre><code class="language-python">In [19]: myfile = open('data.in', 'r')
...: lines = [line for line in myfile]
...: lines
...:
Out[19]: ['1\n', '3\n', '4\n', '5\n', '6\n', '7\n', '88\n', '65\n']
</code></pre>
<p><br />
Now suppose you were to calculate the mean of the numbers in <a href="http:/ECL2017S/lecture/8/data.in" target="_blank">this file</a>. You could simply use the following list comprehension code to do so,</p>
<pre><code class="language-python">In [22]: mean = sum([float(line) for line in lines])/len(lines)
...: print(mean)
22.375
</code></pre>
<p><br />
Note that once you read the file, you can close it using,</p>
<pre><code class="language-python">myfile.close()
</code></pre>
<p><br /></p>
<h5 id="the-with-statement">The <em>with</em> statement</h5>
<p>More often in modern Python code you may see the <code>with</code> statement for reading a file, like the following</p>
<pre><code class="language-python">In [34]: with open('data.in', 'r') as myfile:
...: for line in myfile:
...: print(line)
...:
1
3
4
5
6
7
88
65
</code></pre>
<p><br />
This is technically equivalent to,</p>
<pre><code class="language-python">In [35]: myfile = open('data.in', 'r')
...: for line in myfile:
...: print(line)
...: myfile.close()
...:
1
3
4
5
6
7
88
65
</code></pre>
<p><br />
The difference here is that with the modern <code>with</code> statement, there is no need to close the file in the end.</p>
<h5 id="the-old-while-true-construction">The old <em>while True</em> construction</h5>
<p>The call <code>myfile.readline()</code> returns a string containing the text at the current line. A new <code>myfile.readline()</code> statement will read the next line. If the file reaches the end, then <code>myfile.readline()</code> returns an empty string, the end of the file has
reached and the code must stop further reading of the file. The traditional way of telling the code to stop at the end of the file is a <code>while</code> loop like the following,</p>
<pre><code class="language-python">In [36]: myfile = open('data.in', 'r')
...: while True:
...: line = myfile.readline()
...: if not line:
...: break
...: print(line)
1
3
4
5
6
7
88
65
</code></pre>
<p><br /></p>
<h5 id="reading-an-entire-file-as-a-single-string">Reading an entire file as a single string</h5>
<p>While the <code>readlines()</code> method returns a list of lines in the file, the <code>read()</code> method returns a string containing the entire content of the file.</p>
<pre><code class="language-python">
In [37]: myfile = open('data.in', 'r')
...: s = myfile.read()
In [38]: s
Out[38]: '1\n3\n4\n5\n6\n7\n88\n65\n'
In [39]: print(s)
1
3
4
5
6
7
88
65
</code></pre>
<p><br />
The major advantage of this method of reading file content is that you can then immediately apply string methods directly on the file content.</p>
<pre><code class="language-python">In [48]: myfile = open('data.in', 'r')
...: numbers = [float(x) for x in myfile.read().split()]
...: mean = sum(numbers)/len(numbers)
...:
In [49]: print(mean)
22.375
</code></pre>
<p><br /></p>
<h3 id="converting-user-input-to-live-python-objects">Converting user input to live Python objects</h3>
<p>One of the cool features in Python I/O is that you can provide text containing valid Python code as input to a program and then
turn that text into <em>live Python objects</em> as if the text were lines of code written directly into the program beforehand. This is a very powerful tool for letting users specify function formulas, for instance, as input to a program. The program code itself has no knowledge about the kind of function the user wants to work with, yet at run time the user’s desired
formula enters the computations. To achieve the goal, one can use Python’s <strong>magic functions</strong>, a.k.a. <strong>special methods</strong>.</p>
<h4 id="the-magic-eval-function">The magic <em>eval</em> function</h4>
<p>The <code>eval</code> function takes a <strong>string as argument</strong> and <strong>evaluates</strong> this string as a <strong>Python expression</strong>. The result of an expression is an <strong>object</strong>. For example,</p>
<pre><code class="language-python">In [10]: eval('1+2')
Out[10]: 3
</code></pre>
<p><br />
This is equivalent to typing,</p>
<pre><code class="language-python">In [11]: 1+2
Out[11]: 3
</code></pre>
<p><br />
or another example,</p>
<pre><code class="language-python">In [12]: a = 1
In [13]: b = 2
In [14]: c = eval('a+b')
In [15]: c
Out[15]: 3
</code></pre>
<p><br />
or,</p>
<pre><code class="language-python">In [19]: from math import sqrt
In [20]: eval('sqrt(4)')
Out[20]: 2.0
</code></pre>
<p><br />
But, note that in all of the above examples, the <code>eval</code> function <strong>evaluates</strong> a Python expression, that is, this function <strong>cannot execute</strong> a Python statement.</p>
<p>Now the cool thing about this function is that, you can directly apply it to the user input. For example, suppose the user is asked to input a Python expression and then the code is supposed to evaluate the input just like a simple calculator,</p>
<pre><code class="language-python">eval(input('Input an arithmetic expression to evaluate: '))
</code></pre>
<pre><code>Input an arithmetic expression to evaluate: 2 + 3.0/5 + exp(7)
1099.2331584284584
</code></pre>
<h4 id="the-magic-exec-function">The magic <em>exec</em> function</h4>
<p>Similar to the <code>eval</code> function, there is also an <code>exec</code> magic function that executes a string containing an arbitrary
Python code, not just a Python expression. This is a powerful idea since it now enables the user to write a formula as input to the program, available to the program in the form of a string object. The program can then convert this formula to a callable Python code, or function, using the magic <code>exec</code> function.</p>
<pre><code class="language-python">In [21]: exec('import math')
In [22]: exec('a=1; b=2; c=a+b')
In [23]: a,b,c
Out[23]: (1, 2, 3)
</code></pre>
<p><br />
One could even input a full function definition to the exec function,</p>
<pre><code class="language-python">myFuncString = input('Input a Python function definition of interest: ')
f = exec(myFuncString)
</code></pre>
<pre><code>Input a Python function definition of interest, named func: def func(x): return 2*x + 1
</code></pre>
<pre><code class="language-python">func(x=1)
</code></pre>
<pre><code>3
</code></pre>
<p>Now, since this is such a useful functionality in Python, there is already a package written <code>scitools</code>, that converts an input expression to a Python function,</p>
<pre><code class="language-python">from scitools.StringFunction import StringFunction
myfuncString = input('Input a Python expression to build your requested Python function: ')
myfunc = StringFunction(myfuncString)
</code></pre>
<p><br />
The only major caveat with this module is that, at the moment, it only works with Python 2.x, and not Python 3. So, the above code will not work on your Python 3 platform.<br />
<br /></p>
<h3 id="methods-of-outputting-data">Methods of outputting data</h3>
<p>Two major methods of data output are,</p>
<ol>
<li>writing to the terminal window, as previously done using <code>print()</code> function, or,</li>
<li>writing to an output file.</li>
</ol>
<p>We have already extensively discussed printing output to the terminal window. Writing data to file is also easy.</p>
<h4 id="writing-to-a-file">Writing to a file</h4>
<p>Similar to reading from a file, in order to write to a file, one has to first open the file, this time for the purpose of writing, which is indicated by <code>'w'</code> or <code>'a'</code>,</p>
<pre><code class="language-python">outfile = open(filename, 'w') # write to a new file, or overwrite file
</code></pre>
<p><br />
One could also <strong>append</strong> some output to an <strong>existing file</strong> using the <code>'a'</code> indicator as input to <code>open()</code>,</p>
<pre><code class="language-python">outfile = open(filename, 'a') # append to the end of an existing file
</code></pre>
<p><br />
In both cases, the string valued variable <code>filename</code> contains the path to the file that should be created or manipulated. Suppose we want to write the output of the above code in previous section to a new file. All you would need to do is the following,</p>
<pre><code class="language-python">myfile = open('data.in', 'r')
numbers = [float(x) for x in myfile.read().split()]
mean = sum(numbers)/len(numbers)
outfile = open('data.out','w')
outfile.write(str(mean) + '\n')
myfile.close()
outfile.close()
</code></pre>
<p><br />
This will result in the creation of <a href="http:/ECL2017S/lecture/8/data.out" target="_blank">a new file</a> named <code>data.out</code> which contains the value of <code>mean</code> variable. Note that the addition of the character <code>'\n'</code> at the end of the <code>write</code> statement is necessary, otherwise the next write to the file will not appear on a new line.</p>
<h5 id="writing-a-table-of-data-to-a-file">Writing a table of data to a file</h5>
<p>Now suppose you were to write the following list to an output file,</p>
<pre><code class="language-python">data = [[ 0.75, 0.29619813, -0.29619813, -0.75 ],
[ 0.29619813, 0.11697778, -0.11697778, -0.29619813],
[-0.29619813, -0.11697778, 0.11697778, 0.29619813],
[-0.75, -0.29619813, 0.29619813, 0.75 ]]
</code></pre>
<p><br />
One solution would be the following,</p>
<pre><code class="language-python">outfile = open('table.out', 'w')
for row in data:
for column in row:
outfile.write( '{:14.8f}'.format(column) )
outfile.write('\n')
outfile.close()
</code></pre>
<p><br />
This code would result in the creation of an <a href="http:/ECL2017S/lecture/8/table.out" target="_blank">output file</a> named <code>table.out</code> which contain the content of <code>data</code> variable, in a nice formatted style as the following,</p>
<pre><code class="language-text"> 0.75000000 0.29619813 -0.29619813 -0.75000000
0.29619813 0.11697778 -0.11697778 -0.29619813
-0.29619813 -0.11697778 0.11697778 0.29619813
-0.75000000 -0.29619813 0.29619813 0.75000000
</code></pre>
<p><br /></p>
<h2 id="error-handling-in-python">Error handling in Python</h2>
<p>A good code has to be able to handle exceptional situations that may occur during the code execution. These exceptions may occur during data input from either command line, terminal window, or an input file. They may also occur as a result of repeated operations on the input data, inside the code. For example, in <a href="http:/ECL2017S/lecture/7-python-modules-loops-io#command-line-arguments" target="_blank">lecture 7</a>, we explained a way of handling the wrong number of input command line arguments. This and similar measures to handle nicely the unexpected runtime errors is what’s called <strong>error and exception handling</strong>.</p>
<p>A simple way of error handling is to write multiple if-blocks each of which handle a special exceptional situation. That is, to let the code execute some statements, and if something goes wrong, write the program in such a way that can detect this and jump to a set of statements that handle the erroneous situation as desired.</p>
<p>A more modern and flexible way of handling such potential errors in Python is through the following Python construction,</p>
<pre><code class="language-python">try:
<Python statements>
except <error type>:
<Python statements>
</code></pre>
<p><br />
For example, if we were to rewrite the command line argument section in <a href="http:/ECL2017S/lecture/7/cmd_find_primes.py" target="_blank">this code</a> in <a href="http:/ECL2017S/lecture/7-python-modules-loops-io#command-line-arguments" target="_blank">lecture 7</a>, to handle exceptions that arise due to <code>ValueError</code> (e.g., not an integer input), it would look something like the following,</p>
<pre><code class="language-python">if __name__ == "__main__":
import sys
if len( sys.argv ) != 2: # check the number of arguments to be exactly 2.
print('''
Error: Exactly two arguments must be given on the command line.
Usage:''')
print(" ", sys.argv[0], "<a positive integer number>", '\n')
sys.exit(' Program stopped.\n')
else:
try:
n = int(sys.argv[1])
print('Here is a list of all prime numbers smaller than {}:'.format(n))
get_primes(n)
except ValueError:
print('The input you entered is not an integer!\n Try again...')
sys.exit(1)
</code></pre>
<p><br />
The statement <code>sys.exit(1)</code> aborts the program. The whole code can be found <a href="http:/ECL2017S/lecture/8/cmd_find_primes_modern.py" target="_blank">here</a>. Now if we run the <a href="http:/ECL2017S/lecture/7/cmd_find_primes.py" target="_blank">original code</a> with a non-integer input, we would get the following Python error,</p>
<pre><code class="language-bash">$ ../7/cmd_find_primes.py amir
Traceback (most recent call last):
File "../7/cmd_find_primes.py", line 34, in <module>
n = int(sys.argv[1])
ValueError: invalid literal for int() with base 10: 'amir'
</code></pre>
<p><br />
whereas, if we run the <a href="http:/ECL2017S/lecture/8/cmd_find_primes_modern.py" target="_blank">newly written code</a>, the non-integer error is noicely handled by outputting a gentle error message to the user and exiting the program gracefully.</p>
<pre><code class="language-bash">$ ./cmd_find_primes_modern.py amir
The input you entered is not an integer!
Try again...
</code></pre>
<p><br />
The type of error occurring in the above example was <code>ValueError</code>. There can be however, many other types of errors and exceptions. For this reason, Python has a <a href="https://docs.python.org/2/library/exceptions.html" target="_blank">builtin list of exceptions</a> that frequently occur in programming.</p>
<h3 id="the-raise-statement">The <em>raise</em> statement</h3>
<p>Instead of the print statement in the above <code>except</code> block, Python has a builtin function to handle the error together with an input message from the programmer. For example, the previous code, could be modified to the following code,</p>
<pre><code class="language-python">if __name__ == "__main__":
import sys
if len( sys.argv ) != 2: # check the number of arguments to be exactly 2.
print('''
Error: Exactly two arguments must be given on the command line.
Usage:''')
print(" ", sys.argv[0], "<a positive integer number>", '\n')
sys.exit(' Program stopped.\n')
else:
try:
n = int(sys.argv[1])
print('Here is a list of all prime numbers smaller than {}:'.format(n))
get_primes(n)
except ValueError:
raise ValueError('The input you entered is not an integer!\n Try again...')
sys.exit(1)
</code></pre>
<p><br />
Executing the <a href="http:/ECL2017S/lecture/8/cmd_find_primes_raise.py" target="_blank">code</a> with wrong input would give,</p>
<pre><code class="language-bash">$ ./cmd_find_primes_raise.py amir
Traceback (most recent call last):
File "./cmd_find_primes_raise.py", line 35, in <module>
n = int(sys.argv[1])
ValueError: invalid literal for int() with base 10: 'amir'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "./cmd_find_primes_raise.py", line 39, in <module>
raise ValueError('The input you entered is not an integer!\n Try again...')
ValueError: The input you entered is not an integer!
Try again...
</code></pre>
<p><br />
A more elegant and cleaner way of handling and outputting the error would be use the following syntax, in <a href="http:/ECL2017S/lecture/8/cmd_find_primes_raise_as_err.py" target="_blank">this modified code</a>,</p>
<pre><code class="language-python">if __name__ == "__main__":
import sys
if len( sys.argv ) != 2: # check the number of arguments to be exactly 2.
print('''
Error: Exactly two arguments must be given on the command line.
Usage:''')
print(" ", sys.argv[0], "<a positive integer number>", '\n')
sys.exit(' Program stopped.\n')
else:
try:
n = int(sys.argv[1])
print('Here is a list of all prime numbers smaller than {}:'.format(n))
get_primes(n)
except ValueError as err:
print(err)
sys.exit(1)
</code></pre>
<p><br />
With the following output,</p>
<pre><code class="language-bash">$ ./cmd_find_primes_raise_as_err.py amir
invalid literal for int() with base 10: 'amir'
</code></pre>
<p><br />
In the statement <code>except ValueError as err:</code> one could use <code>Exception</code> for all types of errors instead of only <code>ValueError</code> exceptions, or use a tuple syntax such as <code>except (ValueError, IndexError) as err:</code> to cover these two exceptions.</p>
<h2 id="code-verification-and-unit-testing">Code verification and unit testing</h2>
<p>In the previous lecture we discussed the process of creating modules and collecting functions in one file as a personal module to be used later. As soon as the list of your codes and functions grow, you will need to have a unified way of ensuring all your functions work appropriately, regardless of the potential future internal changes that are made to the functions. This is what the <strong>unit testing</strong> exists for. Unit testing is a software development process in which the smallest testable parts of an application, called <strong>units</strong>, are individually and independently scrutinized for proper operation. Unit testing can be done manually, but if you have a long list of functions (which you most often have), you’d want to automate the testing process.</p>
<p>The grand goal in unit testing is to reduce the risk of encountering potential problems when running the code in the smallest possible units of the code. This means,</p>
<ol>
<li>ensuring the code has the <strong>correct behavior</strong> when given the proper input data.</li>
<li>ensuring the <strong>code robustness</strong> to exceptions and invalid input data, meaning that it does not crash when it reaches unexpected situations during the code execution and gracefully handles the error, without interruption.</li>
</ol>
<p>Because of the goals for which the unit tests are designed, they are mostly written and used by the developers of the code.</p>
<h3 id="unit-test-frameworks">Unit test frameworks</h3>
<p>There are many ways to write tests for codes. Now, if you asked each software developer to write a unit test for a specific software, each would likely come up with their own set of rules and tests of the software. You will end up with many tests, that will generally only be usable by the developer that wrote the tests. That is why you should select a unit test framework. A unit test framework provides consistency for how the unit tests for your project are written. There are many test frameworks to choose from for just about any language you want to program with, including Python. Just like programming language, almost every programmer has a strong opinion which test framework is the best. Research what’s out there and use the one that meets the needs of your organization (For example, there is one experienced Python programmer in our ECL class who does not like any of the existing unit tests for Python, and wants to write his own unit test as the project of this course!).</p>
<p>The framework will provide a consistent testing structure to create maintainable tests with reproducible results. From a product quality and business point of view, those are the most valuable reasons to use a unit test framework. When you write a code, you should also think of a quick and simple way to develop and verify your logic in isolation. Once you make sure you have it working solidly by itself, then you can proceed to integrate it into the larger project solutions with great confidence.</p>
<p>Python offers three unit testing frameworks,</p>
<ol>
<li><a href="https://docs.python.org/2/library/unittest.html" target="_blank"><strong>unittest</strong></a> (Python’s standard unit testing framework)</li>
<li><a href="http://nose.readthedocs.io/en/latest/index.html" target="_blank"><strong>nose</strong></a></li>
<li><a href="https://docs.pytest.org/en/latest/" target="_blank"><strong>pytest</strong></a></li>
</ol>
<p>which automate as much as possible the process of testing all of your codes, whenever required. The last, <code>pytest</code> appears to be the most popular unit testing framework as of today.</p>
<h4 id="conventions-for-test-functions">Conventions for test functions</h4>
<p>The simplest way of using the testing frameworks (e.g., pytest or nose) is to write a set of test functions, scattered around in files, such that pytest or nose can automatically find and run all of these test functions. To achieve the goal, the test functions need to follow certain conventions:</p>
<ol>
<li>The name of a test function starts with <code>test_</code>.</li>
<li>A test function cannot take any arguments.</li>
<li>Any test must be formulated as a boolean condition.</li>
<li>An <code>AssertionError</code> exception is raised if the boolean condition is <code>False</code> (i.e., when the test fails).<br />
<br /></li>
</ol>
<h4 id="testing-function-accuracy">Testing function accuracy</h4>
<p>Suppose we have written the following function which runs the Newton’s method for solving an algebraic equation of the form $f(x)=0$, and we would like to write a test function that ensures its correct behavior.</p>
<pre><code class="language-python">def newton(f, dfdx, x, eps=1E-7):
n = 0 # iteration counter
while abs(f(x)) > eps:
x = x - f(x)/dfdx(x)
n += 1
return x, f(x), n
</code></pre>
<p><br />
Our goal is to write a function that tests the validity of the output of the function for a special case for which we know the results a priori. In the case of the above code, the function output is a not a fixed result, but an approximate float number $x_0$ which satisfies the condition $f(x_0)<\epsilon$ where $\epsilon$ is a prescribed number close to zero. Therefore, we have to first come up with a mathematical test input function to the function <code>newton</code>, for which we have calculated the correct answer a priori, and we want to make sure if the above code gives the same answer. Since the output of the function <code>newton</code> is a float that depends on the machine precision, we cannot expect the function to output the exact same result every time the code is run on any computer. Therefore, we have to define our test such that the function passes the test even if the result is not exactly what we expect, but still close enough to the correct answer. Here is an example test function for the above code using the <code>sin(x)</code> function as the test input function to <code>newton()</code>,</p>
<pre><code class="language-python">def test_newton_sin():
from math import sin, cos, pi
def f(x):
return sin(x)
def dfdx(x):
return cos(x)
x_ref = 0.000769691024206
f_x_ref = 0.000769690948209
n_ref = 3
x, f_x, n = newton(f, dfdx, x=-pi/3, eps=1E-2)
tol = 1E-15 # tolerance for comparing real numbers
assert abs(x_ref - x) < tol , "The test for the value of x_0 failed" # is x correct?
assert abs(f_x_ref - f_x) < tol , "The test for the function value failed" # is f_x correct?
assert n == 3 , "The test for the number of iterations failed" # is f_x correct? # is n correct?
</code></pre>
<p><br />
Note that in the above test function, the function name begins with <code>test_</code>, takes no arguments, and raises an <code>assertionError</code> at the end. Now if you run the test,</p>
<pre><code class="language-python">test_newton_sin()
</code></pre>
<p><br />
you will notice that the function passed the test. However, if in the above test, we set <code>eps=1E-10</code>, and run the test again, you will get an assertion error like the following,</p>
<pre><code class="language-python">---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-20-8be9faac8d8e> in <module>()
----> 1 test_newton_sin()
<ipython-input-18-263651ba410f> in test_newton_sin()
14 x, f_x, n = newton(f, dfdx, x=-pi/3, eps=1E-10)
15 tol = 1E-15 # tolerance for comparing real numbers
---> 16 assert abs(x_ref - x) < tol , "The test for the value of x_0 failed" # is x correct?
17 assert abs(f_x_ref - f_x) < tol , "The test for the function value failed" # is f_x correct?
18 assert n == 3 , "The test for the number of iterations failed" # is f_x correct? # is n correct?
AssertionError: The test for the value of x_0 failed
</code></pre>
<p><br />
One could also write exact tests for the function <code>newton</code> which test for an exact result which is known a priori, for example a mathematical linear input function to <code>newton</code>.<br />
<br /></p>
<h4 id="testing-function-robustness">Testing function robustness</h4>
<p>The above <code>newton</code> function is very basic and suffers from several problems:</p>
<ul>
<li>for divergent iterations it will iterate forever,</li>
<li>it can divide by zero in f(x)/dfdx(x),</li>
<li>it can perform integer division in f(x)/dfdx(x),</li>
<li>it does not test whether the arguments have acceptable types and values.</li>
</ul>
<p>A more robust implementation dealing with these potential problems would look like the following:</p>
<pre><code class="language-python">def Newton(f, dfdx, x, eps=1E-7, maxit=100):
if not callable(f): raise TypeError( 'f is %s, should be function or class with __call__' % type(f) )
if not callable(dfdx): raise TypeError( 'dfdx is %s, should be function or class with __call__' % type(dfdx) )
if not isinstance(maxit, int): raise TypeError( 'maxit is %s, must be int' % type(maxit) )
if maxit <= 0: raise ValueError( 'maxit=%d <= 0, must be > 0' % maxit )
n = 0 # iteration counter
while abs(f(x)) > eps and n < maxit:
try:
x = x - f(x)/float(dfdx(x))
except ZeroDivisionError:
raise ZeroDivisionError( 'dfdx(%g)=%g - cannot divide by zero' % (x, dfdx(x)) )
n += 1
return x, f(x), n
</code></pre>
<p><br />
Now, for this more robust code (than the earlier version: <code>newton</code>), we have to also write a set of tests, examining the robustness of the code, subject to potential exceptions. For example, one can write a test function that examines the behavior of <code>Newton</code> subject to an input mathematical function that is known to lead to divergent (infinite) iterations, if the initial starting point $x$ is not sufficiently close to the root of the function. One such example is $f(x)=tanh(x)$, for which a starting search value of $x=20$ would lead to infinite iterations in the Newton’s method. So we can set <code>maxit=12</code> in our robust <code>Newton</code> code, and test that the actual number of iterations reaches this limit. Given our prior knowledge for this function, that the value of $x$ will also diverge after 12 iterations, we could also add a test for the value of $x$, like the following,</p>
<pre><code class="language-python">def test_Newton_divergence():
from math import tanh
f = tanh
dfdx = lambda x: 10./(1 + x**2)
x, f_x, n = Newton(f, dfdx, 20, eps=1E-4, maxit=12)
assert n == 12
assert x > 1E+50
</code></pre>
<pre><code class="language-python">test_Newton_divergence()
</code></pre>
<p><br />
The example given here, only tests for the robustness of <code>Newton()</code> in handling divergent situations. For other potential problems, one has to write other test functions, some which will be given as exercise.</p>
<h3 id="summary-unit-testing">Summary: unit testing</h3>
<p>Unit testing is a component of <a href="https://en.wikipedia.org/wiki/Test-driven_development" target="_blank">test-driven development (TDD)</a>, a pragmatic methodology that takes a meticulous approach to building a product by means of <em>continual testing and revision</em>.</p>
<p>Unit testing has a steep learning curve. The development team needs to learn what unit testing is, how to unit test, what to unit test and how to use automated software tools to facilitate the process on an on-going basis. The great benefit to unit testing is that the earlier a problem is identified, the fewer compound errors occur. A compound error is one that doesn’t seem to break anything at first, but eventually conflicts with something down the line and results in a problem.</p>
<p>There is a lot more to unit testing and the existing Python frameworks for it than we discussed here. However, covering all those topics would require a dedicated course for unit testing, which is certainly beyond the capacity of this course. But if you are interested to know more, I recommend you to refer to one of the three unit testing frameworks mentioned <a href="#unit-test-frameworks">above</a>. There are also books already written on this topic an example of which is available <a href="http://chimera.labs.oreilly.com/books/1234000000754/pr01.html" target="_blank">here</a>.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/lecture/8-python-io-error-handling-unit-testing">Lecture 8: Python - I/O, error handling, and testing frameworks</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 05, 2017.</p><![CDATA[Homework 7: Problems - Python I/O, error handling, and unit testing]]>http:/ECL2017S/homework/7-problems-python-IO-error-handling-unit-testing2017-04-05T00:00:00-05:002017-04-05T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This homework aims at giving you some experience with Python I/O, error handling in your code, and testing you code for accuracy and robustness.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> Write a simple program named <code>sum.py</code>, that takes in an arbitrary-size list of input floats from the command-line, and prints out the sum of them on the terminal with the following message,</p>
<pre><code class="language-bash">$ python sum.py 1 2 1 23
The sum of 1 2 1 23 is 27.0
</code></pre>
<p><br />
Note that you will need to use the Python’s builtin function <code>sum()</code>.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> Similar to the previous probelm, write a simple program named <code>sum_via_eval.py</code>, that takes in an arbitrary-size list of input numbers from the command-line, and prints out the sum of them on the terminal, this time using Python’s <code>eval</code> function. The program output should look like the following,</p>
<pre><code class="language-bash">$ python sum.py 1 2 1 23
The sum of 1 2 1 23 is 27
</code></pre>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>3. </strong> Consider <a href="http:/ECL2017S/homework/7/1A2T_A.dssp" target="_blank">this data file</a>. It contains information about the amino acids in <a href="http://www.rcsb.org/pdb/explore.do?structureId=1a2t" target="_blank">a protein</a> called <code>1A2T</code>. Each amino acid in protein is labeled by a single letter. There are 20 amin acid molecules in nature, and each has a total surface area (in units of Angstroms squared) that is given by the following table,</p>
<pre><code>'A': 129.0
'R': 274.0
'N': 195.0
'D': 193.0
'C': 167.0
'Q': 225.0
'E': 223.0
'G': 104.0
'H': 224.0
'I': 197.0
'L': 201.0
'K': 236.0
'M': 224.0
'F': 240.0
'P': 159.0
'S': 155.0
'T': 172.0
'W': 285.0
'Y': 263.0
'V': 174.0
</code></pre>
<p>However, when these amino acids sit next to each other to form a chain protein, they cover parts of each other, such that only parts of their surfaces is exposed, while the rest is hidden from the outside world by other neighboring amino acids. Therefore, one would expect an amino acid that is at the core of a spherical protein would have almost zero exposed surface area.</p>
<p>Now given the above information, write a Python program that takes in two command-line input arguments, one of which is a string containing the path to the above <a href="http:/ECL2017S/homework/7/1A2T_A.dssp" target="_blank">input file</a> <code>1A2T_A.dssp</code> which contains the partially exposed surface areas of amino acids in protein <code>1A2T</code> for each of its amino acids, and a second command-line argument which is the path to the file containing output of the code (e.g., it could be <code>./readDSSP.out</code>). Then,</p>
<ol>
<li>the code reads the content of this file, and<br />
<br /></li>
<li>extracts the names of the amino acids in this protein from the data column inside the file which has the header <code>AA</code> (look at the line number 25 inside the input data file, below <code>AA</code> is the column containing the one-letter names of amino acids in this protein), and<br />
<br /></li>
<li>also extracts the partially exposed surface area information for each of these amino acids which appear in the column with header <code>ACC</code>, and<br />
<br /></li>
<li>then uses the above table of maximum surface area values to calculate the fractional exposed surface area of each amino acid in this protein (i.e., for each amino acid, fraction_of_exposed_surface = ACC / maximum_surface_area_from_table), and<br />
<br /></li>
<li>finally for each amino acid in this protein, it prints the one-letter name of the amino acid, its corresponding partially exposed surface area (ACC from the input file), and its corresponding fractional exposed surface area (name it RSA) to the output file given by the user on the command line.<br />
<br /></li>
<li>On the first column of the output file, the code should also write the name of the protein (which is basically the name of the input file <code>1A2T_A</code>) on each line of the output file. <strong>Note that your code should extract the protein name from the input filename</strong> (by removing the file extension and other unnecessary information from the input command line string). <a href="http:/ECL2017S/homework/7/readDSSP.out" target="_blank">Here</a> is an example output of the code.<br />
<br /></li>
<li>Your code should also be able to handle an error resulting from less or more than 2 input command line arguments. That is, if the number of input arguments is 3 or 1, then it should input the following message on screen and stop.</li>
</ol>
<pre><code class="language-bash">$ ./readDSSP.py ./1A2T_A.dssp
Usage:
./readDSSP.py <input dssp file> <output summary file>
Program aborted.
</code></pre>
<p><br />
or,</p>
<pre><code class="language-bash">$ ./readDSSP.py ./1A2T_A.dssp ./readDSSP.out amir
Usage:
./readDSSP.py <input dssp file> <output summary file>
Program aborted.
</code></pre>
<p><br />
To achieve the above goal, you will have to create a dictionary from the above table, with amino acid names as the keys, and the maximum surface areas as the corresponding values. Name your code <code>readDSSP.py</code> and submit it to your repository.</p>
<p><strong>Write your code in such a way that it checks for the existence of the output file</strong>. If it already exists, then it does not remove the content of the file, whereas, it appends new data to the existing file. therwise, if the file does not exist, then it creates a new output file as requested by the user. To do so, you will need to use <code>os.path.isfile</code> function from module <code>os</code>.</p>
<p><strong>ATTENTION</strong>: Note that in some rows instead of a one-letter amino acid name, there is <code>!</code>. In such cases, your code should be able to detect the abnormality and skip that row, because that row does not contain amino acid information.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>4. </strong> Consider the simplest program for evaluating the formula $y(t) = v_0t-\frac{1}{2}gt^2$,</p>
<pre><code class="language-python">v0 = 3; g = 9.81; t = 0.6
y = v0*t - 0.5*g*t**2
print(y)
</code></pre>
<p><br />
(A) Write a program that takes in the above necessary input data ($t$,$v_0$) as command line arguments.<br />
<br />
(B) Extend your program from part (A) with exception handling such that missing command-line arguments are detected. For example, if the user has entered enough input arguments, then the code should raise <code>IndexError</code> exception. In the <code>except IndexError</code> block, the code should use the <code>input</code> function to ask the user for the missing input data.<br />
<br />
(C) Add another exception handling block that tests if the $t$ value read from the command line, lies between $0$ and $2v_0/g$. If not, then it raises a <code>ValueError</code> exception in the if block on the legal values of $t$, and notifes the user about the legal interval for $t$ in the exception message.</p>
<p>Here are some example runs of the code,</p>
<pre><code class="language-bash">$ ./projectile.py
Both v0 and t must be supplied on the command line
v0 = ?
5
t = ?
4
Traceback (most recent call last):
File "./projectile.py", line 17, in <module>
'must be between 0 and 2v0/g = {}'.format(t,2.0*v0/g))
ValueError: t = 4.0 is a non-physical value.
must be between 0 and 2v0/g = 1.019367991845056
</code></pre>
<p><br /></p>
<pre><code class="language-bash">$ ./projectile.py
Both v0 and t must be supplied on the command line
v0 = ?
5
t = ?
0.5
y = 1.27375
</code></pre>
<p><br /></p>
<pre><code class="language-bash">$ ./projectile.py 5 0.4
y = 1.2151999999999998
</code></pre>
<p><br /></p>
<pre><code class="language-bash">$ ./projectile.py 5 0.4 3
y = 1.2151999999999998
</code></pre>
<p><br /></p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>5. </strong> Consider the function <code>Newton</code> that we discussed in <a href="http:/ECL2017S/lecture/8-python-io-error-handling-unit-testing" target="_blank">lecture 8</a>,</p>
<pre><code class="language-python">def Newton(f, dfdx, x, eps=1E-7, maxit=100):
if not callable(f): raise TypeError( 'f is %s, should be function or class with __call__' % type(f) )
if not callable(dfdx): raise TypeError( 'dfdx is %s, should be function or class with __call__' % type(dfdx) )
if not isinstance(maxit, int): raise TypeError( 'maxit is %s, must be int' % type(maxit) )
if maxit <= 0: raise ValueError( 'maxit=%d <= 0, must be > 0' % maxit )
n = 0 # iteration counter
while abs(f(x)) > eps and n < maxit:
try:
x = x - f(x)/float(dfdx(x))
except ZeroDivisionError:
raise ZeroDivisionError( 'dfdx(%g)=%g - cannot divide by zero' % (x, dfdx(x)) )
n += 1
return x, f(x), n
</code></pre>
<p><br />
This function is supposed to be able to handle exceptions such as divergent iterations (which we discussed in the lecture), and division-by-zero. The latter error happens when <code>dfdx(x)=0</code> in the above code. Write a test code that ensures the above code is able to correctly identify a division-by-zero exception and raise the correct assertionError.<br />
(<em>Hint: To do so, you need to consider a test mathematical function as input to <code>Newton</code>. One example could be $f(x)=\cos(x)$ with a starting search value $x=0$. This would result in derivative value $f’(x=0)=-\sin(x=0)=0$, which should lead to a <code>ZeroDivisionError</code> exception. Now, write a test function <code>test_Newton_div_by_zero</code> that can explicitly handle this exception by introducing a boolean variable <code>success</code> that is <code>True</code> if the exception is raised and otherwise <code>False</code></em>.)</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/7-problems-python-IO-error-handling-unit-testing">Homework 7: Problems - Python I/O, error handling, and unit testing</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 05, 2017.</p><![CDATA[Exam final: semester project]]>http:/ECL2017S/exam/2-semester-project2017-04-05T00:00:00-05:002017-04-05T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is page describes the course project that will serve as the final exam for this course. Please submit all your efforts for this project (all files and data and results) in <code>ECL2017S/exams/final/</code> directory in your private repository for this course. Don’t forget to push your answers to your remote Github repository by the end of the semester.</p>
<p>Inside the directory for the project (<code>ECL2017S/exams/final/</code>) create three other folders: <code>data</code>, <code>src</code>, <code>results</code>. The <code>data</code> folder contains the <a href="http:/ECL2017S/exam/2/cells.mat" target="_blank">input data</a> for this project. The <code>src</code> folder should contain all your codes that you write for this project, and the <code>results</code> folder should contain all the results generated by your code.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p>Our goal in this project is to fit a mathemtical model of the growth to living cells to data for the growth of a tumor mass in the brain of a rat. You can download the MATLAB data file for this project from <a href="http:/ECL2017S/exam/2/cells.mat" target="_blank">here</a>. Write a Python code, set of separate codes that performs the following tasks one after the other, and output all the results to the <code>results</code> folder described above. If you have multiple Python codes each in a separate file, then write a <code>main.py</code> Python code, such that if the user runs</p>
<pre><code class="language-bash">./main.py ../data/cells.mat ../results/
</code></pre>
<p><br />
then all the necessary Python codes to generate all the results will be called by this <code>main.py</code> code. The first command line argument to this code is the path to the input <a href="http:/ECL2017S/exam/2/cells.mat" target="_blank">MATLAB data file</a> containing data for this project, and the second command line tells the code where to write all the output and results of the project.</p>
<h3 id="data-structure-of-the-input-matlab-file">Data structure of the input MATLAB file</h3>
<p>The input file, is a 4-dimensional double-precision MATLAB matrix <code>cells(:,:,:,:)</code>, corresponding to dimensions <code>cells(y,x,z,time)</code>. This data is collected from MRI imaging of the rat’s brain almost every other day for a period of two weeks. For example, <code>cells(:,:,:,1)</code> contains the number of cells at each point in space (y,x,z) at the first time point, or, <code>cells(:,:,10,1)</code> represents a (XY) slice of MRI at $z=1$ and $t=1$.</p>
<p>Now write a set of Python codes that perform the following tasks.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<h3 id="data-reduction-and-visualization">Data reduction and visualization</h3>
<p><br /></p>
<p><strong>1. </strong> First write a code that reads the input MATLAB file and converts the data to a 4-D NumPy array.</p>
<p><strong>2. </strong> Write Python codes that generate figures as similar as possible to the following figures (specific color-codes of the curves and figures do not matter, focus more on the format of the plots and its parts).</p>
<figure>
<img src="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_1_t10.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_2_t12.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_3_t14.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_5_t16.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_6_t18.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/ECL2017S/exam/2/figures/tvccZSliceSubplotWithXYlab_rad_00gy_7_t20.0.png" width="900" />
</figure>
<p><br /></p>
<p>and finally, a plot that shows the time evolution of the total number of tumor cells at all time points available in the input data. The time points are $T=[10, 12, 14, 15, 16, 18, 20]$ in units of days.</p>
<figure>
<img src="http:/ECL2017S/exam/2/figures/growthCurve_CellCount_rad_00gy.png" width="900" />
</figure>
<p><br /></p>
<p><strong>Note</strong>. There is no need to regenerate the error bars in the plot above for this project.</p>
<h3 id="the-mathematical-model-of-tumor-growth">The mathematical model of tumor growth</h3>
<p><br /></p>
<p><strong>3. </strong> Now our goal is to fit the time evolution of the growth of this tumor, using a mathematical model, and use the maximum likelihood approach and Markov Chain Monte Carlo Technique to find the best-fit parameters of the model. The model we use is called the <a href="https://en.wikipedia.org/wiki/Gompertz_function" target="_blank">Gompertizan growth model</a>,</p>
<script type="math/tex; mode=display">N(t,a,b,c) = a\exp\big( -b\exp(-ct) \big) ~,</script>
<p>where $N(t)$ is the number of tumor cells at time $t$, and $a$, $b$, and $c$ are the parameters that we would like to find their best values given the input tumor cell data. This Gompertzian growth model is our <strong>physical model</strong> for this problem.</p>
<h4 id="combining-the-mathematical-model-with-a-regression-model">Combining the mathematical model with a regression model</h4>
<p>Now, if the model was ideally perfect in describing the data, the curve of the model predicion would pass through all the points in the growth curve plot in the above, providing a prefect description of data. This is however, never the case, as it is famously said <strong>all models are wrong, but some are useful</strong>. In other words, the model prediction never matches observation perfectly. Therefore, we have to seek for the parameter values of the model that can get us as close as possible to data. To do so, we define a <strong>statistical model</strong> in addition to the <strong>physical model</strong> described above. In other words, we have to define a statistical regression model (the renowned least-squares method) that gives us the probability $\pi(N_{obs}|N(t))$ of observing individual data points at each of the given times,</p>
<script type="math/tex; mode=display">\pi(N_{obs}|N(t|a,b,c),\sigma) = \frac{1}{\sigma\sqrt{2\pi}} \exp\bigg( - \frac{ \big[ N_{obs}(t)-N(t|a,b,c) \big]^2}{2\sigma^2} \bigg) ~,</script>
<p>Note that our statistical model given above is a Gaussian probability density function, with its mean parameter represented by the output of our physical model, $N(t|a,b,c)$, and its standard deviation represented by $\sigma$, which is unknown, and we seek to extremize it.</p>
<p>We have seven data points, so the overall probability of observing all of data $\mathcal{D}$ together given the parameters of the model, $\mathcal{L}(\mathcal{D}|a,b,c,\sigma)$, is the product of their invidiual probabilities of observations given by the above equation,</p>
<script type="math/tex; mode=display">\mathcal{L}(\mathcal{D}|a,b,c,\sigma) = \prod_{i=1}^{n=7} \pi(N_{obs}(t_i)|N(t_i|a,b,c),\sigma) = \prod_{i=1}^{n=7} \frac{1}{\sigma\sqrt{2\pi}} \exp\bigg( - \frac{ \big[ N_{obs}(t_i)-N(t_i|a,b,c) \big]^2}{2\sigma^2} \bigg) ~,</script>
<p>More often, you would want to work with $\log\mathcal{L}$ instead of $\mathcal{L}$, so the above equation becomes,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\log\mathcal{L}(\mathcal{D}|a,b,c,\sigma)
&= \sum_{i=1}^{n=7} \log\pi(N_{obs}(t_i)|N(t_i|a,b,c),\sigma) \\\\
&= -\frac{n}{2}\bigg( \ln(2\pi) + \ln\sigma^2 \bigg) - \frac{1}{2\sigma^2} \sum_{i=1}^{n=7} \bigg[ N_{obs}(t_i)-N(t_i) \bigg]^2 ~,
\end{align*} %]]></script>
<p><strong>4. </strong> Now the goal is to use an optimization algorithm, such as Markov Chain Monte Carlo available in Python via <a href="https://pymc-devs.github.io/pymc/README.html" target="_blank">PyMc package</a>, to find the most likely set of parameters of the model $a,b,c,\sigma$ that give the best prediction of the available data. Use the pymc package, or any other method you wish to obtein the best parameters, then redraw the above tumor evolution curve and show the result from the model as well. You can also alternatively use my own package for MCMC sampling, if which case, please inform me and I will instruct you how to use it.</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/exam/2-semester-project">Exam final: semester project</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on April 05, 2017.</p><![CDATA[Lecture 7: Python - modules, loops, and I/O]]>http:/ECL2017S/lecture/7-python-modules-loops-io2017-03-29T00:00:00-05:002017-03-29T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This lecture explains modules, loops, with a brief introduction to Input/Output processes in Python. Ideally, modules should have been part of the previous lecture (with Python functions). The split was however necessary to reduce the size of lecture 6 to a manageable size.</p>
<div class="post_toc"></div>
<h2 id="python-modules">Python modules</h2>
<p>We have already used Python modules extensively in the past lectures, homework, and quizzes. although we never discussed them. To put it simply, Python modules are a collection of Python definitions, variables, functions, … that can be reused as a library in future.</p>
<p>Sometimes you want to reuse a function from an old program in a new program. The simplest way to do this is to copy and paste the old source code into the new program. However, this is not good programming practice, because you then over
time end up with multiple identical versions of the same function. When you want to improve the function or correct a bug, you need to remember to do the same update in all files with a copy of the function, and in real life most programmers fail to do so. You easily end up with a mess of different versions with different quality of basically the same code. Therefore, a golden rule of programming is to have one and only one version of a piece of code. All programs that want to use this piece of code must access one and only one place where the source code is kept. This principle is easy to implement if we create a module containing the code we want to reuse later in different programs.</p>
<h3 id="the-import-statement">The import statement</h3>
<p>We have already used the <code>math</code> module on multiple occasions, using the <code>import</code> statement. Here is an example:</p>
<pre><code class="language-python">In [11]: import math
In [12]: value = math.factorial(5)
In [13]: print(value)
120
In [14]: math.pi
Out[14]: 3.141592653589793
In [15]: math.e
Out[15]: 2.718281828459045
</code></pre>
<p><br />
In its simplest form, the import has the following syntax:</p>
<pre><code class="language-python">import module1[, module2[,... moduleN]
</code></pre>
<p><br />
like,</p>
<pre><code class="language-python">import math, cmath, numpy
</code></pre>
<p><br />
The standard approach for calling the names and definitions (variables, functions, …) inside the module is using the module-name prefix, like the above examples. To call the module names without the prefix, use the following module import statement,</p>
<pre><code class="language-python">In [16]: from math import *
In [17]: factorial(5)
Out[17]: 120
</code></pre>
<p><br />
To import only specific names, use the format like the following,</p>
<pre><code class="language-python">from math import pi,e,factorial,erf
</code></pre>
<p><br /></p>
<p>This will import the four math modules names <code>pi,e,factorial,erf</code>. You could also change the name of the input module, or specific names from it, upon importing the module into your code, using <code>import as</code> statement,</p>
<pre><code class="language-python">In [16]: import numpy as np
In [17]: np.double(5)
Out[17]: 5.0
In [20]: from numpy import double as dble
In [21]: dble(13)
Out[21]: 13.0
</code></pre>
<p><br /></p>
<blockquote>
A module can contain executable statements as well as function definitions. These statements are intended to initialize the module. They are executed <b>only the first time the module name is encountered in an import statement</b>. <br /><br />
Also, note that in general the practice of <code>from mod_name import *</code> from a module is discouraged, since it often causes poorly readable code. It is however very useful for saving time and exra typing in interactive sessions like IPython, or Jupyter.
</blockquote>
<p><br /></p>
<h3 id="listing-all-names-in-an-imported-module">Listing all names in an imported module</h3>
<p>To get a list of all available names in an imported module, use <code>dir()</code> function.</p>
<pre><code class="language-python">In [11]: import math
In [13]: dir(math)
Out[13]:
['__doc__',
'__loader__',
'__name__',
'__package__',
'__spec__',
'acos',
'acosh',
'asin',
'asinh',
'atan',
'atan2',
'atanh',
'ceil',
'copysign',
'cos',
'cosh',
'degrees',
'e',
'erf',
'erfc',
'exp',
'expm1',
'fabs',
'factorial',
'floor',
'fmod',
'frexp',
'fsum',
'gamma',
'gcd',
'hypot',
'inf',
'isclose',
'isfinite',
'isinf',
'isnan',
'ldexp',
'lgamma',
'log',
'log10',
'log1p',
'log2',
'modf',
'nan',
'pi',
'pow',
'radians',
'sin',
'sinh',
'sqrt',
'tan',
'tanh',
'trunc']
</code></pre>
<p><br /></p>
<h3 id="python-standard-modules">Python standard Modules</h3>
<p>Python comes with a set of standard modules as its library, the so-called <a href="https://docs.python.org/3/library/" target="_blank"><strong>Python Standard Library</strong></a>. Some of these modules are built into the Python interpreter; these provide access to operations that are not part of the core of the language but are nevertheless built in, for efficiency and other reasons.</p>
<h3 id="creating-modules">Creating modules</h3>
<p>To make a Python module, simply collect all the functions that constitute the module in one single file with a given filename, for example, <code>mymodule.py</code>. This file will be automatically a module, with name <code>mymodule</code>, from which you can import functions and definitions in the standard way described above.</p>
<blockquote>
<b>Why and when do you need to create a module?</b><br /><br />
Sometimes you want to reuse a function from an old program in a new program. The simplest way to do this is to copy and paste the old source code into the new program. However, this is not good programming practice, because you then over time end up with multiple identical versions of the same function. When you want to improve the function or correct a bug, you need to remember to do the same update in all files with a copy of the function, and in real life most programmers fail to do so. You easily end up with a mess of different versions with different quality of basically the same code. Therefore, a golden rule of programming is to have one and only one version of a piece of code. All programs that want to use this piece of code must access one and only one place where the source code is kept. This principle is easy to implement if we create a module containing the code we want to reuse later in different programs.
</blockquote>
<p>Note that modules can import other modules. It is customary but not required to place all import statements at the beginning of a module (or script, for that matter). The imported module names are placed in the importing module’s global <a href="https://en.wikipedia.org/wiki/Symbol_table" target="_blank">symbol table</a>.</p>
<h4 id="executing-modules-as-scripts">Executing modules as scripts</h4>
<p>When a Python module is called from the Bash command prompt like,</p>
<pre><code class="language-bash">python mycode.py
</code></pre>
<p><br />
the code in the module will be executed, just as if you imported it inside another code. This is good, but can sometimnes become problematic. Let’s explain this with an example from the midterm exam, a <a href="http:/ECL2017S/lecture/7/find_primes.py" target="_blank">script</a> that finds and reports all prime numbers smaller than a given input number $n$.</p>
<p>When you execute this code as astandalone Python script, it will ask you for an integer, to give you all integers that are smaller than the input number. Now suppose you wanted to import this script as a Python module into your code. If you do so, the Python interpreter would run all statements in this script and asks you to input an integer, before importing the rest of the functions in this script.</p>
<pre><code class="language-python">In [5]: import find_primes
Enter an integer number:
n = 13
Here is a list of all prime numbers smaller than 13:
13
11
7
5
3
2
</code></pre>
<p><br />
This may not be necessarily what we want to do. For example, we may only want to use the functions <code>get_primes</code> and <code>is_prime</code> in this script, without asking the user to input an integer and finding all smaller primes. The solution is to put the part of the code in the script that we don’t want to be executed as module, that is,</p>
<pre><code class="language-python">print('Enter an integer number: ')
n = int(input('n = '))
print('\n Here is a list of all prime numbers smaller than {}:'.format(n))
get_primes(n)
</code></pre>
<p><br />
inside the following if-block,</p>
<pre><code class="language-python">if __name__ == "__main__":
print('Enter an integer number: ')
n = int(input('n = '))
print('Here is a list of all prime numbers smaller than {}:'.format(n))
get_primes(n)
</code></pre>
<p><br />
When the code is run as a standalone script, the <code>__name__</code> property of the code is set to <code>__main__</code>. However, when the script is imported as a module inside another code, the <code>__name__</code> property is automatically set to the name of the module <code>find_primes</code>. Thus as a module, the above if-block will not be executed, but the rest of the code (the two functions) will be properly imported. The corrected script is named <code>mod_find_primes.py</code> and can be downloaded from <a href="http:/ECL2017S/lecture/7/mod_find_primes.py" target="_blank">here</a>.</p>
<pre><code class="language-Python">In [6]: import mod_find_primes
In [7]: mod_find_primes.__name__
Out[7]: 'mod_find_primes'
</code></pre>
<p><br />
You could also import specific names or funcitons from your own module, for example</p>
<pre><code class="language-Python">In [11]: from mod_find_primes import is_prime
</code></pre>
<p><br />
In summary,</p>
<blockquote>
<b>Add test blocks in your modules</b><br /><br />
It is recommended to only have functions and not any statements outside functions in a module. The reason is that the module file is executed from top to bottom during the import. With function definitions only in the module file, and no main program, there will be no calculations or output from the import, just definitions of functions. But in case you need to write a module that can be run standalone, then put all script statements for the standalone part of the module inside a <b>test block</b> (the if-block described above).
</blockquote>
<p><br /></p>
<h4 id="command-line-arguments">Command line arguments</h4>
<p>Test blocks are especially useful when your module can be also run as a standalone Python script that takes in <strong>command-line arguments</strong>. <a href="http:/ECL2017S/lecture/7/cmd_find_primes.py" target="_blank">Here</a> is a modified version of the <code>mod_find_primes</code> module now named <code>cmd_find_primes</code> that instead of using <code>input()</code> function, reads the integer number from the Bash command line. To do so, you need to modify the last part of the original module to the following, using Python’s standard <code>sys</code> module,</p>
<pre><code class="language-python">if __name__ == "__main__":
import sys
if len( sys.argv ) != 2: # check the number of arguments to be exactly 2.
print('''
Error: Exactly two arguments must be given on the command line.
Usage:''')
print(" ", sys.argv[0], "<a positive integer number>", '\n')
sys.exit(' Program stopped.\n')
else:
n = int(sys.argv[1])
print('Here is a list of all prime numbers smaller than {}:'.format(n))
get_primes(n)
</code></pre>
<p><br />
Now if you run this code, from the Bash command line, or inside IPython, like the following,</p>
<pre><code class="language-python">In [14]: run cmd_find_primes.py
Error: Exactly two arguments must be given on the command line.
Usage:
cmd_find_primes.py <a positive integer number>
An exception has occurred, use %tb to see the full traceback.
SystemExit: Program stopped.
</code></pre>
<p><br />
The code will expect you to enter an integer right after the nbame of the script,</p>
<pre><code class="language-python">In [15]: run cmd_find_primes.py 13
Here is a list of all prime numbers smaller than 13:
13
11
7
5
3
2
</code></pre>
<p><br />
In general, I recommend you to use the <code>sys</code> module for input arguments instead of Python’s <code>input()</code> function.</p>
<blockquote>
<b>Modules and main functions</b><br /><br />
If you have some functions and a main program in some program file, just move the main program to the test block. Then the file can act as a module, giving access to all the functions in other files, or the file can be executed from the command line, in the same way as the original program.
</blockquote>
<p><br /></p>
<h4 id="test-blocks-for-module-code-verification">Test blocks for module code verification</h4>
<p>It is a good programming habit to let the test block do one or more of three things:</p>
<ol>
<li>provide information on how the module or program is used,</li>
<li>test if the module functions work properly,</li>
<li>offer interaction with users such that the module file can be applied as a useful program.</li>
</ol>
<p>To achieve the second task, we have to write functions that verify the implementation in a module. The general advice is to write test functions that,</p>
<ol>
<li>have names starting with <code>test_</code>,</li>
<li>express the success or failure of a test through a boolean variable, say <code>success</code>,</li>
<li>run <code>assert success, msg</code> to raise an <code>AssertionError</code> with an optional message <code>msg</code> in case the test fails.</li>
</ol>
<p>We talk about this later on in this course.</p>
<blockquote>
<b>Doc-strings in modules</b><br /><br />
It is a good habit to include a doc-string in the beginning of your module file. This doc string should explain the purpose and use of the module.
</blockquote>
<p><br /></p>
<h4 id="scope-of-definitions-in-your-module">Scope of definitions in your module</h4>
<p>Once you have created your module, you can import it just like any other module into our program, for example,</p>
<pre><code class="language-python">In [22]: import cmd_find_primes
In [23]: dir(cmd_find_primes)
Out[23]:
['__builtins__',
'__cached__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__spec__',
'get_primes',
'is_prime']
</code></pre>
<p><br />
However, more often than not, you may want to have variables in your module, that are only to be used inside the module and not be accessed by the user. The convention is to start the names of these variables by an underscore. For example,</p>
<pre><code class="language-python">_course = "Python programming"
</code></pre>
<p><br />
This however, does not prevent the import of the variable <code>_course</code> into your code from your the <a href="http:/ECL2017S/lecture/7/mod_cmd_find_primes_del.py" target="_blank">module</a> containing it. One solution is to delete the variables that we are not interested the user to have access to, at the end of the module,</p>
<pre><code class="language-python">del _course
</code></pre>
<p><br />
such that the <a href="" target="_blank">module</a> containing the above statement will give,</p>
<pre><code class="language-python">In [28]: import mod_cmd_find_primes_del
In [29]: dir( mod_cmd_find_primes_del )
Out[29]:
['__builtins__',
'__cached__',
'__doc__',
'__file__',
'__loader__',
'__name__',
'__package__',
'__spec__',
'get_primes',
'is_prime']
</code></pre>
<p><br /></p>
<p>However, note that if you import all definitions in <a href="http:/ECL2017S/lecture/7/mod_cmd_find_primes_all.py" target="_blank">your module</a> as standalone definitions like the following,</p>
<pre><code class="language-python">In [4]: from mod_cmd_find_primes_all import *
In [5]: dir()
Out[5]:
['In',
'Out',
'_',
'_3',
'__',
'___',
'__builtin__',
'__builtins__',
'__doc__',
'__loader__',
'__name__',
'__package__',
'__spec__',
'_dh',
'_i',
'_i1',
'_i2',
'_i3',
'_i4',
'_i5',
'_ih',
'_ii',
'_iii',
'_oh',
'_sh',
'exit',
'get_ipython',
'get_primes',
'is_prime',
'quit']
</code></pre>
<p><br />
you see that the variable <code>_course</code> is not imported. In general, to avoid confusion, it is best to define an <code>__all__</code> variable in your module, which contains a list of all variable and function names that are to be imported as standalone definitions using <code>from mymodule import *</code>. For example, add the following to the above module,</p>
<pre><code class="language-python">__all__ = ['get_primes']
</code></pre>
<p><br />
Upong importing this module, now only the function <code>get_prime</code> will be imported and not <code>_course</code> or <code>is_prime</code>.</p>
<h4 id="the-path-to-your-modules">The path to your modules</h4>
<p>When you create a module, if it is in the current directory of your code, then it will be automatcally found by the Python interpreter. This is however, not generally the case if your module lives in another directory than the current working directory of Python interpreter. To add the module’s directory to the path of your Python interpreter, use the following,</p>
<pre><code class="language-python">In [5]: myModuleFolder = ’the path to your module’
In [6]: import sys
In [7]: sys.path
Out[7]:
['',
'C:\\Program Files\\Anaconda3\\Scripts',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\lmfit-0.9.5_44_gb2041c3-py3.5.egg',
'C:\\Program Files\\Anaconda3\\python35.zip',
'C:\\Program Files\\Anaconda3\\DLLs',
'C:\\Program Files\\Anaconda3\\lib',
'C:\\Program Files\\Anaconda3',
'c:\\program files\\anaconda3\\lib\\site-packages\\setuptools-20.3-py3.5.egg',
'C:\\Program Files\\Anaconda3\\lib\\site-packages',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Sphinx-1.3.5-py3.5.egg',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32\\lib',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Pythonwin',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\IPython\\extensions',
'C:\\Users\\Amir\\.ipython']
In [8]: sys.path.insert(0,myModuleFolder)
In [9]: sys.path
Out[9]:
[’the path to your module’,
'',
'C:\\Program Files\\Anaconda3\\Scripts',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\lmfit-0.9.5_44_gb2041c3-py3.5.egg',
'C:\\Program Files\\Anaconda3\\python35.zip',
'C:\\Program Files\\Anaconda3\\DLLs',
'C:\\Program Files\\Anaconda3\\lib',
'C:\\Program Files\\Anaconda3',
'c:\\program files\\anaconda3\\lib\\site-packages\\setuptools-20.3-py3.5.egg',
'C:\\Program Files\\Anaconda3\\lib\\site-packages',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Sphinx-1.3.5-py3.5.egg',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\win32\\lib',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\Pythonwin',
'C:\\Program Files\\Anaconda3\\lib\\site-packages\\IPython\\extensions',
'C:\\Users\\Amir\\.ipython']
</code></pre>
<p><br />
In the above, we added the path to our module to the list of all paths the Python interpreter will search, in order to find the module requested to be imported (Note that <code>’the path to your module’</code> is not a real system path, this was just an example).</p>
<h3 id="the-collections-module">The <strong>collections</strong> module</h3>
<p>One of the greatest strengths of Python as a scientific programming language is that, for almost everything that you could imagine and want to write a code, someone has already written a code, and so there is <em>no reason to reinvent the wheel if someone has already done it for you</em>. Throughout your career you will get to know many of the most important modules for your own domain of science. Here I will introduce only a general module, that has some interesting and rather useful functions in it. Specifically, this module contains some new non-standard Python data types that can be very handy at times.</p>
<h4 id="the-counter-data-type">The <strong>Counter</strong> data type</h4>
<p>The <code>Counter</code> function from module <code>collections</code> takes in a list and creates a dictionary, whose keys are unique elements in the input list and the values of the keys, are the number of times each key appears in the list. For example,</p>
<pre><code class="language-python">from collections import Counter
mylist = [1,1,1,2,3,34,45,34,34,7,8,34,3,3,6,4,4,4,0,34,9,0]
c = Counter(mylist)
c
</code></pre>
<pre><code>Counter({0: 2, 1: 3, 2: 1, 3: 3, 4: 3, 6: 1, 7: 1, 8: 1, 9: 1, 34: 5, 45: 1})
</code></pre>
<p>There are basically three methods for generating a Counter dictionary,</p>
<pre><code class="language-python">c1 = Counter(['a', 'b', 'c', 'a', 'b', 'b']) # input a list directly into Counter
c2 = Counter({'a':2, 'b':3, 'c':1}) # Give it the Counter dictionary
c3 = Counter(a=2, b=3, c=1) # or simply give it the counts
c1 == c2 == c3
</code></pre>
<pre><code>True
</code></pre>
<h5 id="what-is-counter-useful-for">What is Counter useful for?</h5>
<p>Suppose you have a long list of letters, and for some reason you need to count the number of times each letter appears in your string. You can achieve your goal as in the following example,</p>
<pre><code class="language-python">s = 'amirshahmoradijakelucerotravismike'
c = Counter(s)
for key in c.keys():
print('The letter {} appears only {} times in the string'.format(key,c[key]))
</code></pre>
<pre><code>The letter v appears only 1 times in the string
The letter a appears only 5 times in the string
The letter u appears only 1 times in the string
The letter l appears only 1 times in the string
The letter j appears only 1 times in the string
The letter d appears only 1 times in the string
The letter h appears only 2 times in the string
The letter o appears only 2 times in the string
The letter i appears only 4 times in the string
The letter k appears only 2 times in the string
The letter c appears only 1 times in the string
The letter t appears only 1 times in the string
The letter s appears only 2 times in the string
The letter m appears only 3 times in the string
The letter r appears only 4 times in the string
The letter e appears only 3 times in the string
</code></pre>
<p>Now suppose you wanted to cound the number of times different words appear in a given text,</p>
<pre><code class="language-python">text = "Engineering Computation Lab (COE111L) is a new course that is offered by the department of Aerospace Engineering and Engineering Mechanics at the University of Texas at Austin, starting Spring 2017. "
c = Counter(text.split())
for word in c.keys():
print('The word "{}" appears only {} times in the text'.format(word,c[word]))
</code></pre>
<pre><code>The word "Computation" appears only 1 times in the text
The word "a" appears only 1 times in the text
The word "Engineering" appears only 3 times in the text
The word "the" appears only 2 times in the text
The word "(COE111L)" appears only 1 times in the text
The word "offered" appears only 1 times in the text
The word "is" appears only 2 times in the text
The word "at" appears only 2 times in the text
The word "of" appears only 2 times in the text
The word "Lab" appears only 1 times in the text
The word "course" appears only 1 times in the text
The word "department" appears only 1 times in the text
The word "by" appears only 1 times in the text
The word "and" appears only 1 times in the text
The word "Texas" appears only 1 times in the text
The word "Mechanics" appears only 1 times in the text
The word "2017." appears only 1 times in the text
The word "new" appears only 1 times in the text
The word "University" appears only 1 times in the text
The word "starting" appears only 1 times in the text
The word "Austin," appears only 1 times in the text
The word "that" appears only 1 times in the text
The word "Spring" appears only 1 times in the text
The word "Aerospace" appears only 1 times in the text
</code></pre>
<p>Now, you can also apply all different methods that exists for Counter data types on the variable <code>c</code> in the above case. For example, you could ask for the 3 most common words in in the text,</p>
<pre><code class="language-python">c.most_common(3)
</code></pre>
<pre><code>[('Engineering', 3), ('the', 2), ('is', 2)]
</code></pre>
<h4 id="the-ordereddict-data-type">The <strong>OrderedDict</strong> data type</h4>
<p>This is also a subclass of dictionary data type, which provides all the methods provided by <code>dict</code>, but which also retains the order by which elements are added to the dictionary,</p>
<pre><code class="language-python">
</code></pre>
<p><br />
However, you can define a <code>defaultdict</code> dictionary which will assign a default value to all keys that do not exist, and automatically adds them to the dictionary. A normal dictionary does not conserve the order by which elements were added to the dictionary,</p>
<pre><code class="language-python">d = {5:5,3:3,6:6,1:1}
for i,j in d.items():
print(i,j)
</code></pre>
<p><br />
1 1
3 3
5 5
6 6</p>
<p>To get save order of the elements, you can use <code>OrderedDict</code>,</p>
<pre><code class="language-python">from collections import OrderedDict as od
d = od([(5,5),(3,3),(6,6),(1,1)])
for i,j in d.items():
print(i,j)
</code></pre>
<pre><code>5 5
3 3
6 6
1 1
</code></pre>
<blockquote>
Keep in mind that, two order dictionary with the same content may not be necessarily equal, since the order of their content also matters.
</blockquote>
<p><br /></p>
<h3 id="the-timeit-module">The timeit module</h3>
<p>This is a module that provides some useful functions for timing the performance and speed of peices of your Python code.</p>
<pre><code class="language-python">import timeit as tt
tt.timeit( "-".join(str(n) for n in range(100)) , number=10000 )
</code></pre>
<pre><code>0.03779717059364884
</code></pre>
<p>The first input to <code>timeit</code> function above is the operation which we would like to time, and the second input, tell the function, how many times repeat the task (If the operation takes a tiny amount, you would want to repeat it many many times, in order to get a sensible timing output). Here is the same operation as above, but now using the <a href="http://book.pythontips.com/en/latest/map_filter.html#map" target="_blank">map</a> function,</p>
<pre><code class="language-python">tt.timeit( "-".join( map(str,range(1000))) , number=10000 )
</code></pre>
<pre><code>0.384857713242468
</code></pre>
<p>In IPython or Jupyter, you can do the timing operation in a smarter way using IPython magic function <a href="https://ipython.org/ipython-doc/dev/interactive/magics.html#magic-timeit" target="_blank">%timeit</a>,</p>
<pre><code class="language-python">%timeit "-".join(str(n) for n in range(100))
</code></pre>
<pre><code>10000 loops, best of 3: 36.6 µs per loop
</code></pre>
<p>The IPython’s magic function automatically figures how many times it should run the operation to get a sensible timing of the operation.</p>
<pre><code class="language-python">%timeit "-".join( map(str,range(100)))
</code></pre>
<pre><code>10000 loops, best of 3: 21 µs per loop
</code></pre>
<p>In general, as you noticed in the above example, the function <code>map</code> performs much better and faster than Python’s for-loop.</p>
<p><br /></p>
<h3 id="the-time-module">The time module</h3>
<p>More generally, if you want to measure the CPU time spent on a specific part of your code, you can use the <code>clock()</code> method from <code>time</code> module,</p>
<pre><code class="language-python">import time
# do some work
t0 = time.clock() # get the initial CPU time
# do some further work wqhich you want to time
t1 = time.clock() # get the final CPU time
cpu_time = t1 - t0 # This is the time spent on the task being timed.
</code></pre>
<p><br />
The <code>time.clock()</code> function returns the CPU time spent in the program since its start. If the interest is in the total time, also including reading and writing files, <code>time.time()</code> is the appropriate function to call. Now suppose you had a list of functions that performed the same task, but using different methods, and you wanted to time their performance. Since in Python, functions are ordinary objects, making a list of functions is no more special than making a list of strings or numbers. You can therefore, create a list of function names and call them one by one, inside a loop, and time each one repectively.</p>
<pre><code class="language-python">import time
functions = [func1, func2, func3, func4,func5, func6, func7, func8,func9, func10]
timings = [] # timings[i] holds CPU time for functions[i]
for function in functions:
t0 = time.clock()
function(<input variables>)
t1 = time.clock()
cpu_time = t1 - t0
timings.append(cpu_time)
</code></pre>
<p><br /></p>
<h2 id="loops-in-python">Loops in Python</h2>
<p>We have already seen, both in homework and midterm, what a pain it can be if you wanted to repeat a certain number of tasks using recursive functions and if-blocks. Fortunately, Python has loop statements that can highly simplify the task of repeating certain statements for a certain number of times.</p>
<h3 id="while-loop">While loop</h3>
<p>One such statement is the while-loop:</p>
<pre><code class="language-python">while this_logical_statement_holds_true :
perform_statements
</code></pre>
<p><br />
For example, here is a code that prints all positive integers smaller than a given input integer,</p>
<pre><code class="language-python">n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
while n > 1:
n -= 1
print(n)
</code></pre>
<pre><code>input a positive integer: 7
Here are all positive integers smaller than 7
6
5
4
3
2
1
</code></pre>
<p>Another useful way of writing while-loops is the following (using the example above),</p>
<pre><code class="language-python">n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
while True:
n -= 1
print(n)
if n == 1: break
</code></pre>
<pre><code>input a positive integer: 7
Here are all positive integers smaller than 7
6
5
4
3
2
1
</code></pre>
<p>In this case, the loop will continue forever, unless the condition <code>n==1</code> is met at some point during the iteration.</p>
<h3 id="for-loop">For loop</h3>
<p>If you are from a Fortran, C, C++ background you maybe already accustomed to counting loops than while loops. Pyhon does not have a direct method for counting loops, however, there is a for-loop syntax that loops over the elements of a list or tuple. For example, if we wanted to rewrite the above code using for-loop, one solution would be like the following,</p>
<pre><code class="language-python">n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
my_range = range(n-1,0,-1)
for n in my_range:
print(n)
</code></pre>
<pre><code>input a positive integer: 7
Here are all positive integers smaller than 7
7
6
5
4
3
2
1
</code></pre>
<p>Here the Python’s builtin function <code>range([start,] stop [, step])</code> creates a list of integer that starts from <code>start</code> to <code>end</code> <em>but not including <code>end</code></em>, with a distance of size <code>step</code> between the elements. Here is another way of doing the same thing as in the above example,</p>
<pre><code class="language-python">n = int(input('input a positive integer: '))
print( 'Here are all positive integers smaller than {}'.format(n) )
mylist = list(range(n-1,0,-1))
for n in mylist:
print(n)
</code></pre>
<pre><code>input a positive integer: 7
Here are all positive integers smaller than 7
6
5
4
3
2
1
</code></pre>
<p>Note how I have used the <code>range</code> function in order to get the same output as in the previous example.</p>
<pre><code class="language-python">n = int(input('input a positive integer: '))
mylist = list(range(n-1,0,-1))
print(mylist)
</code></pre>
<pre><code>input a positive integer: 7
[6, 5, 4, 3, 2, 1]
</code></pre>
<h4 id="for-loop-with-list-indices">For-loop with list indices</h4>
<p>Instead of iterating over over a list directly, as illustrated above, one could iterate over the indices of a list,</p>
<pre><code class="language-python">mylist = ['amir','jake','lecero','mike','travis']
for i in range(len(mylist)):
print(mylist[i])
</code></pre>
<pre><code>amir
jake
lecero
mike
travis
</code></pre>
<blockquote>
Iterating over list indices, instead of list elements, is particularly udseful, when you have to work with multiple lists in a for-loop.
</blockquote>
<p><br /></p>
<h4 id="manipulating-lists-using-for-loop">Manipulating lists using for-loop</h4>
<p>Note that when you want to change the elements of a list in a for-loop, you have to change the list itself, and not simply the for-loop variable.</p>
<pre><code class="language-python">mydigits = [1,3,5,7,9]
for i in mydigits:
i -= 1
mydigits
</code></pre>
<pre><code>[1, 3, 5, 7, 9]
</code></pre>
<p>The above code won’t change the values in the list, instead only the for-loop variable. If you want to change the list itself, you have to operate on the list elements directly,</p>
<pre><code class="language-python">mydigits = [1,3,5,7,9]
for i in rnage(len(mydigits)):
mydigits[i] -= 1
mydigits
</code></pre>
<pre><code>[0, 2, 4, 6, 8]
</code></pre>
<h4 id="list-comprehension">List comprehension</h4>
<p>Frequently in Python programming you may need to create long lists of regurlarly ordered item. As a result, Python has a special concise syntax for such tasks, called <strong>list comprehension</strong> which uses for-loop. For example, supopse you have a list of odd digits as in the example above, and you want to create a list of even digits from it. You could achieve this using the following simple syntax,</p>
<pre><code class="language-python">odd_digits = [1,3,5,7,9]
even_digits = [i-1 for i in odd_digits]
even_digits
</code></pre>
<pre><code>[0, 2, 4, 6, 8]
</code></pre>
<h4 id="simultaneous-looping-over-multiple-lists">Simultaneous looping over multiple lists</h4>
<p>Suppose you have two or more lists of the same length over the elements of which you want to perform a specific set of tasks simultaneously. To do so, it suffices to create a <strong>list of tuples</strong> using Python’s builtin function <code>zip</code> and loop over the tuple elements of this list. For example, let’s assume that you wanted to create a list of the addition of individual elements in the above two lists: <code>odd_digits</code> and <code>even_digits</code>. One way to do it would be the following,</p>
<pre><code class="language-python">sum_even_odd = []
for i,j in zip(odd_digits,even_digits):
sum_even_odd.append(i+j)
sum_even_odd
</code></pre>
<pre><code>[1, 5, 9, 13, 17]
</code></pre>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/lecture/7-python-modules-loops-io">Lecture 7: Python - modules, loops, and I/O</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on March 29, 2017.</p><![CDATA[Homework 6: Solutions - Python modules, loops, and I/O]]>http:/ECL2017S/homework/6-solutions-python-modules-loops-IO2017-03-29T00:00:00-05:002017-03-29T00:00:00-05:00Amir Shahmoradihttp:/ECL2017Samir@ices.utexas.edu
<p>This is the solution to <a href="6-problems-python-modules-loops-IO.html" target="_blank">Homework 6: Problems - Python modules, loops, and I/O</a>.</p>
<p>The following figure illustrates the grade distribution for this homework.</p>
<figure>
<img src="http:/ECL2017S/homework/gradeDist/gradeHistHomework6.png" width="700" />
<figcaption style="text-align:center">
Maximum possible points is 100.<br />
</figcaption>
</figure>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p>This homework aims at giving you some experience with Python for-loops and while-loops as well as reading user input from the Bash command line.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>1. </strong> <strong>The while-loop implementation of a for-loop</strong>. Consider the following example code, which converts a list of temperature values from Celsius to Fahrenheit, using a for-loop and then prints them on screen.</p>
<pre><code class="language-python">Cdegrees = [-20, -15, -10, -5, 0, 5, 10, 15, 20, 25, 30, 35, 40]
print (' C F')
for C in Cdegrees:
F = (9.0/5)*C + 32
print ('%5d %5.1f' % (C, F))
</code></pre>
<pre><code> C F
-20 -4.0
-15 5.0
-10 14.0
-5 23.0
0 32.0
5 41.0
10 50.0
15 59.0
20 68.0
25 77.0
30 86.0
35 95.0
40 104.0
</code></pre>
<p>Write a while-loop implementation of the above code.</p>
<p><br /></p>
<p><strong>Answer:</strong></p>
<pre><code class="language-python">Cdegrees = [-20, -15, -10, -5, 0, 5, 10, 15, 20, 25, 30, 35, 40]
index = 0
print (' C F')
while index < len(Cdegrees):
C = Cdegrees[index]
F = (9.0/5)*C + 32
print('%5d %5.1f' % (C, F))
index += 1
</code></pre>
<pre><code> C F
-20 -4.0
-15 5.0
-10 14.0
-5 23.0
0 32.0
5 41.0
10 50.0
15 59.0
20 68.0
25 77.0
30 86.0
35 95.0
40 104.0
</code></pre>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>2. </strong> Consider the following nested list,</p>
<pre><code class="language-python">the following nested list:
q = [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h']]
</code></pre>
<p><br />
Write a for-loop that extracts all the letters in the list and finally prints them all as a single string,</p>
<pre><code class="language-python">abcdefgh
</code></pre>
<p><strong>Answer:</strong></p>
<pre><code class="language-python">q = [['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h']]
s = ''
for i in q:
for j in range(len(i)):
s = s + i[j]
print(s)
</code></pre>
<pre><code>abcdefgh
</code></pre>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>3. </strong> Consider the following program,</p>
<pre><code class="language-python">from math import sqrt
for n in range(1, 60):
r_org = 2.0
r = r_org
for i in range(n):
r = sqrt(r)
for i in range(n):
r = r ** 2
print ('With {} times sqrt and then {} times **2, the number {} becomes: {:.16f}'.format(n,n,r_org,r))
</code></pre>
<p><br />
Explain what this code does. Then run the code, and explain why do you the behavior observed. In particular, why do you not recover the original value $2$ after many repetitions of the same forward and reverse task?</p>
<p><strong>Answer:</strong><br />
This code will yield the following output:</p>
<pre><code class="language-python">from math import sqrt
for n in range(1, 60):
r_org = 2.0
r = r_org
for i in range(n):
r = sqrt(r)
for i in range(n):
r = r ** 2
print ('With {} times sqrt and then {} times **2, the number {} becomes: {:.16f}'.format(n,n,r_org,r))
</code></pre>
<pre><code class="language-text">With 1 times sqrt and then 1 times **2, the number 2.0 becomes: 2.0000000000000004
With 2 times sqrt and then 2 times **2, the number 2.0 becomes: 1.9999999999999996
With 3 times sqrt and then 3 times **2, the number 2.0 becomes: 1.9999999999999996
With 4 times sqrt and then 4 times **2, the number 2.0 becomes: 1.9999999999999964
With 5 times sqrt and then 5 times **2, the number 2.0 becomes: 1.9999999999999964
With 6 times sqrt and then 6 times **2, the number 2.0 becomes: 1.9999999999999964
With 7 times sqrt and then 7 times **2, the number 2.0 becomes: 1.9999999999999714
With 8 times sqrt and then 8 times **2, the number 2.0 becomes: 2.0000000000000235
With 9 times sqrt and then 9 times **2, the number 2.0 becomes: 2.0000000000000235
With 10 times sqrt and then 10 times **2, the number 2.0 becomes: 2.0000000000000235
With 11 times sqrt and then 11 times **2, the number 2.0 becomes: 2.0000000000000235
With 12 times sqrt and then 12 times **2, the number 2.0 becomes: 1.9999999999991336
With 13 times sqrt and then 13 times **2, the number 2.0 becomes: 1.9999999999973292
With 14 times sqrt and then 14 times **2, the number 2.0 becomes: 1.9999999999973292
With 15 times sqrt and then 15 times **2, the number 2.0 becomes: 1.9999999999973292
With 16 times sqrt and then 16 times **2, the number 2.0 becomes: 2.0000000000117746
With 17 times sqrt and then 17 times **2, the number 2.0 becomes: 2.0000000000408580
With 18 times sqrt and then 18 times **2, the number 2.0 becomes: 2.0000000000408580
With 19 times sqrt and then 19 times **2, the number 2.0 becomes: 2.0000000001573586
With 20 times sqrt and then 20 times **2, the number 2.0 becomes: 2.0000000001573586
With 21 times sqrt and then 21 times **2, the number 2.0 becomes: 2.0000000001573586
With 22 times sqrt and then 22 times **2, the number 2.0 becomes: 2.0000000010885857
With 23 times sqrt and then 23 times **2, the number 2.0 becomes: 2.0000000029511749
With 24 times sqrt and then 24 times **2, the number 2.0 becomes: 2.0000000066771721
With 25 times sqrt and then 25 times **2, the number 2.0 becomes: 2.0000000066771721
With 26 times sqrt and then 26 times **2, the number 2.0 becomes: 1.9999999917775542
With 27 times sqrt and then 27 times **2, the number 2.0 becomes: 1.9999999917775542
With 28 times sqrt and then 28 times **2, the number 2.0 becomes: 1.9999999917775542
With 29 times sqrt and then 29 times **2, the number 2.0 becomes: 1.9999999917775542
With 30 times sqrt and then 30 times **2, the number 2.0 becomes: 1.9999999917775542
With 31 times sqrt and then 31 times **2, the number 2.0 becomes: 1.9999999917775542
With 32 times sqrt and then 32 times **2, the number 2.0 becomes: 1.9999990380770896
With 33 times sqrt and then 33 times **2, the number 2.0 becomes: 1.9999971307544144
With 34 times sqrt and then 34 times **2, the number 2.0 becomes: 1.9999971307544144
With 35 times sqrt and then 35 times **2, the number 2.0 becomes: 1.9999971307544144
With 36 times sqrt and then 36 times **2, the number 2.0 becomes: 1.9999971307544144
With 37 times sqrt and then 37 times **2, the number 2.0 becomes: 1.9999971307544144
With 38 times sqrt and then 38 times **2, the number 2.0 becomes: 1.9999360966436217
With 39 times sqrt and then 39 times **2, the number 2.0 becomes: 1.9999360966436217
With 40 times sqrt and then 40 times **2, the number 2.0 becomes: 1.9999360966436217
With 41 times sqrt and then 41 times **2, the number 2.0 becomes: 1.9994478907329654
With 42 times sqrt and then 42 times **2, the number 2.0 becomes: 1.9984718365144798
With 43 times sqrt and then 43 times **2, the number 2.0 becomes: 1.9965211562778555
With 44 times sqrt and then 44 times **2, the number 2.0 becomes: 1.9965211562778555
With 45 times sqrt and then 45 times **2, the number 2.0 becomes: 1.9887374575497223
With 46 times sqrt and then 46 times **2, the number 2.0 becomes: 1.9887374575497223
With 47 times sqrt and then 47 times **2, the number 2.0 becomes: 1.9887374575497223
With 48 times sqrt and then 48 times **2, the number 2.0 becomes: 1.9887374575497223
With 49 times sqrt and then 49 times **2, the number 2.0 becomes: 1.8682459487159784
With 50 times sqrt and then 50 times **2, the number 2.0 becomes: 1.6487212645509468
With 51 times sqrt and then 51 times **2, the number 2.0 becomes: 1.6487212645509468
With 52 times sqrt and then 52 times **2, the number 2.0 becomes: 1.0000000000000000
With 53 times sqrt and then 53 times **2, the number 2.0 becomes: 1.0000000000000000
With 54 times sqrt and then 54 times **2, the number 2.0 becomes: 1.0000000000000000
With 55 times sqrt and then 55 times **2, the number 2.0 becomes: 1.0000000000000000
With 56 times sqrt and then 56 times **2, the number 2.0 becomes: 1.0000000000000000
With 57 times sqrt and then 57 times **2, the number 2.0 becomes: 1.0000000000000000
With 58 times sqrt and then 58 times **2, the number 2.0 becomes: 1.0000000000000000
With 59 times sqrt and then 59 times **2, the number 2.0 becomes: 1.0000000000000000
</code></pre>
<p><br />
What is happening is that, 1 is returned for n >= 52 as square root of 2, that is, after 52 times square-root operation, the degree of accuracy required for representing the result goes beyond the degree of accuracy available in a Python float. Consequently, the later squaring operation on 1.00000000000000 will leave the number unchanged and therefore, 2 is not recovered.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>4. </strong> Consider the following code,</p>
<pre><code class="language-python">eps = 1.0
while 1.0 != 1.0 + eps:
print ('...............', eps)
eps /= 2.0
print ('final eps:', eps)
</code></pre>
<p><br /><br />
Explain what the code is doing. Run the code and observe the output. How could <code>1.0 != 1.0 + eps</code> be False?</p>
<p><strong>Answer:</strong><br />
Here is the output of the code,</p>
<pre><code class="language-text">............... 1.0
............... 0.5
............... 0.25
............... 0.125
............... 0.0625
............... 0.03125
............... 0.015625
............... 0.0078125
............... 0.00390625
............... 0.001953125
............... 0.0009765625
............... 0.00048828125
............... 0.000244140625
............... 0.0001220703125
............... 6.103515625e-05
............... 3.0517578125e-05
............... 1.52587890625e-05
............... 7.62939453125e-06
............... 3.814697265625e-06
............... 1.9073486328125e-06
............... 9.5367431640625e-07
............... 4.76837158203125e-07
............... 2.384185791015625e-07
............... 1.1920928955078125e-07
............... 5.960464477539063e-08
............... 2.9802322387695312e-08
............... 1.4901161193847656e-08
............... 7.450580596923828e-09
............... 3.725290298461914e-09
............... 1.862645149230957e-09
............... 9.313225746154785e-10
............... 4.656612873077393e-10
............... 2.3283064365386963e-10
............... 1.1641532182693481e-10
............... 5.820766091346741e-11
............... 2.9103830456733704e-11
............... 1.4551915228366852e-11
............... 7.275957614183426e-12
............... 3.637978807091713e-12
............... 1.8189894035458565e-12
............... 9.094947017729282e-13
............... 4.547473508864641e-13
............... 2.2737367544323206e-13
............... 1.1368683772161603e-13
............... 5.684341886080802e-14
............... 2.842170943040401e-14
............... 1.4210854715202004e-14
............... 7.105427357601002e-15
............... 3.552713678800501e-15
............... 1.7763568394002505e-15
............... 8.881784197001252e-16
............... 4.440892098500626e-16
............... 2.220446049250313e-16
final eps: 1.1102230246251565e-16
</code></pre>
<p>What is happening is that after a certain number of divisions performed on the value of <code>eps</code>, the value goes beyond the highest float precision representatble by Python standard ($0.0000000000000001$), and therefore the value of <code>eps</code> is eventually rounded to exact zero. The nonzero eps value computed above is called <strong>machine epsilon</strong> or <strong>machine zero</strong> and is an important parameter to know, since it can lead to disasters in your very important complex calculations.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>5. </strong> Consider the following list,</p>
<pre><code class="language-python">numbers = list(range(10))
print(numbers)
</code></pre>
<pre><code>[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
</code></pre>
<p>Now run the following code, given the above list. Explain the weird behavior that you observe.</p>
<pre><code class="language-python">for n in numbers:
i = len(numbers)//2
del numbers[i]
print ('n={}, del {}'.format(n,i), numbers)
</code></pre>
<p><br /></p>
<p><strong>Answer:</strong></p>
<pre><code class="language-python">numbers = list(range(10))
for n in numbers:
i = len(numbers)//2
del numbers[i]
print ('n={}, del {}'.format(n,i), numbers)
</code></pre>
<pre><code>n=0, del 5 [0, 1, 2, 3, 4, 6, 7, 8, 9]
n=1, del 4 [0, 1, 2, 3, 6, 7, 8, 9]
n=2, del 4 [0, 1, 2, 3, 7, 8, 9]
n=3, del 3 [0, 1, 2, 7, 8, 9]
n=8, del 3 [0, 1, 2, 8, 9]
</code></pre>
<p>What is really happening is that the list over which we are looping changes its content because of the modifications during on the list in the for-loop. The message in this exercise is to <strong>never modify a list that you are looping over</strong>. Modification is indeed technically possible, as shown above, but you really need to know what you are doing. Otherwise you will experience very strange program behavior.</p>
<p><br /></p>
<hr />
<hr />
<p><br /></p>
<p><strong>6. </strong> Consider a problem similar to what we had in the midterm exam: Write a Python function that when executed, asks the user to enter an integer number, then the function gives out the number of prime numbers that are smaller than the input integer number. Here is the answer to this question using only the knowledge of recursive functions and if-blocks,</p>
<pre><code class="language-python">def is_prime(n):
is_prime = True
def is_divisible(n,divisor):
if n<(divisor-1)*divisor: return False
if n%divisor==0: return True
else:
divisor += 1
return is_divisible(n,divisor)
if is_divisible(n,divisor=2): is_prime=False
return is_prime
def get_primes(n):
count = 0
if n == 1:
return count
else:
if is_prime(n):
count = 1
n -= 1
return count + get_primes(n)
</code></pre>
<p><br /></p>
<pre><code class="language-python">get_primes(13)
</code></pre>
<pre><code>5
</code></pre>
<p>(A) Now rewrite <code>get_primes(n)</code> and the other functions in the above code using for-loop this time. Name the new functions <code>get_prime_for(n)</code> and <code>is_prime_for(n)</code>, with <em>for</em> in the names indicating that the functions now use for-loops.</p>
<p><strong>Answer:</strong></p>
<pre><code class="language-python">def is_prime_for(x):
if x > 1:
n = x // 2
for i in range(2, n + 1):
if x % i == 0:
return False
return True
else:
return False
def get_primes_for(n):
count = 0
for i in range(2,n):
if is_prime(i):
count += 1
return count
</code></pre>
<p><br />
Here is a test,</p>
<pre><code class="language-python">get_primes_for(13)
</code></pre>
<pre><code>5
</code></pre>
<p>(B) Now compare the performance of the two functions <code>get_primes(n=500)</code> and <code>get_primes_for(n500)</code> using Jupyter’s or IPython’s <code>%timeit</code> magic function. Which one is faster?</p>
<p><strong>Answer:</strong></p>
<pre><code class="language-python">%timeit get_primes(500)
</code></pre>
<pre><code>1000 loops, best of 3: 1.32 ms per loop
</code></pre>
<pre><code class="language-python">%timeit get_primes_for(500)
</code></pre>
<pre><code>1000 loops, best of 3: 1.69 ms per loop
</code></pre>
<p>Interesting, recursive functions seem to be faster than Python for-loops!</p>
<p><br /><br /></p>
<p><a href="http:/ECL2017S/homework/6-solutions-python-modules-loops-IO">Homework 6: Solutions - Python modules, loops, and I/O</a> was originally published by Amir Shahmoradi at <a href="http:/ECL2017S">COE 111L - Spring 2017 - W 9-10 AM - WRW 209</a> on March 29, 2017.</p>