Matt Shirley
This course requires no prior knowledge of Python, and this first lecture requires no Unix experience either!
A: You are looking at the IPython notebook interface. There are many different ways to interact with Python:
python
at a command prompt (Unix/Mac)Not only easy, though - it's powerful:
print 'hello world'
hello world
what = 'hello'
# This is a comment
print what
hello
who = 'world'
print what + who
print what + ' ' + who
print what + ' ' + who + '!'
helloworld hello world hello world!
print 'this is correct indentation'
print 'this is NOT correct indentation'
File "<ipython-input-4-2045dd26b583>", line 2 print 'this is NOT correct indentation' ^ IndentationError: unexpected indent
x = 1
x
1
y = 2
y
2
x = y
x
2
x, y = 1, 2
print x
print y
1 2
y = y + 1
y
3
x += 1
x
2
x += 1
is shorthand for x = x + 1
+=
is the autoincrement operator1 + 1
2
Addition
4 - 1
3
Subtraction
5 * 2
10
Multiplication
1 / 5
0
Divis... WHAT? 1 / 5 = 0.2
, right?
Python (and any other programming language) will only follow your instructions literally. When you type 1 / 5
in to the interpreter, Python guesses that both 1 and 5 are numeric integer numbers and that you want the result of the division to also be an integer.
int(1) / int(5)
0
We actually expect that the answer will be a floating point number and not an integer, so we have to give Python better instructions.
float(1) / 5
0.2
We can also use a shortcut.
1. / 5
0.2
Long floating point numbers can be rounded using round
:
round(1. / 3, 2)
0.33
Python has many object types built-in, including:
int()
float()
str()
bool()
tuple()
list()
dict()
We can determine the type of an object by using the type
function.
type(1)
int
type(1.)
float
type('hello')
str
type(True)
bool
4 + 'hello world'
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-23-1b7e5cde89f1> in <module>() ----> 1 4 + 'hello world' TypeError: unsupported operand type(s) for +: 'int' and 'str'
TypeError
is telling us that the +
operator requires objects of the same typef()
f
precedes parentheses ()
sum
<function sum>
sum((1,5))
6
1 < 2
True
1 > 2
False
1 == 2
False
1 != 2
True
2 >= 2
True
x = 1
y = "a"
x > y
False
x < y
True
primer1 = "AGGGTCA"
primer2 = "AGGTTAC"
primer1 == primer2
False
print primer1[0]
print primer2[0]
A A
primer1[0] == primer2[0]
True
0 1 2 3 4 5 6
+---+---+---+---+---+---+---+
| A | G | G | G | T | C | A |
+---+---+---+---+---+---+---+
-7 -6 -5 -4 -3 -2 -1
print primer1
print primer1[0]
print primer1[1]
print primer1[2]
print primer1[-1]
print primer1[-2]
AGGGTCA A G G A C
0 1 2 3 4 5 6 7
+---+---+---+---+---+---+---+
| A | G | G | G | T | C | A |
+---+---+---+---+---+---+---+
-7 -6 -5 -4 -3 -2 -1
print primer1[:]
print primer1[0:]
print primer1[:-1]
print primer1[0:-1]
print primer1[0:5]
print primer1[3:-1]
AGGGTCA AGGGTCA AGGGTC AGGGTC AGGGT GTC
[start:end]
[start:end)
print primer1[::]
print primer1[0:8:2]
print primer1[::-1] ## stride = -1
AGGGTCA AGTA ACTGGGA
[start:end:stride]
, where stride indicates number of elements to skipprimer1[::-1]
is a simple way to reverse a stringlen(primer1)
7
'A' in primer1
True
primer1.find('A')
0
primer1.count('A')
2
primer3 = primer1.replace('A', 'T')
primer3
'TGGGTCT'
primer1.lower()
'agggtca'
Q: How would you calculate the GC content (percent) for the sequence 'ATGCATGATACATAGATACC'
?
dna = 'ATGCATGATACATAGATACC'
c_count = dna.count('C')
g_count = dna.count('G')
dna_len = len(dna)
gc_cont = float(c_count + g_count) / dna_len * 100
print gc_cont
35.0
Q: How would you test whether the sequence 'ATGCATGATTAGTACGTA' is palindromic (reads the same forward and backward)?
seq = 'ATGCATGATTAGTACGTA'
seq == seq[::-1]
True
names = ('Fred', 'Ted', 'Ned')
names
('Fred', 'Ted', 'Ned')
()
and ,
You can access the first name in names
by specifying the 0-based index position of that element.
names[0]
'Fred'
To access the last name in names
, you can use either the 0-based index of that element [2]
, or use a negative index [-1]
.
print names
('Fred', 'Ted', 'Ned')
names[2]
'Ned'
names[-1]
'Ned'
Tuples can be "sliced" using [:]
.
names[0:3]
('Fred', 'Ted', 'Ned')
names[1:3]
('Ted', 'Ned')
names[:2]
('Fred', 'Ted')
names[0] = 'Zed' ## This will result in a TypeError
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-57-c7309bd65052> in <module>() ----> 1 names[0] = 'Zed' ## This will result in a TypeError TypeError: 'tuple' object does not support item assignment
names = ['Fred', 'Ted', 'Ned']
names
['Fred', 'Ted', 'Ned']
[]
and not ()
names[2] = 'Zed'
names
['Fred', 'Ted', 'Zed']
names.append('Mike')
names
['Fred', 'Ted', 'Zed', 'Mike']
names += ['Obama']
names
['Fred', 'Ted', 'Zed', 'Mike', 'Obama']
names.extend(['Craig'])
names
['Fred', 'Ted', 'Zed', 'Mike', 'Obama', 'Craig']
There are several ways to add objects to a list.
names.pop()
'Craig'
names.insert(3, 'Ned')
names
['Fred', 'Ted', 'Zed', 'Ned', 'Mike', 'Obama']
names.index('Obama')
5
We can remove and insert elements, as well as find the numeric index by element name.
sorted(names)
['Fred', 'Mike', 'Ned', 'Obama', 'Ted', 'Zed']
names.sort()
names
['Fred', 'Mike', 'Ned', 'Obama', 'Ted', 'Zed']
names.reverse()
names
['Zed', 'Ted', 'Obama', 'Ned', 'Mike', 'Fred']
Here are a few methods to manipulate the order of lists.
names.sort()
and names.reverse()
actually modified the list instead of returning a new listgtca = list("GATACA")
gtca
['G', 'A', 'T', 'A', 'C', 'A']
"".join(gtca)
'GATACA'
"-".join(gtca)
'G-A-T-A-C-A'
join
ed to form a string'G-A-T-A-C-A'.split('-')
['G', 'A', 'T', 'A', 'C', 'A']
range(0,10)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
range(0,10,2)
[0, 2, 4, 6, 8]
start, end, step
r = range(0,10)
r
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[i + 1 for i in r]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
List comprehensions apply an expression (i + 1) to an iterable (r = range(0,10)). We'll discuss iterables later.
[i + 1 for i in r if i
> 5]
[7, 8, 9, 10]
List comprehensions can also filter the resulting list on a condition (i > 5)
names['Fred']
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-78-43f399010876> in <module>() ----> 1 names['Fred'] TypeError: list indices must be integers, not str
person = {'name':'matt', 'height':71, 'weight':170}
person
{'height': 71, 'name': 'matt', 'weight': 170}
key:value
pairsperson['name']
'matt'
person.keys()
['name', 'weight', 'height']
person.values()
['matt', 170, 71]
person.items()
[('name', 'matt'), ('weight', 170), ('height', 71)]
person['location'] = 'Baltimore'
Assign keys to values like this.
person['pounds'] = person['weight']
del person['weight']
person['name'] = 'james'
person
{'height': 71, 'location': 'Baltimore', 'name': 'james', 'pounds': 170}
Delete and update keys and values like this
print dir(person)
['__class__', '__cmp__', '__contains__', '__delattr__', '__delitem__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__iter__', '__le__', '__len__', '__lt__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'clear', 'copy', 'fromkeys', 'get', 'has_key', 'items', 'iteritems', 'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 'update', 'values', 'viewitems', 'viewkeys', 'viewvalues']
The dir
function lists all of the methods and attributes of an object.
help(person.pop)
Help on built-in function pop: pop(...) D.pop(k[,d]) -> v, remove specified key and return the corresponding value. If key is not found, d is returned if given, otherwise KeyError is raised
The help
function returns the __doc__
attribute of the object it is applied to.
print person.pop.__doc__
D.pop(k[,d]) -> v, remove specified key and return the corresponding value. If key is not found, d is returned if given, otherwise KeyError is raised
x = 1
if x > 1:
print "x is greater than 1"
else:
print "x is less than or equal to 1"
x is less than or equal to 1
if
is evaluated, and the following code block is executed if the expression is True
False
, the code block following the optional else
statement is evaluatedx = False
if x:
print "x is True"
else:
print "x is False"
x is False
x = False
if not x:
print "x is False"
else:
print "x is True"
x is False
x = 200
if x / 2 == 1:
print "x is 2"
elif x / 20 == 1:
print "x is 20"
elif x / 200 == 1:
print "x is 200"
else:
print "x is something else"
x is 200
elif
tests an expression just like if
else
to catch unexpected conditionsfor i in range(0,10):
print i
0 1 2 3 4 5 6 7 8 9
for
loops repeat a block of code for each element in
an iterable
Short answer:
iter('abc')
<iterator at 0x243ea90>
iter(1)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-95-eb9b6f09d0b6> in <module>() ----> 1 iter(1) TypeError: 'int' object is not iterable
Long answer: anything with an __iter__
or __getitem__
method.
'__iter__' in dir([1, 2, 3])
True
'__getitem__' in dir('123')
True
'__iter__' in dir({'name':'matt', 'fingers':9})
True
x = 10
while x > 0:
x = x - 1
print x
9 8 7 6 5 4 3 2 1 0
while
loops repeat a block of code while the condition is True
while 1
, while True
will result in an infinite loopLet's work on some examples that are relevant to your research.