April 24, 2010: We are pleased to announce that Version 4 of this course is now under development. For updates and an early peek at the content, please check out the Software Carpentry blog at http://www.software-carpentry.org/blog/.
set type is built in to Python 2.4 and higher
set()vowels = set()
for char in 'aieoeiaoaaeieou':
vowels.add(char)
print vowels
Set(['a', 'i', 'e', 'u', 'o'])
ten = set(range(10))
lows = set([0, 1, 2, 3, 4])
odds = set([1, 3, 5, 7, 9])
add: add an element to a setlows.add(9) => None
lows is now set([0, 1, 2, 3, 4, 9]])
clear: remove all elements from the setlows.clear() => None
lows is now set()
difference: create a set with elements that are in one set, but not the otherlows.difference(odds) => set([0, 2, 4]])
lows - odds
intersection: create a set with elements that are in both argumentslows.intersection(odds) => set([1, 3]])
lows & odds
issubset: are all of one set's elements contained in another?lows.issubset(ten) => True
lows <= ten
issuperset: does one set contain all of another's elements?lows.issuperset(odds) => False
lows >= odds
remove: remove an element from a setlows.remove(0) => None
lows is now set([1, 2, 3, 4]])
symmetric_difference: create a set with elements that are in exactly one setlows.symmetric_difference(odds) => set([0, 2, 4, 5, 7, 9]])
lows ^ odds
union: create a set with elements that are in either argumentlows.union(odds) => set([0, 1, 2, 3, 4, 5, 7, 9]])
lows | odds
Table 7.1: Set Methods and Operators
lines = [
'canada goose', 'canada goose', 'long-tailed jaeger', 'canada goose',
'snow goose', 'canada goose', 'canada goose', 'northern fulmar'
]
seen = set()
for line in lines:
seen.add(line.strip())
for bird in seen:
print bird
northern fulmar snow goose long-tailed jaeger canada goose
for loops over the values in the set
Figure 7.1: Hashing
Figure 7.2: Misplaced Values
values = set()
values.add('birds')
print values
values.add(('Canada', 'goose'))
print values
values.add(['snow', 'goose'])
print values
Traceback (most recent call last):
File "mutable_in_set.py", line 8, in ?
values.add(['snow', 'goose'])
File "/usr/lib/python2.3/sets.py", line 521, in add
self._data[element] = True
TypeError: list objects are unhashable
("snow", "goose")>>> birds = set() >>> arctic = frozenset(['goose', 'tern']) >>> birds.add(arctic) >>> print birds
set([frozenset(['goose', 'tern'])])
>>> arctic.add('eider')
AttributeError: 'frozenset' object has no attribute 'add'
if name in seen check requires $N/2$ comparisons on average
Figure 7.3: List vs. Set Performance
Figure 7.4: Binary Search
Figure 7.5: List vs. Set Performance Revisited
(name, count) in set...
Figure 7.6: Dictionaries as Tables
{}
{'Newton':1642, 'Darwin':1809}{}[]birthday = {
'Newton' : 1642,
'Darwin' : 1809
}
print "Darwin's birthday:", birthday['Darwin']
print "Newton's birthday:", birthday['Newton']
Darwin's birthday: 1809 Newton's birthday: 1642
birthday = {
'Newton' : 1642,
'Darwin' : 1809
}
print birthday['Turing']
Traceback (most recent call last):
File "key_error.py", line 5, in ?
print birthday['Turing']
KeyError: 'Turing'
birthday = {}
birthday['Darwin'] = 1809
birthday['Newton'] = 1942 # oops
birthday['Newton'] = 1642
print birthday
{'Darwin': 1809, 'Newton': 1642}
del d[k]
birthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
print 'Before deleting Turing:', birthday
del birthday['Turing']
print 'After deleting Turing:', birthday
del birthday['Faraday']
print 'After deleting Faraday:', birthday
Before deleting Turing: {'Turing': 1912, 'Newton': 1642, 'Darwin': 1809}
After deleting Turing: {'Newton': 1642, 'Darwin': 1809}
Traceback (most recent call last):
File "dict_del.py", line 10, in ?
del birthday['Faraday']
KeyError: 'Faraday'
k is in a dictionary d using k in d
birthday = {
'Newton' : 1642,
'Darwin' : 1809
}
for name in ['Newton', 'Turing']:
if name in birthday:
print name, birthday[name]
else:
print 'Who is', name, '?'
Newton 1642 Who is Turing ?
for k in d loops over the dictionary's keys (rather than its values)
for loops over the values, rather than indicesbirthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
for name in birthday:
print name, birthday[name]
Turing 1912 Newton 1642 Darwin 1809
| Method | Purpose | Example | Result |
|---|---|---|---|
clear |
Empty the dictionary. | d.clear() |
Returns None, but d is now empty. |
get |
Return the value associated with a key, or a default value if the key is not present. | d.get('x', 99) |
Returns d['x'] if "x" is in d, or 99 if it is not. |
keys |
Return the dictionary's keys as a list. Entries are guaranteed to be unique. | birthday.keys() |
['Turing', 'Newton', 'Darwin'] |
items |
Return a list of (key, value) pairs. | birthday.items() |
[('Turing', 1912), ('Newton', 1642), ('Darwin', 1809)] |
values |
Return the dictionary's values as a list. Entries may or may not be unique. | birthday.values() |
[1912, 1642, 1809] |
update |
Copy keys and values from one dictionary into another. | See the example below. |
Table 7.2: Dictionary Methods in Python
birthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
print 'keys:', birthday.keys()
print 'values:', birthday.values()
print 'items:', birthday.items()
print 'get:', birthday.get('Curie', 1867)
temp = {
'Curie' : 1867,
'Hopper' : 1906,
'Franklin' : 1920
}
birthday.update(temp)
print 'after update:', birthday
birthday.clear()
print 'after clear:', birthday
keys: ['Turing', 'Newton', 'Darwin']
values: [1912, 1642, 1809]
items: [('Turing', 1912), ('Newton', 1642), ('Darwin', 1809)]
get: 1867
after update: {'Curie': 1867, 'Darwin': 1809, 'Franklin': 1920, 'Turing': 1912, 'Newton': 1642, 'Hopper': 1906}
after clear: {}
# Data to count.
names = ['tern','goose','goose','hawk','tern','goose', 'tern']
# Build a dictionary of frequencies.
freq = {}
for name in names:
# Already seen, so increment count by one.
if name in freq:
freq[name] = freq[name] + 1
# Never seen before, so add to dictionary.
else:
freq[name] = 1
# Display.
print freq
{'goose': 3, 'tern': 3, 'hawk': 1}
dict.get
freq = {}
for name in names:
freq[name] = freq.get(name, 0) + 1
print freq
{'goose': 3, 'tern': 3, 'hawk': 1}
keys = freq.keys()
keys.sort()
for k in keys:
print k, freq[k]
goose 3 hawk 1 tern 3
{'a':1, 'b':1, 'c':1}?dict.get(key, []) instead of dict.get(key, 0)inverse = {}
for (key, value) in freq.items():
seen = inverse.get(value, [])
seen.append(key)
inverse[value] = seen
keys = inverse.keys()
keys.sort()
for k in keys:
print k, inverse[k]
1 ['hawk'] 3 ['goose', 'tern']
Figure 7.7: Inverting a Dictionary
inverse = {}
for (key, value) in freq.items():
if value not in inverse:
inverse[value] = []
inverse[value].append(key)
"%" can take a dictionary as its right argument
"%(varname)s" inside the format string to identify what's to be substitutedbirthday = {
'Newton' : 1642,
'Darwin' : 1809,
'Turing' : 1912
}
entry = '\%(name)s: \%(year)s'
for (name, year) in birthday.items():
temp = {'name' : name, 'year' : year}
print entry \% temp
Turing: 1912 Newton: 1642 Darwin: 1809
def settings(title, **kwargs):
print 'title:', title
for key in kwargs:
print ' %s: %s' % (key, kwargs[key])
settings('nothing extra')
settings('colors', red=0.0, green=0.5, blue=1.0)
title: nothing extra
title: colors
blue: 1.0
green: 0.5
red: 0.0
** in front of kwargs means "Put any extra keyword arguments in a dictionary, and assign it to kwargs"def sum(*values):
result = 0.0
for v in values:
result += v
return result
print "no values:", sum()
print "single value:", sum(3)
print "five values:", sum(3, 4, 5, 6, 7)
no values: 0.0 single value: 3.0 five values: 25.0
* in front of values means "Put any extra unnamed arguments in a tuple, and assign it to values"* argument per function** mean? How and why would you use it?Copyright © 2005-09 Python Software Foundation.
Created Thu Aug 6 21:56:06 2009 UTC