Software Carpentry Day 3: Data Structures and Image Processing Exercises

Data Structures (Morning)

A nanotechnology company called Molecules 'R' Us has hired you to write a simple inventory management program for them. Your program is supposed to read in a file describing how many atoms of various kinds are required to make different molecules, and another file describing how many atoms the company actually has in its teeny tiny warehouse, and then print out a list of the molecules the company could make. The first file (describing molecules) is formatted like this:

# Comments start with '#' and go to the end of the line.
# Blank lines (like the one below) are allowed.

helium : He 1               # molecule name, colon, atom type, number of atoms
ammonia : N 1 H 3           # molecules may contain many types of atoms
salt : Na 1 Cl 1            # atom names may be one or two characters long
lithium hydride : Li 1 H 1  # molecule names may be several words long

The file that specifies how many atoms the company has on hand is formatted like this:

# Once again, comments and blank lines are allowed.

He 3                        # atom name, number available
H 2
N 5
Li 1

For these two input files, the output would be:

helium
lithium hydride

Note that molecules' names appear in alphabetical order, and that the system doesn't have to say how many molecules of any type could be made.

To solve this problem, break it up into four pieces:

  1. read_formulas(filename) reads a file of the first kind, returning a dictionary whose keys are the names of molecules, and whose values are also dictionaries showing how many atoms of each kind are required by that molecule. Using the first input file above as an example, read_formulas would return:
    {
        'helium' : {'He' : 1},
        'ammonia' : {'N' : 1, 'H' : 3},
        'salt' : {'Na' : 1, 'Cl' : 1},
        'lithium hydride' : {'Li' : 1, 'H' : 1}
    }
    
  2. read_inventory(filename) reads a file of the second kind, returning a dictionary whose keys are atomic symbols and whose values are the number of atoms of that kind available. Using the second input file above as an example, read_inventory would return:
    {
        'He' : 3,
        'H' : 2,
        'N' : 5,
        'Li' : 1
    }
    
  3. makeable(formulas, inventory) takes a dictionary of formulas as its first argument, and an inventory dictionary as its second, and returns a set of makeable molecules. In this example, its output would be:
    {'helium', 'lithium hydride'}
  4. show_result(molecules) takes a set of molecule names as input, and prints their names in alphabetical order.

Note: you're working with a partner on this exercise, so you could either split up the work and coordinate through a version control repository, or pair program as before. In either case, spend a few minutes thinking of very simple test cases before you start to write your functions. What are the simplest examples of each kind of file? The next-to-simplest? What output should your program produce for them?

Image Processing (Afternoon)

One of the Reasons Programming Is Hard

Where would you shelve books by the following authors?

The correct answers are:

My thanks to Margaret Menzin for this example.