More Unix Shell
April 24, 2010: We are pleased to announce that Version 4 of this course is now under development. For updates and an early peek at the content, please check out the Software Carpentry blog at http://www.software-carpentry.org/blog/.
1) Introduction
- In the first lecture we talked about three components of an OS for which the shell can act as a interface: files, processes, communication
- We will first fill in a few pieces about files that we didn't have time to cover in the first lecture and focus this lecture around process management and communication

2) You Can Skip This Lecture If...
- You know what
stdin and stdout are
- You know what a process is
- You know what a pipe is
- You know what
$PATH is
- You know what
-rwxr-xr-x means

3) Wildcards
Some characters (called wildcards) mean special things to the shell. (You will see this again when you get to regular expressions.)
* matches zero or more characters
- So
ls bio/*.txt lists all the text files in the bio directory
$ ls bio/*.txt
bio/albus.txt bio/ginny.txt bio/harry.txt bio/hermione.txt bio/ron.txt
? matches any single character
- So
ls jan-??.txt lists text files whose names start with "jan-" followed by two characters
- You can probably guess what
ls jan-??.* does
- Note that:
- The shell expands wildcards, not individual applications
ls can't tell whether it was invoked as ls *.txt or as ls earth.txt venus.txt
- Wildcards only work with filenames, not with command names
ta* does not find the tabulate command
~ is not strictly a wild card, but it is a character with a special meaning
- By itself it refers to the current user's home directory
cd ~ is the same as cd $HOME
- As the prefix to a user name (E.g.,
~reid), it refers to that user's home directory

4) File Ownership and Permissions
- On Unix, every user belongs to one or more groups
- The
groups command will show you which ones you are in
- Every file is owned by a particular user and a particular group
- Can assign read (r), write (w), and execute (x) permissions independently to user, group, and others
- Read: can look at contents, but not modify them
- Write: can modify contents
- Execute: can run the file (e.g., it's a program)
ls -l shows this information
- Along with the file's size and a few other things
- Permissions displayed as three
rwx triples
- "Missing" permissions shown by
"-"
- So
rw-rw-r-- means:
- User and group can read and write
- Everyone else can read, but not write
- No one can execute

5) Directory Permissions
- Execute permission means something different for directories
- Allows you to "go into" a directory, but does not mean you can read its contents
- If
tools has permission rwx--x--x, then:
- If someone other than the owner does
ls tools, permission is denied
- But anyone who wants to can run
tools/pfold

6) Changing Permissions
- Change permissions using
chmod
chmod u+x broom allows broom's owner to run it
chmod o-r notes.txt takes away the world's read permission for notes.txt

7) Changing Permissions Continued
- Change permissions to
rwxr-xr-x
- Run it with
./nojunk
- Or if
$HOME/bin is in your search path, move it there
- Don't call your temporary test programs
test
- There's already
/usr/bin/test
- Your PATH may cause that program to run instead of yours
- Confusion results, so use something else, e.g.,
./try

8) Ownership and Permission: Windows
- Of course, it all works differently on Windows
- Not better or worse, just differently
- Windows XP uses access control lists
- Every file and directory has a list of (who, what) pairs
- "Who" can be a group
- Some versions of Unix provide ACLs as well, but many tools don't understand them
- Older versions of Windows (such as Windows 95 and Windows 2000) are fundamentally insecure, and shouldn't be used
- Cygwin does its best to make the Windows model look like Unix's
- When you trip over the differences, please consult a system administrator

9) Configuration
- To set a variable's value automatically when you log in, edit
~/.bashrc
"~" is a shortcut meaning "your home directory"
# Add personal tools directory to PATH.
PATH=$HOME/bin:$PATH
# Personal settings.
export EDITOR=/local/bin/emacs
export PRINTER=gryffindor-laserwriter
# Change default behavior of commands.
alias ls="ls -F"
- Note:
.bashrc files can become very complex...
- Many applications look for personal configuration files in the user's home directory
- By convention, their names begin with "." so that a normal
ls won't show them
- Once upon a time, the "rc" at the end meant "run commands"

10) The Shell as a programming environment
- The real power of the shell is when you look at it as a component-based programming environment
- Small tools that each do one job
- ...can be connected together to create ad hoc solutions to larger problems
- A good model, even when you're building large GUI or web applications

11) Redirecting Input and Output
- A running program is called a process
- Every process automatically has three connections to the outside world:
- You can tell the shell to connect standard input and standard output to files instead

12) Redirection Examples
- Save number of words in all text files to
words.len:
$ cd bio
$ wc *.txt > words.len
- Nothing appears on the screen because output is being sent to the file
words.len
- Check contents using
cat
$ cat words.len
7 66 468 albus.txt
5 46 311 ginny.txt
5 49 342 harry.txt
5 49 331 hermione.txt
6 54 364 ron.txt
28 264 1816 total
- Try typing
cat > junk.txt
- No input file specified, so
cat reads from the keyboard
- Output sent to a file
- The world's dumbest text editor
- When you're done, use
rm junk.txt to get rid of the file
- Don't type
rm * unless you're really, really sure that's what you want to do...
- Don't redirect out to the same file, e.g.,
sort words >words
- The shell sets up redirection before running the command
- Redirecting out to an existing file truncates it make it empty
sort then goes and reads the empty file
- Contents of
words are lost

13) Pipes
- Suppose you want to use the output of one program as the input of another
- E.g., use
wc -w *.txt to count the words in some files, then sort -n to sort numerically
- The obvious solution is to send output of first command to a temporary file, then read from that file
$ wc -w *.txt > words.tmp
$ sort -n words.tmp
46 ginny.txt
49 harry.txt
49 hermione.txt
54 ron.txt
66 albus.txt
264 total
$ rm words.tmp
- The right answer is to use a pipe
- Written as
"|"
- Tells the operating system to connect the standard output of the first program to the standard input of the second
wc -w *.txt | sort -n
46 ginny.txt
49 harry.txt
49 hermione.txt
54 ron.txt
66 albus.txt
264 total
- More convenient and less error prone than temporary files

14) Combining Pipes
- Can chain any number of commands together
- And combine with input and output redirection
$ grep 'Title' spells.txt | sort | uniq -c | sort -n -r | head -10 > popular_spells.txt
- Any program that reads from standard input and writes to standard output can use redirection and pipes
- Such programs are often called filters
- If your programs work like filters, you (and other people) can combine them with standard tools
- A combinatorial explosion of goodness

15) Cygwin on Windows
- [Cygwin] does things a little differently
- Uses the notation
/cygdrive/c/somewhere instead of Windows' C:/somewhere
- Because the colon in
C:/somewhere would clash with the colons in the PATH variable
- By default, Cygwin treats
C:/cygwin as the root of its file system
- So
/home/rweasley is a synonym for C:/cygwin/home/rweasley
- Yes, it can be confusing
- But then, it is trying to make one operating system look like another

16) Job control
- A job is a program whose execution has been initiated by the user
- At any moment a process can be running or suspended
- Foreground job:
- a process that has control of the terminal
- Background job:
- runs concurrently with the parent shell and does not take control of the keyboard
- output may still appear in the shell
- Start a job in the background by appending
&
- Commands:
^C |
Send the SIGTERM signal to the foreground process |
^Z |
Send the SIGSTOP signal to the foreground process |
jobs |
Display the status of active background jobs controlled by the shell. |
fg |
Change the first background job to the foreground |
bg |
Continue the suspended job in the background |
Table 10.1: Job Commands

17) grep
- One of the most useful shell commands
- Search for a word or pattern in a file or set of files
- E.g.,
grep reid /etc/passwd
- Lots of useful options:
-i case-insensitive search
-r recurse through subdirectories

18) Shell Programs
- Any set of shell commands can be turned into a program!
- If it's worth doing again, it's worth automating
- Create a file called
nojunk
#!/usr/bin/bash
rm -f *.junk
- Use
man rm to find out what the "-f" flag does
#!/usr/bin/bash means "run this using the Bash shell"
- Any program name can follow the
#!
- We'll see some possibilities later

19) More Advanced Tools
chmod |
Change file and directory permissions. |
du |
Print the disk space used by files and directories. |
find |
Find files with names that match patterns, that are of a certain age or size, etc. |
grep |
Print lines matching a pattern. |
gunzip |
Uncompress a file. |
gzip |
Compress a file. |
lpr |
Send a file to a printer. |
lprm |
Remove a print job from a printer's queue. |
lpq |
Check the status of a printer's queue. |
ps |
Display running processes. |
tar |
Archive files. |
which |
Find the path to a program. |
who |
See who is logged in. |
xargs |
Execute a command for each line of input. |
Table 10.2: Advanced Command-Line Tools

20) Summary
- The shell is as powerful as most programming languages
- Actually has features that most programming languages don't
- But its limits are as important as its capabilities
- As soon as you need functions or data structures, you should switch to Python Basics
