- Hello, and welcome to the second episode of the Software Carpentry lecture on version control. This episode introduces the basic workflow you’ll use when working with a version control system. To keep things simple, we’ll assume that someone has already been set up a repository for you. A later episode will show you how to do this yourself.
- Dracula [mad laughter] and Wolfman [spooky howl] have just been assigned to the Universal Monsters project, and need to figure out where they should hide their secret lair. The Mummy has already put some notes in a version control repository on the
universal.software-carpentry.org server. Its full URL is https://universal.software-carpentry.org/monsters. Every repository has an address like this that uniquely identifies the location of the master copy.
- It’s Monday night. Dracula sits down at his computer and runs SmartSVN. This is a Subversion client, i.e., a program that runs on your machine, and knows how to move files back and forth to a repository located on a server. There are lots of other graphical clients out there, and many power users run Subversion commands from the shell, but we’ll use SmartSVN in this lecture.
- In order to create a working copy on his computer, Dracula has to check out the repository. He only has to do this once per project; once he has a working copy, he can update it in place to get other people’s work.
- Using SmartSVN, Dracula goes to the
Project menu and selects Checkout....
- The dialog that appears on his screen has two required fields. The first is the URL of the repository, which tells Subversion where to look for the master copy. The second specifies where Dracula wants the working copy put on his computer.
- After filling them both in, he clicks; SmartSVN opens a connection to the server, checks that Dracula is allowed access to the repository, then creates a new directory on his computer and copies files into it.
- Once the checkout is complete, SmartSVN makes a bookmark for the project. As in a standard file browser, double-clicking on this, and on the directories inside it, opens things up and displays their contents.
- Dracula can find out more about the history of the project by using Subversion’s
log command. When he clicks on the Log button, SmartSVN displays a summary of all the changes made to the project so far. This list includes the revision number, the name of the person who made the change, the date the change was made, and whatever comment the user provided when the change was submitted. As you can see, the monsters project is currently at revision 6, and all changes so far have been made by the Mummy.
- While we have this dialog open, notice how detailed the comments on the updates are. Good comments are as important in version control as they are in coding, because without them, it can be very difficult to figure out who did what, when, and why. You can use comments like, “Changed things,” and, “Fixed it,” if you want, or even nothing at all, but you’ll only be creating trouble for your future self.
- A couple of cubicles away, Wolfman also runs SmartSVN to check out a working copy of the repository. He also gets Version 6, so the files on his machine are the same as the files on Dracula’s. Unfortunately, he then has a bad hair episode, and has to take a short break.
- While Wolfman is calming down, Dracula decides to add some information to the repository about Jupiter’s moons. Using his favorite editor, he creates a file in the
jupiter directory called moons.txt, and fills it with information about Io, Europa, Ganymede, and Callisto:
Name Orbital Radius Orbital Period Mass Radius
Io 421.6 1.769138 893.2 1821.6
Europa 670.9 3.551181 480.0 1560.8
Ganymede 1070.4 7.154553 1481.9 2631.2
Calisto 1882.7 16.689018 1075.9 2410.3
- After double-checking his data, he wants to commit the file to the repository so that everyone else on the project can see it.
- The first step is to add the file to his working copy. This isn’t the same as creating it—Dracula has already done that. Instead, adding the file tells Subversion to start keeping track of changes to that file. It’s quite common, particularly in programming projects, to have backup files or artefacts of compilation in a directory that aren’t worth storing in the repository. This is why version control requires you to explicitly tell it which files are to be managed.
- Once he has told Subversion to add the file, Dracula can go ahead and commit his changes to the repository. He clicks “Commit”, adds a meaningful comment, and then clicks “Continue”. SmartSVN establishes a connection and copies his changes over to the master.
- The version number has now changed from 6 to 7. Notice that this version number applies to the whole repository, not just to files that have changed. Version numbers always refer to snapshots of the entire repository, so if you say “Version 119″ of a file, that is always going to be the same as Version 119 of any other file or directory in the repository.
- The next morning, when he’s back in human form, Wolfman starts work once again. He runs SmartSVN and does an update.
- SmartSVN tells him that a new file has been added to the repository, and Wolfman’s working copy is now up to date with Version 7, which is the current head, or most recent, revision.
- Looking in the new file,
jupiter/moons.txt, Wolfman notices that Dracula has misspelled “Callisto”—it’s supposed to have two L’s. Wolfman goes ahead and edits that line of the file:
Name Orbital Radius Orbital Period Mass Radius
Io 421.6 1.769138 893.2 1821.6
Europa 670.9 3.551181 480.0 1560.8
Ganymede 1070.4 7.154553 1481.9 2631.2
Callisto 1882.7 16.689018 1075.9 2410.3
- He also adds a line about Amalthea, which he thinks might be a good site for a secret lair despite its small size:
Name Orbital Radius Orbital Period Mass Radius
Amalthea 181.4 0.498179 0.075 125.0
Io 421.6 1.769138 893.2 1821.6
Europa 670.9 3.551181 480.0 1560.8
Ganymede 1070.4 7.154553 1481.9 2631.2
Callisto 1882.7 16.689018 1075.9 2410.3
- He then commits his changes to create Version 8 of the repository [spooky howl].
- Later that night, when Dracula wakes up and starts working again, the first thing he wants to do is get Wolfman’s changes. Before clicking “Update”, though, he clicks on the “Log” button to see who has done what. He’s curious what Wolfman changed, so he selects
moons.txt and asks SmartSVN to show him the changes.
- SmartSVN brings up a double-panelled display that uses color to show insertions, changes, and deletions on a line-by-line basis.
- After checking them over, Dracula is satisfied, so he dismisses this view, and does the update.
- This is a very common workflow: check to see what has changed in the repository, check to see if it’s going to get in your way, and if it’s not, pull those changes down to your machine. It’s worth noticing here how important Wolfman’s comments about his changes were. It’s hard to see the difference between ‘Calisto’ with one L and ‘Callisto’ with two, even if the line containing the difference has been highlighted. Without Wolfman’s comments, Dracula might have wasted time wondering if there actually was a difference or not.
- In fact, Wolfman should probably have made…
- …two separate commits, since there’s no logical connection between…
- …fixing a typo in Callisto’s name…
- …and adding information about Amalthea to the same file. Just as a function or program should do one job, and one job only, a single commit to version control should have a single logical purpose so that it’s easier to find, understand, and if necessary undo later on.
- In our next episode, we’ll take a look at the workflow when changes conflict [mad laughter].
Wow, that was a great video. I can see that using a version control program would be useful even if a project is being done by only yourself. I will have to look into this.
Love the videos by the way. What is the “Proper” order one should be viewing the lectures?
Cheers,
JR
Nice, I didnot know about SVN thing before.I will try it.
At this point (after two episodes), I’m wondering if the version control works well for .doc or Latex files..
@LS : Yes for LaTeX (they’re plain text files), and sort of for .doc (they’re not). The catch (which we discuss later) is what happens when two or more people have edited the file simultaneously, and we need to merge their changes. Like all other version control tools, Subversion compares them line by line, which makes sense for hand-edited text files, but not for images (no lines) or for word processor files (which embed a lot of formatting commands in the files — what you see on the screen is based on what’s in the file, but it’s not a one-byte-to-one-character-on-screen match). Hand-written HTML can (sometimes) play nicely with version control, but machine-generated HTML (e.g., what comes out of Microsoft Word or OpenOffice) doesn’t have the line breaks that human beings would add, so again, a line-oriented ‘diff’ produces gibberish.
The sad thing is that there’s no good reason for this: it would be a lot of work to build sensible diff-and-merge tools for all those formats, but no one has. My guess is that there’s just too many to support (what about spreadsheets? MP3 files? AutoCAD?).
That said, version control *can* do a lot for you with so-called “binary” files (anything that isn’t handwritten ASCII text). You may have to do the differencing and merging the hard way, but you have to do that anyway if you’re mailing files around or FTP’ing them or whatever. With version control, there’s one channel, there’s a record of who did what when, old copies aren’t lost in the ether… It’s still the right answer, it’s just not as nice as it could be.