Matthew Gentzkow and Jesse Shapiro have written an excellent guide
"Code and Data in the Social Sciences".
It's short (only 38 pages),
and full of practical advice for scientists of all stripes:
Automate everything that can be automated.
Write a single shell script that executes all code from beginning to end.
Store code and data under version control.
Run the whole directory before checking it back in.
Separate directories by function.
Separate files into inputs and outputs.
Make directories portable.
Store cleaned data in tables with unique, non-missing keys.
Keep data normalized as far into your code pipeline as you can.