# Load data ---------------------------
# Plot data ---------------------------
1 Files
1.1 Names
File names should be machine readable: avoid spaces, symbols, and special characters. Prefer file names that are all lower case, and never have names that differ only in their capitalization. Delimit words with
-
or_
. Use.R
as the extension of R files.# Good fit_models.R utility_functions.R exploratory-data-analysis.R # Bad fit models.R foo.r ExploratoryDataAnaylsis.r
File names should be human readable: use file names to describe what’s in the file.
# good report-draft-notes.txt # bad temp.r
Use the same structure for closely related files:
# good fig-eda.png fig-model-3.png # bad figure eda.PNG fig model three.png
File names should play well with default ordering. If your file names contain dates, use yyyy-mm-dd (ISO8601) format so they sort in chronological order. If your file names include numbers, make sure to pad them with the appropriate number of zeros so that (e.g.) 11 doesn’t get sorted before 2. If files should be used in a specific order, put the number at the start, not the end.
# good 01-load-data.R 02-exploratory-analysis.R 03-model-approach-1.R 04-model-approach-2.R 2025-01-01-report.Rmd 2025-02-01.report.Rmd # bad alternative model.R code for exploratory analysis.r feb 01 report.Rmd jan 01 report.Rmd model_first_try.R run-first.r
If you later realise that you’ve missed some steps, it’s tempting to use
02a
,02b
, etc. However, I think it’s generally better to bite the bullet and rename all files.Don’t tempt fate by using “final” or similar words in file names. Instead either rely on Git to track changes over time, or failing that, put the date in the file name.
# good report-2022-03-20.qmd report-2022-04-02.qmd # bad finalreport.qmd FinalReport-2.qmd
1.2 Organisation
It’s hard to describe exactly how you should organise your code across multiple files. I think the best rule of thumb is that if you can give a file a concise name that still evokes its contents, you’ve arrived at a good organisation. But getting to that point is hard.
1.3 Internal structure
Use commented lines of -
and =
to break up your file into easily readable chunks.
If your script uses add-on packages, load them all at once at the very beginning of the file. This is more transparent than sprinkling library()
calls throughout your code or having hidden dependencies that are loaded in a startup file, such as .Rprofile
.