January 18, 2017

Overview

  • Part A
    • R-packages – an introduction (10 minutes)
    • Structure and contents of R-packages (30 minutes)
    • From scripts to functions (20 minutes)
  • Part B
    • Building a template package the prospective way (10 minutes)
    • Maintaining a package (20 minutes)

1.1 R-packages – What?

R-packages – What?

  • If you are looking for an introduction to R… bad luck!
  • This course does cover
    • a brief idea of the concept of R-packages
    • a discussion/justification of package contents
    • hands-on work to build a package
    • work beyond heaving a package onto the shelf and vanish
    • additional material and efforts related to packages

R-packages – What?

What you should be already familiar with

  • R and RStudio
  • Installing and using R-packages
  • Writing scripts (and functions)
  • Structuring code and following good practice rules

R-packages

Further resources

R-packages

Prerequisites and computational minions

  • RStudio: The skin for R, a perfect developer (and user) environment
    • Support for creating scripts, packages, reports, books, slides, …
    • Plots, data environment history, file browser, help viewer
    • Auto completion, syntax highlighting, context help
    • Version control and project management
  • The R-package devtools by Hadley Wickham
  • The R-package ryoxygen2 by Hadley Wickham

R-packages

Concepts

  • R lives from sharing code, and packages are the vehicles for this idea
  • Packages are the pilars for open, transparent, reproducible work
  • Packages comprise the full set of self-contained components

  • Libraries are no packages! Libraries host packages. You pull a package from the library.

R-packages

Some illusions

  • R-packages have to go to the Comprehensive R Archive Network (CRAN)
  • Writing R-packages is a lot of boring effort
  • You can start easy and add further details later
  • People will use your package in the way you designed it

R-packages

Why writing packages

  • You want to share your code with others
  • The code is organised in a coherent way
  • You want to handle/distribute only a single file
  • Bundling code makes keeping track much easier than collecting scripts
  • The code is automatically tested (by your examples and other routines)

  • Working with packages simply saves time and brain cells

1.2 R-packages – Contents!

R-package contents

An overview

  • What do you think? What belongs to a an R-package?

R-package contents

An overview of obvious material

  • A name (yes, a package name)
  • A description (the meta information)
  • A function (yes, there are packages with only one function)
  • A documentation (you want to understand what the function does)
  • A working example (you better don't trust the documentation)

  • A set of further stuff that will be covered later

R-package contents

An overview

R-package contents

The package name

  • This is (or should be) the hardest part of building a package
  • Requirements
    • Only letters, numbers and periods are allowed
    • Start with a letter, do not end with a period
  • Advice
    • Pick a unique name that you can google and covers the package
    • Do not mix upper and lower case letters
    • Check if the name already exists beforehand

R-package contents

The DESCRIPTION file

  • The mandatory file that defines all the metadata of the package: name & short title, version & date, author & contact, license & dependencies:

R-package contents

The DESCRIPTION file - Title and description

  • Title must be capitalised, only one line, not ending with a period
  • Description can be several sentences, but only one paragraph. Lines can only contain 80 characters and must be indented by 4 spaces.
Title: Environmental seismology toolbox
Description: A collection of functions to handle seismic data for the
    purpose of investigating the seismic signals emitted by Earth surface
    processes. The package supports inporting standard formats, data 
    preparation and analysis techniques and data visualisation.
  • Both elements are important. They will be indexed and Google has learned a lot to spot R packages.

R-package contents

The DESCRIPTION file – Dependencies

  • Dependency options in short (read more):
    • Depends: all packages your package needs to run
    • Imports: will be covered by the namespace section
    • Suggests: packages used to make, e.g., vignettes
    • Enhances: optionally needed packages
    • LinkingTo: needed to reference C++/C libraries
  • CRAN became strict with the number of Depends-entries. Use importFrom() in NAMESPACE, instead.

R-package contents

The DESCRIPTION file – Author information and roles

  • Author information can also be defined more comprehensive (read more):
Authors@R: person("First", "Last", email = "first.last@example.com",
                  role = c("aut", "cre"))
  • Essential for correct citation of packages! Nota bene: always cite the package and R version used for analysis: citation("PACKAGENAME", auto = TRUE)
  • Don't use fake mails. CRAN and users will not communicate with you.

R-package contents

The DESCRIPTION file – License issues

  • The key element to inform who can use the package for which purpose!

  • Either a link to a license file (License: file LICENSE) or a keyword of standard licenses (read more):
    • GPL-2 or GPL-3: copy-left license , other users must license code GPL-compatile. Essential for CRAN-submission.
    • CC0: give away all rights, anything can be done with the code
    • BSD or MIT: permissive licenses, require additional file LICENSE.

R-package contents

The DESCRIPTION file – Version patterns

  • Version numbers must be numeric and separated by a period.
  • They are more than just counters, they define dependency satisfactions

  • Format: MAJOR.MINOR.PATCH (start a released package with 0.0.1)
    • MAJOR releases should be rare
    • MINOR releases should keep the package up to date
    • PATCH releases may be frequent (but think of CRAN team time budget)
  • Make use of a NEWS file to announce version history and changes.

R-package contents

Further contents