R Packages and Seeking Help

Overview

Teaching: 10 min
Exercises: 5 min
Questions
  • How do I use packages in R?

  • How can I get help in R?

Objectives
  • To be able to install packages, and load them into your R session

  • To be able read R help files for functions and special operators.

  • To be able to seek help from your peers.

R packages

R packages extend the functionality of R. Over 13,000 packages have been written by others. It’s also possible to write your own packages; this can be a great way of disseminating your research and making it useful to others. A number of useful packages are installed by default with R (are part of the R core distribution). The teaching machines at the University have a number of additional packages installed by default.

We can see the packages installed on an R installation via the “packages” tab in RStudio, or by typing installed.packages() at the prompt, or by selecting the “Packages” tab in RStudio.

In this course we will be using packages in the tidyverse to perform the bulk of our plotting and data analysis. Although we could do most of the tasks without using extra packages, the tidyverse makes it quicker and easier to perform common data analysis tasks. The tidyverse packages are already installed on the university teaching machines.

Finding and installing new packages

There are several sources of packages in R; the ones you are most likely to encounter are:

CRAN

CRAN is the main repository of packages for R. All the packages have undergone basic quality assurance when they were submitted. There are over 12,000 packages in the archive; there is a lot of overlap between some packages. Working out what the most appropriate package to use isn’t always straightforward.

Bioconductor

Bioconductor is a more specialised repository, which contains packages for bioinformatics. Common workflows are provided, and the packages are more thoroughly quality assured. Because of its more specialised nature we don’t focus on Bioconductor in this course.

Github / personal websites

Some authors distribute packages via Github or their own personal web-pages. These packages may not have undergone any form of quality assurance. Note that many packages have their own website, but the package itself is distributed via CRAN.

Finding packages to help with your research

There are various ways of finding packages that might be useful in your research:

Installing packages

If a package is available on CRAN, you can install it by typing:

install.packages("packagename")

This will automatically install any packages that the package you are installing depends on.

Installing a package doesn’t make the functions included in it available to you; to do this you must use the library() function. As we will be using the tidyverse later in the course, let’s load that now:

library("tidyverse")
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
✔ ggplot2 3.3.5     ✔ purrr   0.3.4
✔ tibble  3.1.4     ✔ dplyr   1.0.7
✔ tidyr   1.1.3     ✔ stringr 1.4.0
✔ readr   1.4.0     ✔ forcats 0.5.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()

The tidyverse is a collection of other packages that work well together. The tidyverse package’s main function is to load some of these other packages. We will be using some of these later in the course.

Conflicting names

You may get a warning message when loading a package that a function is “masked”. This happens when a function name has already been “claimed” by a package that’s already loaded. The most recently loaded function wins.

If you want to use the function from the other package, use packagename::function().

Reading Help files

R, and every package, provide help files for functions. The general syntax to search for help on any function, say function_name:

?function_name
# OR 
help(function_name)

This will load up a help page in RStudio (or by launching a web browser, or as plain text if you are using R without RStudio).

Each help page is broken down into sections:

Different functions might have different sections, but these are the main ones you should be aware of.

Tip: Reading help files

One of the most daunting aspects of R is the large number of functions available. It would be prohibitive, if not impossible to remember the correct usage for every function you use. Luckily, the help files mean you don’t have to!

Special Operators

To seek help on special operators, use quotes:

?"<-"

Getting help on packages

Many packages come with “vignettes”: tutorials and extended example documentation. Without any arguments, vignette() will list all vignettes for all installed packages; vignette(package="package-name") will list all available vignettes for package-name, and vignette("vignette-name") will open the specified vignette.

If a package doesn’t have any vignettes, you can usually find help by typing help("package-name"), or package?package-name.

Challenge: Vignettes

Vignettes are often useful tutorials. We will be using the dplyr package later in this course, to manipulate tables of data. List the vignettes available in the package. You might want to take a look at these now, or later when we cover dplyr.

Solution

vignette(package="dplyr")

Shows that there are several vignettes included in the package. The dplyr vignette looks like it might be useful later. We can view this with:

vignette(package="dplyr", "dplyr")

When you kind of remember the function

If you’re not sure what package a function is in, or how it’s specifically spelled you can do a fuzzy search:

??function_name

Citing R and R packages

If you use R in your work you should cite it, and the packages you use. The citation() command will return the appropriate citation for R itself. citation(packagename) will provide the citation for packagename.

When your code doesn’t work: seeking help from your peers

If you’re having trouble using a function, 9 times out of 10, the answers you are seeking have already been answered on Stack Overflow. You can search using the [r] tag.

If you can’t find the answer, there are a few useful functions to help you ask a question from your peers:

?dput

Will dump the data you’re working with into a format so that it can be copy and pasted by anyone else into their R session.

Package versions

Many of the packages in R are frequently updated. This does mean that code written for one version of a package may not work with another version of the package (or, potentially even worse, run but give a different result). The sessionInfo() command prints information about the system, and the names and versions of packages that have been loaded. You should include the output of sessionInfo() somewhere in your research. The packrat package provides a way of keeping specific versions of packages associated with each of your projects.

sessionInfo()

Other ports of call

Note that some of these resources use base R, rather than the tidyverse approach taught in this course.

Key Points

  • Use install.packages() to install a package from CRAN

  • Use help() to get online help in R.