Encountering your IRB: What political scientists need to know

Editor’s note: This is a condensed version of an essay appearing in Qualitative & Multi-Method Research [Newsletter of the APSA Organized Section for Qualitative and Multi-Method Research] Vol. 12, No. 2 (Fall 2014). The shorter version is also due to appear in the newsletters of APSA’s Immigration & Citizenship and Law & Courts sections. The original, which is more than twice the length and develops many of these ideas more fully, is available from the authors (Dvora.Yanow@wur.nl, psshea@poli-sci.utah.edu).

This post is contributed by Dvora Yanow (Wageningen University) and Peregrine Schwartz-Shea (University of Utah).

Pre-script.  After we finished preparing this essay, a field experiment concerning voting for judges in California, Montana, and New Hampshire made it even more relevant. Three political scientists—one at Dartmouth, two from Stanford—mailed potential voters about 300,000 flyers marked with the states’ seals, containing information about the judges’ ideologies. Aside from questions of research design, whether the research passed IRB review is not entirely clear (reports say it did not in Stanford but was at least submitted to the Dartmouth IRB; for those who missed the coverage, see this link and political scientist Melissa Michelson’s blog (both accessed November 3, 2014). Two bits of information offer plausible explanations for what have been key points in the public discussion:

  1. Stanford may have had a reliance agreement with Dartmouth, meaning that it would accept Dartmouth’s IRB’s review in lieu of its own separate review;
  2. Stanford and Dartmouth may have “unchecked the box” (see below), relevant here because the experiments were not federally funded, meaning that IRB review is not mandated and that universities may devise their own review criteria.

Still, neither explains what appear to be lapses in ethical judgment in designing the research (among others, using the state seals without permission and thereby creating the appearance of an official document). We find this a stellar example of a point we raise in the essay: the discipline’s lack of attention to research ethics, possibly due to reliance on IRBs and the compliance ethics that IRB practices have inculcated.

 * * *

Continuing our research on US Institutional Review Board (IRB) policies and practices (Schwartz-Shea and Yanow 2014, Yanow and Schwartz-Shea 2008) shows us that many political scientists lack crucial information about these matters. To facilitate political scientists’ more effective interactions with IRB staff and Boards, we would like to share some insights gained from this research.

University IRBs implement federal policy, monitored by the Department of Health and Human Services’ Office of Human Research Protections (OHRP). The Boards themselves are comprised of faculty colleagues (sometimes social scientists) plus a community member. IRB administrators are often not scientists (of any sort), and their training is oriented toward the language and evaluative criteria of the federal code. Indeed, administering an IRB has become a professional occupation with its own training and certification. IRBs review proposals to conduct research involving “human subjects” and examine whether potential risks to them have been minimized, assessing those risks against the research’s expected benefits to participants and to society. They also assess researchers’ plans to provide informed consent, protect participants’ privacy, and keep the collected data confidential.

The federal policy was created to rest on local Board decision-making and implementation, leading to significant variations across campuses in its interpretation. Differences in practices often hinge on whether a university has a single IRB evaluating all forms of research or different ones for, e.g., medical and social science research. Therefore, researchers need to know their own institutions’ IRBs. In addition, familiarity with key IRB policy provisions and terminologies will help. We explain some of this “IRB-speak” and then turn to some procedural matters, including those relevant to field researchers conducting interviews, participant-observation/ethnography, surveys, and/or field experiments, whether domestically or overseas.

IRB-speak: A primer

Part of what makes IRB review processes potentially challenging is its specialized language. Regulatory and discipline-based understandings of various terms do not always match. Key vocabulary includes the following.

  • Research.” IRB regulations tie this term’s meaning to the philosophically-contested idea of “generalizable knowledge” (CFR §46.102(d)). This excludes information-gathering for other purposes and, on some campuses, other scholarly endeavors (e.g., oral history) and course-related exercises.
  • Human subject.” This is a living individual with whom the researcher interacts to obtain data. “Interaction” is defined as “communication or interpersonal contact between investigator and subject” (CFR §46.102(f)). But “identifiable private information” obtained without interaction, such as through the use of existing records, also counts.
  • Minimal risk.” This research is when “the probability and magnitude of harm or discomfort anticipated in the research are not greater in and of themselves than those ordinarily encountered in daily life or during the performance of routine physical or psychological examinations or tests” (CFR §46.102(i)). But everyday risks vary across subgroups in American society, not to mention worldwide, and IRB reviewers have been criticized for their lack of expertise in risk assessment, leading them to misconstrue the risks associated with, e.g., comparative research (Schrag 2010, Stark 2012).
  • Vulnerable populations.” Six categories of research participants “vulnerable to coercion or undue influence” are subject to additional safeguards: “children, prisoners, pregnant women, mentally disabled persons, or economically or educationally disadvantaged persons” (CFR 46.111(b)). Federal policy enables universities also to designate other populations as “vulnerable,” e.g., Native Americans.
  • Levels of review. Usually, IRB staff decide a proposed project’s level of required review: “exempt,” “expedited,” or “convened” full Board review. “Exempt” does not mean that research proposals are not reviewed. Rather, it means exemption from full Board review, a status that can be determined only via some IRB assessment. Only research entailing no greater than minimal risk is eligible for “exempt” or “expedited” review. The latter means assessment by either the IRB chairperson or his/her designee from among Board members. This reviewer may not disapprove the proposal, but may require changes to its design. Projects that entail greater than minimal risk require “convened” (i.e., full) Board review.
  • Exempt category: Methods. Survey and interview research and observation of public behavior are exempt from full review if the data so obtained do not identify individuals and would not place them at risk of “criminal or civil liability or be damaging to the subjects’ financial standing, employability, or reputation” if their responses were to be revealed “outside of the research” (CFR 46.101(b)(2)(ii)). Observing public behaviors as political events take place (think: “Occupy”) is central to political science research. Because normal IRB review may delay the start of such research, some IRBs have an “Agreement for Public Ethnographic Studies” that allows observation to begin almost immediately, possibly subject to certain stipulations.
  • Exempt category: Public officials. IRB policy explicitly exempts surveys, interviews, and public observation involving “elected or appointed public officials or candidates for public office” (45 CFR §46.101(b)(3))—although who, precisely, is an “appointed public official” is not clear. This exemption means that researchers studying public officials using any of these three methods might—in complete compliance with the federal code—put them at risk for “criminal or civil liability” or damage their “financial standing, employability, or reputation” (CFR §46.101(b)(2)). The policy is consistent with legal understandings that public figures bear different burdens than private citizens.
  • Exempt category: Existing data. Federal policy exempts from full review “[r]esearch involving the collection or study of existing data, documents, [or] records, … if these sources are publicly available or if the information is recorded by the investigator in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects” (§46.101(b)(4)). However, university IRBs vary considerably in how they treat existing quantitative datasets, such as the Inter-University Consortium for Political and Social Research collection (see icpsr.umich.edu/icpsrweb/ICPSR/irb/). Some universities require researchers to obtain IRB approval to use any datasets not on a preapproved list even if those datasets incorporate a responsible use statement.
  • Unchecking the box.” The “box” in question—in the Federal-wide Assurance form that universities file with OHRP registering their intention to apply IRB regulations to all human subjects research conducted by employees and students, regardless of funding source—when “unchecked” indicates that the IRB will omit from review any research funded by sources other than the HHS (thereby limiting OHRP jurisdiction over such studies). IRB administrators may still, however, require proposals for unfunded research to be reviewed.

Procedural matters: Non-experimental field research

The experimental research design model informing IRB policy creation and constituting the design most familiar to policy-makers, Board members and staff means that field researchers face particular challenges in IRB review.

As the forms and online application sites developed for campus IRB uses reflect this policy history, some of their language is irrelevant for non-experimental field research designs (e.g., the number of participants to be “enrolled” in a study or “inclusion” and “exclusion” criteria, features of laboratory experiments or medical randomized controlled clinical trials). Those templates can be frustrating for researchers trying to fit them to field designs. Although that might seem expeditious, conforming to language that does not fit the methodology of the proposed research can lead field researchers to distort the character of their research.

IRB policy generally requires researchers to inform potential participants—to “consent” them—about the scope of both the research and its potential harms, whether physical, mental, financial or reputational. Potential subjects also need to be consented about possible identity revelations that could render them subject to criminal or civil prosecution (e.g., the unintentional public revelation of undocumented workers’ identities). Central to the consent process is the concern that potential participants not be coerced into participating and understand that they may stop their involvement at any time. Not always well known is that federal code allows more flexibility than some local Boards consider. For minimal risk research, it allows: (a) removal of some of the standard consent elements; (b) oral consent without signed forms; (c) waiver of the consent process altogether if the “research could not practicably be carried out without the waiver or alteration” (CFR §46.116(c)(2)).

Procedural matters: General

IRB review process backlogs can pose significant time delays to the start of a research project. Adding to potential delay is many universities’ requirement that researchers complete some form of training before they submit their study for review. Such delay has implications for field researchers negotiating site “access” to begin research and for all empirical researchers receiving grants, which are usually not released until IRB approval is granted. Researchers should find out their campus IRB’s turnaround time as soon as they begin to prepare their proposals.

Collaborating with colleagues at other universities can also delay the start of a research project. Federal code explicitly allows a university to “rely upon the review of another qualified IRB…[to avoid] duplication of effort” (CFR §46.114), and some IRBs are content to have only the lead researcher proceed through her own campus review. Other Boards insist that all participating investigators clear their own campus IRBs. With respect to overseas research, solo or with foreign collaborators, although federal policy recognizes and makes allowances for international variability in ethics regulation (CFR §46.101(h)), some US IRBs require review by a foreign government or research setting or by the foreign colleague’s university’s IRB, not considering that not all universities or states, worldwide, have IRBs. Multiple review processes can make coordinated review for a jointly written proposal difficult. Add to that different Boards’ interpretations of what the code requires, and one has a classic instance of organizational coordination gone awry.

In sum

On many campuses political (and other social) scientists doing field research are faced with educating IRB members and administrative staff about the ways in which their methods differ from the experimental studies performed in hospitals and laboratories. Understanding the federal regulations can put researchers on more solid footing in pointing to permitted research practices that their local Boards may not recognize. And knowing IRB-speak can enable clearer communications between researchers and Board members and staff. Though challenging, educating staff as well as Board members potentially benefits all field researchers, graduate students in particular, some of whom have given up on field research due to IRB delays, often greater for research that does not fit the experimental model (van den Hoonard 2011).

IRB review is no guarantee that the ethical issues relevant to a particular research project will be raised. Indeed, one of our concerns is the extent to which IRB administrative processes are replacing research ethics conversations that might otherwise (and, in our view, should) be part of departmental curricula, research colloquia, and discussions with supervisors and colleagues. Moreover, significant ethical matters of particular concern to political science research are simply beyond the bounds of US IRB policy, including recognition of the ways in which current policy makes “studying up” (i.e., studying societal elites and other power holders) more difficult.

Change may still be possible. In July 2011, OHRP issued an Advanced Notice of Proposed Rulemaking, calling for comments on its proposed regulatory revisions. As of this writing, the Office has not yet announced an actual policy change (which would require its own comment period). OHRP has proposed revising several of the requirements discussed in this essay, including allowing researchers themselves to determine whether their research is “excused” (their suggested replacement for “exempt”). Because of IRB policies’ impact, we call on political scientists to monitor this matter. Although much attention has, rightly, been focused on Congressional efforts to curtail National Science Foundation funding, as IRB policy affects all research engaging human participants, it deserves as much disciplinary attention.

References

Schrag, Zachary M. 2010. Ethical imperialism: Institutional review boards and the social sciences, 1965–2009. Baltimore, MD: Johns Hopkins University Press.

Schwartz-Shea, Peregrine and Yanow, Dvora. 2014. Field research and US institutional review board policy. Betty Glad Memorial Symposium, University of Utah (March 20-21). http://poli-sci.utah.edu/2014-research-symposium.php

Stark, Laura. 2012. Behind closed doors: IRBs and the making of ethical research. Chicago: University of Chicago Press.

US Code of Federal Regulations. 2009. Title 45, Public Welfare, Department of Health and Human Services, Part 46, Protection of human subjects. www.hhs.gov/ohrp/humansubjects/guidance/45cfr46.html.

van den Hoonaard, Will. C. 2011. The seduction of ethics. Toronto: University of Toronto Press.

Yanow, Dvora and Schwartz-Shea, Peregrine. 2008. Reforming institutional review board policy. PS: Political Science & Politics 41/3, 484-94.

 

Posted in Uncategorized

Reminder: 12/1 Due Date for Special Issue on Replication

A quick reminder: if you’re planning to contribute a piece to our special issue on replication, please make sure to submit it to thepoliticalmethodologist@gmail.com before December 1, 2014!

Posted in Editorial Message | Leave a comment

Reminder: 10/24 Due Date for Int’l Methods Colloquium Presentations

A reminder: if you are interested in presenting a talk in the International Methods Colloquium this year, please fill out an application form by October 24th, 2014:

http://goo.gl/jGLsnh

The application requires a presentation title, abstract, and link to an associated manuscript. Applications are considered on a first-come, first-served basis, but scheduling begins on the 24th.

Posted in Uncategorized | Leave a comment

Advisor Corner: interview tips for ABDs on the job market

‘Tis the season for job interviews in Political Science, and departments and candidates are meeting each other for the first time all over the world.  I thought it’d be useful to talk about some job interview tips that I’ve found useful over the years, then open up the comment thread for other faculty members (and industry pros, for those considering alt-ac careers!) to contribute their own ideas.

Here are some of my own tips (most I gleaned from others, a few I learned in the school of hard knocks):

  • Give your research job talk and/or teaching demonstration many times in front of many different audiences, and try to optimize it as best you can in the time you have available.
  • Practice questions and answers for your research job talk obsessively, with many audiences. For better or worse, Q&A is how some people will decide how smart and qualified you are.
  • Have a list of questions that you want the answers to from the people working at the place you’re visiting. Make sure you ask the same questions to different people, but also don’t always ask exactly the same suite of questions repetitively.
  • Have a good understanding of what you want to accomplish, especially research-wise, in the next few years. For example, have a one minute sketch of the questions you’re hoping to answer over the next few years and what you’re planning to do to answer them. A new faculty member is an investment in the future, and the people at the institution will want to know what you think that future looks like.
  • Whenever possible, talk about science and ideas, not about people or the discipline. The more you’re talking about work, the better that things are going.
  • Don’t promise anything you can’t deliver, but also don’t be eager to rule things out. You can get yourself into trouble saying, e.g., “I can teach an environmental politics course!” if you don’t immediately know what books you’d assign. But you can also get into trouble saying you’re incapable of adapting to the department’s needs.
  • Pick a few (maybe 2 or 3) people at the interviewing institution and get to know their research well enough that you can talk to them about what they’re working on. Ideally, it’s something that you share a common interest in and can exchange ideas about. Communicating that you’re an engaged and helpful colleague is important.
  • If asked anything about your personal status (e.g., marriage or children), I would respond that “I have no commitments that would bar me from accepting this or any other faculty position.” I wish I didn’t have to bring this topic up, but I do think it’s something one needs to prepare for in advance. My own take is that, for junior candidates, it’s safest to present a one-dimensional and work-focused profile of yourself when you’re meeting a department for the first time.
  • Pack a small but thorough kit of supplies to take with you and have it with you during the entire visit. The kit should include some sort of stain-removing pen, energy bars, various medicines (headache, gastrointestinal, and nasal/throat), and multiple pens/refills.
  • Most importantly, walk into the interview confident in the knowledge that you’re a reasonable, thoughtful, friendly, and fully qualified candidate for the position. But don’t measure the offices for your drapes.

Good luck, and make sure to check out the comment thread for other tips!

Posted in Ask a Methodologist, The Discipline | Tagged , , | Leave a comment

Propose to present in the International Methods Colloquium series!

Today, I’m pleased to announce that the International Methods Colloquium (IMC) is seeking presenters to fill its inaugural AY 2014/2015 schedule! I believe that the IMC provides a great opportunity for presenters to get their work out to a very large, very interested audience.[1]

Supported by the National Science Foundation and Rice University, the IMC is a weekly on-line interactive seminar about the application of quantitative statistical methodology to the social sciences. The IMC makes it possible for scholars scattered all over the world to participate free of charge in an interactive, real-time audiovisual presentation of statistical research from the comfort of their computer, tablet, or smart phone. Audience members can ask questions of the presenter via voice call or e-mail and have them answered immediately, with the possibility for live follow-up. It’s a great opportunity for methodologists to see a new, interesting research talk every single week.

The IMC is scheduled to run every Friday from 11:00 AM until 12:00 PM Central Time during the academic year. You can learn more about the IMC at our website, www.methods-colloquium.com.

If you are interested in presenting, please fill out a short application form here: http://goo.gl/jGLsnh

The application requires a presentation title, abstract, and link to an associated manuscript. For the coming year, we will consider applications to present on a first-come, first-served basis. However, we will begin scheduling presentations for the coming year starting on October 24th, 2014. Thus, those who apply before this date will get priority with respect to their preferred presentation dates.

If you are accepted to present in the IMC, you will be contacted to set up an appropriate date and time for your presentation. Acceptance and scheduling decisions are made in consultation with members of the IMC Advisory Board. I’ll announce a schedule of speakers once one has been established, and you can sign up to receive announcements for each week’s talk here: http://goo.gl/VfJPUk

If you have questions about the application process or how IMC talks are conducted, feel free to contact me directly at justin@justinesarey.com.

[1] Disclosure: I am the Principal Investigator on the grant that funds this project, so I guess it’s not surprising that I think it’s pretty great.

 

Posted in Call for Papers / Conference, Statistics | 1 Comment

Call for Papers: TPM Special Issue on Replication

The editors of The Political Methodologist are calling for papers for a special issue of TPM addressing the replication of empirical research in political and social science!

Replication has recently become a frequent and somewhat controversial topic in the social sciences generally and political science specifically. Many issues remain unresolved, including how and why replications are to be conducted, interpreted, and published–and whether any of these efforts should be undertaken at all.

To further this conversation, the editors of The Political Methodologist are devoting a special issue to replication in political and social science. Topics addressed may include, but are not limited to, the following:

  • the different ways that studies can be replicated
  • what each of form of replication can accomplish
  • what successful and failed replications mean
  • how replications should be handled as a part of the publication process
  • an author’s responsibility to enable replication of his/her research
  • software and instructional resources to facilitate replication
  • the role of replication in graduate or undergraduate education

Submissions should be between 2000-4000 words, and should be sent to thepoliticalmethodologist@gmail.com by December 1, 2014. Accepted articles will be featured on our blog, and also in the print edition of TPM.

If you’re interested in contributing to the special issue and would like to talk about prospective contributions before writing/submitting, please feel free to contact Justin Esarey (justin@justinesarey.com) or any of the associate editors of TPM.

Posted in Call for Papers / Conference, Editorial Message, Replication | 1 Comment

Building and Maintaining R Packages with devtools and roxygen2

This post is co-authored by Jacob Montgomery of Washington University in St. Louis and Ryan T. Moore of American University

This post summarizes our full TPM article, available at this link.

Political methodologists increasingly develop complex computer code for data processing, statistical analysis, and data visualization — code that is intended for eventual distribution to collaborators and readers, and for storage in replication archives. This code can involve multiple functions stored in many files, which can be difficult for others to read, use, or modify.

For researchers working in R , creating package is an attractive option for organizing and distributing complex code. A basic R package consists of a set of functions, documentation, and some metadata. Other components, such as datasets, demos, or compiled code may also be included. Turning all of this into a formal R package makes it easy to distribute it to other scholars either via the Comprehensive R Archiving Network (CRAN) or simply as a compressed folder.

However, transforming R code into a package can a tedious process requiring the generation and organization of files, metadata, and other information in a manner that conforms to R package standards. It can be particularly difficult for users less experienced with R’s technical underpinnings.

Here, we discuss two packages designed to streamline the package development process — devtools and roxygen2.


Building an example package: squaresPack

Readers unfamiliar with the basic structure of an R packages, may wish to consult our full article.  Here, we build a toy package called squaresPack using the code stored here.

R package development requires building a directory of files that include the R code, documentation, and two specific files containing required metadata. (The canonical source for information on package development for R is the extensive and sometimes daunting document, Writing R Extensions.)

As an example, imagine that we wish to create a simple package containing only the following two functions.

## Function 1: Sum of squares
addSquares <- function(x, y){
  return(list(square=(x^2 + y^2), x = x, y = y))
}

## Function 2: Difference of squares
subtractSquares <- function(x, y){
  return(list(square=(x^2 - y^2), x = x, y = y))
}

Here is an example of how the directory for a simple package should be structured.

DirTree1

First, we store all R source code in the subdirectory R.  Second, corresponding documentation should accompany all functions that users can call. This documentation is stored in the subdirectory labeled man.  As an example, the file addSquares.Rd would be laid out as follows.

\name{addSquares}
\alias{addSquares}
\title{Adding squared values}
\usage{
  addSquares(x, y)
}
\arguments{
 \item{x}{A numeric object.}
 \item{y}{A numeric object with the same dimensionality as \code{x}.}
}
\value{
  A list with the elements
 \item{squares}{The sum of the  squared values.}
 \item{x}{The first object input.}
 \item{y}{The second object input.}
}
\description{
  Finds the squared sum of numbers.
}
\note{
  This is a very simple function.
}
\examples{
myX <- c(20, 3);  myY <- c(-2, 4.1)
addSquares(myX, myY)
}
\author{
  Jacob M. Montgomery
}

Third, the directory must contain a file named DESCRIPTION that documents the directory in a specific way. The DESCRIPTION file contains basic information including the package name, the formal title, the current version number, the date for the version release, and the name of the author and maintainer. Here we also specify any dependencies on other R packages and list the files in the R subdirectory.

Package: squaresPack
Title: Adding and subtracting squared values
Version: 0.1
Author: Jacob M. Montgomery and Ryan T. Moore
Maintainer: Ryan T. Mooore <rtm@american.edu>
Description: Find sum and difference of squared values
Depends: R (>= 3.1.0)
License: GPL (> = 2)
Collate:
`addSquares.R'
`subtractSquares.R'

Finally, the NAMESPACE file is a list of commands that are run by R when the package is loaded to make the R functions, classes, and methods defined in the package visible to R and the user. This is a much more cumbersome process when class structures and methods must be declared, as we discuss briefly below. For the present example, the
NAMESPACE file is quite simple, telling R to allow the user to call our two functions.

export(addSquares)
export(subtractSquares)

Once all of that is set up, however, several steps remain.  A minimal checklist for updating a package and submitting it to CRAN might look like the following:

  1. Edit DESCRIPTION file
  2. Change R code and/or data files.
  3. Edit NAMESPACE file
  4. Update man files
  5. R CMD build –resave-data=no pkg
  6. R CMD check pkg
  7. R CMD INSTALL pkg
  8. Build Windows version to ensure compliance by submitting to: http://win-builder.r-project.org/
  9. Upload to CRAN (Terminal below, or use other FTP client):
    > ftp cran.r-project.org
    > cd incoming
    > put pkg_0.1-1.tar.gz
  10. Email R-core team: cran@r-project.org

We have been part of writing four R packages over the course of the last six years. In order to keep track of all the manual updating steps, one of us created an 17-point checklist outlining the steps required each time a package is edited, and we expect that most authors will welcome some automation.   The packages devtools and roxygen2 promise to improve upon this hands-on maintenance and allow authors to focus more on improving the functionality and documentation of their package rather than on bookkeeping.


 Building with devtools and roxygen2

The devtools approach streamlines several steps: it creates and updates appropriate documentation files; it eliminates the need to leave R to build and check the package from the terminal prompt; and it submits the package to win-builder and CRAN and emails the R-core team from within R itself.   After the initial directory structure is created, the only files that are edited directly by the author are contained in the R directory (with one exception — the DESCRIPTION file should be reviewed before the package is released). This is possible because devtools automates the writing of the help files, the NAMESPACE file, and updating of the DESCRIPTION file relying on information placed directly in *.R files.

We will provide some examples below, but here is a helpful video we recently discovered that covers some of the same ground for users of RStudio:

There are several advantages to developing code with devtools, but the main benefit is improved workflow. For instance, adding a new function to the package using more manual methods means creating the code in a *.R file stored in the R subdirectory, specifying the attendant documentation as a *.Rd file in the man subdirectory, and then updating the DESCRIPTION and NAMESPACE files. In contrast, developing new functions with devtools requires only editing a single *.R file, wherein the function and its documentation are written simultaneously. devtools then updates the documentation, and package metadata with no further attention.

Thus, one key advantage of using devtools to develop a package is that the R files will themselves contain the information for generating help files and updating metadata files. Each function is accompanied by detailed comments that are parsed and used to update the other files. As an example, here we show the addSquares.R file as it should be written to create the same help files and NAMESPACE files shown above.

#' Adding squared values
#'
#' Finds the sum of squared numbers.
#'
#' @param x A numeric object.
#' @param y A numeric object with the same dimensionality as \code{x}.
#'
#' @return A list with the elements
#' \item{squares}{The sum of the squared values.}
#' \item{x}{The first object input.}
#' \item{y}{The second object input.}
#' @author Jacob M. Montgomery
#' @note This is a very simple function.
#' @examples
#'
#' myX <- c(20, 3)
#' myY <- c(-2, 4.1)
#' addSquares(myX, myY)
#' @rdname addSquares
#' @export
addSquares<- function(x, y){
   return(list(square=(x^2 + y^2), x = x, y = y))
}

The text following the #’ symbols is processed by R during package creation to make the *.Rd and NAMESPACE files. The @param, @return, @author, @note, @examples, and @seealso commands specify the corresponding block in the help file. The @rdname block overrides the default setting to specify the name of the associated help file, and @export instructs R to add the necessary commands to the NAMESPACE file. We now walk through the steps required to initialize and maintain a package with devtools.


Setting up the package

Creating an R package from these augmented *.R files is straightforward. First, we must create the basic directory structure using

setwd("~/Desktop/MyPackage/") ## Set the working directory
create("squaresPack")

Second, we edit the DESCRIPTION file to make sure it contains the correct version, package name, dependencies, licensing, and authorship of the package. The create() call will produce a template for you to fill in. The author will need to add something like

Author: Me
Maintainer: Me@myemail.edu

to this template DESCRIPTION file. You need not keep track of the various R files to be collated; devtools will automatically collate all R files contained in the various subdirectories. Third, place the relevant R scripts in the R directory. Finally, making sure that the working directory is correctly set, we can create and document the package using three commands:

current.code <- as.package("squaresPack")
load_all(current.code)
document(current.code)

The as.package() command will load the package and create an object representation (\texttt{current.code}) of the entire package in the user’s workspace. The load_all() command will load all of the R files from the package into the user’s workspace as if the package was already installed. The document() command will create the required documentation files for each function and the package, as well as update the NAMESPACE and DESCRIPTION files.


Sharing the package

Once all of this is in place, the author prepares the package for wider release from within R itself. To build the package as a compressed file in your working directory, run build(current.code, path=getwd()). The analogous build_win() command will upload your package to the win-builder website. Your package will be built in a Windows environment and an email will be sent to the address of the maintainer in the DESCRIPTION file with results in about thirty minutes. Both of these compressed files can be uploaded onto websites, sent by email, or stored in replication archives. Other users can simply download the package and install it locally.

The list below provides a minimal checklist for editing and submitting an existing R package using devtools.

  1. Edit R code and/or data files
  2. Run as.package(), load_all(), and document()
  3. Check the code: check(current.code)
  4. Make a Windows build: build_win(current.code)
  5. Double-check the DESCRIPTION file
  6. Submit the package to CRAN: release(current.code, check=FALSE)

The check() command is analogous to the R CMD check from the terminal, but it also (re)builds the package. Assuming that the package passes all of the required checks, it is now ready for submission to CRAN. As a final precaution, we recommend taking a moment to visually inspect the DESCRIPTION file one last time to ensure that it contains the correct email address for the maintainer and the correct release version. Finally, the release() command will submit the package via FTP and open up the required email using the computer’s default email client.


Conclusion

We have outlined the components of a simple R package and two approaches for developing and maintaining them. In particular, we illustrated how the devtools package can aid package authors in package maintenance by automating several steps of the process. The package allows authors to focus on only editing *.R files since both documentation and metadata files are updated automatically. The package also automates several steps such as submission to CRAN via ftp.

While we believe that the devtools approach to creating and managing R packages offers several advantages, there are potential drawbacks. We routinely use other of Hadley Wickham’s excellent packages, such as reshape, plyr, lubridate, and ggplot2. On one hand, each of them offers automation that greatly speeds up complex processes such as attractively displaying high-dimensional data. However, it can also take time to learn a new syntax for old tricks (like specifying x and y limits for a plot). Such frustrations may make package writers hesitant to give up full control from a more manual maintenance system. By making one’s R code conform to the requirements of the devtools workflow, one loses some degree of flexibility.

Yet, devtools makes it simpler to execute the required steps efficiently. It promises to smoothly integrate package development and checks, cut out the need to switch between R and the command line, and greatly reduce the number of files and directories that must be manually edited. Moreover, the latest release of the package contains many further refinements. It is possible, for instance, to build packages directly from GitHub repositories, create vignettes, and create clean environments for code development. Thus, while developing R packages and code in a manner consistent with devtools does require re-learning some basic techniques, we believe that it comes with significant advantages for speeding up development while reducing the degree of frustration commonly associated with transforming a batch of code into a package.

Posted in Uncategorized | 3 Comments