Call for Papers: TPM Special Issue on Replication

The editors of The Political Methodologist are calling for papers for a special issue of TPM addressing the replication of empirical research in political and social science!

Replication has recently become a frequent and somewhat controversial topic in the social sciences generally and political science specifically. Many issues remain unresolved, including how and why replications are to be conducted, interpreted, and published–and whether any of these efforts should be undertaken at all.

To further this conversation, the editors of The Political Methodologist are devoting a special issue to replication in political and social science. Topics addressed may include, but are not limited to, the following:

  • the different ways that studies can be replicated
  • what each of form of replication can accomplish
  • what successful and failed replications mean
  • how replications should be handled as a part of the publication process
  • an author’s responsibility to enable replication of his/her research
  • software and instructional resources to facilitate replication
  • the role of replication in graduate or undergraduate education

Submissions should be between 2000-4000 words, and should be sent to thepoliticalmethodologist@gmail.com by December 1, 2014. Accepted articles will be featured on our blog, and also in the print edition of TPM.

If you’re interested in contributing to the special issue and would like to talk about prospective contributions before writing/submitting, please feel free to contact Justin Esarey (justin@justinesarey.com) or any of the associate editors of TPM.

Posted in Call for Papers / Conference, Editorial Message, Replication | Leave a comment

Building and Maintaining R Packages with devtools and roxygen2

This post is co-authored by Jacob Montgomery of Washington University in St. Louis and Ryan T. Moore of American University

This post summarizes our full TPM article, available at this link.

Political methodologists increasingly develop complex computer code for data processing, statistical analysis, and data visualization — code that is intended for eventual distribution to collaborators and readers, and for storage in replication archives. This code can involve multiple functions stored in many files, which can be difficult for others to read, use, or modify.

For researchers working in R , creating package is an attractive option for organizing and distributing complex code. A basic R package consists of a set of functions, documentation, and some metadata. Other components, such as datasets, demos, or compiled code may also be included. Turning all of this into a formal R package makes it easy to distribute it to other scholars either via the Comprehensive R Archiving Network (CRAN) or simply as a compressed folder.

However, transforming R code into a package can a tedious process requiring the generation and organization of files, metadata, and other information in a manner that conforms to R package standards. It can be particularly difficult for users less experienced with R’s technical underpinnings.

Here, we discuss two packages designed to streamline the package development process — devtools and roxygen2.


Building an example package: squaresPack

Readers unfamiliar with the basic structure of an R packages, may wish to consult our full article.  Here, we build a toy package called squaresPack using the code stored here.

R package development requires building a directory of files that include the R code, documentation, and two specific files containing required metadata. (The canonical source for information on package development for R is the extensive and sometimes daunting document, Writing R Extensions.)

As an example, imagine that we wish to create a simple package containing only the following two functions.

## Function 1: Sum of squares
addSquares <- function(x, y){
  return(list(square=(x^2 + y^2), x = x, y = y))
}

## Function 2: Difference of squares
subtractSquares <- function(x, y){
  return(list(square=(x^2 - y^2), x = x, y = y))
}

Here is an example of how the directory for a simple package should be structured.

DirTree1

First, we store all R source code in the subdirectory R.  Second, corresponding documentation should accompany all functions that users can call. This documentation is stored in the subdirectory labeled man.  As an example, the file addSquares.Rd would be laid out as follows.

\name{addSquares}
\alias{addSquares}
\title{Adding squared values}
\usage{
  addSquares(x, y)
}
\arguments{
 \item{x}{A numeric object.}
 \item{y}{A numeric object with the same dimensionality as \code{x}.}
}
\value{
  A list with the elements
 \item{squares}{The sum of the  squared values.}
 \item{x}{The first object input.}
 \item{y}{The second object input.}
}
\description{
  Finds the squared sum of numbers.
}
\note{
  This is a very simple function.
}
\examples{
myX <- c(20, 3);  myY <- c(-2, 4.1)
addSquares(myX, myY)
}
\author{
  Jacob M. Montgomery
}

Third, the directory must contain a file named DESCRIPTION that documents the directory in a specific way. The DESCRIPTION file contains basic information including the package name, the formal title, the current version number, the date for the version release, and the name of the author and maintainer. Here we also specify any dependencies on other R packages and list the files in the R subdirectory.

Package: squaresPack
Title: Adding and subtracting squared values
Version: 0.1
Author: Jacob M. Montgomery and Ryan T. Moore
Maintainer: Ryan T. Mooore <rtm@american.edu>
Description: Find sum and difference of squared values
Depends: R (>= 3.1.0)
License: GPL (> = 2)
Collate:
`addSquares.R'
`subtractSquares.R'

Finally, the NAMESPACE file is a list of commands that are run by R when the package is loaded to make the R functions, classes, and methods defined in the package visible to R and the user. This is a much more cumbersome process when class structures and methods must be declared, as we discuss briefly below. For the present example, the
NAMESPACE file is quite simple, telling R to allow the user to call our two functions.

export(addSquares)
export(subtractSquares)

Once all of that is set up, however, several steps remain.  A minimal checklist for updating a package and submitting it to CRAN might look like the following:

  1. Edit DESCRIPTION file
  2. Change R code and/or data files.
  3. Edit NAMESPACE file
  4. Update man files
  5. R CMD build –resave-data=no pkg
  6. R CMD check pkg
  7. R CMD INSTALL pkg
  8. Build Windows version to ensure compliance by submitting to: http://win-builder.r-project.org/
  9. Upload to CRAN (Terminal below, or use other FTP client):
    > ftp cran.r-project.org
    > cd incoming
    > put pkg_0.1-1.tar.gz
  10. Email R-core team: cran@r-project.org

We have been part of writing four R packages over the course of the last six years. In order to keep track of all the manual updating steps, one of us created an 17-point checklist outlining the steps required each time a package is edited, and we expect that most authors will welcome some automation.   The packages devtools and roxygen2 promise to improve upon this hands-on maintenance and allow authors to focus more on improving the functionality and documentation of their package rather than on bookkeeping.


 Building with devtools and roxygen2

The devtools approach streamlines several steps: it creates and updates appropriate documentation files; it eliminates the need to leave R to build and check the package from the terminal prompt; and it submits the package to win-builder and CRAN and emails the R-core team from within R itself.   After the initial directory structure is created, the only files that are edited directly by the author are contained in the R directory (with one exception — the DESCRIPTION file should be reviewed before the package is released). This is possible because devtools automates the writing of the help files, the NAMESPACE file, and updating of the DESCRIPTION file relying on information placed directly in *.R files.

We will provide some examples below, but here is a helpful video we recently discovered that covers some of the same ground for users of RStudio:

There are several advantages to developing code with devtools, but the main benefit is improved workflow. For instance, adding a new function to the package using more manual methods means creating the code in a *.R file stored in the R subdirectory, specifying the attendant documentation as a *.Rd file in the man subdirectory, and then updating the DESCRIPTION and NAMESPACE files. In contrast, developing new functions with devtools requires only editing a single *.R file, wherein the function and its documentation are written simultaneously. devtools then updates the documentation, and package metadata with no further attention.

Thus, one key advantage of using devtools to develop a package is that the R files will themselves contain the information for generating help files and updating metadata files. Each function is accompanied by detailed comments that are parsed and used to update the other files. As an example, here we show the addSquares.R file as it should be written to create the same help files and NAMESPACE files shown above.

#' Adding squared values
#'
#' Finds the sum of squared numbers.
#'
#' @param x A numeric object.
#' @param y A numeric object with the same dimensionality as \code{x}.
#'
#' @return A list with the elements
#' \item{squares}{The sum of the squared values.}
#' \item{x}{The first object input.}
#' \item{y}{The second object input.}
#' @author Jacob M. Montgomery
#' @note This is a very simple function.
#' @examples
#'
#' myX <- c(20, 3)
#' myY <- c(-2, 4.1)
#' addSquares(myX, myY)
#' @rdname addSquares
#' @export
addSquares<- function(x, y){
   return(list(square=(x^2 + y^2), x = x, y = y))
}

The text following the #’ symbols is processed by R during package creation to make the *.Rd and NAMESPACE files. The @param, @return, @author, @note, @examples, and @seealso commands specify the corresponding block in the help file. The @rdname block overrides the default setting to specify the name of the associated help file, and @export instructs R to add the necessary commands to the NAMESPACE file. We now walk through the steps required to initialize and maintain a package with devtools.


Setting up the package

Creating an R package from these augmented *.R files is straightforward. First, we must create the basic directory structure using

setwd("~/Desktop/MyPackage/") ## Set the working directory
create("squaresPack")

Second, we edit the DESCRIPTION file to make sure it contains the correct version, package name, dependencies, licensing, and authorship of the package. The create() call will produce a template for you to fill in. The author will need to add something like

Author: Me
Maintainer: Me@myemail.edu

to this template DESCRIPTION file. You need not keep track of the various R files to be collated; devtools will automatically collate all R files contained in the various subdirectories. Third, place the relevant R scripts in the R directory. Finally, making sure that the working directory is correctly set, we can create and document the package using three commands:

current.code <- as.package("squaresPack")
load_all(current.code)
document(current.code)

The as.package() command will load the package and create an object representation (\texttt{current.code}) of the entire package in the user’s workspace. The load_all() command will load all of the R files from the package into the user’s workspace as if the package was already installed. The document() command will create the required documentation files for each function and the package, as well as update the NAMESPACE and DESCRIPTION files.


Sharing the package

Once all of this is in place, the author prepares the package for wider release from within R itself. To build the package as a compressed file in your working directory, run build(current.code, path=getwd()). The analogous build_win() command will upload your package to the win-builder website. Your package will be built in a Windows environment and an email will be sent to the address of the maintainer in the DESCRIPTION file with results in about thirty minutes. Both of these compressed files can be uploaded onto websites, sent by email, or stored in replication archives. Other users can simply download the package and install it locally.

The list below provides a minimal checklist for editing and submitting an existing R package using devtools.

  1. Edit R code and/or data files
  2. Run as.package(), load_all(), and document()
  3. Check the code: check(current.code)
  4. Make a Windows build: build_win(current.code)
  5. Double-check the DESCRIPTION file
  6. Submit the package to CRAN: release(current.code, check=FALSE)

The check() command is analogous to the R CMD check from the terminal, but it also (re)builds the package. Assuming that the package passes all of the required checks, it is now ready for submission to CRAN. As a final precaution, we recommend taking a moment to visually inspect the DESCRIPTION file one last time to ensure that it contains the correct email address for the maintainer and the correct release version. Finally, the release() command will submit the package via FTP and open up the required email using the computer’s default email client.


Conclusion

We have outlined the components of a simple R package and two approaches for developing and maintaining them. In particular, we illustrated how the devtools package can aid package authors in package maintenance by automating several steps of the process. The package allows authors to focus on only editing *.R files since both documentation and metadata files are updated automatically. The package also automates several steps such as submission to CRAN via ftp.

While we believe that the devtools approach to creating and managing R packages offers several advantages, there are potential drawbacks. We routinely use other of Hadley Wickham’s excellent packages, such as reshape, plyr, lubridate, and ggplot2. On one hand, each of them offers automation that greatly speeds up complex processes such as attractively displaying high-dimensional data. However, it can also take time to learn a new syntax for old tricks (like specifying x and y limits for a plot). Such frustrations may make package writers hesitant to give up full control from a more manual maintenance system. By making one’s R code conform to the requirements of the devtools workflow, one loses some degree of flexibility.

Yet, devtools makes it simpler to execute the required steps efficiently. It promises to smoothly integrate package development and checks, cut out the need to switch between R and the command line, and greatly reduce the number of files and directories that must be manually edited. Moreover, the latest release of the package contains many further refinements. It is possible, for instance, to build packages directly from GitHub repositories, create vignettes, and create clean environments for code development. Thus, while developing R packages and code in a manner consistent with devtools does require re-learning some basic techniques, we believe that it comes with significant advantages for speeding up development while reducing the degree of frustration commonly associated with transforming a batch of code into a package.

Posted in Uncategorized | 3 Comments

What does a failed replication really mean? (or, One cheer for Jason Mitchell)

A few weeks ago, Jason Mitchell wrote a piece entitled “On the emptiness of failed replications.” Mitchell is a professor in Harvard University’s department of Psychology studying “the cognitive processes that support inferences about the psychological states of other people and introspective awareness of the self.” In practice, this means his lab spends a lot of time doing experiments with fMRI machines.

It is worth saying at the outset that I don’t agree with Mitchell’s core claim: unlike him, I believe that failed replications can have a great deal of scientific value. However, I believe that there is a grain of truth in his argument that we should consider. Namely, I think that failed replications should be thought about in the same way that we think about the initial successful experiment: skeptically. A positive result is not proof of success, and a failed replication is not proof of failure; the interpretation of both must be uncertain and ambiguous at first. Unfortunately, I have observed that most of us (even highly trained scientists) find it hard to make that kind of uncertainty a part of our everyday thinking and communicating.

The thesis of Mitchell’s argument is summarized in his opening paragraph:

Recent hand-wringing over failed replications in social psychology is largely pointless, because unsuccessful experiments have no meaningful scientific value. Because experiments can be undermined by a vast number of practical mistakes, the likeliest explanation for any failed replication will always be that the replicator bungled something along the way.  Unless direct replications are conducted by flawless experimenters, nothing interesting can be learned from them.

Why do I believe that Mitchell’s claim about the scientific value of failed replication is wrong? Lots of other statisticians and researchers have explained why very clearly, so I will just link to their work and present a brief sketch of two points:

  1. Failed replications have proved extremely informative in resolving past scientific controversies.
  2. Sampling variation and statistical noise cannot be definitively excluded as explanations for a “successful experiment” without a very large sample and/or replication.

First, the consistent failure to replicate an initial experiment has often proven informative about what we could learn from that initial experiment (often, very little). Consider as one example the Fleischmann–Pons experiment apparently demonstrating cold fusion. Taken at face value, this experiment would seem to change our theoretical understanding about how nuclear fusion works. It would also seem to necessitate the undertaking of intense scientific and engineering study of the technique to improve it for commercial and scientific use. But if we add the fact that no other scientists could ever make this experiment work, despite sustained effort by multiple teams, then the conclusion is much different and much simpler: Flesichmann and Pons’ experiment was flawed.

Second, and relatedly, Mitchell seems to admit multiple explanations for a failed replication (error, bias, imperfect procedure) but only one explanation for the initial affirmative result (the experiment produced the observed relationship):

To put a fine point on this: if a replication effort were to be capable of identifying empirically questionable results, it would have to employ flawless experimenters. Otherwise, how do we identify replications that fail simply because of undetected experimenter error? When an experiment succeeds, we can celebrate that the phenomenon survived these all-too-frequent shortcomings. But when an experiment fails, we can only wallow in uncertainty about whether a phenomenon simply does not exist or, rather, whether we were just a bit too human that time around.

The point of conducting statistical significance testing is to exclude another explanation for a successful experiment: random noise and/or sampling variation produced an apparent result where none in fact exists. There is also some evidence to suggest that researchers consciously or unconsciously make choices in their research that improve the possibility of passing a statistical significance test under the null (so-called “p-hacking”).

Perhaps Mitchell believes that passing a statistical significance test and peer review definitively rules out alternative explanations for an affirmative result. Unfortunately, that isn’t necessarily the case. In joint work with Ahra Wu I find evidence that, even under ideal conditions (with no p-hacking, misspecification bias, etc.), statistical significance testing cannot prevent excess false positives from permeating the published literature. The reason is that, if most research projects are ultimately chasing blind alleys, the filter imposed by significance testing is not discriminating enough to prevent many false positives from being published. The result is one of the forms of “publication bias.”

Yet despite all this, I think there is a kernel of insight in Mitchell’s argument. I think the recent interest in replication is part of a justifiably greater skepticism that we are applying to new discoveries. But we should also apply that greater skepticism to isolated reports of failed replication–and for many of the same reasons. Allow me to give one example.

One source of publication bias is the so-called “file drawer problem,” whereby studies of some phenomenon that produce null results never get published (or even submitted); thus, false positive results (that do get published) are never placed into their proper context. But this phenomenon is driven by the fact that evidence in favor of new theories is considered more scientifically important than evidence against theories without a wide following. But if concern about false positives in the literature becomes widespread, then replications that contradict a published result may become more scientifically noteworthy than replications that confirm that result. Thus, we may become primed to see (and publish) falsifying results and to ignore confirmatory results. The problem is the same as the file drawer problem, but in reverse.

Even if we do our best to publish and take note of all results, we can reasonably expect many replications to be false negatives. To demonstrate this, I’ve created a simulation of the publication/replication process. First, a true relationship (b) is drawn from the distribution of underlying relationships in a population of potential research studies; this population has pr.null proportion of relationships where b = 0. My initial simulation sets pr.null = 0 for demonstrative purposes; thus, b comes from the uniform density between [-2, -1] and [1, 2]. (I extracted the values between (-1, 1) to remove the possibility of small, noise-dominated relationships; the reason why will become clear once I present the results.) Then, I simulate an estimate produced by a study of this relationship with noise and/or sampling variation (= b.est) by adding b and an error term drawn from the normal distribution with mean = 0 and standard error = se.b, which is set to 0.5 in my initial run. If the resulting coefficient is statistically significant, then I replicate this study by drawing another estimate (b.rep) using the same process above.

However, I also allow for the possibility of “biased” replications that favor the null; this is represented by moving the b.rep coefficient a certain number of standard deviations closer to zero. The initial setting for bias is 0.5*se.b, meaning that I presume that a motivated researcher can move a replicated result closer to the null by 1/2 of a standard deviation via making advantageous choices in the data collection and analysis. In short, I allow for “p-hacking” in the process, but p-hacking that is designed to handicap the result rather than advantage it. The idea is that motivated researchers trying to debunk a published claim may (consciously or unconcsiously) pursue this result.

The code to execute this simulation in R is shown here:

set.seed(123456)
rm(list=ls())

se.b <- 0.5       # std. error of est. beta
reps <- 1000      # number of MC runs
bias <- 0.5*se.b  # degree of replicator null bias
pr.null <- 0      # prior Pr(null hypothesis)

# where to store true, est., and replicated results
b.store <- matrix(data=NA, nrow=reps, ncol=3)
# where to store significance of est. and replicated betas
sig.store <- matrix(data=NA, nrow=reps, ncol=2)

pb <- txtProgressBar(init=0, min=1, max=reps, style=3)
for(i in 1:reps){
 
  setTxtProgressBar(pb, value=i)

  # draw the true value of beta
  if(runif(1) < pr.null){
    b <- 0
  }else{
    b <- sign(runif(1, min=-1, max=1))*runif(1, min=1, max=2)
  }
 
  # simulate an estimated beta
  b.est <- b + rnorm(1, mean=0, sd=se.b)
 
  # calculate if est. beta is statistically significant
  if( abs(b.est / se.b) >= 1.96){sig.init <- 1}else{sig.init <- 0}
 
  # if the est. beta is stat. sig., replicate
  if( sig.init == 1 ){
    
    # draw another beta, with replicator bias
    b.rep <- b + rnorm(1, mean=0, sd=se.b) - sign(b)*bias
    # check if replicated beta is stat. sig.
    if( abs(b.rep / se.b) >= 1.96){sig.rep <- 1}else{sig.rep <- 0}
 
  }else{b.rep <- NA; sig.rep <- NA}
 
  # store the results
  b.store[i, ] <- c(b, b.est, b.rep)
  sig.store[i, ] <- c(sig.init, sig.rep)
 
}
close(pb)

# plot estimated vs. replicated results
plot(b.store[,2], b.store[,3], xlab = "initial estimated beta", ylab = "replicated beta")
abline(0,1)
abline(h = 1.96*se.b, lty=2)
abline(h = -1.96*se.b, lty=2)

dev.copy2pdf(file="replication-result.pdf")

# false replication failure rate
1 - sum(sig.store[,2], na.rm=T)/sum(is.na(sig.store[,2])==F)

What do we find? In this scenario, about 30% of replicated results are false negatives; that is, the replication study finds no effect where an effect actually exists. Furthermore, these excess false negatives cannot be attributed to small relationships that cannot be reliably detected in an underpowered study; this is why I extracted the values of b between (-1, 1) from the prior distribution of the relationship b.

So: I believe that it is important not to replace one suboptimal regime (privileging statistically significant and surprising findings) with another (privileging replications that appear to refute a prominent theory). This is why many of the people advocating replication are in favor of something like a results-blind publication regime, wherein no filter is imposed on the publication process. As Ahra and I point out, that idea has its own problems (e.g., it might creates an enormous unpaid burden on reviewers, and might also force scientists to process an enormous amount of low-value information on null results).

In summary: I think the lesson to draw from the publication bias literature, Mitchell’s essay, and the simulation result above is: the prudent course is to be skeptical of any isolated result until it has been vetted multiple times and in multiple contexts. Unexpected and statistically significant relationships discovered in new research should be treated as promising leads, not settled conclusions. Statistical evidence against a conclusion should be treated as reason for doubt, but not a debunking.

Posted in Replication, Statistics, The Discipline | Leave a comment

Recruiting and Placing Political Science Majors in Successful Careers

Prof. Scott McClurg of the University of Southern Illinois-Carbondale recently posted some thoughts about the political science major, and his suggestions for reform bear directly on methodologists. Specifically, Scott suggests substantially beefing up the applied statistics portion of our training:

But the idea we teach critical thinking skills — which we do when we are at our best — is a losing argument with the public. We can preach until we are blue in the face that employers care about good writing, creative thinking, problem solving, etc., but liberal arts majors still look like a risk to students who now have to pay more for their education.  And all this despite employers saying they WANT those traits in their employers.

…This doesn’t mean we only teach quant skills, but it does mean to expose them to how do some of the things we do and explain why it’s relevant.

Here’s my preliminary thinking on the topic; it isn’t fully formed, but it might get a few interesting conversations started:

  1. I think we need to differentiate between knowledge/skills that every college student needs and should be part of the general curriculum, and knowledge/skills that are peculiar to some application and should be part of a major. Critical thinking skills are (IMO) the former. Research design and applied statistics are the latter. Claiming that your major curriculum “imparts critical thinking skills” is questionable to me for this reason.
  2. I’m a methodologist, so it’s no surprise that I think our undergraduate major should include far more applied statistics. But I also think that Political Science can’t hang its hat on an applied statistics training alone; our students will always lose that race to majors in Applied Statistics and allied fields. Quantitative data analysis can’t be our (only) comparative advantage. I think we need to do more than we’re doing now, but I also think we can’t become a subfield of Applied Stats.
  3. Training in research design, inference, and epistemology might be one way we can form a comparative advantage over pure Applied Statistics majors, as they are generally more focused on programming and modeling than social science. “Big data” is definitely in demand in the modern economy, but presumably expertise in designing, collecting, and interpreting this data is at least equally important as expertise in managing and analyzing it (the Applied Stats piece).
  4. I presume we can also help our students to compete with other social science majors by helping them attain superior knowledge on topics associated with strong economic demand. Presumably this would include knowledge of: electoral politics and campaigning; mass opinion; policy analysis and impact assessment; area-specific knowledge of countries and regions including language mastery.
  5. Some aspects of education are investments in long-term success, and I don’t think we should forget about these as we think about helping our students position themselves for short-term jobs. A knowledge of political philosophy, American political development, gender politics, political psychology, and lots of other things might not help a student get a job tomorrow, but they might well help a student become the kind of person who excels in their field and rises to the highest levels. (Some things, like international conflict theory or comparative political economy, probably fit between “immediate applicability” and “long term value” depending on what a student’s interested in.)
  6. Nearly everyone gets their job through connections; the curriculum may help qualify students for more and better summer jobs/internships, but we can’t skimp on helping them to secure these vital entry points.

With all of the above factored in, I guess my dream curriculum would involve (i) the creation of a sequence of core courses in philosophy, research design, and applied statistics; (ii) a serious language fluency requirement; (iii) greater specialization internal to the major that corresponded to a career tracks (e.g., replace “American Politics” with “Elections and Campaigning” or “Judicial Politics”, and then cut out the course requirements that don’t correspond to that track).

Many departments aren’t far from this already! But getting the rest of the way could be a painful process on many fronts.

Posted in Uncategorized | 1 Comment

Scientific Updating as a Social Process

Justin Esarey:

TPM doesn’t do a great deal of “reblogging,” but I wanted to bring some attention to what I think is a brief and thought-provoking post about the process of science by Jay Ulfelder. This post speaks to what I think is a debate between those who believe that science is a set of procedures–that is to say, a way of thinking and doing research that produces sound results–and those who think of science as a social process that produces knowledge from a dialogue among people who fall far short of scientific ideals. I think you can see one instance of this ongoing debate in how people think about the ongoing “credibility revolution” in quantitative research, particularly with respect to results arguing that much published research is false.

Originally posted on Dart-Throwing Chimp:

Cognitive theories predict that even experts cope with the complexities and ambiguities of world politics by resorting to theory-driven heuristics that allow them: (a) to make confident counterfactual inferences about what would have happened had history gone down a different path (plausible pasts); (b) to generate predictions about what might yet happen (probable futures); (c) to defend both counterfactual beliefs and conditional forecasts from potentially disconfirming data. An interrelated series of studies test these predictions by assessing correlations between ideological world view and beliefs about counterfactual histories (Studies 1 and 2), experimentally manipulating the results of hypothetical archival discoveries bearing on those counterfactual beliefs (Studies 3-5), and by exploring experts’ reactions to the confirmation or disconfirmation of conditional forecasts (Studies 6-12). The results revealed that experts neutralize dissonant data and preserve confidence in their prior assessments by resorting to a complex battery of belief-system defenses that, epistemologically defensible or not…

View original 408 more words

Posted in Uncategorized | Leave a comment

Spring 2014 Edition of The Political Methodologist

We are pleased to announce the Spring 2014 edition of The Political Methodologist! This special issue pertaining to gender diversity in political methodology is guest edited by Meg Shannon of the University of Colorado, Boulder. Click this link to access the PDF file.

This edition of TPM contains contributions from Meg Shannon, Michelle Dion, Tiffany Barnes, Emily Beaulieu, Yanna Krupnikov, Hazel Morrow-Jones, Jan Box-Steffensmeier, Fred Bohemke, Chris Achen, and Rumman Chowdhury.

Posted in Uncategorized | Leave a comment

Diversity and Political Methodology: A Graduate Student’s Perspective

[Correction: a previous version of this post stated that Danica McKellar has a PhD in Mathematics. Ms. McKellar has a BS in Mathematics from UCLA.]

The following post is written by Rumman Chowdhury, a graduate student in political science at the University of California at San Diego. Chowdhury received an undergraduate degree in political science and management science from MIT, and a masters degree in quantitative methods in social sciences from Columbia University.

Graduate students, regardless of field of study, gender, race, or any other distinguishing characteristic, generally feel isolated from their peers, petrified of passing comprehensive exams and dissertation defenses, and intimidated by the academic job market looming ahead. The insecurity of being in a PhD program is compounded by being a minority. Women, as the previous posts have established, are a minority within political science, and especially within methods.

It is daunting to discuss why there are not enough women in quantitative fields. It requires identifying macro-level societal issues, honing in on micro-level individual behaviors, but couching it in the framework of a larger discussion of what it means to be a woman and have a successful career. In an attempt to simplify, I focus on two selection mechanisms which may pull women away from successful and rewarding careers in quantitative fields: selection due to early-development socialization and selection due to gender-differential social pressures.

Early development socialization

A significant amount of work has explored the gender and math question. In short, the literature illustrates a weeding-out process, whereby girls grow up in a society that explicitly and implicitly deters them from quantitative fields. We inadvertently groom your young girls to give up when faced with difficult problems, and compound that with messages that math isn’t for them.

Psychologist Carol Dweck’s Mindset establishes two types of learners: fixed and growth. Individuals of a fixed mindset believe that qualities like intelligence or ability are innate traits that can be refined, but not significantly improved on. In contrast, those with a growth mindset believe that most skills are a function of hard work and dedication, not simply talent. These mindsets are an important and distinguishing characteristic when the individual is faced with difficulties. Those with a fixed mindset are more averse to challenges, as they fear failure will define them as “not smart,” whereas those with a growth mindset are more capable of dealing with failure, as it is perceived as part of the learning process. While this mindset is seen in both genders, women are particularly susceptible to adopting a fixed mindset in math. [1] This causal mechanism has three parts.

At an early age, young girls are more developmentally advanced than young boys. Their brains develop language faster which results in young girls who are more expressive than their male counterparts [2]. This linguistic ability can be attributed to intelligence. Accordingly we provide results-based compliments to our precocious little girls, saying things like “You’re so clever!” for actions that come naturally to them. Parents also offer words of encouragement to their children who are not as capable, and little boys may hear action-based compliments, like “You’re working really hard on that!” Dweck establishes that this form of reinforcement creates a fixed mindset. Paradoxically, more capable children can internalize that intelligence is intrinsic while their less able counterparts associate ability with hard work.

The second part of the mechanism that leads young girls to believe that they are not good at math is the tired, but still extant, stereotype that math is for boys. [3] [4] Cevencek et al observe this math-gender stereotype expressed in Implicit Association Test results for children ages 6 to 10. Children associated math with male, and little boys identified more strongly with math than little girls.

The literature on mindsets is particularly salient with regard to math and science. A two-year panel study of 7th graders found that children with growth mindsets significantly outperformed children with fixed mindsets, even though they both entered the analysis period with equal prior math achievement. Growth mindsets predicted success in college-level organic chemistry, when controlling for prior math ability.  Similarly, women primed with a fixed mindset treatment performed significantly worse than women provided with a growth mindset treatment [5].

Society’s attempts to remedy this situation runs the spectrum of offensive to brilliant. The poster child of well-intended, but rather ridiculous, “math for girls” is Danica McKellar’s series “Math Doesn’t Suck.” Ms. McKellar (most famously known as Winnie Cooper from The Wonder Years), who has a BS in math and her name on a theorem, writes a series of books in pastel colored, curly fonts that sport covers that look like Cosmo. I’m surprised the I’s aren’t dotted with hearts. Her hair is always perfect, her shirt unbuttoned just so, and her head its tilted at a come-hither (to math, of course) 45-degree angle. At the other end of the spectrum, we have the promising line of girl engineering toys, GoldieBlox, created via Kickstarter by Debra Sterling, a Stanford engineer. GoldieBlox has received much press for their innovative line of fun, creative toys that encourage little girls to use applied simple math and engineering skills to solve problems – and their clever use of a Beastie Boys song.

As anyone who has ever shopped for young girls will know, GoldieBlox is the exception, rather than the rule. Girls who like math and grow up to be women in math-oriented fields are considered to be anomalies. What does this mean for women who do enter these fields, whether in academia or in the professional world? It means that we sit in classrooms where we are the only female. It means we sit in client meetings where we are mistaken for the secretary. It means our male advisors and supervisors occasionally get the wink-nudge “I see why you’ve got her working for you.” At this year’s MPSA, I was a panelist in a nearly-packed room. As I got up to speak, it occurred to me that I was one of only four women in the room, and the only person of color. We have few role models and little encouragement. Promising women are deterred from productive careers in quantitative fields, based on socialization rather than ability.

Gender-differential social pressures

Graduate school is best described as a monastic experience. We lead a sparse life, consumed by esoteric information-gathering that is appreciated by a small group. While most people respect a PhD, they generally have little understanding of what it is we do, exactly. Most of us manage to eke out a personal life, but it is generally limited by the all-encompassing nature of research and by the lack of disposable income. We continue to live in the pizza and free beer world while our non-academic counterparts move on to fancier affairs.

What does this mean for a woman in graduate school? In short, we are alienated. Most graduate students are in their twenties. The average completion time for a Political Science PhD is 6.5 years, according to the National Research Council. The average age that Americans get married is 27 for women and 29 for men – skewing slightly downward for women and slightly upward for men. The pressure for women in their twenties to get married and have children is intense.  The social expectations for women in their twenties can be at odds with what it means to be a successful graduate student.

How are we alienated by society? First, we are constantly bombarded with messages and images of what it means to be a successful woman (hint: it has little to do with R skills). A glance at any 20-something’s Facebook feed of female friends is a menagerie of engagements, first dances, and tastefully planned flower arrangements. This progresses to baby bumps, knitting projects, and first steps. Most societies applaud these accomplishments. Recently, a cousin of mine was married. In true South Asian style, the celebration was a week of elaborate parties, delicious food, and an obscene amount of gold. At the same time, her sister ranked first in her class at a competitive pharmacy graduate program. Even in my often-ostentatious society, there’s no comparable reward for scholastic achievement.

How are we alienated by our graduate programs? The monastery analogy extends further to encompass family. Maternity leave has become a recent allowance at graduate programs. While a promising boost, what many schools offer is simply an extension of in-absentia status. This can mean no pay, and the potential of being removed from student health insurance, losing visa status, or reinstating suspended student loan payments.  Expanded maternity allowance can help with the practical nature of juggling academics and family, but it does little to address the culture and perception of women who are pregnant or have children during their graduate career. Women who get married or become pregnant may be perceived as less serious, and an advisor with a limited amount of time and resources may choose to focus on more “promising” advisees. Similarly, the lack of emphasis on paternity leave reinforces the idea that the woman should be at home caring for children.

To better understand the culture, I suggest reading Anne-Marie Slaughter’s 2012 article Why Women Still Can’t Have it All. Her assessments of a work environment for mothers is parallel to the female academic experience. I often hear male colleagues complimented for being great dads was because they shoulder child care while their wives work. In contrast, a female professor once confided to me that at her first post-maternity leave faculty meeting, one of the male professors jokingly asked how her vacation was. Similarly, my female RA was hesitant to reveal her pregnancy to potential recommendation-writers, because she felt it would be counted against her. This article in Inside Higher Ed specifically addresses this issue in political science.

It is unfair to target only graduate programs for this stigma. As an undergraduate senior at MIT, many of us had the opportunity to interview with some of the top firms in any field. I remember being provided a word of advice from an alumna who had launched a successful investment banking career. “When you’re asked the five-year question,” she said, “never say you plan on being married. They’ll see you as a liability.” She was referring to the generic interview question of “Where do you expect to be in five years, professionally and personally?” For men in their early twenties, the advice was the opposite. Young men who have a goal of being married are viewed as reliable and stable, while women who expected to get married were viewed as a waste of resources.

What can we do?

Graduate women in quantitative fields have consciously chosen an alternative path that is explicitly and implicitly discouraged by our environment. The issues I point out in this article – mindset development and social pressures – are daunting but not impossible to overcome. As a woman who has weaved her way through the male-dominated environments of MIT, The Conference Board, analytic forecasting, and now quantitative methods within Political Science, I offer the following advice to women pursuing or considering pursuing graduate education in the male-dominated quantitative fields:

1. Observe your mindset. The best way to do this is to pay attention to your language. Women often attribute our success to luck and our failures to a lack of ability (interestingly, this is the opposite in men)[6]. This is magnified in quantitative fields, where our negative thoughts are validated by the actions and words of others. If you are a woman in a PhD program, you did not get there by chance. You are not “lucky” you got into a top-tier program. You are there because you belong there. To think otherwise is an insult to you and your hard work.

2. Hold your own. It is inevitable that you will be in situations that are uncomfortable for women, even if the males in the room don’t see it that way. A colleague of mine related that she is the TA for a graduate class composed of military mid-career professionals. While she emphasized that her students are very respectful to her, it is still a fragile situation. In neither this, nor my MPSA experience, is the environment overtly disparaging or negative. But that is irrelevant; being the one that doesn’t belong only amplifies any insecurities and self-doubts. Notice the situation, acknowledge it, and own it.

One of my favorite stories to tell is from a conference where I was invited to dinner with senior academics in the field. It was me, eight tenured white male academics in their 60s and 70s, and one girlfriend of one of the academics in a rather small booth. It was an informal dinner where the food and wine flowed freely. I was the only sober individual while the others ranged from raucous to falling asleep at the table. By the end of the night, the waiter was deferring to me as the authority at the table, since I was the only person able to answer his questions clearly.

 3. Be friends with other women. One of the most self-defeating things that women do is alienate other women. By doing so we reinforce negative stereotypes of groups of women as catty, gossipy, and unproductive. We also cut off a resource for ourselves by internalizing our problems or airing them to individuals who cannot relate and we put ourselves in a situation where we have to go it alone, by eliminating those who have done it already.

 4. Be a mentor. Graduate students are often barely out of college. It is hard to us to view ourselves as a mentor to anyone, yet we are in a unique position where we have authority but can still relate to our students. Use that to help your promising female students. Make it a point to ask them how they are doing or to provide action-based compliments that develop growth intelligence, such as, “I can see you’re working really hard on this problem.”

Quantitative methods can be a free-form field. While there is a hurdle of learning a programming language and the basics of statistics, the rest of our learning is often project-specific. If methodologists have a problem to solve, we google packages, read vignettes, find github accounts and snag some code. We then hack away at our problem until the code works. We screw up quite a bit, and, at some point, screw up a little bit less. There is an degree of self-confidence that is required to tackle a problem in that manner.

Even that last sentence is a loaded statement. Women who have advanced in these fields usually do so in spite of their socialization and their environment. The rebuttal of the “confidence gap” literature is that these concepts of the qualities of a good leader (or university professor) are predicated upon the path that has been forged by men.[7] While women may not be as aggressive self-salesmen, that does not make us less qualified as methodologists. What departments can do to improve their environment for women could (and should) be the subject of another blog post.

Due to the broad accessibility of information, advances in technology and statistical abilities and the growth of applied data science, the walls of the ivory tower are crumbling. For political methodology to remain relevant, interesting, and in order to advance the field in a meaningful way, we must embrace diversity. It is a detriment to the field that qualified and capable women are being turned away, and there is much we can do to draw the best and brightest, regardless of gender.

*I would like to thank Elaine Denny and Kathryn Dove for their advice in crafting this article.

[1]   Henderson, V., & Dweck, C. S. (1991). Adolescence and achievement. In S. Feldman & G. Elliott (Eds.), At the threshold: Adolescent development (pp.197-216). Cambridge, MA: Harvard University Press.

[2]   Burman, Douglas,  Bitan, Tali, & Booth, J. B. (2008). Sex differences in neural processing of language among children. Neuropsychologia 46 (5) p. 1349-62.

[3]   Cvencek, Dario, Meltzoff, Andrew N., Greenwald, Anthony G. (2011) Math–Gender Stereotypes in Elementary School Children. Child Development 82 (3), p.1467-8624.

[4]   Blackwell, L., Trzesniewski, K., & Dweck, C.S. (2007). Implicit Theories of Intelligence Predict Achievement Across an Adolescent Transition: A Longitudinal Study and an Intervention. Child Development, 78, 246-263.

[5]   Dar-Nimrod, I., & Heine, S.J. (2006). Exposure to scientific theories affects women’s math performance. Science, 314, 435.

[6]   http://www.nber.org/digest/aug08/w13922.html

[7]   http://www.theatlantic.com/features/archive/2014/04/the-confidence-gap/359815/

Posted in Uncategorized | Tagged , , , | Leave a comment