Replication data for: The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation
hdl:1902.1/11047 UNF:3:jeUN9XODtYUp2iUbe8gWZQ==
Version: 4 – Released: Tue Apr 19 10:01:49 EDT 2011
Cataloging Information
Documentation, Data and Analysis
User Comments
Versions
 
If you use these data, please add the following citation to your scholarly references. Why cite?
Original Publication
Results found in this publication can be replicated using these data.
"The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation," under review at Statistical Science
Data Citation Details
Study Global IDhdl:1902.1/11047
AuthorsKosuke Imai (Department of Politics, Princeton University); Gary King (Institute for Quantitative Social Science, Harvard University); Clayton Nall (Institute for Quantitative Social Science, Harvard University)
Production Date2009
DistributorIQSS Dataverse Network Logo
Distributor Contactking@harvard.edu
Deposit DateJanuary 25, 2008
Provenance
Abstract and Scope
Abstract

A basic feature of many field experiments is that investigators are only able to randomize clusters of individuals — such as households, communities, firms, medical practices, schools, or classrooms — even when the individual is the unit of interest. To recoup the resulting efficiency loss, some studies pair similar clusters and randomize treatment within pairs. However, many other studies avoid pairing, in part because of claims in the literature, echoed by clinical trials standards organizations, that this matched-pair, cluster-randomization design has serious problems. We argue that all such claims are unfounded. We also prove that the estimator recommended for this design in the literature is unbiased only in situations when matching is unnecessary; and its standard error is also invalid. To overcome this problem without modeling assumptions, we develop a simple design-based estimator with much improved statistical properties. We also propose a model-based approach that includes some of the benefits of our design-based estimator as well as the estimator in the literature. Our methods also address individual-level noncompliance, which is common in applications but not allowed for in most existing methods. We show that from the perspective of bias, efficiency, power, robustness, or research costs, and in large or small samples, pairing should be used in cluster-randomized experiments whenever feasible; failing to do so is equivalent to discarding a considerable fraction of one’s data. We develop these techniques in the context of a randomized evaluation we are conducting of the Mexican Universal Health Insurance Program.

Abstract DateJanuary, 2008
Keywordscausal inference, community intervention trials, field experiments, group-randomized trials, place-randomized trials, health policy, matched-pair design, noncompliance, power
Country/NationMexico
Geographic Coverage100 local health clusters
Data Availability
Number of Files 11
Terms of Use
Dataverse Network Terms of Use
View Terms of Use [+]
IQSS Dataverse Network Terms and Conditions

By downloading these Materials, I agree to the following:

  1. I will not use the Materials to
    1. obtain information that could directly or indirectly identify subjects.
    2. produce links among the Distributor's datasets or among the Distributor's data and other datasets that could identify individuals or organizations.
    3. obtain information about, or further contact with, subjects known to me except where the use and/or release of such identifying information has no potential for constituting an unwarranted invasion of privacy and/or breach of confidentiality.
  2. I agree not to download any Materials where prohibited by applicable law.
  3. I agree not to use the Materials in any way prohibited by applicable law.
  4. I agree that any books, articles, conference papers, theses, dissertations, reports, or other publications that I create which employ data reference the bibliographic citation accompanying this data. These citations include the data authors, data identifier, and other information accord with the Recommended Standard (http://thedata.org/citation/standard) for social science data.
  5. THE DISTRIBUTOR MAKES NO WARRANTIES, EXPRESS OR IMPLIED, BY OPERATION OF LAW OR OTHERWISE, REGARDING OR RELATING TO THE DATASET