The American Community Survey

Rev. March 25, 2017
General Info   |  Data Release Plan   |  Access Data   |  What to Expect   |   ACS Data Profiles
PUMS Data   |  PUMAs   |   CPR Method   |   More Information / Things to Know

General Information

The American Community Survey is the ambitious new national survey from the U.S. Census Bureau that is replacing the long form portion of the decennial census for the new millenium. While some version of this survey has been in the field since 1999, it was not fully implemented in terms of coverage until 2006. In 2005 it was expanded to cover all counties in the country and the 1-in-40 households sampling rate was first applied. However, persons living in group quarters (such as nursing homes, dormitories and prisons) were not added to the survey until 2006. (The original plan was to begin GQ coverage in 2005 but last-minute budget reductions delayed it for a year.) The full implementation of the (household) sampling strategy for ACS entails having the survey mailed to about 250,000 households nationwide every month of every year and was begun in January 2005. In January 2006 sampling of group quarters was added to complete the sample as planned (albeit several years later than originally planned.) In any given year about 2.5% (1 in 40) of U.S. households will receive the survey. Over any 5-year period about 1 in 8 households should receive the survey (as compared to about 1 in 6 that received the census long form in the 2000 census). Unfortunately, receiving the survey is not the same as responding to it, since the Bureau has adopted a strategy of sampling for non-response. This has resulted in something closer to 1 in 11 households actually participating in the survey over any 5-year period.

Data Release Plan Based on Population Size of Geographic Area

Data based on the ACS surveys for any calendar year will be published in the late summer of the following year for geographic areas with a minimum of 65,000 population. For smaller areas the Bureau will only publish data based on surveys for multiple consecutive years as follows:

  • For geographic areas of 20,000 or more population data will be published based on 3 consecutive years of survey data. Thus, for example, the first time we saw data tabulated for Jefferson City, MO (population around 40,000) was in December of 2008 and it was based upon the surveys done in 2005 through 2007. In late October, 2009 data were released based upon the surveys taken in 2006-2008, etc.

    NOTE: Starting with vintage year 2014 (published in calendar year 2015) these 3-year period estimates will NOT be created or published.

  • For all geographic areas regardless of population size (down to the block group level, but NOT to the block level) data will be published based on 5 consecutive years of survey data. Thus, for example, data for the majority (60 out of 115) of counties in Missouri, and for 46 out of 53 counties in the state of North Dakota, and for all census tracts and block groups everywhere, data will first appear some time late in 2010 and will be based upon the combined survey data of 2005 through 2009. New data for these areas should then be published each year, based upon the most recent 5 years of surveys. Even if a census tract should happen to meet the population threshold of 20,000 (which is rare) no data will be published for it other than the 5-year "period estimates". Similarly, data at the ZIP code/ZCTA and state legislative district levels will only be published as 5-year period estimates, even though there are many ZIP codes and LD's that meet the 20,000 threshold.

Suppressing Tables

In addition to the population threshold rules that are used to limit the publication of data for geographic areas the Bureau also applies their "data release rules" for each table for each geographic area (that passes the total population threshold filter). Basically they analyze the cells of a table and assign a measure of the statistical reliability of each cell based on the margin of error. We have indicated our dismay at the algorithm elsewhere (see "Thing 6" in our
Ten_Things to Know... page). The link there to the document that describes the algorithm is broken, but we found it (or at least something quite similar) at . The following excerpt from the 8-page document outlines the method. Be sure to note the final sentence. :

Data Release Rules

Even with the population size thresholds described earlier, in certain geographic areas some very detailed tables might include estimates with unacceptable reliability. Data release rules, based on the statistical reliability of the survey estimates, were first applied in the 2005 ACS. These release rules apply only to the 1- and 3-year data products. The main data release rule for the ACS tables works as follows. Every detailed table consists of a series of estimates. Each estimate is subject to sampling variability that can be summarized by its standard error. If more than half of the estimates in the table are not statistically different from 0 (at a 90 percent confidence level), then the table fails. Dividing the standard error by the estimate yields the coefficient of variation (CV) for each estimate. (If the estimate is 0, a CV of 100 percent is assigned.) To implement this requirement for each table at a given geographic area, CVs are cal- culated for each table’s estimates, and the median CV value is determined. If the median CV value for the table is less than or equal to 61 percent, the table passes for that geographic area and is published; if it is greater than 61 percent, the table fails and is not published. Whenever a table fails, a simpler table that collapses some of the detailed lines together can be substituted for the original. If the simpler table passes, it is released. If it fails, none of the esti- mates for that table and geographic area are released. These release rules are applied to single- and multiyear period estimates based on 3 years of sample data. Current plans are not to apply data release rules to the estimates based on 5 years of sample data.

Access the Data

To access the data within the MCDC data archive via Uexplore/Dexter go to the ACS section of the Uexplore/Dexter home page and follow the link to the desired vintage (e.g. follow the acs2014 link to access data based on 2014 vintage data). Here are those links (as of early 2017):
acs2015 acs2014 acs2013 acs2012 acs2011 acs2010 acs2009 acs2008 acs2007 acs2006 acs2005

The acs directory is generic (i.e. not time or data product specific) and contains a number of interesting documents. The Readme file/web page you are currently reading is contained in this directory.

The MCDC created a series of xsamples mini tutorials (back in 2010) that demonstrated how to access data in the archive. Several of these pertained to the ACS data, including the xsamples module, acsBasetbls, which combines frames-based access to a collection of input and output modules with links to video modules that show and tell how the extraction works. Even if it is getting old, it is reassuring to note that it is still relevant and can be used to see how things work today (i.e. when accessing 2014 or 2015 vintage ACS data). We have made enhancements since 2010, of course, but the sample code created back then still works.

Base (aka "Summary") Table Access Via Uexplore/Dexter

For many/most users for many/most queries casual users would probably be better off using AFF. If you are looking for basic data for just a few geographic areas then using our ACS Profiles and/or ACS Extract assistant apps (see that section below) are hard to beat for ease of use combined with good flexibility. Accessing the more complex summary (aka "base") tables is much more challenging on our site. You have to use the Uexplore/Dexter modules, which not everybody wants to learn about. You have to discover that we now are keeping two separate subdirectories of the acs data directories (e.g. we have acs2014/basetbls and acs2014/btabs5yr) where we keep these base tables. Within these subdirectories you'll find data sets that come in groups of six with names ending in 00_07, 08, 09_16, 17_20,21_24,25_27. These are "topic intervals". For example, a data set ending in 17_20 will contain all tables relevant to topics 17 (Poverty), 18 (Disability), 19 (Income) and 20 (Earnings). Not everybody who needs to use ACS data knows or wants to know about topic codes. They should use Americna FactFinder.

One of the keys to using these large datasets is knowing what tables are available and within these tables what do each of the data cells represent. For this we have created yet another subdirectory called Varlabs (within the basetbls subdirectory. There are (usually) 8 files within these metadata collections. Two are small and trivial but very useful: TableTopicCodes.txt contains this:

 The first 2 digits of a base table number are the topic code.  So
 if you are looking for tables related to poverty (for example) you need 
 look only at tables B17xxx and C17xxx .  These tables would be found in 
 a data set such as  ustabs17_203yr  (3-year period estimates with all 
 tables in topic 17 through 20.)  The "Topic Group" is part of the data set
Topic Group 00_07: Basic Demographics and Ancestry
  00 = Unweighted sample counts. 
  01 = Age and Sex
  02 = Race
  03 = Hispanic or Latino Origin
  04 = Ancestry
  05 = Foreign Born, Citizenship
  06 = Place of Birth
  07 = Residence Last Year, Migration

Topic Group 08: Journey to Work 
  08 = Journey to Work, Worker Characteristics

Topic Group 09_16: Households, Families, Misc. Social Characteristics
  09 = Children, Relationship
  10 = Grandparents, Age of HH members
  11 = Households, Families
  12 = Marital status
  13 = Fertility
  14 = School enrollment
  15 = Educational attainment
  16 = Language spoken at home

Topic Group 17_20: Income and Poverty 
  17 = Poverty
  18 = Disability
  19 = Income (Household, family)
  20 = Earnings (Individuals)

Topic Group 21_24: Employment and Related Items
  21 = Veteran status
  22 = Transfer Programs, Food Stamps
  23 = Employment status
  24 = Industry, Occupation, Class of Worker

Topic Group 25_27: Housing, Group Quarters and Insurance Coverage
  25 = Housing Characteristics
  26 = Group Quarters
  27 = Insurance Coverage (collected only since 2008, N.A. on 2007-2011 5-year data)
  28 = Computer Ownership and Internet Use

Topic Group 99: Imputation (NA on this site) 
  99 = Imputation tables. 

The even smaller file is BaseTableAlphaSuffixes.txt file contains

 American Community Survey Table Alpha Suffix Codes
A  White alone
B  Black or African American alone
C  American Indian & Alaska Native alone
D  Asian alone
E  Native Hawaiian & Other PI alone 
F  Some Other Race alone
G  Two or more races
H  White alone, not Hispanic or latino
I  Hispanic or latino  
. These files just about never change. We just copy them each new data cycle.

The remaining 6 Varlabs files have names with familiar numeric suffixes. The file 17_20labels.txt contains the critical metadata for anyone wanting to know what data are contained in a dataset with 17_20 in its name. These are simple text files that can be quickly loaded into your browser and viewed, searched or printed. The exact details of what appears in these files has varied slightly over time but the gist of them has remained the same. You get the table title and universe followed by "detail lines" describing each cell within the table. For example:

/* Universe: Population for whom poverty status is determined  */
/* *** Important Note ***: This table is NOT available for 5-year data at any level    */
    B17002i1   ="Total:"
    B17002i2   ="  Under .50"
    B17002i3   ="  .50 to .74"
    B17002i4   ="  .75 to .99"
    B17002i5   ="  1.00 to 1.24"
    B17002i6   ="  1.25 to 1.49"
    B17002i7   ="  1.50 to 1.74"
    B17002i8   ="  1.75 to 1.84"
    B17002i9   ="  1.85 to 1.99"
    B17002i10  ="  2.00 to 2.99"
    B17002i11  ="  3.00 to 3.99"
    B17002i12  ="  4.00 to 4.99"
    B17002i13  ="  5.00 and over"

/* Universe: Population for whom poverty status is determined  */
    C17002i1   ="Total:"
    C17002i2   ="  Under .50"
    C17002i3   ="  .50 to .99"
    C17002i4   ="  1.00 to 1.24"
    C17002i5   ="  1.25 to 1.49"
    C17002i6   ="  1.50 to 1.84"
    C17002i7   ="  1.85 to 1.99"
    C17002i8   ="  2.00 and over"   

Base (summary) table names are comprised of a letter (B or C), a 5-digit code (the first 2 of which constitute the topic code), and (sometimes) an alpha suffix (to indicate a special race/hispanic universe (per the BaseTableAlphaSuffixes.txt, shown above). There can also be a "PR" suffix to indicate a file available only for Puerto Rico. We do not include PR tables in these metadata files.

We now keep only one set of Varlabs metadata files, even though the available tables for a given vintage can vary with the single year vs. five year files. We handle this by including the special comment line:

/* *** Important Note ***: This table is NOT available for 5-year data at any level    */ 

Similarly, starting with the 2015 vintage data (and we may go back and update earlier versions) we also display a similar comment to identify tables that are not available at the block group level on the 5-year files.

Here is another table entry from the 17_20labels.txt file:

/* Universe: Population for whom poverty status is determined  */
/* *** Important Note ***: This table is NOT available for 5-year data at any level    */
   /* *** This table is repeated for the following Race/Hispanic suffixes: _ABCDEFGHI *** */
    C17001i1   ="Total:"
    C17001i2   ="  Income in the past 12 months below poverty level:"
    C17001i3   ="    Male:"
    C17001i4   ="      Under 18 years"
    C17001i5   ="      18 to 64 years"
    C17001i6   ="      65 years and over"
    C17001i7   ="    Female:"
    C17001i8   ="      Under 18 years"
    C17001i9   ="      18 to 64 years"
    C17001i10  ="      65 years and over"
    C17001i11  ="    Income in the past 12 months at or above poverty level:"
    C17001i12  ="      Male:"
    C17001i13  ="        Under 18 years"
    C17001i14  ="        18 to 64 years"
    C17001i15  ="        65 years and over"
    C17001i16  ="      Female:"
    C17001i17  ="        Under 18 years"
    C17001i18  ="        18 to 64 years"
    C17001i19  ="        65 years and over"  

Once You Know What Tables You Want

Once you know that what you need is not available in the standard profile datasets, meaning you need to access the base (summary) tables, and you have perused and searched the Varlabs module to determine which table(s) you need you are ready to Uexplore and do a Dexter query. This is the relatively easy part. You will need to access the basetbls subdirectory if what you want is single-year data (don't need any geographic area with under 65000 population). Then just select (click on) one of the ustabs data sets to invoke Dexter to access that collection summary/base tables. You will notice when the Dexter query form page is displayed that you will see in Section III for the form that instead of a select list on the right that is labeled Numerics (containing a list of numeric variables from which you get to choose) you will instead see a Tables select list with table titles instead of variable names and labels in the select list. The application knows that this is a table-based dataset and displays the alternate selection tool to make life a little easier.

What To Expect (Coming Soon)

This Year (2016)

We plan to update this for 2017 later in the summer when the target dates are set for the vintage 2016 data. But you can use the dates from last year (shown below) as good estimates of when we might expect the data to be released this year. All the usual 2015 vintage data have been processed and are available on this site.

There will be two sets of estimates released this year, starting with the single-year data for large areas in September. The specific dates (announced by the Bureau on June 1, 2016) are:

  • Sept. 15: Single-year estimates for 2015. (These are available for access now.)

  • Dec. 8: Five-year period estimates for 2011-2015 .

Reminder: There are no longer any 3-year period estimates being produced. Due to budget constraints the Bureau decided to stop creating those products starting with the 2014 vintage data.

Note that when these 5-year data tables are available that you can then do trend analysis by comparing these data with the corresponding tables from vintage 2010 (year 2006-2010). You already could do this with the 2005-2009 vs. 2010-2014 estimates, but if you are wanting to use 2010 geography (for tracts, block groups, PUMAs) this is the first time you will be able to compare consistent geographies. The tracts on the 2006-2010 estimates are 2010 tracts, whereas the tracts used in the 2005-2009 data are 2000 census tracts.

ACS Data Profiles and the ACS Extract Assistant

Both the Census Bureau and the Missouri Census Data Center provide data profile reports (and corresponding data files) containing highlights of the very detailed information contained in the complete set of base tables. The Bureau profiles can be accessed via American FactFinder. Access to the MCDC's acsporifle application stypically starts by access the ACS Profile Menu page (this link is currently featured at the top of the MCDC's Quick Links box, which appears on most MCDC web pages. The closely-related ACS Trends report can be accessed from a main menu page that also appears in the Quick Links box, or you can access it as a link from the acsprofiles report page.


The ACS also includes a public use microdata sample (PUMS) product. We keep all such datasets (regardless of year) in the acspums data directory. We currently have data for 2004 thru 2014 . This collection will only be of direct interest to researchers with access to and knowledge of how to use a statistical software package. For a more detailed discussion see the ACS PUMS item from our "Ten More Things to Know..." web page, linked to in the Where to Get More Information section, below.

PUMAs (Public Use Microdata Areas) Related to ACS

Users of this web site will notice that we place considerable emphasis on data summarized at the PUMA geographic level. This is because PUMAs are large enough (100,000 minimum population) that they qualify to have new single-year ACS data published each year. We can use PUMA data therefore to look at trends and maps that cover the entire state. We have created custom reports to help users understand where these PUMAs are (what counties and cities they contain or in which they are contained) together with links to pdf map files showing them (see our Geographic Reference Reports, the first 3 bullets). To learn more see the discussion of PUMAs that appears in our introductory Ten Things to Know About the American Community Survey page (cited just below).

CPR: Experimental Method For Estimating Data for Smaller Counties

The MCDC has experimented with a methodology for estimating county level ACS data using a County to PUMA Ratio methodology. If interested, try viewing Introduction to the County-PUMA Ratio Method and the related Time Promoting Using CPR Methodology pages. We welcome user feedback on these topics as we weigh the value of pursuing such approaches to enhancing the ACS data products.

Where to Get More Information

There has been and continues to be a lot written about the ACS. Here are some of our favorite resources.

  • American Community Survey Home Page at the Census Bureau. This is the official site. Comprehensive, with many links to the various components. From it (under the "About the ACS" tab) you'll find a link to the current (2013) ACS Questionnaire (pdf file).

  • American Community Survey Geography page showing which geographic levels are getting 1-year, 3-year and 5-year estimates (for recent and one future data cycle). This is where you can find out (for example) when to expect data for 113th Congressional Districts, or when we started getting data by 2010 census tracts instead of 2000 tracts. You can also find Release Notes regarding any anomalies in the data, such as the situation in 3 New York counties where they modified some tract boundaries starting with the 2011 release. This page contains a link to Geography and the American Community Survey, which is the ultimate source for the topic.

  • 2014 Data Release contains information specific to the 2014 vintage data release (to be mostly completed in December, 2015).

  • The ACS Compass Products page at the Bureau's ACS web site provides links to a series of educational materials related to the ACS. Includes handbooks, power point tutorials and other materials, which seem to be appearing almost every few weeks. The Compass Handbooks are all targeted to specific audiences and have titles such as What General Data Users Need to Know, What the Business Community Needs to Know, etc.

  • Stats Indiana ACS Page has some nice ACS data and links. For example, it contains the following link to the Cornell site for estimimating MOE's of calculated values.

  • Online ACS Calculator at Cornell University provides a quick and easy way to calculate MOE's and do signficance-of-difference tests by simply entering the relevant estimate and MOE values, and letting it do the calculations. The formulas used are the ones published in the Bureau's Compass guides and are known to be less than perfect. Nevertheless... To get the MOE for a percentage you enter the values for the numerator in the Value1 boxes, and of the denominator in the Value2 boxes; then select Proportion as the Operation and read the answer in the Result box.

  • Ten Things to Know About the American Community Survey is a locally produced introductory document that focuses on things that new users will need to know in order to make use of this important data resource. (Parts are specific to the 2005 edition of the data.) Created in April, 2007.

  • Ten More Things to Know (and Do) About the American Community Survey is a locally produced document that is obviously a sequel to the previous item. This document (which is actually three separate web pages linked together) is considerably longer and more detailed than its predecessor, with lots of screenshots and specific examples of how to access the data. Created in early 2010.

  • The American Community Survey vs. the Decennial Census Long Form is an essay that considers the question "Are we better off now [6/2009] than we were a decade ago?" following the Big Trade involving giving up the decennial census long form in exchange for the ACS.

  • The ACS Section of MCDCs Product Inventory Showcase points to various data products/applications on this (MCDC) web site related to the ACS.

  • Population Bulletin on The American Community Survey from the Population Reference Bureau is an excellent in-depth review of the survey (written in September, 2005) . This is a 20-page pdf document.

  • Map of Missouri by County (pdf file) showing the three population categories (<20K, 20K to < 65K and 65K+) relevant to ACS data products. Also displays the PUMA boundaries and labels. (Because sometimes you can use PUMA level data as the next best thing when county level is not available.)

  • The Docs subdirectory of the MCDC's (this) acs data directory has some useful links.

You can subscribe to the "ACS Alerts" mailing list so that you will be notified of the latest news about the ACS. See the ACS Alerts web page for more information. The alerts are also posted to the ACS web site.