Documentation: | Demographic Report |
Document: | Market Profile Data Product Documentation |
citation: | Social Explorer; "Easy Analytic Software, Inc. (EASI ®) – a New York based company" |
Easy Analytic Software, Inc. (EASI) - Methodology
Demographics, Related Data, and Life Stage Clusters
Thank you for your interest in Easy Analytic Software, Inc. (EASI) and The Right Site®. We have included a description of our company, and a description of our basic methodology.
The EASI Research Methodology – Our Philosophy
Our intent is to establish a proper benchmark or starting point for a data series, which ensures a reliable and reasonable source for updating. We then find and develop a logical and consistent set of information, from reliable sources, which we then use to develop procedures, models, and algorithms to update and forecast the data elements in a manner that allows for accountability and accuracy.
Definition Link
http://www.easidemographics.com/mdbhelp/mdbhelp.htm
Introduction
Easy Analytic Software, Inc. (EASI) is a New York-based independent developer and marketer of desktop and internet demographic data and software solutions that provide demographic reports with unique search and analysis tools. EASI provides targeted site analysis software and updated demographics and related data for standard and customized geographies (Block Groups, ZIP Codes, Cities, Counties, CBSAs, etc.). Included with all software is a simple-to-learn mapping tool that does street lookups, point maps, ring studies, create quintile analysis, and much more. EASI has been in business since 1995 with over 1,500 clients who use our databases, software, and online services.
Take a moment to read the Testimonials on our website! While there you can also test our software – for free. EASI offers key reports from the 2010 Census. Thousands of corporate, magazines, colleges and other users go to our site for their Census demographics.
All of our estimates are based upon the official 2010 Census.
We have several versions of our software, The Right Site ®. They all have different data but the same software. The software has simple to interpret standard demographic reports, sales potential analysis, site analysis (three-ring reports), Trend reports (Census, current, and five-year projection), and user defined demographic profiles (clusters). Our software also has unique features such as the EASI ® Significant Variable Report. This EASI-created report instantly shows what makes each study area special. The results of that can then be used to find other similar areas anywhere in the US!
EASI provides targeted demographic data, site analysis, and general reference software that is really easy to use – we guarantee it.
At our web site www.easidemographics.com you can compare the reports and data contained in The Right Site – Executive, Professional, or Advanced – and determine which one is right for you. At our site, you can determine all the variables found in this and other versions.
EASI Master Database and The Right Site ® Methodology for Updates
The following is a general description of the methodology used by EASI to update the demographic and economic characteristics for the United States, States, Counties, ZIP Codes, Cities, Census Tracts, Block Groups, Carrier Routes, and ZIP Plus 4’s (and custom geographies which are derived from other geographies).
The purpose of this explanation is not to divulge any proprietary methods but to illustrate the efforts made on your behalf to create accurate updates. EASI statistician’s and programmers have over 30 years of experience updating these types of data. By industry standard EASI estimates would be considered of the highest quality.
Quick Annual Summary
While the 2010 Census doesn’t change for the released data (population by age sex race and ethnicity and certain limited household data) EASI has to use a five-year analysis of The American Community Survey (ACS) to estimate Census data (income, educational attainment etc.). And in some instances our Census data can change (4/1/2010) as we have new information.
Each year we have a new ACS and a new five years’ worth to analyze over time at the Block Group level or Census Tract (the actual level of geography we use from ACS varies by data element and is based on sample size reported in each geography). These data form the basis of how we forecast these data elements forward.
Separately the Census release annual estimates (previous March) for age by sex by race and for household income by race. We use these data as the basis for our current year and five year forecasts for US controls totals. Each year the Census releases county population estimates for July 1 of the previous years. We analyze these data by birth, death, and migration and they form the basis of current and forecasted county control totals.
To get a handle on recent local change (Block Groups) we analyze annually a USPS file of postal deliveries by type. This is a ZIP4 file which we use to assign annual changes to the Block Group level. These data relate well to households (change in mailable households compared to actual households). We also use this file to annually determine how ZIP Codes change annually.
a) With the current release EASI will benchmark at the Block Group and higher levels all of the details supplied with the 2010 Census (all related releases at the Block Group level). All data are now derived from BG data from American Community Survey (ACS) and Public Use Microdata Survey (PUMS) combined with the released 2010 Census data for April 1, 2010. Cities are now based on the results of the 2010 (population 100 or more as of 4/1/2010).
Note: Details of 2010 Census are at the end of this methodology
b) In all previous estimates EASI racial data including Black Population, Asian Population, White Population, and Other Population. However, these are now based on different questions with the 2010 Census. The 2010 data are not compatible with the 2000 Census (multiple race categories are now possible and are part of the other group).
The groups now are:
White Population, Alone;
Black Population, Alone;
Asian Population, Alone;
American Indian and Alaska Native Population, Alone;
Other Race Population, Alone, and
Two or More Races Population.
EASI uses the 2010 Census Block Group data as our benchmark and makes adjustments for consistency for age and sex and for all household counts.
c) EASI has collected from the Census Bureau all current local (counties etc.) and national updates and estimates for all the key demographic information. All these official estimates have been analyzed and then incorporated into our estimates and projections using a variety of EASI models. However, starting with the 2010 Census benchmark the largest impact of our estimates will now be coming from the American Community Survey and Public Use Microdata Sample (details later).
d) EASI has summarized from the United States Postal Service (USPS) mailable Households at a County, ZIP Code, Census Tract, and Block Group level. These data have been used as the primary input to estimate local current change within a small area such as a Block Group. Mailable households are not the same as Census Households but are used to indicate recent annual change in household formations. These changes are combined with an EASI proprietary model for updating and forecasting at the Block Groups.
e) The Mailable Household data match starts by identifying, for every ZIP Plus 4, (ZIP+4), which Block Group it belongs to. EASI develops a split file and a plurality file of these matches using the latest TIGER (Topologically Integrated Geographic Encoding and Referencing system) file, to determine which Block Group (primary) they should be assigned to. One of the key goals is to identify all correct current ZIP Codes and ZIP+4’s and then assign them to the correct Block Group that these Mailable Households should be assigned to. EASI has also reconfigured the 2000 Block Groups into the 2010 Block Group configuration to estimate the 2000 population for comparative purposes. An analysis of this decade change is also included in our model.
f) EASI has also analyzed the 2010 Census Block files in order to create a population Centroid for each Block Group. The results of that analysis are used for all ring study analysis.
g) Specific other sources include:
1. Bureau of the Census – 2010 Census PL 94 – 171; American Community Survey (ACS) and Public Use Microdata Samples (PUMS). Other related sources are: Annual Demographic Survey, Current Population Reports (P20; P25; P60; and numerous special Census reports.
2. ZIP and County Business Patterns (US Department of Commerce – Economics and Statistics Administration- Bureau of the Census.)
3. US Department of Justice – Federal Bureau of Investigation.
4. National Center for Education Statistics – Common Core of Data (CCD)
5. National Oceanic and Atmospheric Administration – National Environmental Satellite, Data and Information Service - National Climatic Data Center.
6. United States Department of the Interior – Geological Survey – Office of Earthquakes, Volcanoes, and Engineering.
7. Bureau of Labor Statistics – Department of Labor.
8. Geography Changes – EASI has captured more ZIP Code demographics this year by including Point ZIPs such as PO Boxes. In the case of these areas, the physical area served by these demographics will be the area defined by the Block Groups in which these Point ZIPs are located.
9. Geography Changes (2010 Census) – EASI has added a CITYTYPE identifier field to help users determine how the EASI City relates to Census Designations. All City Records except CITYTYPE C3 and T1 are Census Places. The seven C3 records are consolidated cities such as Indianapolis and Nashville. The remaining T1 Records are County Subdivisions. Please note that where both Places and County Subdivisions exist, some of the areas may be coextensive (not mutually exclusive).
2. Data Preparation
The steps in creating the ZIP Plus 4 (ZIP+4) and Block Group mailable Households include:
a) Start with a USPS ZIP+4 file for end of December (year prior to the estimating year) which includes all valid residential ZIP+4’s in the country whether they are residential mail or not.
b) For each ZIP+4’s, we add Census Blocks Groups based upon a current TIGER file distance formula. Over 20 million records (of the approximate 38 million or so) are processed by this direct match.
c) For each remaining ZIP+4, we match against our internal geocode file (latitude and longitude). This file is based on running through address matching/geocoding software. Approximately 18% of total are matched to their Block Group this way.
d) For each remaining ZIP+4 that cannot be geocoded by b) or c), we use a calculated carrier route or Block Group centroid. We weight the geographies to a larger area and calculate a latitude and longitude. We then determine which is the closest (by distance) Block Group. This approach is used for approximately 5% of total.
e) If a ZIP4 is still unassigned then, we use nearest neighbor ZIP+4. There are approximately 2% or total are done through this approach (recent, 6 months old ZIP+4s are often in this category).
f) Block Groups assignments are from the most recent Census TIGER file. TIGER errors, where identified (such as wrong FIPS Codes) have been corrected.
g) ZIP Plus 4’s are assigned data based upon the data of the Block Group that it has been assigned to. (Note: There are no official Census Bureau data for ZIP+4.)
h) These mailable household data analysis are for residential ZIP+4’s (no business-exclusive ZIP+4’s are included).
EASI has developed a series of models which use the relationship between the count of the current mailable households at the Block Group (BG) level to develop estimates of the change in the household size relationships at the BG compared to the county and to the ZIP Code. In addition, EASI analyzes the change in relationships between these mailable households over time and compares them to the county and to the ZIP Code households using a proprietary formula. Care is taken in this approach since there can be ZIP4 definitional changes. The analysis relates the current estimate of mailable households to the number of mailable households at the time of the 2010 Census (4/1/00) and as time progresses.
One key component of the analysis is a proximity site review of all ZIP+4’s based upon their Block Group assignment (217,740 Block Groups in Census 2010 versus 208,790 Block Groups in Census 2000). EASI analysis includes the many new Block Groups that didn’t exist and the many Block Groups with the same codes but different shapes. This analysis prepares our input data before use in EASI demographic models.
EASI uses newly released Census county estimate information which are analyzed and compared to prior releases to develop current and forecasted county control totals through an analysis of population component changes (births, deaths, migration, etc.). In addition EASI uses information from the current American Community Survey and Census Population Estimates Program (PEP) to generate annual adjustments for population changes by Race and Ethnicity, Gender, and Age.
Annually, EASI also incorporates relevant national and state data as control totals. This is done for a variety of demographic factors. EASI derives this from analysis of national data, over time, from the Annual Demographic Survey, the Current Population Survey, American Community Survey, Public Use Microdata Surveys, and the Annual Housing Survey. There are also from a variety of sources at the Census Bureau web site (www.census.gov).
ZIP Code results are independently compared to the USPS current ZIP Code file of residential deliveries including residential post office boxes. Additional updating sources include: USPS AMS files and Postal bulletins (the ZIP Alert); these record any annual changes that take place to ZIP codes including name changes, delivery or branch changes as they become official. Other sources include: U.S. Postal Service City-State File (monthly) and Delivery Statistics File. These CD ROM’s incorporate main inventory of ZIP Codes and the post office and other names associated with them. Each year EASI conducts a complete review of these files to maintain a current ZIP Code roster. EASI inventories the old ZIP Codes as well. Note: Starting with the 2011 release EASI is updating Post Office Boxes to include their demographics.
Updates to the current year and a five-year projections are first done at the United States level and for key variables at the county level as well. Block Group (BG) level estimates are all controlled to the county control totals. That is, the Block Group data will add to the separately generated county data for all data elements. In a similar manner, other geographies are summarized from the Block Group level. However, parts of BGs are added to get ZIP Codes and to get cities.
4. Consistency – year-to-year changes
Each year EASI uses all available sources to maintain the highest quality of our estimates. Sometimes the new information will makes year to year changes less meaningfu e.g. a current ZIP Code may have a different definition of BG’s because of postal changes in the last year. However, the changes from our 5 year forecast, within an EASI calendar year, are consistent from the current estimate but changes from last’s years estimates are not necessarily so. EASI geography estimates are all based on the same geography, which is all ZIP Code estimates for April 1, 2010, 1/1/current year; and a five-year forecast are all based on the same geographic definition.
Starting in 2007 the Census Bureau has begun releasing The American Community Survey (ACS - www.census.gov) to supplement its Census 2010 data. EASI has incorporated all of these estimates into our current updates and forecasts. These ACS adjusted estimates will be carried forward and will annually use the latest released data as part of the updating process. Note: Use of these new ACS estimates offer an immediate improvement and a vast improvement over time. If you have questions or concerns about the impact of ACS, please call EASI at 800-HOW-EASI (469-3274) for a thorough and complete discussion.
Users must be use caution when comparing data from prior censuses or even releases of new Census data. As mentioned before, the 2010 Census has a new Race question (White Population, Alone; Black Population, Alone; Asian Population, Alone; American Indian and Alaska Native Population, Alone; Other Race Population, Alone, and Two or More Races Population.
Another factor in consistency is that with some data sources information becomes available annually but with others data elements may not be released but once every two or even three years.
Starting with 2015 we’ve added BLS data to our methodology, prior to this year we relied on the American Community Survey (ACS) data for changes in unemployment. However since the ACS is a rolling 5 year average, the data for 2013 was based approximately on mid-year 2010 data. In the last few years there has been a period of rapid changes in employment which wasn’t reflected in our estimates. So now with the new use of current BLS numbers to make the employment/unemployment it will make our estimates more consistent with outside sources.
Occasionally a post Census estimates can be subject to revision for several reasons. In one instance a data series may be deemed more important by Congress and as result a sample size can be expanded to allow for more detailed results. Another change could be that the sample is framed against any new data such as the 2010 Census. EASI with decades of experience analyzes all information and then EASI incorporates the results into our estimates.
ZIP Code Details - As mentioned above, ZIP Codes even if they seem to be the same (same 5 digits) are especially difficult for consistency from year to year (they are always consistent within the EASI data and software.) Since each ZIP Code area may change from year to year EASI spends considerable time and effort to develop new ZIP Code data for each and every year. That is, EASI assigns a portion of each Block Group to a ZIP Code based on the latest information for each year (2010, current and five year forecast). Note: Annually EASI’s creates a proprietary ZIP to Block Group (partial) analysis and we also allocate all land area to create each ZIP Code.
Income
There are many different definitions of income that are available for analysis. With the release of the 2010 Census EASI has been using the ACS 5 year data to develop a Census Income estimate (for the year 2009) as our starting point. These estimates are then modeled using the P60 Money Income in the United States (Current Population Reports – Consumer Income) as well as other data. EASI income models are based on race and by family characteristics to obtain a current estimate. All Income estimates use the 2010 Census household definition of income as a benchmark.
EASI income estimates are controlled to analysis from the Money income data after analyzing the differences in that sample compared to the actual ACS data.
EASI estimates inflation (current dollars) in all of our estimates and forecasts. EASI also maintains Income distributions based on gross income (includes all taxes).
EASI Crime Models
EASI analyzed actual county level FBI data for the various types of crimes as follows:
A. Split the counties with reported crime data into two random groups of 200 counties each.
B. We then developed a series of regression models which could then be applied to EASI updated demographics.
C. Finally we then tested the models on the 200 counties that were used to develop the regressions.
D. Statistical analysis showed that the models did an excellent job of predicating crime rates at the county level.
EASI then used this model to develop a similar methodology for estimating crime at the Block Group and other levels of geography.
Consumer Expenditure Survey (CEX)
The results of the CEX are analyzed annually by EASI and then combined with EASI estimates at the Block group level. The Bureau of Labor Statistics and the Bureau of the Census conduct the CEX. There are two parts to the survey. The first part is a diary, which is completed by respondents for two consecutive one-week periods. The second part is an interview survey, which are conducted quarterly (three months) for five quarters. The interview survey includes about 95% of all expenditures and includes large expenditures such as property, automobiles, major appliances, rent, utility payments insurance premiums, and many others.
EASI annually models these results of about 600+ categories of expenditures against our updated demographic estimates. EASI’s models use our own BG demographic estimates to update these potential sales.
An example:
EASI models the age of respondent, income of respondent, and tenure (own home versus rent). Then for each demographic characteristic we have an average expenditure for the previous calendar year (e.g. a respondent earning $50,000 to $75,000 spent $210 (for example only) and we might then see that a respondent with income of $35,000 to $50,000 spent $150 (for example only).
We take all the values for the demographics and then develop a model for this CEX characteristic that combines the factors to get one BG level estimate.
EASI is estimating CEX categories by race and ethnicity.
Standard Occupational Classification Codes (SOC)
EASI has developed estimates for both major and minor occupations (over 800). These are based on estimates of occupations within NAICS employment groups (4 digits). EASI adds up each SOC category estimate within a 4 digit NAICS code (employment) and then adds up these parts to get a total for each level of geography.
A. ZIP and County Business Patterns (US Department of Commerce - Economics and Statistics Administration- Bureau of the Census.) County Business Patterns is an annual series that provides subnational economic data by industry. This series includes the number of establishments, employment during the week of March 12, first quarter payroll, and annual payroll. This data is useful for studying the economic activity of small areas; analyzing economic changes over time; and as a benchmark for other statistical series, surveys, and databases between economic censuses. Businesses use the data for analyzing market potential, measuring the effectiveness of sales and advertising programs, setting sales quotas, and developing budgets. Government agencies use the data for administration and planning.
ZIP Code Business Patterns data are available shortly after the release of County Business Patterns. It provides the number of establishments by employment-size classes by detailed industry in the U.S. Here is a link for the 2018 ZIP Business Patterns and County Business Patterns: https://www.census.gov/data/datasets/2018/econ/cbp/2018-cbp.html .
For 2017, EASI tested a new allocation method using a Carrier Route model instead of the previous Block Group Model. This is a more detailed method of allocation businesses from ZIP Codes (which are where the Census business data is based on). This new approach should yield similar results for ring studies it does result in a change in business counts for Block Groups, specifically the number of “zero business” counts. This method results in a better presentation of where businesses are actually located.
B. Occupational Employment Statistics - The Occupational Employment Statistics (OES) program produces employment and wage estimates for over 800 occupations. These are estimates of the number of people employed in certain occupations, and estimates of the wages paid to them. Self-employed persons are not included in the estimates. http://www.bls.gov/oes/
Note: See NAICS Business Data further down for that review.
Retail Sales and Store Groups, Minor Stores and Major Merchandise Lines
EASI’s Retail Sales Estimates include Food Service – Total Retail Sales includes the standard 12 major stores plus Food Service; 55+ Minor Stores, and 45 Major Merchandise Lines. All data are based on an extensive review of County and ZIP Code Retail Trade data for 2012. EASI created a file of benchmark data from the released Census data which is used for our annual update.
Each year, EASI creates a new consistent file of benchmark and updated for 2002, current, and a five-year forecast. EASI re-benchmarks estimates for each update to a new set of Block Group estimates for all retail categories based on new information so our data over time is consistent. These estimates are based on our current analysis of the latest NAICS employment data for each retail store and food service. Note: EASI resolves any inconsistencies between sources as part of this annual process.
The 13 store groups that comprise Total Retail Sales are:
1. Motor Vehicle and Parts Dealers
2. Furniture and Home Furnishings Stores
3. Electronics and Appliance Stores
4. Building Material and Garden Equipment and Supplies Dealers
5. Food and Beverage Stores
6. Health and Personal Care Stores
7. Gasoline Stations
8. Clothing and Clothing Accessories Stores
9. Sporting Goods, Hobby, Book, and Music Stores
10. General Merchandise Stores
11. Miscellaneous Store Retailers
12. Non-store Retailers
13. Food Services
Call EASI for Minor stores and Major Merchandise Line information.
NAICS Business Data
ZIP Business Patterns - Each year EASI models the Business Counts and Employment from the ZIP Code level to the Block Group level. EASI has identified 2 kinds of ZIP codes.
1. Business to Business only ZIP codes – EASI assigns all the business counts to the Block Group that is the shortest distance from the Centroid of the ZIP Code).
2. All other ZIP codes – EASI assigns portions of the ZIP Code business data to each of the Block Groups that comprise the ZIP Code (this includes portions of Block Groups). We do not know if there are or are not business present but when added up they maintain the ZIP Code business data.
Note: ZIP Codes even if they seem to be the same (same five digits) are especially difficult for consistency from year to year (they are always consistent within the EASI data and software.) Since each ZIP Code area may change from year to year EASI spends considerable time and effort to develop new ZIP Code data for each and every year. That is, EASI assigns a portion of each Block Group to a ZIP Code based on the latest information for each year (2010 current and five-year forecast). Annually EASI’s creates a proprietary ZIP to Block Group (partial) analysis and we also allocate all land area to create each ZIP Code.
Benchmark Methodology and Assumptions
These retail data are benchmarked at the county level from the 2012 Census. Then EASI develops a ZIP code version of this file. EASI models these actual store locations at the Block Group level using a business employment relationship developed from the latest ZIP Business Patterns. This is done in order to allow the retail sales estimates to be used as part of standard database summaries. Note: EASI does not know the actual locations of stores at the Block Group. Other geographies are estimated by adding up the Block Group estimates.
The updates are modeled against estimated changes based upon the ZIP Business Patterns. Therefore, the sum of the BG’s retail sales estimates within a ZIP Code is consistent to the ZIP Code Business employment data. Any inconsistencies between sources are reviewed and made consistent to the most current data from ZIP Business Patterns.
EASI models the retail trade data to a Block Group based on a proximity model. The model assigns exclusive Business or Retail ZIP Codes to the closest Block Group. For example, from ZIP Business Patterns EASI can identify point business locations and the retail configuration within each.
EASI ® Key Demographic Profiles (Nickname = Complete Name)
EASI has developed a series of independent profiles for each level of geography. The purpose is to give a single picture or image of the most significant segment found in this particular geography. EASI developed a series of independent demographic and business variables that illustrate a wide range of lifestyles and behavior-related variables.
The individual Profiles are used to estimate the Dominant Profile (Cluster) for each standard geography. All profiles are calculated using EASI ® software. Profiles are based upon the relative rankings of each of the components of the profile compared to the national average. These profiles are all independent of each other.
EASI Profiles
1 |
Above Average Education |
AB_AV_EDU |
|
2 |
Apartments (20 or more units) |
APT20 |
|
3 |
Available Renting Units |
RENTAL |
|
4 |
Pre-School |
|
PRESCHL |
5 |
Below Average Education |
BEL_EDU |
|
6 |
Blue Collar Employment |
BLUE_EMPL |
|
7 |
Born in America |
|
BORN_USA |
8 |
Expensive Homes |
|
EXP_HOMES |
9 |
Few Teens |
|
NO_TEENS |
10 |
House for Sale |
|
FOR_SALE |
11 |
In the Air Forces |
|
ARMFORCE |
12 |
Large Families |
|
LARGE_FAM |
13 |
Long Time Residents |
NO_MOVE |
Lots of Cars |
MANY_CARS |
|
15 |
Median Age |
MED_AGE |
16 |
Median Income |
MED_INC |
17 |
No Cars |
NO_CAR |
18 |
Not in Labor Force |
NO_LABFOR |
19 |
Old and Rich Households |
RICH_OLD |
20 |
Old Homes |
OLD_HOMES |
21 |
New Homes |
NEW_HOMES |
22 |
Recent Movers |
RECENT_MOV |
23 |
Retired |
RETIRED |
24 |
Service Employment |
SERV_EMPL |
25 |
Subway or Bus to Work |
SUB_BUS |
26 |
Trailer Park City |
TRAILER |
27 |
Unattached |
UNATTACH |
28 |
Unemployed |
UNEMPL |
29 |
Very Asian |
ASIAN_LANG |
30 |
Very Rich Asians |
RICH_ASIAN |
31 |
Very Rich Blacks |
RICH_BLK |
32 |
Very Rich Families |
RICH_FAM |
33 |
Very Rich Hispanics |
RICH_HISP |
34 |
Very Rich Households |
VERY_RICH |
35 |
Very Rich Non Families |
RICH_NFAM |
36 |
Very Rich Whites |
RICH_WHT |
37 |
Very Spanish |
SPAN_LANG |
38 |
Work at Home |
WORK_HOME |
39 |
Young and Rich Households RICH_YOUNG |
Note: Dominant Profiles are selected from these except for items 30 to 36, which can based on very small sample sizes.
EASI Sales and Other Potentials
1 |
Amusement Index |
AMUS_INDX |
2 |
Bargain Seekers Market |
BARGINS |
3 |
Culture Index |
CULT_INDX |
4 |
Education Index |
EDU_INDX |
5 |
Higher Priced Product Market |
EXP_PROD |
6 |
Luxury Priced Product Market |
LUX_PROD |
7 |
Medical Index |
MEDI_INDX |
8 |
Mortality Index (All Causes) |
MORT_INDX |
9 |
Religion Index |
RELIG_INDX |
10 |
Restaurant Index |
REST_INDX |
Potential Analyses –What are they? Why the need?
Demographics have many applications in sales, marketing and advertising. “Potential Analyses” are unique and powerful evaluation tools based upon the demographics of the residents of a target area and a special type of data called Potential Variable. EASI’s extensive data library includes a variety of Potential Variables including Consumer Expenditures, Health Care Database, and the American Time Use Survey.
A potential analysis creates an estimate that can be compared directly to what you would normally expect to find in an area – where all things (demographics) are equal. For example, a metropolitan area may comprise 1% of the population in the US. Many sales, marketing, and advertising expenses are based on this value. But suppose that the key market population target of the user is actually 1.2% in this market. The aim of an EASI Potential is to estimate what that key market potential is (expressed as a US %). So, if advertising costs are related to total population for the metro then this market has a 20% bump automatically in the number of users that might hear the ad compared to the normal population (and the cost). This is a desirable situation for the advertiser.
The chart below illustrates how to interpret the relationship between Potential and Actual. Areas with High Actual counts and Low Potential are top performers. Areas with Low Actual and High Potential are problem areas.
Actual is a real count or figure number derived without using any estimation techniques. The best example is a company’s sales in a known geography (sales territory) for a specific time period.
Potential is percentage of the US based on a model. For example, an EASI demographic regression type model. Potential can also be based a similar statistical procedure which has attempted to estimates what sales would be in an area based on who lives there, their ages, their incomes, types of businesses, and other relevant demographics using weights.
High or low is based on an index number that is created by dividing the Actual by the Potential for all of the areas you intend to evaluate.
Remember: Potential variables are not actual variables, they are derived estimates. Their purpose is to help you evaluate Actual data, not to replace Actual data.
5. Accuracy
With all estimates and with ours as well, the higher the level of data (national is the highest) the more accurate the estimate. Our data follows standard demographic techniques, all developed with over 35 years of experience in this industry. It is considered a highly accurate technique.
EASI data has also been “field tested”. That is, portions of our updated data are available at our web site and have been used by hundreds of thousands of users. These
users raise questions about our updates, which we investigate. This input does help us to review and check results and makes our estimates better.
Here are some common questions:
Why are the Post Office mailable households different than EASI’s?
a) One reason is that the differences between the counts of ZIP households in the Census and the mailable households from the post office is that there are differences in definitions between mailable households and Census households. There can be two mailable households in a residence but only one household. The Census will call it a single household if there is a relationship and the post office does not keep track of relationships.
How close is EASI updated data to other sources?
b) EASI has made an extensive effort to obtain all relevant information and to incorporate it in a logical statistical manner. Other companies who use similar sources and statistical approach should give similar results. One method of comparison is a circle or ring study. An analysis of comparable ring studies has shown a current population difference of less than 2%. In denser population areas the results of the ring analysis are within .005 percent analysis. With the release of the 2010 Census an analysis showed that EASI ZIP Code estimates were in over 98% of the cases within .005 percent.
EASI has made numerous checks for internal and external consistency in all our estimates. There are three types of checks that are rigorously reviewed. These include; Census internal consistency, controlling updates to definitions of estimates, and correcting for, or preventing, rounding errors, especially in small geographies.
ACS validity check is an analysis and comparison of the results of ACS estimates at the Block Group level. Due to sample sizes and Census procedures for disseminating the ACS results there frequently are ACS results which are inconsistent. These results are analyzed and EASI has developed a series of algorithms to adjust these estimates to make them consistent. (Examples are mostly in small BGs where there might be a single household, by total or by race, found in ACS but no population in ACS. Or a value for a single cell in a detailed by race age distribution won’t re-add to the SF1 distribution for the same results.) EASI strives to correct all of these problems with the Census data and remove these as issues that could affect EASI updates.
EASI updating validity checks involve controlling all Census 2010 distributions and updates that require a controlling definition. EASI then makes the same checks on the EASI updates in order to prevent inconsistencies from coming into the updates.
The next issue is the controlling of distribution to the correct sum. A basic example of that is population by age and sex must add to population. This same issue is where the sum of the male age 0 to 5 for White, Black, Asian, and American Indian and Alaska Native Population, Other Race Population, and Two or More Races. The sum of these population estimates by race must add to total 0 to 5. Another example is that education attainment is defined as for the population 25. Note: Each distribution has a requirement like this. Many must add to population 16+ or population 3+, or households, or population, etc. Other key ones are that Hispanic must be less than or equal to Total Population less White Non-Hispanic Population. Also key is that White Non-Hispanic Population must be less than or equal to White Population. These conditions for updates apply across all estimates including individual age groups (0 to 5, 6 to 11, etc.) and individual income groups ($0 to $15k, 15k to 25k, etc.).
The last part of the validity check is to find and fix rounding errors. Rounding errors are introduced in all estimates since results for the sum of a distribution will frequently not exactly add to the require estimate. To accommodate the rounding error EASI has developed various ways of adjusting the error into the most likely cell (in EASI rounding errors are calculated simultaneously as the distribution is being estimated, so when a group or cell sum is off by 1 (high or low) EASI immediately makes the adjustment in that actual group or cell.
These checks are performed at the BG, City, and ZIP Code levels. This is required since EASI splits BGs to create cities and ZIP Codes. Since splitting of BGs can introduce these validity issues the EASI methodology require the BG checks described above to be repeated at both cities and ZIP Codes as well.
Life Stage Clusters – The Basics
l Step 1: Begin with a collection of neighborhood (Census Block Groups) demographic data series to learn about what comprises a “neighborhood”.
l Step 2: Through thousands of multivariate analyses, EASI synthesized and identified the independent variables, and their relationship to each other, that form the foundation of the clusters. This statistical foundation of neighborhoods form the basis of “Life Stages”.
l Step 3: Based on the unique variables characterized by the Life Stages concept of independent clusters, EASI was able to replicate and verify the accuracy and utility of their neighborhood prediction model.
l Step 4: Create EASI Life Stages, an understandable, explainable, and statistically relevant group of clusters which comprise a highly predictive neighborhood model of location.
For a further discussion of these methodologies please call 800-HOW-EASI (469-3274) or email info@easidemographics.com.
Summary Review
2010 Census Data: PL 94 file. 2010 Census data for April 1, 2010. Cities are now based on the results of the 2010 (population 100 or more as of 4/1/2010). EASI reviews Census estimates for internal consistencies and makes adjustments where required.
Current Estimates and Five-Year Projections: EASI has collected from the Census Bureau all current local (counties etc.) and national updates and estimates for all the key demographic information. All these official estimates have been analyzed and then incorporated into our estimates and projections using a variety of EASI models. However, starting with the 2010 Census benchmark the largest impact of our estimates will now be coming from the American Community Survey and Public Use Microdata Sample.
EASI has summarized from the United States Postal Service (USPS) mailable Households at a County, ZIP Code, Census Tract, and Block Group level. These data have been used as the primary input to estimate local current change within a small area such as a Block Group. Mailable households are not the same as Census Households but are used to indicate recent annual change in household formations. These changes are combined with an EASI proprietary model for updating and forecasting at the Block Groups.
The Mailable Household data match starts by identifying, for every ZIP Plus 4, (ZIP+4), which Block Group it belongs to. EASI develops a split file and a plurality file of these matches using the latest TIGER (Topologically Integrated Geographic Encoding and Referencing system) file, to determine which Block Group (primary) they should be assigned to. One of the key goals is to identify all correct current ZIP Codes and ZIP+4’s and then assign them to the correct Block Group that these Mailable Households should be assigned to. EASI has also reconfigured the 2000 Block Groups into the 2010 Block Group configuration to estimate the 2000 population for comparative purposes. An analysis of this decade change is also included in our model.
EASI has also analyzed the 2010 Census Block files in order to create a population Centroid for each Block Group. The results of that analysis are used for all ring study analysis.
Specific other sources include:
Bureau of the Census – 2010 Census PL 94 – 171; American Community Survey (ACS) and Public Use Microdata Samples (PUMS). Other related sources are: Annual Demographic Survey, Current Population Reports (P20; P25; P60; and numerous special Census reports.
EASI has developed a series of EASI-based proprietary models (based on almost 40 years of experience) for updating and forecasting age sex race income and all related variables. EASI controls estimates to Census control totals and uses the most recent ACS and PUMs as part of their modeling.
2010 Census is also developed from the PL 94 combined with ACS and PUMS data.
Current and 5-year forecasts CEX Data: An EASI proprietary model using the latest CEX (Bureau of Labor Statistics; Consumer Expenditure Study) data modeled against EASI current and forecasted demographics. The results of the CEX are analyzed annually by EASI and then combined with EASI estimates at the Block group level. The Bureau of Labor Statistics and the Bureau of the Census conduct the CEX. There are two parts to the survey. The first part is a diary, which is completed by respondents for two consecutive one-week periods. The second part is an interview survey, which are conducted quarterly (3 months) for five quarters. The interview survey includes about 95% of all expenditures and includes large expenditures such as property, automobiles, major appliances, rent, utility payments insurance premiums, and many others.
EASI annually models these results of about 600+ categories of expenditures against our updated demographic estimates. EASI’s models use our own BG demographic estimates to update these potential sales.
Current Business Counts: EASI uses the latest versions of ZIP Business Patterns and County Business Patterns (US Department of Commerce - Economics and Statistics Administration- Bureau of the Census). EASI has a proprietary method for estimating any non-disclosures.
Current Crime Data: EASI analyzed actual county level FBI data (US Department of Justice - Federal Bureau of Investigation) for the various types of crimes. EASI split the list of these counties (the only level where actual crime data exists) into two random groups of 200 or so each.
EASI then developed a series of regression models which could then be applied to EASI updated demographics. EASI then tested the models on the 200 counties that were not part of the actual regressions.
The results proved that the models do an excellent job of predicating crime rates at the county level.
EASI then used the results to develop a similar methodology for estimating crime at the Block Group and other levels of geography.
Current Weather Data: EASI has analyzed National Oceanic and Atmospheric Administration - National Environmental Satellite, Data and Information Service - National Climatic Data Center.
Current CPI Data: EASI has analyzed data from the Consumer Price Index (Bureau of Labor Statistics - Department of Labor - CPI Detailed Report December 2012).
Current and five-year forecasts of Retail Sales Data: EASI’s Proprietary Retail Sales Estimates include Food Service – Total Retail Sales includes the standard 12 major stores plus Food Service; 55+ Minor Stores, and 45 Major Merchandise Lines. All data are based on an extensive review of County and ZIP Code Retail Trade data for 2012. EASI created a file of benchmark data from the released Census data which is used for our annual update.
Each year, EASI creates a new consistent file of benchmark and updated for 2012, current, and a five-year forecast. EASI re-benchmarks estimates for each update to a new set of Block Group estimates for all retail categories based on new information so our data over time is consistent. These estimates are based on our current analysis of the latest NAICS employment data for each retail store and food service. Note: EASI resolves any inconsistencies between sources as part of this annual process.
Current and five-year forecast of Health Data: EASI has developed a series of demographic models using the latest reports from:
Vital and Health Statistics; Centers for Disease Control and Prevention; Summary Health Statistics for US Adults: National Health Interview Survey – Series 10, #235
Vital and Health Statistics; Centers for Disease Control and Prevention; Summary Health Statistics for US Children: National Health Interview Survey – Series 10, #234
National Vital Statistics Reports United States Life Tables; Centers for Disease Control and Prevention – Volume 56, #9
Current MRI Data: EASI has developed a proprietary model based on the results of the 26 MRI reports – These are for the most current year.
Current and five-year forecast of Life Stage Clusters: EASI has developed a simplified clustering system. A primary goal of the EASI development effort has been the creation of a cluster system based on a Life Stage model. Life Stage Clusters are a neighborhood classification system based on the crucial factors (84 Possible Life Stages based upon:
Age of Head of Households; Marital Status; and Household Income that determine life's key decisions). It is a community-oriented scheme that identifies and quantifies the factors that are involved in moving to a specific location. To accomplish this goal EASI's statisticians have spent hundreds of hours analyzing the vast EASI demographic database, and organizing the results into a simplified system designed for non-statisticians.
EASI Master Database and The Right Site ® Methodology for ZIP4 and Carrier Route Updates
The following is a general description of the methodology used by EASI to update the demographic and economic characteristics for the United States for the ZIP Plus 4’s (Z4) geography.
Since there is no official Census estimates for detailed demographics below the Block Group level EASI has developed a methodology that estimates, for the over 37 million residential Z4s, an approximate value for a variety of key demographics based upon likelihood.
EASI first estimates the number of Households within a Z4 based upon a relation of Z4 mailable households (developed from postal files) to mailable households at the actual Block Group that the Z4 is located in. Once that Z4 Household estimate is created EASI then estimates the Z4 key demographics. EASI has develop a unique approach that uses an inverse weighting formula based upon the distance that the Z4 is from its nearest Block Groups.
For example, if the diagram below illustrates 2 Block Groups
(Block Group Population Centroid are the tops of the mountains) and the Z4s
that surround each, then the Z4s that are bordering (orange) another Block
Group are affected by their proximity to that other Block Group.
The purpose of this explanation is not to divulge any proprietary methods but to illustrate the efforts made on your behalf to create accurate updates. EASI statistician’s and programmers have over 40 years of experience updating these types of data. By industry standard EASI estimates would be considered of the highest quality.
Note: EASI has improved the consistency of its ZIP4 and Carrier Route files by eliminating Business (only business deliveries) ZIP4s from our roster. The consequence of this change will be fewer ZIP4 records in our ZIP4 Conversion File and ZIP4s with our demographic files, as well as considerably fewer records in our Carrier Route. There were almost 200,000 Business Carrier Routes that have been dropped. This change helps make the demographics files (which are all residentially based) more consistent in their allocations to ZIP4s and Carrier Routes.