Purpose

The Voting Location Siting Tool is designed to help California county election officials identify optimal sites for potential polling place, vote center and vote-by-mail drop box locations, as well provide visualization of demographic and voter data at the community level.

Please see the Siting Tool's current methodology below. For more details, see CID's contact page.

Scope

The Tool provides vote center and drop box location modeling, along with visualization of demographic and voter data for the following counties:
The Tool provides visualization of demographic and voter data for the following counties:

Voting Location and Ballot Drop Box Location Modeling Methodology

The CID acknowledges that there are many factors that go into the decision-making process for site selection. Therefore, due to the limitations of data and the need to incorporate local on-the-ground knowledge, this tool does not identify exact sites to be used.

Suitable Areas

We identified all areas that were potentially suitable for hosting a site. We created a grid made up of 0.5 mile cells covering the entire county, where each cell is a potential area to host a site. These “suitable areas” were determined using a combined approach for road density and “points of interest” density. A suitable area needed to contain either a sufficient density of roads or at least two points of interest, while also avoiding bodies of water. This means that suitable areas are areas that have some concentration of activity and therefore are more likely to have buildings and infrastructure. Note that we are not suggesting that all points of interest are suitable sites to host a voting location, but that the presence of a point of interest suggests a general concentration of infrastructure.

Points of interest were sourced from OpenStreetMap and were defined by the research team to be both governmental and non-governmental buildings (shown as two different layers in the tool) that could serve as potential voting locations or drop box sites.

Facility Location

To generate suggested locations based on the number of sites each county has publicly reported, we used a k-means model and a series of facility location models.

First we conducted a k-means cluster analysis to aggregate populated census block sites into a smaller number of computationally manageable geographic clusters. These clusters were created by clustering on latitude/longitude. Next we estimated travel time from every census block cluster to every potential area for a voting location (defined by the “suitable areas” grid).

We then used a facility location model to determine optimal locations. The inputs our facility location model used were: the cost of site creation (which included the area score and the presence of points of interest), the travel time to neighboring sites, and the estimated capacity of the site.

The facility location model selected for both the highest gain (most voters served), and the lowest penalty (high travel time and low weighted scores). These scores were generated by the indicators listed in the Data section below, and weighted based on importance. The total score was defined as the sum of individual scores across indicators, multiplied by the weights. The facility location model prefers to locate potential voting areas on sites that are near a high number of voters and/or are near a site with a high score.

The model first selects a certain number of 11-day sites. These sites become fixed points for the 4-day site selection, and the sites from both the 11-day and 4-day selection are fixed when running the model to include additional sites. The number of final points is defined by California law. Proposed sites for voting locations and ballot drop boxes were calculated separately and allowed to overlap.

Scoring Model

The data were normalized and combined in a weighted average. The weighting schema is described below in the section Variable Weights. A higher score indicates that there were multiple priority characteristics, whereas a medium score indicates that this area has some priority characteristics, but not all. For example, an area with a higher rate of eligible non-registered voter population, a higher percent disabled population and a higher percent limited English proficient population would receive a score that is higher than an area with a lower rate of eligible non-registered voter population, lower percent disabled population, and lower percent limited English proficient population.

Data

For more information on the data source and calculation of these variables see below.

Transit Stops: Transit points indicate the location of transit stops in the county and are sourced from regional General Transit Feeds (GTFS). Where GTFS data is missing local transit agency data is used instead.

  • Data sources: GTFS and local transit agencies.
  • Calculation: Frequency of service to each transit stop was normalized to a range of 1-4 that indicates low to high service.
  • Scale: Point
  • Rationale: Voting location/drop off should be proximal to public transportation.
  • Limitations: Assessment of quality will be generally based on published timetables that are subject to change. Some transit stops do not have published stop frequency information. These are retained on the map for visual purposes, but excluded from the analysis.

Percent of Population with Vehicle Access: The percentage of the population with access to a vehicle.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B25044.
  • Calculation: The percent of households with access to at least one vehicle available. The direction of this variable is inverted before it enters the model score, so that areas with a high percentage of car access receive lower priority for siting.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to communities with low rates of household vehicle ownership.
  • Limitations: Data shows where people live, not where they work.

County Percentage of Voting Age Citizens: The number of citizens in this tract who are voting age, divided by the county's total number of voting age citizens.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018).
  • Calculation: The percent of the population in the tract that is eligible to vote (voting age citizen) out of the total tract citizen population.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to population centers.
  • Limitations: Estimates can have high margins of error for small samples. Assumes people spread evenly across census tract.

Percent Disabled Population: The percentage of the population that is disabled.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B23024.
  • Calculation: The percent of residents with disabilities in a census tract out of the total population in the census tract.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to voters with disabilities.
  • Limitations: Data shows where people live, not where they work. Data shows disabled population, not voters.

Eligible Non-Registered Voter Rate: The percentage of voting age citizens who are not registered to vote.

  • Data sources: Statewide Database voter registration data (2012 General Election, 2014 General Election, 2016 General Election); American Community Survey 5-Year Estimate (2014-2018); Census 2010.
  • Calculation: Convert voter data to the tract level and average voter registration totals for 2012-2016. Subtract the average number of registered voters from the citizen voting age population (CVAP). Divide by the total CVAP estimate in the tract. Where the incarcerated population is over 25% of the CVAP, use the 2010 Census estimate for non-institutionalized populations instead of CVAP.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to communities of eligible voters who aren’t registered to vote.
  • Limitations: Imperfect conversion of precinct to tract level. The 2010 Census is the most recent estimate for incarcerated and non-institutionalized populations at the tract level.

County Worker Percentage: The percent of employed county residents in a tract out of the total employed county residents in the county.

  • Data sources: Census LEHD Origin-Destination Employment Statistics (LODES) 2015, workplace area characteristics.
  • Calculation: The percent of the population in the tract that is eligible to vote (voting age citizen) out of the total tract citizen population.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to communities with low rates of household vehicle ownership.
  • Limitations: Data shows where people live, not where they work.
  • Note: Although this variable is displayed on the web map at the census tract level, the model input was at the census block level.

Percent Latino Population: The percentage of the population that is Hispanic or Latino.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B03002.
  • Calculation: The percent of residents that are Hispanic or Latino in a tract out of the total tract population.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to communities with historically low VBM. CID research finds Latino voters have lower VBM use.
  • Limitations: Data shows where people live, not where they work. Assumes people spread evenly across census tract.

Percent Limited English Proficient Population: The percentage of the population that has limited English proficiency.

  • Data sources: American Community Survey 5-Year Estimate (2011-2015), Table B16001.
  • Calculation: The percent of the population with limited English proficiency in a census tract. Limited English proficiency is defined as people who speak English “less than very well”.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to language minority communities.
  • Limitations: Data shows where people live, not where they work. At the time of publication the 2012-2016 data had not yet been released.

Polling Place Voter Percentage: The number of voters who voted at a polling place divided by the total number of voters who voted at a polling place in the county.

  • Data sources: Statewide Database (2016 General Election).
  • Calculation: Convert precinct to block. Calculate the percent of polling place voters in a block out of the county total. Divide the number of people who voted at a polling place in the 2016 General Election in the block by the total number of people who voted at a polling place in the 2016 General Election.
  • Scale: Census Tract - Note that although this variable is displayed on the web map at the census tract level, the model input was at the census block level.
  • Rationale: Voting location/drop off should be proximal to communities with historically low VBM. CID research finds Latino voters have lower VBM use.
  • Limitations: Data shows where people live, not where they work. Assumes people spread evenly across census tract.
  • Note: Although this variable is displayed on the web map at the census tract level, the model input was at the census block level.

Population Density: The total population density per square kilometer.

  • Data sources: Census 2010.
  • Calculation: Divide the total block population by the area of the block (square kilometers).
  • Scale: Census Tract - Note that although this variable is displayed on the web map at the census tract level, the model input was at the census block level.
  • Rationale: Voting location/drop off should be proximal to population centers.
  • Limitations: Data shows where people live, not where they work. Most recent block level population estimate is from 2010.
  • Note: Although this variable is displayed on the web map at the census tract level, the model input was at the census block level.

Percent of the Population in Poverty: The percentage of the population with income below the poverty level.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B17001.
  • Calculation: The percent of residents living below poverty in a census tract
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to low-income communities.
  • Limitations: Data shows where people live, not where they work.

Vote by Mail Rate (Asian-American): The percentage of Asian-American voters who voted by mail out of total Asian-American voters.

  • Data sources: Statewide Database (2016 General Election).
  • Calculation: Convert from precinct to block level, calculate as percent of total vote. Calculate the VBM rate for the Asian-American vote by dividing the number of Asian-American voters who voted by mail by the total number of Asian-American voters who voted. The direction of this variable is inverted before it enters the model score, so that areas with a high VBM rate receive lower priority for siting.
  • Scale: Census Tract - Note that although this variable is displayed on the web map at the census tract level, the model input was at the census block level.
  • Rationale: Voting location/drop off should be proximal to communities with low VBM usage. CID research shows that VBM use varies throughout the Asian-American community.
  • Limitations: Imperfect conversion of precinct to block.
  • Note: Although this variable is displayed on the web map at the census tract level, the model input was at the census block level.

Vote by Mail Rate (Latino): The percentage of Latino voters who voted by mail out of total Latino voters.

  • Data sources: Statewide Database (2016 General Election).
  • Calculation: Convert from precinct to block level, calculate as percent of total vote. Calculate the VBM rate for the Latino vote by dividing the number of Latino voters who voted by mail by the total number of Latino voters who voted. The direction of this variable is inverted before it enters the model score, so that areas with a high VBM rate receive lower priority for siting.
  • Scale: Census Tract - Note that although this variable is displayed on the web map at the census tract level, the model input was at the census block level.
  • Rationale: Voting location/drop off should be proximal to communities with low VBM usage.
  • Limitations: Imperfect conversion of precinct to block.
  • Note: Although this variable is displayed on the web map at the census tract level, the model input was at the census block level.

Vote by Mail Rate (Youth): The percentage of voters between the age of 18 and 24 years old who voted by mail out of total youth voters.

  • Data sources: Statewide Database (2016 General Election).
  • Calculation: Convert from precinct to block level, calculate as percent of total vote. Calculate the VBM rate for the youth vote by dividing the number of youth voters who voted by mail by the total number of youth voters who voted. The direction of this variable is inverted before it enters the model score, so that areas with a high VBM rate receive lower priority for siting.
  • Scale: Census Tract - Note that although this variable is displayed on the web map at the census tract level, the model input was at the census block level.
  • Rationale: Voting location/drop off should be proximal to communities with low VBM usage.
  • Limitations: Imperfect conversion of precinct to block.
  • Note: Although this variable is displayed on the web map at the census tract level, the model input was at the census block level.

Vote by Mail Rate (Total): The percentage of voters who voted by mail out of total voters.

  • Data sources: Statewide Database (2016 General Election).
  • Calculation: Convert from precinct to block level, calculate as percent of total vote. Calculate the VBM rate for the total vote by dividing the number of voters who voted by mail by the total number of voters who voted. The direction of this variable is inverted before it enters the model score, so that areas with a high VBM rate receive lower priority for siting.
  • Scale: Census Tract - Note that although this variable is displayed on the web map at the census tract level, the model input was at the census block level.
  • Rationale: Voting location/drop off should be proximal to communities with low VBM usage.
  • Limitations: Imperfect conversion of precinct to block.
  • Note: Although this variable is displayed on the web map at the census tract level, the model input was at the census block level.

Percent of the Youth Population: The percentage of the population between the age of 18 and 24 years old.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B01001.
  • Calculation: The percent of residents between the ages of 18 to 24 years in a tract out of the total tract population.
  • Scale: Census Tract
  • Rationale: Voting location/drop off should be proximal to communities with historically low VBM usage. CID research finds youth have lower VBM use.
  • Limitations: Data shows where people live, not where they work. Assumes people spread evenly across census tract.

Geographically Isolated Community:

  • Data sources: Model
  • Calculation: The model accounts for geographically isolated communities encouraging dispersion of sites. The additional suggested areas based on distance account for any remaining communities that have a greater travel time to a suggested site.
  • Scale: NA
  • Rationale: VCA stipulates vote center/drop off boxes to be proximal to geographically isolated communities.
  • Limitations: There is no clear definition or data source for geographically isolated communities.

Travel Time By Car:

  • Data sources: OpenStreetMap
  • Calculation: Use k-means clustering to create groups of computationally-manageable census blocks. Calculate travel time from each group of blocks to each potential siting area using road network analysis. The time is estimated for travel by vehicle, assuming standard rates of travel. This data does not go into the score, but is used to locate sites throughout the county.
  • Scale: NA
  • Limitations: Assessment of how travel time is affected by traffic is imperfect.

Percent Asian-American Population: The percentage of the population that is Asian-American alone, not Hispanic or Latino.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B03002.
  • Calculation: The percent of residents that are Asian-American alone, not Hispanic or Latino in a tract out of the total tract population. The categories for Asian-American alone and Native Hawaiian and Other Pacific Islander alone were summed in order to get a total Asian-American population total.
  • Scale: Census Tract
  • Note: This data is included as contextual population information only, it was not included in the model or in the scoring of the potential areas.

Percent African-American Population: The percentage of the population that is African-American alone, not Hispanic or Latino.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B03002.
  • Calculation: The percent of residents that are African-American alone, not Hispanic or Latino in a tract out of the total tract population.
  • Scale: Census Tract
  • Note: This data is included as contextual population information only, it was not included in the model or in the scoring of the potential areas.

Percent White Population: The percentage of the population that is White alone, not Hispanic or Latino.

  • Data sources: American Community Survey 5-Year Estimate (2014-2018), Table B03002.
  • Calculation: The percent of residents that are White alone, not Hispanic or Latino in a tract out of the total tract population.
  • Scale: Census Tract
  • Note: This data is included as contextual population information only, it was not included in the model or in the scoring of the potential areas.

Variable Weights

In this prototype version variables were weighted equally with the exception of several variables that will receive additional weight due to being flagged as high priority. The variables that received higher weighting were voters with disabilities, voters with limited English proficiency, areas with high populations of eligible non-registered voters, areas with low VBM use for Latino and Youth voters, and areas with relatively high worker density. The variables that received the highest weight were areas with low total VBM use, areas close to public transit stops, and areas with relatively high population density.

The same weighting system was used for modeling optimal ballot drop box areas, with the exception of areas with eligible non-registered voters, which received no extra weight.

Assessing Gaps in Vote Center Coverage

After the model selects optimal areas for voting locations, we identify areas for additional siting coverage either by geographic distance or by additional need. These areas might be considered service gaps. We are interested in seeing where additional voting locations could be placed to minimize the number of people that are more than 20 minutes travel time distance from a voting location, or to minimize the overall travel cost of all voters. The number of additional suggested areas is 10% of the total minimum required facilities.

Reliability of Estimates

Some data published by the American Community Survey (ACS) rely on small sample sizes, meaning that the resulting estimates can have a high degree of uncertainty. The coefficient of variation (CV) was calculated for every ACS-based variable, where the CV is equal to the standard error (associated with 90% confidence interval) divided by the estimate (see ACS documentation, Appendix 1). Tract estimates with a CV over 40% were considered to have a high degree of uncertainty and are flagged visually on the web map with a gray square symbol.