Skip to contents

Prerequisites

Before scraping data you need a working Python environment. If you have not done so yet, follow the steps in the Python setup vignette:

# One-time setup (downloads ~100-200 MB the first time)
downballot_install_python()

At the start of each R session:


The single entry point: scrape_elections()

All data retrieval flows through one function:

scrape_elections(state, office = "general", ...)

The scraper is selected automatically based on state and office — you never need to specify a backend by name:

Priority Condition Scraper used
1 office = "school_district" Ballotpedia school board scraper (all US states)
2 office = "state_elections" Ballotpedia state elections (2024–present)
3 office = "municipal_elections" Ballotpedia municipal + mayoral elections (2014–present)
4 state is North Carolina NC State Board of Elections scraper
5 All other states ElectionStats multi-state scraper

Data availability

Use db_available_years() to check available year ranges programmatically:

# All sources
db_available_years()

# Filter to one office type
db_available_years(office = "school_district")
db_available_years(office = "state_elections")
db_available_years(office = "municipal_elections")

# Filter to one state
db_available_years(state = "virginia")
Source State Start year End year
ElectionStats Vermont 1789 2024
ElectionStats Virginia 1789 2025
ElectionStats Colorado 1902 2024
ElectionStats Massachusetts 1970 2026
ElectionStats New Hampshire 1970 2024
ElectionStats New York 1994 2024
ElectionStats New Mexico 2000 2024
ElectionStats South Carolina 2008 2025
NC State Board North Carolina 2025 2025
Ballotpedia (school boards) All US states 2013 present
Ballotpedia (state elections) All US states* 2024 present
Ballotpedia (municipal elections) All US states 2014 ("all") / 2020 ("mayoral") present

To see which states are supported by the ElectionStats scraper:

db_list_states("election_stats")
#> [1] "colorado"       "massachusetts"  "new_hampshire"  "new_mexico"
#> [5] "new_york"       "south_carolina" "vermont"        "virginia"

Choosing the right scraper

Goal Call
Candidate + vote totals for VA, MA, CO, NH, VT scrape_elections(state = "virginia", ...)
Candidate + vote totals for SC, NM, NY (Playwright) scrape_elections(state = "south_carolina", ...)
NC precinct-level local results scrape_elections(state = "NC", year_from = 2025, year_to = 2025)
School board district schedules, any state scrape_elections(office = "school_district", mode = "districts", ...)
School board candidates + results scrape_elections(office = "school_district", mode = "results", ...)
All candidates for a state (federal + state + local), 2024+ scrape_elections(state = "Maine", office = "state_elections", year = 2024)
Federal candidates only, with vote counts scrape_elections(state = "Maine", office = "state_elections", year = 2024, election_level = "federal", mode = "results")
Discover municipal/mayoral election URLs for a year scrape_elections(office = "municipal_elections", year = 2022)
Full candidate results for municipal elections scrape_elections(office = "municipal_elections", year = 2022, mode = "results")
Multi-year mayoral results, one state scrape_elections(office = "municipal_elections", state = "Texas", race_type = "mayoral", mode = "results", start_year = 2020, end_year = 2024)

Detailed vignettes

Each scraper has its own vignette with full argument documentation, worked examples, and column descriptions:


Performance tips

  • Start small — test with a single year and state before requesting large date ranges.
  • Playwright states are slower — South Carolina, New Mexico, and New York launch a headless browser for each scrape. Expect several seconds per year.
  • Ballotpedia mode = "results" makes one extra HTTP request per district, contest, or election page. Use mode = "districts" / mode = "listings" / mode = "links" for a fast overview without vote counts.
  • Parallel scraping is on by default for classic (requests-based) ElectionStats states. Pass parallel = FALSE to disable.
  • Be polite — built-in delays are intentional. Do not reduce or disable them; excessive requests may result in temporary IP blocks.