by Alton Henley
Before we talk about what AI can do for local government, we need to talk about what it has already done to people.
Not hypothetically. Robert Williams was arrested in front of his daughters. Disabled Medicaid recipients lost care hours they needed to eat and bathe. Families were separated based on risk scores that measured poverty, not danger. A chatbot told New York City residents they could legally steal their employees’ tips.
These are not cautionary tales from some imagined future. They are the documented record. And every one of them involved a government that believed it was using technology responsibly.
The Best-Case Scenario Still Failed
The Allegheny County case is the one I keep coming back to, because it eliminates the easy explanation.
In 2016, Allegheny County, Pennsylvania implemented the Allegheny Family Screening Tool to predict which families reported for potential child maltreatment were at highest risk. The county published its methodology. It commissioned an independent ethical review. It invited external researchers to evaluate the system. It engaged extensively with critics. Virginia Eubanks, whose Automating Inequality devotes an entire chapter to this system, acknowledges that the county “did every single thing that progressive folks who talk about these algorithms have asked people to do.”
The system still systematically disadvantaged the families it was supposed to protect.
Here’s why. The AFST draws its data from public systems: county welfare offices, Medicaid, behavioral health services, jails, public housing. Families that interact with these systems generate data points. Families with private doctors, private therapists, and private insurance generate almost none.
The algorithm doesn’t measure risk of child maltreatment. It measures visibility to government. And visibility to government correlates with poverty and race—not because poor people are more likely to harm their children, but because poor people are more likely to have their lives documented by government institutions.
An AP investigation, drawing on Carnegie Mellon University research data, found that the AFST calculated mandatory-investigation scores for 32 percent of Black children referred for neglect, compared to 21 percent of white children. The county’s own data shows that 30 percent of cases classified at the highest risk level were thrown out as baseless upon investigation.
Caseworkers overrode the algorithm roughly a third of the time for the highest-risk scores, and those overrides reduced the racial disparity. But the fact that human judgment was needed to correct algorithmic bias rather defeats the purpose of the tool.
The Proxy-Variable Problem Is Structural
The significance of the Allegheny case for local governments isn’t that one county made a specific mistake. It’s that the problem is structural.
Any algorithm built on government administrative data will reflect who interacts with government, which correlates with poverty, which correlates with race. This isn’t a calibration error. It’s not something a better dataset would fix. The proxy-variable problem is baked into what government data measures. Child welfare, benefits eligibility, housing assistance, social services—these are functions that local governments of modest size actually perform. Any community that adopts algorithmic decision support for eligibility or risk assessment will face the same structural challenge the Allegheny team faced, with fewer resources to study it.
Common Threads
The specifics vary across the failure record—criminal justice, policing, benefits, chatbots—but the same institutional failures appear in every case.
No system was adequately tested before deployment. COMPAS wasn’t tested for racial bias. Facial recognition was used for identification despite documented accuracy disparities. Benefits algorithms were implemented without piloting on representative populations.
Once deployed, no system was adequately monitored. LAPD’s PredPol ran for six years without independent effectiveness evaluation. Benefits cuts didn’t trigger reviews. Problems accumulated until they became crises.
In nearly every case, the affected people—and often the government officials responsible—couldn’t explain how the AI reached its decisions. Trade secrets, technical complexity, inadequate documentation. And when things went wrong, accountability was diffuse: vendors blamed government implementation, governments blamed vendor products, workers blamed policies they didn’t create. No one was clearly responsible.
The Ethical Questions That Arrive as Operational Decisions
The failure record isn’t just a governance problem. It’s an ethical one. And the ethical questions it raises aren’t abstract—they will arrive in a council chamber or a department head’s office, dressed as operational choices.
When does efficiency come at the cost of dignity? Indiana’s welfare automation increased processing speed. It also meant that eligible people lost food assistance because a system they couldn’t understand rejected them for technical errors. The efficiency was real. So was the indignity. Every AI deployment that touches residents carries this question: is the time we save worth the experience we impose?
What do communities owe people when their system is wrong? When a government learns its AI system produces disparate outcomes, the ethical obligation isn’t just to fix the system. It’s to account for the harm already done.
Should you deploy AI that works for most but fails for some? In every documented failure, the costs fell disproportionately on people who were already vulnerable. That pattern isn’t coincidental. It’s structural.
What This Means for Communities Considering AI
The right response to this record isn’t to conclude that AI is too risky for government use. It’s to recognize that AI requires governance to use safely—and that many early implementations lacked adequate governance.
The knowledge barrier documented in survey after survey likely reflects practitioners recognizing they don’t yet know how to govern AI safely. That’s a reasonable assessment given this record. Leaders aren’t wrong to be uncertain.
But uncertainty is not a permanent condition. Communities can build governance capacity before deploying high-stakes applications. They can start where consequences are manageable, build oversight infrastructure, and advance carefully. Renée Cummings puts it directly: “These weren’t technical glitches. They were devastating design decisions and bad algorithmic policy that punished the poor, the marginalized, the historically underserved.”
The question every local government should answer before deploying AI for any consequential decision: what happens when this system is wrong, and who bears the cost?
Alton Henley is the author of The Knowledge Barrier: AI Adoption in American Local Government and Ready or Not: A City Manager’s Guide to AI in Local Government.