The Polling Imperilment

https://portside.org/2024-10-03/polling-imperilment

Portside Date: October 3, 2024

Author: Rick Perlstein

Date of source: September 25, 2024

The American Prospect

In 2016, I experienced the desolation of my candidate for president losing after the most respected polling experts told me she had a 71.4 percent, 85 percent, 98.2 percent, and even 99 percent chance of winning. As a historian, I was studying how Ronald Reagan’s runaway landslide in 1980 was proceeded by every pollster but one supremely confident that the race was just about tied. I’ve just finished a fine book published in 2020 that confirms an intuition I’ve been chewing on since then. It turns out this is practically the historical norm. W. Joseph Campbell’s Lost in a Gallup: Polling Failure in U.S. Presidential Elections demonstrates—for the first time, strangely enough, given the robust persuasiveness of its conclusions—that presidential polls are almost always wrong, consistently, in deeply patterned ways.

Unusual for any historical narrative, the pattern is almost unchanged for a good hundred years. First, someone comes forth with some new means of measuring how people will vote for president, and gets it so right it feels like magic. That was the accomplishment of a magazine called The Literary Digest between 1924 and 1932. They sent as many sample ballots as existing technological infrastructure would allow—in 1932, some 20 million—on postcards that doubled as subscription ads. Then, with the greatest care, they counted the ones that came back. For three straight elections, they got it so right the Raleigh News and Observer half-joked that it “would save millions in money and time” to “quit holding elections and accept the Digest’s poll as final.”

In 2008, that was the accomplishment of Nate Silver, who called 49 out of 50 states; in 2012, he notched 50 for 50, scored a best-selling book, and reportedly accounted in the run-up to the election for 20 percent of the traffic for his new employer, The New York Times.

In part two of the cycle, yesterday’s miracle suffers a spectacular failure—as in the poll-crazy year of 1936, when modern political polling was invented by the triumvirate of George Gallup, Elmo Roper, and Archibald Crossley, who all called it for Roosevelt over Alf Landon, where the Digest only gave him 41 percent of the popular vote. Their technical revolution (directly querying a representative sample of the electorate) seemed so obvious in retrospect, you wonder how nobody thought of it before. The same with Silver’s model of aggregating, then evaluating and weighting for accuracy, existing state polls.

They’re cocky about it; that’s a pattern, too. That’s what tends to proceed their most spectacular failures.

In early September of 1948, Elmo Roper announced that he wouldn’t publish further results, because “the outcome is settled.” Archibald Crossley vowed to stop counting because “there had been little late shift in 1936, 1940, and 1944.” Just like in 1928, people asked why we should even bother having an election. So confident were the experts that the famous Chicago Daily Tribune early-edition headline “DEWEY DEFEATS TRUMAN” was only one of many. A German newspaper even described what it claimed was a raucous celebration of Dewey’s victory in Times Square.

Kind of like in 2016, when reporters saw Clinton associates popping champagne corks on Election Day in the campaign plane.

POLLSTERS NEXT DO WHAT ONE WOULD EXPECT: They adjust their methods—but to fight the last war. What else can they do?

In 1952, the three famous pollsters, terrified that “another blunder like that of 1948 would just about finish them off,” as one newspaper put it, were so timid that all predicted a photo finish. A Wall Street Journal columnist complained that pollsters were acting “as coy as the Delphic Oracle (remembered in history for its skill in framing answers which would be right no matter what happened).” Ultimately, Dwight D. Eisenhower scored a nationwide blowout.

George Gallup, whom Time had just deemed the “Babe Ruth of the polling profession”—oops!—gave as his alibi, “No scientific method is known today which can accurately predetermine the voting intentions of people who are … undecided.” Nate Silver offered the same truism 67 years later: “There’s not much a pollster can do when a voter hasn’t made up her mind.” But you have to try something. So Gallup weighted the 13 percent of his last 1952 sample who hadn’t yet made up their minds as going 3-to-1 for the Democrat, as they had in 1948. But this time, they mostly went for the Republican. Oops again.

That error opens up onto the myriad conceptual fallacies built into the entire enterprise, if something so unavoidable can be called an “error.” Past performance is no guarantee of future results; but past performance is all a pollster has to go on. That’s why much of the process of choosing and weighting samples is … well, you can call it “more art than science.” Or you can call it “intuitive.” Or you can call it “trial and error.” But you can also call it “made up.”

The electorate, Campbell observes, is “a self-selecting, ephemeral population that takes shape only when the time comes to vote.” To model an electorate by polling individuals, you have to measure how “likely” or “unlikely” that respondent is to vote. In 1949, Arch Crossley called it “the great question we have not answered.” In 2016, Pew released a study explaining voter likelihood, as The Atlantic summarized, as “a vexing bit of psychological prediction pollsters have never gotten quite right.”

They try by sifting voters into categories: male or female, young or old, religious or not. That latter one makes for a possible explanation for the debacle of 1980: Evangelical Christians went from being one of the least active categories of voters to pretty active in 1976, when Jimmy Carter ran, someone they considered one of their own. But how many of them would vote in 1980, after their leaders threw Carter over for his alleged liberal heresies? With such a small “n” (in social science terms) to work with, it was no more scientific than throwing at a dartboard with a blindfold.

It’s always something. In 1966, when Reagan ran for governor of California, he outperformed the polls, apparently because many who voted for him were ashamed to tell a stranger they chose an actor who was labeled an extremist. How should pollsters have weighted the “shy Reagan effect” in 1980? Should they have conjured up a revised weighting in 1984, perhaps one that ran the other way, given Republicans succeeding in those years making voters feel shy about their liberalism?

You could go either way. But you won’t know whether you were right until after the election—when all a pollster can do about it is fight the last war next time.

Many pollsters’ decisions about methodology are by necessity subjective, even arbitrary. Campbell lists a quick half-dozen: how they list a candidate’s job title; the order in which the choices are stated; the gender of the interviewer; whether it’s done by phone, internet, or in person; even the day of the week. The pollsters can likewise be arbitrary once the numbers come in. Lost in a Gallup notes a fascinating experiment carried out by Nate Cohn for The New York Times. He had four pollsters interpret the same raw data from a 2016 poll of Florida. Their choices in how to weight ranged from Clinton winning by four percentage points to Trump winning by one.

Cohn concluded, “Clearly, the reported margin of error due to sampling … doesn’t even come close to capturing total survey error … There really is a lot of flexibility for pollsters to make choices that generate a fundamentally different result.”

POLLSTERS TEND NOT TO INTERPRET THIS all as a spur to humility. Reading Campbell’s book, I found myself creating a section of my notes headed “Assholes.” Like George Gallup in ’48 giving the excuse that his mistakes were his audience’s fault: “Most laymen see no difference between forecasting an election and picking the winner of a horse race. In due time these people will be educated to the difference.” Or John Zogby in 2004, when he had joined the herd who said John Kerry had it in the bag. This was so taken for granted that on Election Day, senior adviser Bob Shrum said to Kerry, “May I be the first to call you Mr. President?” When this proved wrong, Zogby whined, “I don’t know that anyone was hospitalized over my prediction.”

The spin’s the thing. Admitting the enterprise’s fallibility is bad for business.

And make no mistakes, this is a business. Sometimes, that drives pollsters’ herdlike caution, where everyone ends up making the same kind of mistake, like in 1996 when CBS/New York Times, Pew, Harris, and ABC/Washington Post all tipped it to Clinton from a range of 11 to 18 (he won by 8.5). Sometimes attempts at market-driven product differentiation create a temptation that sends things off the rails. According to a 1976 exposé of the polling industry called Lies, Damn Lies, and Statistics, Louis Harris’s frustration at being only the “second best-known pollster” grated on him so much that he made “mistakes of judgement in efforts to scoop Gallup.” Like when he published a poll for the Sunday papers before the 1968 election that showed Humphrey passing Nixon on the home straight by four, where he had trailed the whole campaign. Plot twists sell, after all.

In 2000, Gallup’s own bid for product differentiation was a daily tracking poll. It was advertised as “a continually changing portrait of where the American public stands.” Continually change it did—over three days in early October, from Bush +11 to Gore +7.

It’s a good example of how blithely pollsters can invent a reality they purport to describe. All these numbers could ever be was a statistical artifact of the reality that the more undecided or “no opinion” voters there are, the less predictive a poll can be. Polling so closely was inherently misleading. Instead, the implication was that it proved the electorate was fantastically volatile. Which at the very least makes for a more entertaining horse race. “I would love to be tracking the election that Gallup is tracking,” one more responsible practitioner rued. “It’s a lot more interesting than the one I’m looking at.”

Another consequence of the capitalist imperatives of the polling biz is a little bit horrifying. Since 1936, pollsters have saved money by stopping their counts days or even weeks before an election. The pollsters who got 1980 wrong, for example, had all stopped before they could measure the game-changing consequences of that year’s only debate, held the Tuesday before the election.

It was a money thing. In Jack Germond and Jules Witcover’s Blue Smoke and Mirrors: How Reagan Won and Why Carter Lost the Election of 1980, you can read the classic scene in which Pat Caddell breaks the bad news to the president on Air Force One that he’s about to suffer a landslide loss. Caddell knows this thanks to their record $2 million polling budget, which let him survey right up to the end. The voters waiting for Carter on the tarmac in Georgia, on the other hand, were lost in their Gallup: They presumed the election was tied.

This should be an imperishable lesson. Except, in 2016—there you go again—Wisconsin’s “benchmark” state poll, run by the Marquette University Law School, stopped contacting voters nine days early, notched Hillary Clinton nine points ahead, then ate their proverbial crow when Donald Trump won that pivotal battleground state.

THE PROBLEM OF THE MYRIAD STATE POLLS brings us to Nate Silver and his epigones. Silver’s oft-imitated method, as Campbell summarizes it, is “to assess and aggregate national state-level polls, then crank them through a statistical model that considers past performance of the polls and the rigor of their survey methodology … among other variables.” The idea, like in an insurance risk pool, is that with a big enough mega-sample, the bad cancels out the good.

But an aggregator can only be as good as the polls he aggregates—and as we’ve seen, bad predictions often come in herds.

He can also only be as good as how soundly he weights them according to past performance. But of course, performance of that Marquette University poll had been unimpeachable, until it wasn’t; as had been the 1920s Literary Digest poll; as had been the pre-1948 triumvirate polls.

Life can only be understood backward, but it must be lived forward. Subjective and arbitrary decisions must therefore be made by aggregators, just as much as by traditional pollsters—if not more so.

There is a young political analyst named Joshua Cohen, whom I admire very much for grasping, foregrounding, and skillfully applying the necessarily multifaceted tools a responsible political prognosticator must use. In his Substack, he published a devastating two-part critique of Silver that contains a rigorous documentation of how atrocious his judgment can be in making these decisions. There is a polling organization called the Trafalgar Group that functions like a propaganda outfit, publishing Republican-leaning “shock polls” for media attention. Trafalgar got lucky in 2020 when other, more responsible pollsters happened to undercount eventual Republican strength: That meant, like a blind squirrel, Trafalgar was the only one that was “right.”

So Silver graded them an A- for reliability. Even though their principal, one Robert Cahaly, is an advocate of the Big Lie. Silver then denied that they “always” lean in the Republican direction, because, after all, they only started in 2016.

Cohen argues that Silver hasn’t had a truly successful election since 2012. But boy, can he spin. In fact, when it comes to petulant pollster alibis, the former baseball statistician truly is the field’s Babe Ruth.

Clinton-Trump 2016 was supposed to be the Year of Silver. But it started with a demonstration of his doofishness. Seeking a scientific-seeming method in order, for the first time, to FiveThirtyEight a primary process, he hit upon counting endorsements. Using this method, one of his staffers, Harry Enten, gave Donald Trump a “negative 10 percent” chance of the nomination. Nonetheless, by general-election time Silver-mania was in full effect, joined in the field by any number of aggregate-building imitators—for with aggregating, this whole polling problem had really been licked.

The one at HuffPost awarded Clinton a 99 percent chance of winning. The one at Princeton was run by a neuroscientist named Sam Wang who said he would eat a bug on live TV if Trump won. (He did.)

As for Silver himself, he blithely parried critics by observing that, well, a 71.4 percent chance of Clinton means a 28.6 percent chance of Trump. So was he even actually wrong?

To be fair, all the big presidential pollsters do this to greater or lesser degrees. Their never-wrongness, after all, is their value proposition. Spinning is part of the business model.

In 1952, George Gallup said that they wouldn’t be “predicting the winner without qualification.” Then, after predicting a tie that turned out to be an Eisenhower landslide, he took out a full-page ad in Editor & Publisher claiming he got it right on the nose—citing only his results for decided voters. His competitor Elmo Roper lied “that he made no forecast and never said the race was close.”

Likewise Silver. Cohen nails him dead to rights:

He was on the top of the world after the 2012 election, with everyone desperate to hear from the race’s second biggest winner on how he got it so right. He could have tempered their excitement, explaining the limits of his own role in his own forecasts, how he never technically made any calls, how much he relied on the collective polling industry getting it right. Instead, he played right into their mythical conception of him, taking full credit for “calls” as noncommittal as the 50.2% chance he gave for Obama to win Florida. There would never be a pained explanation as to why he didn’t technically get the election right, like how he explained after 2016 and 2022 that he didn’t get the election wrong. He was going all in, betting that he could fully sustain his new image as a clairvoyant mastermind.

THAT POLLS DO NOT PREDICT PRESIDENTIAL election outcomes any better now than they did a century ago is but one conclusion of this remarkable history. A second conclusion lurks more in the background—but I think it is the most important one to absorb.

For most of this century, the work was the subject of extraordinary ambivalence, even among pollsters. In 1948, George Gallup called presidential polling (as distinguished from issue polling, which has its own problems) “this Frankenstein.” In 1980, Elmo Roper admitted that “our polling techniques have gotten more and more sophisticated, yet we seem to be missing more and more elections.” All along, conventional journalists made a remarkably consistent case that they were empty calories that actively crowded out genuine civic engagement: “Instead of feeling the pulse of democracy,” as a 1949 critic put it, “Dr. Gallup listens to its baby talk.”

Critics rooted for polls to fail. Eric Sevareid, in 1964, recorded his “secret glee and relief when the polls go wrong,” which might restore “the mystery and suspense of human behavior eliminated by clinical dissection.” If they were always right, as James Reston picked up the plaint in 1970, “Who would vote?” Edward R. Murrow argued in 1952 that polling “contributed something to the dehumanization of society,” and was delighted, that year, when “the people surprised the pollsters … It restored to the individual, I suspect, some sense of his own sovereignty” over the “petty tyranny of those who assert that they can tell us what we think.”

Still and all, the practice grew like Topsy. There was an “extraordinary expansion” in polls for the 1980 election, including the first partnerships between polling and media organizations. The increase was accompanied by a measurable failure of quality, which gave birth to a new critique: news organizations “making their own news and flacking it as if it were an event over which they had no control.”

And so, after the 1980 debacle, high-minded observers began wondering whether presidential polls had “outlived their usefulness,” whether the priesthood would end up “defrocked.” In 1992, the popular columnist Mike Royko went further, proposing sabotage: Maybe if people just lied, pollsters would have to give up. In 2000, Alison Mitchell of The New York Times proposed a polling moratorium in the four weeks leading up to elections, noting the “numbing length … to which polling is consuming both politics and journalism.”

Instead, polling proliferated: a “relentless barrage,” the American Journalism Review complained, the media obsessing over each statistically insignificant blip. Then, something truly disturbing started happening: People stopped complaining.

A last gasp was 2008, when Arianna Huffington revived Royko’s call for sabotage, until, two years later, she acquired the aggregator Polling.com and renamed it HuffPost Pollster. “Polling, whether we like it or not,” the former skeptic proclaimed, “is a big part of how we communicate about politics.”

And so it is.

Even as the resources devoted to every other kind of journalism atrophied, poll-based political culture has overwhelmed us, crowding out all other ways of thinking about public life. Joshua Cohen tells the story of the time Silver, looking for a way to earn eyeballs between elections, considered making a model to predict congressional votes. But voters, he snidely remarked, “don’t care about bills being passed.”

Pollsters might not be able to tell us what we think about politics. But increasingly, they tell us how to think about politics—like them. Following polls has become our vision of what political participation is. Our therapy—headlines like the one on AlterNet last week, “Data Scientist Who Correctly Predicted 2020 Election Now Betting on ‘Landslide’ Harris Win.” Our political masochism: “Holy cow, did you hear about that Times poll.” “Don’t worry, I heard it’s an outlier …”

The Washington Post’s polling director once said, “There’s something addictive about polls and poll numbers.” He’s right. When we refer to “political junkies,” polls are pretty much the junk.

For some reason, I’ve been able to pretty much swear off the stuff, beyond mild indulgence. Maybe it’s my dime-store Buddhism. I try to stay in the present—and when it comes to the future, try to stick with things I can do. Maybe, I hereby offer myself as a role model?

As a “political expert,” friends, relatives, and even strangers are always asking me, “Who’s going to win?” I say I really have no idea. People are always a little shocked: Prediction has become what people think political expertise is for.

Afterward, the novelty of the response gets shrugged off, and we can talk. Beyond polling’s baby talk. About our common life together, about what we want to happen, and how we might make it so. But no predictions about whether this sort of thing might ever prevail. No predictions at all.

[Rick Perlstein is the author of a four-volume series on the history of America’s political and cultural divisions, and the rise of conservatism, from the 1950s to the election of Ronald Reagan. He lives in Chicago.]

Read the original article at Prospect.org.

Support the American Prospect.

Click here to support the Prospect's brand of independent impact journalism