Connect with us


How claims of voter fraud were supercharged by bad science



During the 2016 primary season, Trump campaign staffer Matt Braynard had an unusual political strategy. Instead of targeting Republican base voters—the ones who show up for every election—he focused on the intersection of two other groups: people who knew of Donald Trump, and people who had never voted in a primary before. These were both large groups. 

Because of his TV career and ability to court controversy, Trump was already a household name. Meanwhile, about half America’s potential voters, nearly 100 million people, don’t vote in presidential elections, let alone primaries. The overlap between the groups was significant. If Trump could mobilize even a small percentage of those people, he could clinch the nomination, and Braynard was willing to put in the work. 

His strategy, built from polls, research, and studies of voting behavior, focused on two goals in particular. The first was registering, engaging, educating, and turning out non-voters, the largest electoral bloc in the country and one that’s regularly ignored. One recent survey of 12,000 “chronic non-voters” suggests they receive “little to no attention in national political conversations” and remain “a mystery to many institutions.” 

One way to turn out potentially sympathetic voters would be to use a call center to remind them, which would also help with his second goal: to investigate and expose voter fraud. 

“If you’re trying to do systematic voter fraud, you’re going to look for people who haven’t or are not going to cast their ballot,” he told me in a recent interview, “because if you do cast a ballot for them and they do show up at the polling place, that’s going to set up a red flag.”

So the plan was that after the election, the call centers would contact a sample of the people in the state who had voted for the first time to confirm that they had actually cast a ballot. 

Not only was pursuing voter fraud popular with prospective donors, Braynard says, but it was also an endeavor supported by the academic literature. “I believe it’s been documented, at least scientifically in some peer-reviewed studies, that at least one senator in the last 10 years was elected by votes that aren’t legal ballots,” he says. 

This single voter fraud study has become canonical among conservative, and many of today’s other claims of fraud—such as through mail-in voting—also trace back to it.

A study like this does in fact exist, and it and is peer-reviewed. In fact, it goes even further than Braynard remembers. Published in 2014 by Jesse Richman, a political science professor at Old Dominion University, it argues that illegal votes have played a major role in recent political outcomes. In 2008, Richman argued, “non-citizen votes” for Senate candidate Al Franken “likely gave Senate Democrats the pivotal 60th vote needed to overcome filibusters in order to pass health care reform.” 

The paper has become canonical among conservatives. Whenever you hear that 14% of non-citizens are registered to vote, this is where it came from. Many of today’s other claims of voter fraud—such as through mail-in voting—also trace back to this study. And it’s easy to see why it has taken root on the right: higher turnout in elections generally increases the number of Democratic voters, and so proof of massive voter fraud justifies voting restrictions that disproportionately affect them.

Academic research on voting behavior is often narrowly focused and heavily qualified, so Richman’s claim offered something exceedingly rare: near certainty that fraud was happening at a significant rate. According to his study, at least 38,000 ineligible voters—and perhaps as many as 2.8 million—cast ballots in the 2008 election, meaning the “blue wave” that put Obama in office and expanded the Democrats’ control over Congress would have been built on sand. For those who were fed up with margins of error, confidence intervals, and gray areas, Richman’s numbers were refreshing. They were also very wrong.

The data dilemma

If you want to study how, whether, and for whom people are going to vote, the first thing you need is voters to ask. Want to reach them by phone? Good luck calling landlines: very few people pick up. You might have a better chance with cell phones, but don’t expect much. 

Telephone surveys are “barge in” research says Jay H. Leve, the CEO of SurveyUSA, a polling firm based in New Jersey. These phone polls, he says, happen at a time that’s convenient to the pollster, and “to hell with the respondent.” For that reason, the company aims to limit calls to four to six minutes, “before the respondent begins to feel like he or she is being abused.” Online surveys are preferable because respondents can complete them when they want, but it’s still hard to motivate people. For that reason, many survey companies offer something in return for people’s opinion, typically points that can be exchanged for gift cards. 

Even if you’ve found participants, you want to make sure you’re asking good questions, says Stephen Ansolabehere, a government professor at Harvard. He is principal investigator of the Cooperative Congressional Election Study (CCES), a national survey of more than 50,000 people about demographics, general political attitudes, and voting intentions—and the data set used in Jesse Richman’s voter fraud study. It’s easy to generate bias in your results by wording your survey questions poorly, says Ansolabehere.

“We’ll try and be literal and give brief descriptions, and we generally don’t do things too adjectivally,” Ansolabehere says. But what about when the bill you’re asking about is called something inflammatory, like the “Pain-Capable Unborn Child Protection Act?” “We don’t use that title,” he says.

Another problem with opinion polling is that what somebody thinks doesn’t really matter if it’s not going to translate into a vote. That means you have to figure out who will actually show up to the polls. 

Here, demographic data is helpful. Women vote slightly more than men. White people vote more than people of color. Those 65 and older vote at rates roughly 50% higher than those 18 to 29, and advanced degree holders up to nearly three times as often as those without a high school diploma. 

However, even if you ladle on the enticements, some demographic groups are simply less likely to respond to survey requests, which means you’ll need to adjust the numbers coming out of your survey group. Most polling firms do this by amplifying the responses they get from underrepresented groups: a survey with a small sample of Hispanic voters, say, might weight their responses more heavily if trying to predict behavior in a battleground state like Arizona, where 24% of voters are Latino. 

One 2016 presidential poll conducted included a young Black man living in the Midwest who supported Trump. Because he represented several harder-to-reach categories—young, minority, male—his responses were dramatically over-indexed.

But beware: this weighting can backfire.

One 2016 presidential poll conducted by the University of Southern California and the Los Angeles Times recruited 3,000 respondents from across America, including a young Black man living in the Midwest who turned out to be a Trump supporter. Because he represented several harder-to-reach categories—young, minority, male—his responses were dramatically over-indexed. This ended up throwing the numbers off: at one point the survey estimated Trump’s support among Black voters at 20%, largely on the basis of this one man’s responses. A post-election analysis put that number at 6%

The media, grasping for certainty, missed the error margins of the study and reached for the headline figures that amplified these overweighted responses. As a result, the survey team— which had already made raw data, weighting schemes, and methodology public—stopped releasing sub-samples of their data to prevent their study being distorted again. Not all researchers are as concerned about potential misinterpretation of their work, however.

An academic controversy

Until Richman’s 2014 paper, the virtual consensus among academics was that non-citizen voting didn’t exist on any functional level. Then he and his coauthors examined CCES data and claimed that such voters could actually number several million. 

Richman asserted that the illegal votes of non-citizens had changed not only the pivotal 60th Senate vote but also the race for the White House. “It is likely though by no means certain that John McCain would have won North Carolina were it not for the votes for Obama cast by non-citizens,” the paper says. After its publication, Richman then wrote an article for the Washington Post with a similarly provocative headline that focused on the upcoming 2014 midterms: “Could non-citizens decide the November election?”

Unsurprisingly, conservatives ran with this new support for their old narrative and have continued to do so. The study’s fans include President Trump, who used it to justify the creation of his short-lived and failed commission on voter fraud, and whose claims about illegal voting are now a centerpiece of his campaign. 

But most other academics saw the study as an example of methodological failure. Ansolabehere, whose CCES data Richman relied on, coauthored a response to Richman’s work titled “The Perils of Cherry Picking Low-Frequency Events in Large Sample Sizes.” 

Stephen Ansolabehere testifies
Stephen Ansolabehere.

For starters, he argued, the paper overweighted the non-citizens in the survey—just as the Black Midwestern voter was overweighted to produce an illusion of widespread Black support for Trump. This was especially problematic in Richman’s study, wrote Ansolabehere, when you consider the impact that a tiny number of people who were misclassified as non-citizens would have on the data. Some people, said Ansolabehere, had likely misidentified themselves as ineligible to vote in the 2008 study by mistake—perhaps out of sloppiness, misunderstanding, or just the rush to accumulate points for gift cards. Critically, nobody who had claimed to be a non-citizen in both the 2010 survey and the follow-up in 2012 had cast a validated vote.

Nearly 200 social scientists echoed Ansolabehere’s concerns in an open letter, but for Harold Clarke, then editor of the journal that published Richman’s paper, the blowback was hypocritical. “If we were to condemn all the papers on voting behavior that have made claims about political participation based on survey data,” he says, “well, this paper is identical. There’s no difference whatsoever.” 

As it turns out, survey data does contain a lot of errors—not least because many people who say they voted are lying. In 2012, Ansolabehere and a colleague discovered that huge numbers of Americans were misreporting their voting activity. But it wasn’t the non-citizens, or even the people who were in Matt Braynard’s group of “low propensity” voters. 

Instead, found the researchers, “well-educated, high-income partisans who are engaged in public affairs, attend church regularly, and have lived in the community for a while are the kinds of people who misreport their vote experience” when they haven’t voted at all. Which is to say: “high-propensity” voters and people likely to lie about having voted look identical. Across surveys done over the telephone, online, and in person, about 15% of the electorate may represent these “misreporting voters.” 

Ansolabehere’s conclusion was a milestone, but it relied on something not every pollster has: money. For his research, he contracted with Catalist, a vendor that buys voter registration data from states, cleans it, and sells it to the Democratic Party and progressive groups. Using a proprietary algorithm and data from the CCES, the firm validated every self-reported claim of voting behavior by matching individual survey responses with the respondents’ voting record, their party registration, and the method by which they voted. This kind of effort is not just expensive (the Election Project, a voting information source run by a political science professor at the University of Florida, says the cost is roughly $130,000) but shrouded in mystery: third-party companies can set the terms they want, including confidentiality agreements that keep the information private.

In a response to the criticism of his paper, Richman admitted his numbers might be off. The estimate of 2.8 million non-citizen voters “is itself almost surely too high,” he wrote. “There is a 97.5% chance that the true value is lower.” 

Despite this admission, however, Richman continued to promote the claims.

In March of 2018, he was in a courtroom testifying that non-citizens are voting en masse. 

Kris Kobach, the Kansas secretary of state, was defending a law that required voters to prove their citizenship before registering to vote. Such voter ID laws are seen by many as a way to suppress legitimate votes, because many eligible voters—in this case, up to 35,000 Kansans—lack the required documents. To underscore the argument and prove that there was a genuine threat of non-citizen voting, Kobach’s team hired Richman as an expert witness. 

kris kobach
Kris Kobach.

Paid a total of $40,663.35 for his contribution, Richman used various sources to predict the number of non-citizens registered to vote in the state. One estimate, based on data from a Kansas county that was later proved to be inaccurate, put the number at 433. Another, extrapolated from CCES data, said it was 33,104. At the time, there were an estimated 115,000 adult residents in Kansas who were not American citizens—including green card holders and people on visas. By Richman’s calculations, that would mean nearly 30% of them were illegally registered to vote. Overall, his estimates ran from roughly 11,000 to 62,000. “We have a 95% confidence that the true value falls somewhere in that range,” he testified. 

The judge ended up ruling that voter ID laws were unconstitutional. “All four of [Richman’s] estimates, taken individually or as a whole, are flawed,” she wrote in her opinion.

Unseen impact

One consequence of this unreliable data—from citizens who lie about their voting record to those who mistakenly misidentify themselves as non-citizens—is that it further diverts attention and resources from the voters who lie outside traditional polling groups.

“For the [low-propensity] crowd it is a vicious cycle,” wrote Matt Braynard in his internal memo for the Trump campaign. “They don’t get any voter contact love from the campaigns because they don’t vote, but they don’t vote because they don’t get any voter contact. It is a persistent state of disenfranchisement.” 

Campaigns focus on constituents who are likely to vote and likely to give money, says Allie Swatek, director of policy and research for the New York City Campaign Finance Board. She experienced this bias firsthand when she moved back to New York in time for the 2018 election. Though there were races for US Senate, governor, and state congress, “I received nothing in the mail,” she says. “And I was like, ‘Is this what it’s like when you have no voting history? Nobody reaches out to you?” 

According to the Knight Foundation’s survey of non-voters, 39% reported that they’ve never been asked to vote—not by family, friends, teachers, political campaigns, or community organizations, nor at places of employment or worship. However, that may be changing. 

Stacy Abrams runs for governor of GA
Stacey Abrams’ campaign for governor of Georgia targeted “low propensity” voters.

Braynard’s mobilization strategy played a role in the 2018 campaign for governor of Georgia by Democrat Stacey Abrams. She specifically targeted low-propensity voters, especially voters of color, and though she ultimately lost that race, more Black and Asian voters turned out that year than for the presidential race in 2016. “Any political scientist will tell you this is not something that happens,” wrote Abrams’s former campaign manager in a New York Times op-ed. “Ever.”

But even if campaigns and experts try to break these cycles—by cleaning their data, or by targeting non-voters—there’s a much more dangerous problem at the heart of election research: it is still susceptible to those operating in bad faith.

Backtracking claims

I asked Richman earlier this summer if we should trust the sort of wide-ranging numbers he gave in his study, or in his testimony in Kansas. No, he answered, not necessarily. “One challenge is that people want to know what the levels of non-citizen registration and voting are with a level of certainty that the data at hand doesn’t provide,” he wrote me in an email. 

In fact, Richman told me, he “ultimately agreed” with the judge in the Kansas case despite the fact that she called his evidence flawed. “On the one hand, I think that non-citizen voting happens, and that public policy responses need to be cognizant of that,” he told me. “On the other hand, that doesn’t mean every public policy response makes an appropriate trade-off between the various kinds of risk.” 

Behind the academic language, he’s saying essentially what every other expert on the subject has already said: fraud is possible, so how do we balance election security with accessibility? Unlike his peers, however, Richman reached that conclusion by first publishing a paper with alarmist findings, writing a newspaper article about it, and then testifying that non-citizen voting was rampant, maybe, despite later agreeing with the decision that concluded he was wrong.

Whatever Richman’s reasons for this, his work has helped buttress the avalanche of disinformation in this election cycle.

Throughout the 2020 election campaign, President Trump has continued to make repeated, unfounded claims that vote-by-mail is insecure, and that millions of votes are being illegally cast. And last year, when a ballot harvesting scandal hit the Republican Party in North Carolina and forced a special election that led to a Democratic win, one operative made an appearance on Fox News to accuse the left of encouraging an epidemic of voter fraud.

“The left is enthusiastic about embracing this technique in states like California,” he said. “Voter fraud’s been one of the left’s most reliable voter constituencies.” 

The speaker? Matt Braynard.

However, Braynard is unlike some voter fraud evangelists, for whom finding no evidence of fraud is simply more evidence of a vast conspiracy. He at least purports to be able to change his mind on the basis of new facts. This suggests that there may be a way out of this current situation, where we project our own assumptions onto the uncertainty inherent in voting behavior. 

After leaving the Trump campaign, he founded Look Ahead America, a nonprofit dedicated to turning out blue-collar and rural voters and to investigating voter fraud. As part of the group’s work, he and 25 other volunteers served as poll watchers in Virginia in 2017. 

The process wasn’t as transparent as he would’ve liked. He wasn’t allowed to look over poll workers’ shoulders, and there were no cameras to photograph voters as they cast their ballots. But even though he wasn’t absolutely certain that the election was clean, he was still confident enough to issue a press release the following day. 

“At least where we were present, the local election officials faithfully followed the lawful procedures,” LAA’s statement said. “We did observe a few occasions where polling staff could benefit from better education on the relatively recent voter ID laws. Nonetheless, they worked diligently to ensure the election laws were followed.”

Continue Reading


Mike Cagney is testing the boundaries of the banking system for himself — and others



Founder Mike Cagney is always pushing the envelope, and investors love him for it. Not long sexual harassment allegations prompted him to leave SoFi, the personal finance company that he cofounded in 2011, he raised $50 million for new lending startup called Figure that has since raised at least $225 million from investors and was valued a year ago at $1.2 billion.

Now, Cagney is trying to do something unprecedented with Figure, which says it uses a blockchain to more quickly facilitate home equity, mortgage refinance, and student and personal loan approvals. The company has applied for a national bank charter in the U.S., wherein it would not take FDIC-insured deposits but it could take uninsured deposits of over $250,000 from accredited investors.

Why does it matter? The approach, as American Banker explains it, would bring regulatory benefits. As it reported earlier this week, “Because Figure Bank would not hold insured deposits, it would not be subject to the FDIC’s oversight. Similarly, the absence of insured deposits would prevent oversight by the Fed under the Bank Holding Company Act. That law imposes restrictions on non-banking activities and is widely thought to be a deal-breaker for tech companies where banking would be a sidelight.”

Indeed, if approved, Figure could pave the way for a lot of fintech startups — and other retail companies that want to wheel and deal lucrative financial products without the oversight of the Federal Reserve Board or the FDIC — to nab non-traditional bank charters.

As Michelle Alt, whose year-old financial advisory firm helped Figure with its application, tells AB: “This model, if it’s approved, wouldn’t be for everyone. A lot of would-be banks want to be banks specifically to have more resilient funding sources.” But if it’s successful, she adds, “a lot of people will be interested.”

One can only guess at what the ripple effects would be, though the Bank of Amazon wouldn’t surprise anyone who follows the company.

In the meantime, the strategy would seemingly be a high-stakes, high-reward development for a smaller outfit like Figure, which could operate far more freely than banks traditionally but also without a safety net for itself or its customers. The most glaring danger would be a bank run, wherein those accredited individuals who are today willing to lend money to the platform at high interest rates began demanding their money back at the same time. (It happens.)

Either way, Cagney might find a receptive audience right now with Brian Brooks, a longtime Fannie Mae executive who served as Coinbase’s chief legal officer for two years before jumping this spring to the Office of the Comptroller of the Currency (OCC), an agency that ensures that national banks and federal savings associations operate in a safe and sound manner.

Brooks was made acting head of the agency in May and green-lit one of the first national charters to go to a fintech, Varo Money, this past summer. In late October, the OCC also granted SoFi preliminary, conditional approval over its own application for a national bank charter.

While Brooks isn’t commenting on speculation around Figure’s application, in July, during a Brookings Institution event, he reportedly commented about trade groups’ concerns over his efforts to grant fintechs and payments companies charters, saying: “I think the misunderstanding that some of these trade groups are operating under is that somehow this is going to trigger a lighter-touch charter with fewer obligations, and it’s going to make the playing field un-level . . . I think it’s just the opposite.”

Christopher Cole, executive vice president at the trade group Independent Community Bankers of America, doesn’t seem persuaded. Earlier this week, he expressed concern about Figure’s bank charter application to AB, saying he suspects that Brooks “wants to approve this quickly before he leaves office.”

Brooks’s days are surely numbered. Last month, he was nominated by President Donald to a full five-year term leading the federal bank regulator and is currently awaiting Senate confirmation. The move — designed to slow down the incoming Biden administration — could be undone by President-elect Joe Biden, who can fire the comptroller of the currency at will and appoint an acting replacement to serve until his nominee is confirmed by the Senate.

Still, Cole’s suggestion is that Brooks still has enough time to figure out a path forward for Figure — and if its novel charter application is approved, and it stands up to legal challenges — a lot of other companies, too.

Continue Reading


We read the paper that forced Timnit Gebru out of Google. Here’s what it says



On the evening of Wednesday, December 2, Timnit Gebru, the co-lead of Google’s ethical AI team, announced via Twitter that the company had forced her out. 

Gebru, a widely respected leader in AI ethics research, is known for coauthoring a groundbreaking paper that showed facial recognition to be less accurate at identifying women and people of color, which means its use can end up discriminating against them. She also cofounded the Black in AI affinity group, and champions diversity in the tech industry. The team she helped build at Google is one of the most diverse in AI, and includes many leading experts in their own right. Peers in the field envied it for producing critical work that often challenged mainstream AI practices.

A series of tweets, leaked emails, and media articles showed that Gebru’s exit was the culmination of a conflict over another paper she co-authored. Jeff Dean, the head of Google AI, told colleagues in an internal email (which he has since put online) that the paper “didn’t meet our bar for publication” and that Gebru had said she would resign unless Google met a number of conditions, which it was unwilling to meet. Gebru tweeted that she had asked to negotiate “a last date” for her employment after she got back from vacation. She was cut off from her corporate email account before her return.

Online, many other leaders in the field of AI ethics are arguing that the company pushed her out because of the inconvenient truths that she was uncovering about a core line of its research—and perhaps its bottom line. More than 1,400 Google staff and 1,900 other supporters have also signed a letter of protest.

Many details of the exact sequence of events that led up to Gebru’s departure are not yet clear; both she and Google have declined to comment beyond their posts on social media. But MIT Technology Review obtained a copy of the research paper from  one of the co-authors, Emily M. Bender, a professor of computational linguistics at the University of Washington. Though Bender asked us not to publish the paper itself because the authors didn’t want such an early draft circulating online, it gives some insight into the questions Gebru and her colleagues were raising about AI that might be causing Google concern.

Titled “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” the paper lays out the risks of large language models—AIs trained on staggering amounts of text data. These have grown increasingly popular—and increasingly large—in the last three years. They are now extraordinarily good, under the right conditions, at producing what looks like convincing, meaningful new text—and sometimes at estimating meaning from language. But, says the introduction to the paper, “we ask whether enough thought has been put into the potential risks associated with developing them and strategies to mitigate these risks.”

The paper

The paper, which builds off the work of other researchers, presents the history of natural-language processing, an overview of four main risks of large language models, and suggestions for further research. Since the conflict with Google seems to be over the risks, we’ve focused on summarizing those here. 

Environmental and financial costs

Training large AI models consumes a lot of computer processing power, and hence a lot of electricity. Gebru and her coauthors refer to a 2019 paper from Emma Strubell and her collaborators on the carbon emissions and financial costs of large language models. It found that their energy consumption and carbon footprint have been exploding since 2017, as models have been fed more and more data.

Strubell’s study found that one language model with a particular type of “neural architecture search” (NAS) method would have produced the equivalent of 626,155 pounds (284 metric tons) of carbon dioxide—about the lifetime output of five average American cars. A version of Google’s language model, BERT, which underpins the company’s search engine, produced 1,438 pounds of CO2 equivalent in Strubell’s estimate—nearly the same as a roundtrip flight between New York City and San Francisco.

Gebru’s draft paper points out that the sheer resources required to build and sustain such large AI models means they tend to benefit wealthy organizations, while climate change hits marginalized communities hardest. “It is past time for researchers to prioritize energy efficiency and cost to reduce negative environmental impact and inequitable access to resources,” they write.

Massive data, inscrutable models

Large language models are also trained on exponentially increasing amounts of text. This means researchers have sought to collect all the data they can from the internet, so there’s a risk that racist, sexist, and otherwise abusive language ends up in the training data.

An AI model taught to view racist language as normal is obviously bad. The researchers, though, point out a couple of more subtle problems. One is that shifts in language play an important role in social change; the MeToo and Black Lives Matter movements, for example, have tried to establish a new anti-sexist and anti-racist vocabulary. An AI model trained on vast swaths of the internet won’t be attuned to the nuances of this vocabulary and won’t produce or interpret language in line with these new cultural norms.

It will also fail to capture the language and the norms of countries and peoples that have less access to the internet and thus a smaller linguistic footprint online. The result is that AI-generated language will be homogenized, reflecting the practices of the richest countries and communities.

Moreover, because the training datasets are so large, it’s hard to audit them to check for these embedded biases. “A methodology that relies on datasets too large to document is therefore inherently risky,” the researchers conclude. “While documentation allows for potential accountability, […] undocumented training data perpetuates harm without recourse.”

Research opportunity costs

The researchers summarize the third challenge as the risk of “misdirected research effort.” Though most AI researchers acknowledge that large language models don’t actually understand language and are merely excellent at manipulating it, Big Tech can make money from models that manipulate language more accurately, so it keeps investing in them. “This research effort brings with it an opportunity cost,” Gebru and her colleagues write. Not as much effort goes into working on AI models that might achieve understanding, or that achieve good results with smaller, more carefully curated datasets (and thus also use less energy).

Illusions of meaning

The final problem with large language models, the researchers say, is that because they’re so good at mimicking real human language, it’s easy to use them to fool people. There have been a few high-profile cases, such as the college student who churned out AI-generated self-help and productivity advice on a blog, which went viral.

The dangers are obvious: AI models could be used to generate misinformation about an election or the covid-19 pandemic, for instance. They can also go wrong inadvertently when used for machine translation. The researchers bring up an example: In 2017, Facebook mistranslated a Palestinian man’s post, which said “good morning” in Arabic, as “attack them” in Hebrew, leading to his arrest.

Why it matters

Gebru and Bender’s paper has six co-authors, four of whom are Google researchers. Bender asked to avoid disclosing their names for fear of repercussions. (Bender, by contrast, is a tenured professor: “I think this is underscoring the value of academic freedom,” she says.)

The paper’s goal, Bender says, was to take stock of the landscape of current research in natural-language processing. “We are working at a scale where the people building the things can’t actually get their arms around the data,” she said. “And because the upsides are so obvious, it’s particularly important to step back and ask ourselves, what are the possible downsides? … How do we get the benefits of this while mitigating the risk?”

In his internal email, Dean, the Google AI head, said one reason the paper “didn’t meet our bar” was that it “ignored too much relevant research.” Specifically, he said it didn’t mention more recent work on how to make large language models more energy-efficient and mitigate problems of bias. 

However, the six collaborators drew on a wide breadth of scholarship. The paper’s citation list, with 128 references, is notably long. “It’s the sort of work that no individual or even pair of authors can pull off,” Bender said. “It really required this collaboration.” 

The version of the paper we saw does also nod to several research efforts on reducing the size and computational costs of large language models, and on measuring the embedded bias of models. It argues, however, that these efforts have not been enough. “I’m very open to seeing what other references we ought to be including,” Bender said.

Nicolas Le Roux, a Google AI researcher in the Montreal office, later noted on Twitter that the reasoning in Dean’s email was unusual. “My submissions were always checked for disclosure of sensitive material, never for the quality of the literature review,” he said.

Dean’s email also says that Gebru and her colleagues gave Google AI only a day for an internal review of the paper before they submitted it to a conference for publication. He wrote that “our aim is to rival peer-reviewed journals in terms of the rigor and thoughtfulness in how we review research before publication.”

Bender noted that even so, the conference would still put the paper through a substantial review process: “Scholarship is always a conversation and always a work in progress,” she said. 

Others, including William Fitzgerald, a former Google PR manager, have further cast doubt on Dean’s claim: 

Google pioneered much of the foundational research that has since led to the recent explosion in large language models. Google AI was the first to invent the Transformer language model in 2017 that serves as the basis for the company’s later model BERT, and OpenAI’s GPT-2 and GPT-3. BERT, as noted above, now also powers Google search, the company’s cash cow.

Bender worries that Google’s actions could create “a chilling effect” on future AI ethics research. Many of the top experts in AI ethics work at large tech companies because that is where the money is. “That has been beneficial in many ways,” she says. “But we end up with an ecosystem that maybe has incentives that are not the very best ones for the progress of science for the world.”

Continue Reading


Daily Crunch: Slack and Salesforce execs explain their big acquisition



We learn more about Slack’s future, Revolut adds new payment features and DoorDash pushes its IPO range upward. This is your Daily Crunch for December 4, 2020.

The big story: Slack and Salesforce execs explain their big acquisition

After Salesforce announced this week that it’s acquiring Slack for $27.7 billion, Ron Miller spoke to Slack CEO Stewart Butterfield and Salesforce President and COO Bret Taylor to learn more about the deal.

Butterfield claimed that Slack will remain relatively independent within Salesforce, allowing the team to “do more of what we were already doing.” He also insisted that all the talk about competing with Microsoft Teams is “overblown.”

“The challenge for us was the narrative,” Butterfield said. “They’re just good [at] PR or something that I couldn’t figure out.”

Startups, funding and venture capital

Revolut lets businesses accept online payments — With this move, the company is competing directly with Stripe, Adyen, Braintree and

Health tech venture firm OTV closes new $170M fund and expands into Asia — This year, the firm led rounds in telehealth platforms TytoCare and Lemonaid Health.

Zephr raises $8M to help news publishers grow subscription revenue — The startup’s customers already include publishers like McClatchy, News Corp Australia, Dennis Publishing and PEI Media.

Advice and analysis from Extra Crunch

DoorDash amps its IPO range ahead of blockbuster IPO — The food delivery unicorn now expects to debut at $90 to $95 per share, up from a previous range of $75 to $85.

Enter new markets and embrace a distributed workforce to grow during a pandemic — Is this the right time to expand overseas?

Three ways the pandemic is transforming tech spending — All companies are digital product companies now.

(Extra Crunch is our membership program, which aims to democratize information about startups. You can sign up here.)

Everything else

WH’s AI EO is BS — Devin Coldewey is not impressed by the White House’s new executive order on artificial intelligence.

China’s internet regulator takes aim at forced data collection — China is a step closer to cracking down on unscrupulous data collection by app developers.

Gift Guide: Games on every platform to get you through the long, COVID winter — It’s a great time to be a gamer.

The Daily Crunch is TechCrunch’s roundup of our biggest and most important stories. If you’d like to get this delivered to your inbox every day at around 3pm Pacific, you can subscribe here.

Continue Reading