Google said a year ago it would stop its computers from scanning the inboxes of Gmail users for information to personalize advertisements, saying it wanted users to “remain confident that Google will keep privacy and security paramount.”
But the internet giant continues to let hundreds of outside software developers scan the inboxes of millions of Gmail users who signed up for email-based services offering shopping price comparisons, automated travel-itinerary planners or other tools. Google does little to police those developers, who train their computers—and, in some cases, employees—to read their users’ emails, a Wall Street Journal examination has found.
One of those companies is Return Path Inc., which collects data for marketers by scanning the inboxes of more than two million people who have signed up for one of the free apps in Return Path’s partner network using a Gmail, Microsoft Corp. or Yahoo email address. Computers normally do the scanning, analyzing about 100 million emails a day. At one point about two years ago, Return Path employees read about 8,000 unredacted emails to help train the company’s software, people familiar with the episode say.
In another case, employees of Edison Software, another Gmail developer that makes a mobile app for reading and organizing email, personally reviewed the emails of hundreds of users to build a new feature, says Mikael Berner, the company’s CEO.
Letting employees read user emails has become “common practice” for companies that collect this type of data, says Thede Loder, the former chief technology officer at eDataSource Inc., a rival to Return Path. He says engineers at eDataSource occasionally reviewed emails when building and improving software algorithms.
“Some people might consider that to be a dirty secret,” says Mr. Loder. “It’s kind of reality.”
Neither Return Path nor Edison asked users specifically whether it could read their emails. Both companies say the practice is covered by their user agreements, and that they used strict protocols for the employees who read emails. eDataSource says it previously allowed employees to read some email data but recently ended that practice to better protect user privacy.
Google, a unit of Alphabet Inc., says it provides data only to outside developers it has vetted and to whom users have explicitly granted permission to access email. Google’s own employees read emails only “in very specific cases where you ask us to and give consent, or where we need to for security purposes, such as investigating a bug or abuse,” the company said in a written statement.
This examination of email data privacy is based on interviews with more than two dozen current and former employees of email app makers and data companies. The latitude outside developers have in handling user data shows how even as Google and other tech giants have touted efforts to tighten privacy, they have left the door open to others with different oversight practices.
Facebook Inc. for years let outside developers gain access to its users’ data. That practice, which Facebook has said it stopped by 2015, spawned a scandal when the social-media giant this year said it suspected one developer of selling data on tens of millions of users to a research firm with ties to President Donald Trump’s 2016 campaign. The episode led to renewed scrutiny from lawmakers and regulators in the U.S. and Europe over how internet companies protect user information.
There is no indication that Return Path, Edison or other developers of Gmail add-ons have misused data in that fashion. Nevertheless, privacy advocates and many tech industry executives say opening access to email data risks similar leaks.
For companies that want data for marketing and other purposes, tapping into email is attractive because it contains shopping histories, travel itineraries, financial records and personal communications. Data-mining companies commonly use free apps and services to hook users into giving up access to their inboxes without clearly stating what data they collect and what they are doing with it, according to current and former employees of these companies.
Gmail is especially valuable as the world’s dominant email service, with 1.4 billion users. Nearly two-thirds of all active email users globally have a Gmail account, according to comScore , and Gmail has more users than the next 25 largest email providers combined. The data miners generally have access to other email services besides Gmail, including those from Microsoft and Verizon Communications Inc.’s Oath unit, formed after the company acquired email pioneer Yahoo. Those are the next two largest email providers, according to comScore.
Google’s developer agreement prohibits exposing a user’s private data to anyone else “without explicit opt-in consent from that user.” Its rules also bar app developers from making permanent copies of user data and storing them in a database.
Developers say Google does little to enforce those policies. “I have not seen any evidence of human review” by Google employees, says Zvi Band, the co-founder of Contactually, an email app for real-estate agents. He says Contactually has never had employees review emails with their own eyes.
Google said it manually reviews every developer and application requesting access to Gmail. The company checks the domain name of the sender to look for anyone who has a history of abusing Google policies, and reads the privacy policies to make sure they are clear. “If we ever run into areas where disclosures and practices are unclear, Google takes quick action with the developer,” a spokesman said.
Google says it lets any user revoke access to apps at any point. Business users of Gmail can also restrict access to certain email apps to the employees in their organization, the company said, “ensuring that only apps that have been vetted and are trusted by their organization are used.”
Google has contended with privacy concerns since it launched Gmail in 2004. The company’s software scanned email messages and sold ads across the top of inboxes related to their content. That year, 31 privacy and consumer groups sent a letter to Google co-founders Larry Page and Sergey Brin saying the practice “violates the implicit trust of an email service provider.” Google responded that other email providers were already using computers to scan email to protect against spam and hackers, and that showing ads helped offset the cost of its free service.
While some users complained the ads were creepy, people signed up for Gmail in droves.
In 2014, Google said it would stop scanning Gmail inboxes of student, business and government users. In June of last year, it said it was halting all Gmail scanning for ads.
Meanwhile, Google in 2014 started promoting Gmail as a platform for developers to leverage the contents of users’ email to develop apps for such productivity tasks as scheduling meetings. A new Gmail version launched this spring adds a link next to inboxes to a curated menu of 34 add-ons, including one that offers to track users’ outgoing emails to report whether recipients open them.
Google says apps make Gmail more useful. Turning Gmail into a platform emulates Microsoft’s Windows and Apple Inc.’s iPhone, which attracted outside developers to make their software more useful to corporate users.
Google doesn’t disclose how many apps have access to Gmail. The total number of email apps in the top two mobile app stores, for Apple’s iOS and Android, jumped to 379 last year, from 142 five years earlier, according to researcher App Annie. Most can link to Gmail and other major providers.
Almost anyone can build an app that connects to Gmail accounts using Google’s software called an application programming interface, or API. When Gmail users open one of these apps, they are shown a button asking permission to access their inbox. If they click it, Google grants the developer a key to access the entire contents of their inbox, including the ability to read the contents of messages and send and delete individual messages on their behalf. Microsoft also offers API tools for email.
With Gmail, the developers who get this access range from one-person startups to large corporations, and their processes for protecting data privacy vary.
Return Path, based in New York, gains access to inboxes when users sign up for one of its apps or one of the 163 apps offered by Return Path’s partners. Return Path gives the app makers software tools for managing email data in return for letting it peer into their users’ inboxes.
Return Path’s system is designed to check if commercial emails are read by their intended recipients. It provides customers including Overstock.com Inc. a dashboard where they can see which of their marketing messages reached the most customers. Overstock didn’t respond to a request for comment.
Marketers can view screenshots of some actual emails—with names and addresses stripped out—to see what their competitors are sending. Return Path says it doesn’t let marketers target emails specifically to users.
Navideh Forghani, 34 years old, of Phoenix, signed up this year for Earny Inc., a tool that compares receipts in inboxes to prices across the web. When Earny finds a better price for items its users purchase, it automatically contacts the sellers and obtains refunds for the difference, which it shares with the users.
Return Path says its computers are supposed to strip out personal emails from what it sends into its system by examining senders’ domain names and searching for specific words, such as “grandma.” The computers are supposed to delete such emails.
In 2016, Return Path discovered its algorithm was mislabeling many personal emails as commercial, according to a person familiar with the matter. That meant millions of personal messages that should have been deleted were passing through to Return Path’s servers, the person says.
To correct the problem, Return Path assigned two data analysts to spend several days reading 8,000 emails and manually labeling each one, the person says. The data helped train the company’s computers to better distinguish between personal and commercial emails.
Return Path declined to comment on details of the incident, but said it sometimes lets employees see emails when fixing problems with its algorithms. The company uses “extreme caution” to safeguard privacy by limiting access to a few engineers and data scientists and deleting all data after the work is completed, says Mr. Blumberg.
Jules Polonetsky, CEO of the nonprofit Future of Privacy Forum, says he thinks users want to know specifically whether humans are reviewing their data, and that apps should explain that clearly.
At Edison Software, based in San Jose, Calif., executives and engineers developing a new feature to suggest “smart replies” based on emails’ content initially used their own emails for the process, but there wasn’t enough data to train the algorithm, says Mr. Berner, the CEO.
Two of its artificial-intelligence engineers signed agreements not to share anything they read, Mr. Berner says. Then, working on machines that prevented them from downloading information to other devices, they read the personal email messages of hundreds of users—with user information already redacted—along with the system’s suggested replies, manually indicating whether each made sense.
Neither Return Path nor Edison mentions the possibility of humans viewing users’ emails in their privacy policies.
Write to Douglas MacMillan at email@example.com