News Clipping

Microsoft: Zero Data Retention Not Possible To Keep Search Engines Viable

By eWEEK

Thu, Dec 18, 2008 at 11:43 am

  • Share

Yahoo’s time reduction of users’ search engine data storage from 13 to 3 months caught the eye of privacy advocates, who called for Google to lead the way toward a zero retention policy. Microsoft’s privacy guru Brendon Lynch explains why this just isn’t possible to ensure Live Search performs as a quality Web service. The issue seems headed for some resolution in 2009, with search engine providers meeting with the European Commission in February.

Yahoo’s reduction of its duration for user log data retention has some industry watchers calling for Google and Microsoft to do the same and predicting that pressure from government regulators’ will lead to zero retention policies in search next year.

Brendon Lynch, director of privacy strategy at Microsoft, told eWEEK zero retention policies are just not possible for Microsoft without denigrating the quality of its Live Search offering, among other issues.

The issue sparked Dec. 17 when Yahoo pledged to reduce the period it saves the user log data its search engine gathers — user queries, IP addresses and cookies that create digital trails — from 13 months to 3 months. Yahoo, Google and Microsoft argue that data about users is necessary to provide quality search, protect users from malicious users and scam artists.

The move by Yahoo, the No. 2 search engine provider, is easily the most aggressive to data. Search leader Google pared its data retention period from 18 months to 9 in September. No. 3 player Microsoft has been stuck at 18 months since July 2007, though it has said it would be willing to go down to six months if Google and Yahoo agreed to do the same.

Yet Yahoo’s move was received with cautious praise by some privacy advocates who believe Yahoo, Google and Microsoft can do better. Peter Eckersley, staff technologist with consumer rights group Electronic Frontier Foundation, told eWEEK:

This looks like an attempt by Yahoo to keep a lot of information
that they can use for their own internal research and engineering
purposes, while being able to say "it would be extremely hard for us to
find your search history file in this huge stack of search history
files that we keep". That’s a big step in the right direction.

However, Eckersley noted that Yahoo still retains 24 of the 32
digits of users’ IP addresses, which means that if Yahoo had someone’s
IP address, and wanted to find their search history, it could dig out
fifty or a hundred files and say that one of them belongs to that
person. A human, or more likely a statistical analysis program, could
then read them and match a file to that person.

John Simpson, a privacy advocate for the non-profit consumer rights
group Consumer Watchdog, said no less than a zero retention policy will
suffice, arguing that since most users of Google or Yahoo return daily
they are constantly providing a new stream of personal data. His group
wants users to have the option to control their data and browse
anonymously.

But Microsoft’s Lynch said the search data Live Search collects has a
number of uses. In addition to analyzing users’ search queries to
improve query relevancy, Lynch said user log data helps Microsoft Live
Search thwart security threats, keep people from gaming search ranking
results, and combat click fraud scammers.

After evaluating the issue, Microsoft concluded earlier this month that a 6 month retention policy is feasible.
If Google, Yahoo and Microsoft agreed to a 6-month timeframe it would
keep the playing field level for the search engines. "The company that
has more data to analyze has the greater ability to improve the
relevance of the search engine," he said.

Does that mean Microsoft would match Yahoo’s new 3 month policy? Lynch
wouldn’t bite, noting only that Microsoft constantly reviewing what is
an acceptable time frame. "Ultimately what we’re looking for is a
common approach across the industry and not just on timeframe."

But  months is unlikely to satisfy privacy advocates. EFF’s Eckersley
said Yahoo is clearly doing a better job on this issue than Google,
which in most cases could look up a person’s search history very easily
for 18 months because it still keeps cookie IDs for 18 months, and
hasn’t announced any deletion of "giveaway" searches for things like names and phone numbers.

"A gold star to Yahoo, and a gold star to the European Union for
scaring the search engines into offering Internet users more privacy,"
Eckersley said.

Consumer Watchdog’s Simpson called on Google to give its search users
control over their private data; transparency about how their data is
gathered and used; and the right to give informed consent through opt
in functions, rather than having to sift through pages in order to opt
out.

For its part, Google is content with its current policy, which the company halved from 18 months to nine in September.

"When we make changes to our policies, they are dependent on what will
be best for our users both in terms of the services we provide and the
respect of their privacy. It is a balance that we are continually
evaluating," wrote Jane Horvath, senior privacy counsel at Google, in
an e-mail sent to eWEEK.

Microsoft, Google and Yahoo are expected to make their cases in presentations in February to the European Commission’s Article 29 Working Party, an advisory panel comprising data protection commissioners from each of its 27 member countries.

The meeting will be the latest battle in the tug of war between the
government body, which is bent on protecting users’ privacy, and the
search providers, which are determined to store data to improve their
services.

The European Commission has been more aggressive toward regulating
search engines to date. With the installation of U.S. President Barack
Obama, it is unclear how this fight will evolve in 2009 at home and
abroad.

Share
, , ,

Leave a Reply