Fionn Murtagh’s Blog

Themes: information economy, intellectual property, research

Archive for May 2009

ICT, Energy and Environment – Inextricably Intertwined

leave a comment »

Processing of Information, Processing of Energy: A Short Review of (1) Querying, and (2) Email

I will look at just one aspect of this ICT/energy/climate picture. This is related to the fundamental Internet activities of data searching and email.

All our business, social, and indeed personal, activities and processes are built on information and they are built on energy. Some recent reports are interesting in regard to this close relationship.

Data Querying – Energy Needs and Impact on Environment

How much energy is expended with each and every search query? Following some discussion in the press in January 2009, it is clear that the carbon emissions and the energy use that accompany computing tasks are complicated.

It has been estimated that, globally, the ICT sector is directly responsible for 2% of CO2 emissions. That is the same as the totality of the aviation sector. Included in ICT are computer displays, servers, routers, printers, fixed and mobile telecoms, and so on. (Gartner Group, 26 April 2007.)

Let’s look at emissions. According to Google, a one hit search taking less than a second produces about 0.2g of CO2. When a search is a multiple one, then the estimated CO2 byproduct varies from 1g to 10g. By way of comparison, boiling a kettlefull of water entails about 15g of CO2 emission. There are about 200 million Internet searches per day, the lions’s share of them through Google.

Viewing a web page generates an estimated 0.02g of CO2 per second. If there are complex images or video on what is being viewed, then the estimate rises tenfold to 0.2g of CO2 per second. Running a PC generates an estimated 40g to 80g of CO2 emissions per hour. And maintaining an avatar on Second Life requires an estimated 1752 kilowatt hours of electric power per year, which is almost as much as the average personal use in Brazil. (The Sunday Times, 11 and updated 16 January 2009.)

Google’s response to the Sunday Times article traced out the energy as well as greenhouse gas consequences of searching. A typical search returns results in less than 0.2 seconds. “Queries vary in degree of difficulty, but for the average query, the servers it touches each work on it for just a few thousandths of a second. Together with other work performed before your search even starts (such as building the search index) this amounts to 0.0003 kWh of energy per search, or 1 kJ. For comparison, the average adult needs about 8000 kJ a day of energy from food, so a Google search uses just about the same amount of energy that your body burns in ten seconds.  In terms of greenhouse gases, one Google search is equivalent to about 0.2 grams of CO2.”  (Reference, 12 January 2009.)

PCs and monitors account for 40% of the CO2 emissions figures of the ICT sector. Data centres account for 23% of global ICT CO2 emissions. (Reference.) The New York Times of 14 June 2007 described how data centres are increasingly located near where energy is readily available. The price of power in Washington state is about one fifth that of California. So data centres are being established on the Columbia river, by Google, by Microsoft at Wenatchee, and by Yahoo at Quincy. In an article in Le Monde of 23 June 2007 there was a sketch of a future world where nuclear plants in less developed countries would be surrounded by data centres. (Reference.)

The Formidable Amount of Energy Wasted with Spam

It is common currency that spam accounts for about 97% of all email (Microsoft view, 8 April 2009) although by other accounts spam can be as low as 80% of all traffic. Whatever… it accounts for the greatest proportion of traffic, – 62 trillion (10^12) spam emails in 2008 according to the McAfee ICF report, “The carbon footprint of email spam report”. Richi Jennings contributed in a major way to this report.

The McAfee ICF report looks at legitimate email comprising the following phases and percentages of energy and associated CO2-equivalent emissions:

  • drafting – 25%
  • outgoing mail server – 12.5%
  • internet – 2%
  • incoming mail server – 12.5%
  • storage – 11%
  • viewing – 37%

The total energy expended in legitimate email, per year, is given as 120,115 million kWh, or 120 TWh. I will now look at the energy that goes hand in hand with spam, working out at 33,733 million kWh per year, or 33 TWh.

The pipeline for analysis of carbon footprint of spam is as follows.

  • embodied energy of servers, storage, network equipment; user machines
  • harvesting/scraping addresses
  • disseminating directly from servers or commandeering zombie (botnet) machines
  • transmission circuits
  • server-based filtering – 16%
  • receipt and storage by user machines
  • user appraising and then deleting of spam; searching for false positives – 52%, 27%

The dominant energy expenditure is 16% for server-based filtering, 52% for user viewing and deleting of email – about 3 seconds on average for this; and searching for false positives, 27%. False positives are the genuine emails that have been wrongly labeled as spam.

The lion’s share of energy expended here is a user viewing and deleting spam. Overall, approximately 104 billion user hours per year go into such viewing and deleting of spam.

Per spam message, the CO2-equivalent emissions are 0.3 g. For a legitimate email, the CO2-equivalent emissions are 4 g.

Two Conclusions

This is an inspirational analysis. Let me make two comments about what arises out of it.

Firstly, optimizing a system of the sort described – e.g. to minimize overall email handling energy efficiency – is unlikely to be feasible. The reason is that many aspects of what is dealt with here – the size of an email and its significance for the user; the odd spam that will take more than usual consideration; a new form of spam that will take time for the filters to understand and lop off; and so on – are subject to extreme or out of the ordinary events of various sorts. The distributions involved (of data, time, energy, equivalent emissions, etc.) are long tailed. Although unusual but nonetheless realistic, with such distributions it may not be possible to define its variability and it may not even be possible to define its average! In practice therefore we can come up with figures, but each and every context will give rise to different figures.

What I have just sketched out is quite a general analysis. There is a mathematical argument behind it which means that we cannot view this as a closed, fully specified, system, which would offer an optimized solution.

What this points to instead is that the policy-oriented issues come to the fore. How do we handle worst case scenarios? Will we support redundancy in order to establish guaranteed levels of fault tolerance? Or to cater for extreme events?

As we have seen here however, part and parcel of spam are the energy and environment implications. There are energy and environment implications to querying and searching, to handling email, and to catering for common-place and for out of the ordinary activities.

Underlying all such activities, ordinary or exceptional, power and emissions auditing are becoming every bit as important as data and information auditing.

Advertisements

Written by Fionn Murtagh

2009/05/19 at 23:48