Cycles for the Mind

18/09/2012Filed under:feature / independent / science & technology

Cloud computing entered the popular lexicon in around 2008 when Google Executive Chairman, Dr Eric Schmidt started giving talks on the subject. Since that time, the range of cloud computing services available to businesses and consumers has expanded considerably, giving birth to entirely new industries in the process. Eric Payne reports.

Every new innovation in computing needs a new name or buzzword and the marketing bods who earn their crust in Silicon Valley are more than happy to oblige. What was at first called Software as a Service (SaaS) has since evolved from cloud computing into utility computing and, depending upon just how closely you follow these sorts of things, grid computing, parallel computing and distributed computing. The functions they describe are discrete but are essentially based on the notion that the ‘network is the computer’, as popularised by John Gage of Sun Microsystems.

SaaS involves leasing a single application from a remote server accessible via the Internet, whereas cloud computing and utility computing incorporate leasing hardware, software and infrastructure. The term utility computing reflects the way in which users can arbitrarily increase their processing and storage demand in accordance with their need, much the same as one might draw electricity from the grid to power a television and boil a kettle.

Big Data

Many of the most notable early innovations in the cloud computing sector were consumer focused. The first cloud-based applications to gain widespread popular recognition were web-based email servers such as Hotmail, Gmail and Yahoo! Mail. The next phase started in a Harvard dorm room in 2004, when a 19-year-old Mark Zuckerberg registered the domain name thefacebook.com. Not to say that social networks were the exclusive invention of Mark Zuckerberg, of course. The world’s first ‘social network’ was the Whole Earth ’Lectronic Link or WELL, an electronic Bulletin Board System or BBS, established by Stewart Brand and Larry Brilliant in San Francisco in 1985. The biggest difference between Facebook and the WELL is one of scale, as encapsulated in the latest piece of Silicon Valley boilerplate – Big Data.

Big Data is essentially a sign that businesses and other large organisations are beginning to embrace cloud computing. One of the most important pioneers in this area is Google. World renowned for its industry-leading search engine and highly lucrative advertising platform, Google has also taken the lead in creating many technologies that make cloud computing and Big Data useful. The unprecedented scale at which Google operates – processing billions of search queries every day – with data centres in remote locations around the world – led to the creation of the Google File System (GFS), which enabled it to spread data over thousands of distributed servers (some in different continents). Another piece of software called MapReduce was used to co-ordinate compute cycles and crunch data to create a single, searchable index. The enormity of the task and the ‘off-the-shelf’ components used in Google’s data centres led to the creation of a system that was resistant to network or machine failure.

The two papers in which these technologies are described have since spawned an entire industry now measured in billions of dollars. The open-source application Hadoop, which replicates GFS and MapReduce, has enabled enterprises and academic institutions to benefit from the Big Data movement. The ability to work with these kinds of very large datasets is all of the evidence one should need to understand that cloud computing is not just about using the browser to access the same programs once stored on local computer hard drives. On the contrary, it promises to transform computing in much the same way that the personal computer once did. As minicomputer pioneer and Digital Equipment Corporation (DEC) founder, Ken Olsen, once said: “There is no reason for any individual to have a computer in his home.”

In the years since Olsen said that, it has become all too easy to scoff at his lack of foresight. It is worth remembering that, at the time when he said it, computers were the preserve of the military, academia and big business. Neither Olsen nor any of his mainstream contemporaries anticipated the home computing revolution that was to follow. But, with that history behind us, it may be slightly easier to plot a trajectory into the future.

Utility supercomputing

The kinds of applications cloud computing might enable are already evinced in the consumer IT and electronics sector, which seems to be taking the lead once again. The massive datasets collected by Google’s search engine have already enabled it to make unprecedented advances in the area of language-to-language translation. Siri, the ‘digital assistant’ bundled with the iPhone 4S, uses cloud computing technology to solve some extremely complicated computer science problems. Voice recognition and natural language processing techniques are predicated upon the ability to build ever-larger datasets that are improved by constant iteration, based on feedback. Every query Apple receives from Siri increases the size of its dataset and immediate user feedback iterates the program towards a better response.

In the enterprise sector, where demands are even greater, there is still more room for innovation. A Silicon Valley start-up called Cycle Computing recently started utilising Amazon’s Elastic Compute Cloud (EC2) to provide supercomputing as a service or utility for anyone with enough money to pay the hourly fee. The service leverages Amazon’s cloud-based servers to provide access to a virtual machine with supercomputer-like capabilities. The largest virtual machine Cycle Computing has yet created had the equivalent of 50,000 processor cores and was used by Schrödinger and Nimbus Discovery to accelerate lead identification via virtual screening for a new drug. “Typically, we have to weigh tradeoffs between time and accuracy in a project,” said Ramy Farid, Schrödinger President. “With Cycle’s utility supercomputing, we didn’t have to compromise the accuracy in favour of faster throughput, and we were able to run the virtual screen using the appropriate levels of scoring and sampling.”

As with any new technology, there are trade-offs between the old and the new. Utility supercomputing is undoubtedly lower cost – capital expenditure and maintenance costs are eliminated, although, it obviously makes one more reliant upon third-party support. It also frees up time that would have been spent managing a large computing department that can be redirected towards a core competency. Most important is probably the fact that utility supercomputing lowers the barriers to entry for a technology that was previously the preserve of a very exclusive elite. Projects that would have had to wait to access precious computer resources or put up with running on slower machines over longer periods of time, can now buy a ready-made solution ‘off-the-shelf’. One of the world’s largest scientific equipment manufacturers, Varian Inc., successfully used Cycle Computing’s virtual supercomputer to simulate the design of a new mass spectrometer in less than a day, an operation that would have taken nearly six calendar weeks using its own internal pool of processors.

Accelerated learning

Of course, there are detractors as well as boosters. The life sciences have embraced the technology, as have several visual effects companies working in Hollywood, and it proffers benefits for every area of industry from energy to agriculture and telecommunications to construction. The primary limiting factor is likely to be the networks themselves, which have to transmit data between servers and clients, something that telecommunications consultancy TeleGeography anticipates will impact how the topology of the Internet itself is likely to evolve.

According to the company’s latest Global Internet Geography report: “The rise of cloud computing will have an effect on traffic patterns and may cause a shift in the topology of the Internet as accessing data centres becomes more important. Of course, the locations of data centres housing different types of cloud services will influence traffic patterns. Some cloud services may only be located in major Internet hub cities due to the prevalence of more affordable power, space, and connectivity options and more transparent, open regulatory environments. Greater demand will occur on links connected to the major Internet hub cities. However, other types of cloud services will only drive increased local demand requirements. For example, data storage for financial establishments and enterprises may be legally required to reside within the institution’s home country, in which no international traffic is generated. In other cases, local policies for decency may lead content operators to not host any content locally.”

Privacy campaigners fear that Big Data could become a synonym for Big Brother when applied to the consumer sector. But the ability for industrial and academic institutions to access supercomputing services via the Internet has the potential to accelerate research and development in energy, architecture, telecommunications, defence, oil & gas, heavy industry, aerospace, mining, green tech, services, construction, automotive, manufacturing, entertainment, petrochemical and nuclear.