Share |

iSGTW Opinion - Scattered, sectioned and super-sized: sysadmin challenges and grids

 

Opinion - Scattered, sectioned and super-sized: sysadmin challenges and grids


Paul Anderson, from the University of Edinburh School of Informatics, says size is not always the biggest challenge when managing grid systems.
Image courtesy of Paul Anderson

The discipline of system administration covers all aspects of running a computing installation—from technical issues such as operating system configuration and security through to administrative tasks such as account management and software licensing.

Covering this many bases is always a challenge, but when the system you’re administrating is scattered, sectioned and super-sized, a whole new family of challenges arises.

Grid applications tend to rely on large numbers of machines, and managing these appropriately requires some skilled systems administration.

However, it is often not the sheer scale of grid infrastructure that presents the real challenge, but rather the distributed and federated nature of its management.

For example, if an application depends on a particular version of a software library, then this library must be consistent and correct across all participating sites, and this consistency must be guaranteed with a high degree of confidence: if one machine in ten thousand has the wrong version of a library there may significant consequences for the results of an experiment, especially if the error goes undetected.

Since participating grid sites are often managed by different organizations, such co-ordination can be difficult to achieve.

More than 1200 attendees listen to Cory Doctorow, key note speaker from last year’s Large Installation System Administration (LISA) conference, held in Washington DC from 3-8 December 2006.
Image courtesy of LISA

Systems administration in a changing world

Systems administration is an evolving discipline and most sites have their own tools and procedures, often depending on the history and experience of particular administrators.

This means that requirements—for a specific library or service for example—are usually communicated manually between sites, and can be enforced in different ways.

But it is not only the lack of standards which hampers automatic cross-site coordination.

The system-level requirements of a particular grid application—for example installation of specific kernel modules, opening of firewall ports, or other operations requiring privileged access—can often have implications for security. Most administrators would be reluctant to devolve these operations to some process or administrator outside of their organization.

Management of grid applications can not only benefit from increased co-operation between sites, it can also benefit from co-operation between application developers and system administrators.

For example, administrators and developers can work together to produce software tailored to the grid environment: for example, applications packaged for simple installation and un-installation are easier to install and upgrade; applications which present a clean interface to their configuration are more easily redeployed and reconfigured in the event of hardware failure.

Autonomically speaking

“Autonomics” holds the promise of automatically reconfiguring systems to recover from hardware failures or overloading, but this requires applications to be written in such a way that they co-operate with this reconfiguration.

Yielding control of the site configuration to some automatic process is also an unnerving experience for a system administrator, and requires a good deal of trust in any autonomic system.

“Autonomics”is thus an interesting area of current research; it requires systems which can operate autonomically, but act in a clear and predictable manner, under the guidance of policies laid down by the system administrator.  

This year’s Large Installation System Administration (LISA) conference will be held in Dallas, Texas, U.S. from 11-16 November and is designed for both beginners and experienced attendees. Early bird registration closes 19 October.

- Paul Anderson, University of Edinburgh School of Informatics and program chair of LISA 07

No votes yet

Comments

Post new comment

By submitting this form, you accept the Mollom privacy policy.