iSGTW - International Science Grid This Week
iSGTW - International Science Grid This Week
Null

Home > iSGTW - BlogWatch

iSGTW Blog Watch

iSGTW is plugging into the blogosphere ... come here for the latest from GridCast, Grids UK, Ian Foster, HASTAC, GridBus, Grid Gurus and West Coast Grid.

Suggest A Blog
E-mail: blogs@isgtw.org

Disclaimer: iSGTW is not responsible for the content in the blogs. Inclusion of a blog on this page does not imply agreement or endorsement by iSGTW of any commentary in the blog.

Monday 06 September

UK Grids 04:57 NorthGrid - How to enable Atlas squid monitoring » UK Grids#6DAA1D Visit off-site link
Atlas has started to monitor squids with mrtg.

http://frontier.cern.ch/squidstats/indexatlas.html

mrtg uses snmp. So to enable the monitoring you need your squid instance compiled with --enable-snmp. CERN binaries are already compiled with that option the default squid coming with SL5 OS might not, your site centralised squid service might not. You don't need snmpd or snmptracd (net-snmp rpm) running to make it work.

Once you are sure the binary is compiled with the right options and that port 3401 is not blocked by any firewall you need to add these lines to squid.conf

acl SNMPHOSTS src 128.142.202.0/24 localhost
acl SNMPMON snmp_community public
snmp_access allow SNMPMON SNMPHOSTS
snmp_access deny all
snmp_port 3401


again if you are using the CERN rpms and the default frontier configuration you might not need to do that as there are already ACL lines for the monitoring.

Reload the configuration

service squid reload

Test it

snmpwalk -v2c -Cc -c public localhost:3401 .1.3.6.1.4.1.3495.1.1

you should get something similar to this:

SNMPv2-SMI::enterprises.3495.1.1.1.0 = INTEGER: 206648
SNMPv2-SMI::enterprises.3495.1.1.2.0 = INTEGER: 500136
SNMPv2-SMI::enterprises.3495.1.1.3.0 = Timeticks: (23672459) 2 days, 17:45:24.59


snmpwalk is part of net-snmp-utils rpm.

It takes a while for the monitoring to catch up. Don't expect an immediate response.

Additional information on squid/snmp can be found here http://wiki.squid-cache.org/Features/Snmp

NOTE: If you are also upgrading pay attention to the fact that in the latest CERN rpms the init script fn-local-squid.sh might try to regenerate your squid.conf.

Friday 03 September

UK Grids 21:10 National Grid Service - The messages on the bus go back and forth » UK Grids#6DAA1D Visit off-site link
In a rare display of synchronised software development - a raft of NGS research and development projects have safely reached their final destination.

We are proudly showing off our cloud; our HARC acceptors are now accepting to an acceptable level; the new User Interfaces are on display in the Innovation section of the NGS web site; we have come up with a saner way to manage access to licensed software and the ngs-vo-tool to manage virtual organisation configuration has been released.

There will be more on HARC in a future posting. The rest have already been well and truly blogged.

This posting is about the what we are going to do with all our newly available free time in the final 7 months of the third phase of the NGS.

We are here to do the dull-but-useful bits and the dull-but-useful things we are concentrating on are those needed as the NGS and GridPP join forces to form the UK National Grid Initiative (UK NGI) within the European Grid Initiative...
These two projects share one important feature, one they also share with DataMINX Data Transfer Service described in an earlier post. They all rely on a message bus based on ActiveMQ to pass information around.

Yes... you wait ages for a bus and then three turn up at once.

A message bus is simply a way getting a lump of data from A to B. The important bit is that your application does not need to know how to get from A to B - it just needs to hop on the bus at A and hop off again at B.

Finding a reliable route from A to B becomes somebody else's problem.

WLCG Nagios uses its message bus to pass the results of tests to a central monitoring service. Technical details can be found on https://twiki.cern.ch/twiki/bin/view/EGEE/UseLocalActiveMQForMessaging.

The latest APEL client sends records wrapped in messages rather than attempting immediate database updates. Again there are more technical details available from http://goc.grid.sinica.edu.tw/gocwiki/ApelHome.

In both cases, they are using the message bus to avoid information being lost when there is a backlog of data to be processed. As users of the UK public transport service are all too aware, buses are very good at waiting in traffic.
GridCast 12:00 Citizen Cyberscience Wrap-up » GridCast#7DA2EA Visit off-site link
So it's the end of the very first Citizen Cyberscience Summit here in London and, although the event has been full of discussion, I think everyone can agree that we've had a successful couple of days. To round up the Summit we had a quick chat with Francois Grey, coordinator of the Citizen Cyberscience Centre and one of the brains behind the Summit. Watch below for Francois' thoughts on the event and what he thinks we've achieved in the last two days.

GridCast 10:39 Welcome to e-ScienceTalk » GridCast#7DA2EA Visit off-site link
With all of the exciting news coming out of the Citizen Cyberscience Summit you might have missed the news that GridTalk, the project behind GridCast, has now come to an end. But don't fret - although GridTalk is over, it is being replaced by a new (and some might say better) project called e-ScienceTalk. e-ScienceTalk will take over GridTalk's role in bringing you the latest news in grid computing and, what's more, it will also cover super computing, networking and volunteer computing too. Project manager Catherine Gater gave us a quick introduction to e-ScienceTalk at the Summit earlier today.

GridCast 08:50 Creating maps for Africa » GridCast#7DA2EA Visit off-site link
One of the projects presented this morning was AfricaMap, a Citizen Cyberscience Centre project, which aims to use volunteers in Africa to create maps for Africa. We chatted to Peter Amoako Yirenkyi who told us a little more about the project.

GridCast 07:51 When computers were human » GridCast#7DA2EA Visit off-site link

The photo above may look rather ordinary but in reality this office is much more exciting than it seems. The men and women working away at these desks are in fact a sort of human computer, employed by the Mathematical Tables Project in New York City in the 1940s.

The Mathematical Tables Project comprised of 450 'human computers' mostly people with low incomes, who were close to homelessness. The large majority of the staff had not even completed high school, yet they were brought together to perform calculations for government and scientists in an era when computers were not yet up to the task.

The Mathematical Tables Office Computing Floor shown in the photo was split up according to arithmetic function. Workers were assigned to either addition, subtraction, multiplication, or (if they were deemed skilled enough) division calculations. Working from 1938 to 1948, they produced tables of powers, trig functions, probability functions, and contributed to the Handbook of Mathematical Functions, the largest selling scientific book in scientific history. In fact you've most probably used the book yourself.

The Mathematical Tables Project was in a sense an early form of crowd surfing, says David Grier, who introduced the group to us in the opening session of the Summit yesterday. And for those of you who weren't lucky enough to hear his talk you can find out more about the group and the remarkable stories behind it in David's book 'When Computers Were Human' from Princeton University Press.
GridCast 06:40 Stormy weather at the CCC summit » GridCast#7DA2EA Visit off-site link
Navy mariners sailing the chilly waters around the Falkland Islands in 1914, gearing up for the battles on the horizon, were probably not thinking very closely about the weather. Nevertheless, strict and meticulous records were kept in the ships’ logbooks throughout WW1 and beyond. For climate scientists at today’s Met Office, trying to see back through the fog of missing weather data, to get a better picture of what the world’s climate looked like before proper records began, these log books are a gold mine of information.

Soon to be launched on www.zooniverse.org is a volunteer computing project to help dig out this key data from the log books– pressures, temperatures, date, location. Optical computer recognition is not an option, this project needs human eyes – lots of them, as over 250,000 images have been scanned in to date.

Using this data, scientists can build up a much more complete picture of weather data, potentially covering hundreds of years. Philip Brohan of the Met Office reminded us of the devastating storm that hit Britain on October 16 in 1987 – those of us in the audience that were nodding in recognition giving away our age. But comparing this event to the last major storm of 1703 is difficult without the intervening data- not too many in the audience were able to recall that one quite so easily.

So for anyone interested in helping to fill in the gaps, it’s as simple as going online, picking a ship and joining the crew. Along the way, there’s also the chance of revealing fascinating historical details that perhaps no one else has spotted among the thousands of pages. The keeper of HMS Invincible’s log book, in between jotting down the weather that morning and evening, describes picking up survivors during the Battle of Falkland in December 1914. This is a well known event, but there could be others hidden away that are not yet uncovered. This is definitely a project I’ll be looking out for!
GridCast 04:54 The Citizen Cyberscience Summit online » GridCast#7DA2EA Visit off-site link
This morning we've been hearing about volunteer computing projects in the pipeline. From solving Suduko puzzles to creating detailed maps of Africa, the possibilities of citizen cyberscience appear to be endless and there are some amazing ideas coming out of the Summit today.

However, if you're not lucky enough to be at the Summit today, there's still plenty of ways to keep up with all the action online. We're streaming the talks live on the web at the King's College Anatomy Theatre and Museum website. And for those of you in America or Australia, we're also hoping to archive filmed versions of the talks, plus the speakers presentations after the Summit is over, so there's no need to wake up before sunrise or stay up until the wee hours.

For a quick snapshot of the attendees thoughts on the speakers and talks take a look at Twitter, just search for the #cybersci tag. And last but not least don't forget to keep checking the blogosphere. We'll be updating this blog throughout the day. There are also some great summaries of all the speakers talks on Suw Charman-Anderson's blog, Strange Attractor, check it out here.
GridCast 03:24 An interview with Becky Parker » GridCast#7DA2EA Visit off-site link
One person that got everybody talking yesterday was the amazing Becky Parker, a 'superteacher', from Langton Grammar School in Kent. Becky is inspiring her pupils to get interested in physics by setting up CERN@school, a network of cosmic ray detectors, and her enthusiasm for the project was infectious.

 We grabbed Becky to find out a little more about the project - watch below to find out more about her fab ideas. And if that's not enough, you can read more in this article from iSGTW.

Thursday 02 September

GridCast 11:44 CERN@school @ the Citizen Cyberscience Summit » GridCast#7DA2EA Visit off-site link

Today at the Citizen Cyberscience Summit in London has been one of the most lively and entertaining events I’ve attended in ages. Under the banner of citizen cyberscience in all its guises, we’ve heard about human computers from David Grier, finding prime numbers from PrimeGrid, now running to hundreds of thousands of digits, and from the founder of BOINC and SETI@home, David Anderson.

The story of herbaria@home struck a particular chord as well for me – as a former museum curator in a previous life, the idea of using volunteers to help catalogue otherwise inaccessible botanical collections seems like an inspired idea. Instead of spending several weeks in a darkened museum cupboard cataloguing 200 pairs of 19th century spectacles, as I found myself doing a few years ago, today I could have asked a team of enthusiastic volunteers to give me a hand from the comfort of their own homes.

Among a host of enthusiastic presenters, a highlight for me was Becky Parker’s energetic presentation on building cosmic ray detector networks in schools. Through regular trips to CERN her class has had a unique insight into the world of particle physics and cosmic rays – a far cry from the experiments we did in school rolling little trucks down slopes attached to ticker tapes. Becky introduced us to the CERN@school initiative, which sends Medipix detector chips, as used in the Large Hadron Collider, into schools. A pilot group of schools in Kent are using these chips to gather data about secondary cosmic rays in the atmosphere, and Queen Mary University’s GridPP project is helping them to process the data. Hopefully this is a pilot that will take off into a whole host of schools, bring cyberscience to the citizens of tomorrow.
GridCast 10:13 Herbaria@home » GridCast#7DA2EA Visit off-site link
Citizen cyberscience projects don't just help physicists and astronomers. Scientists from all sorts of disciplines can benefit from the efforts put in by citizens across the world.

A Herbaria@home specimen
With this in mind, this morning we heard from Tom Humphrey of the Botanical Society of the British Isles who explained how volunteers are helping to classify UK plant specimens.

His project, Herbaria@Home, asks volunteers to decipher plant specimen labels, providing information such as site name and date. This allows botanists to map these plants and track the spread (or not) of species across the UK. This provides vital biodiversity information which can be used for future studies of taxonomy, ecology, conservation and genetic biodiversity.

For hard-to-read handwritten labels volunteers need an opportunity to collaborate and discuss with each other, so like many other successful projects Herbaria@home has a very active message board. All information provided by volunteers is open to peer review and can be edited, with the public edit history accessible to all.

As an added bonus, uploading these plant specimens onto the internet for analysis, means that Herbaria@home is opening up museums' collections of plant material, which are otherwise largely inaccessible.

To date Herbaris@home has classified a total of 70,000 species and looks set to continue well into the future.
GridCast 09:14 The Charity Engine by Mark McAndrew » GridCast#7DA2EA Visit off-site link
Mark McAndrew is talking from the Citizen Cyberscience Summit in London about his startup: The Charity Engine
You can find more on their website: http://www.charity-engine.com/


GridCast 07:58 Citizen Cyberscience Summit: Hanny Van Arkel's interview » GridCast#7DA2EA Visit off-site link
While volunteering for the Galaxy Zoo project, teacher Hanny van Arkel came across an unexplained astronomical object - Hanny's Voorwerp. Hanny tells us more about her find below - you can also read more at her website http://www.hannysvoorwerp.com/

UK Grids 07:38 National Grid Service - A flurry of activity » UK Grids#6DAA1D Visit off-site link
There has been a batch of new speakers recently announced for the NGS Innovation Forum '10 and further information about their presentations are now on the website. The latest presentations are NGS tool demos which will consist of walk-throughs of NGS tools using real research examples so delegates can leave the event with the knowledge of new tools to use in their research.

NGS tool demos
1. Transcriptome Analysis using the NGS User Interface /Workload Management System (UI/WMS) – Jonathan Churchill, NGS, STFC RAL
The UI/WMS is a tool which allows users to easily submit jobs to the whole of the NGS relying on the WMS to chose which NGS resources to use for their jobs. Use of the UI/WMS will be demonstrated with a user case study in which analysis time of mRNA was decreased from a month to less than 12 hours.

2. Accessing the NGS using the Application Hosting Environment (AHE) – Stefan Zasada, UCL
An overview of how access to the NGS can be simplified using the Application Hosting Environment, a lightweight application portal system.

3. Using the HERMES data management tool – David Wallom, NGS, University of Oxford
Here we will show how easy it is to install and connect into various NGS resources to move data between them, your home institution and your desktop.

4. The NGS from the CCP4 desktop – Matteo Turilli, NGS, University of Oxford
The NGS R&D theme have been working to build access to the NGS into the desktop tools that researchers use on a day-to-day basis, in this presentation we look at the example of CCP4: Software for Macromolecular X-Ray Crystallography.

We also have a presentation from the Director of the NGS -
The future of the NGS – Neil Geddes, NGS Director, STFC RAL
This presentation will look at the focus of activities for the NGS for the coming 2-3 years and possible longer term opportunities.

Remember that registration for the event is now open and that the call for poster abstracts closes on the 10th of September!
GridCast 05:40 The Citizen Cyberscience Summit » GridCast#7DA2EA Visit off-site link
Hello!

As you can probably tell from the last post, over the next two days we're going to be blogging from the Citizen Cyberscience Science Summit, being held at King's College London. The Summit is a great chance for scientists, developers and citizens to get together and talk about how they can all work together to solve problems that would otherwise be rather tricky. Citizen cyberscience has so far given birth to projects such as Galaxy Zoo, Where's George? and SETI@home, all of whom have speakers at the conference.

All in all it promises to be an interesting couple of days. We'll be posting videos and interesting snippets from the conference right here on GridCast but, in the meantime if you want to get up-to-date on the conference, take a look at the Citizen Cyberscience blog which has lots more info from the speakers and organisers on their opinions of this exciting branch of science. You can also catch all the action on Twitter - just use the #cybersci tag.
GridCast 03:55 Welcome to Citizen Cyberscience Summit in London! » GridCast#7DA2EA Visit off-site link
The Citizen Cyberscience Summit just started one hour ago.
Here are the 1st images:




Wednesday 01 September

UK Grids 09:14 SouthGrid - APEL on ngsce-test » UK Grids#6DAA1D Visit off-site link
APEL was failing on ngsce-test with the following error.

java.io.FileNotFoundException: /var/spool/pbs/server_priv/accounting/20090522 (Too many open files)

The solution was to type:
ulimit -n 10240

I've added this to the /opt/glite/bin/apel-pbs-log-parser script.

A fix is in test, so a new version of APEL will fix it.
see GGUS ticket
https://gus.fzk.de/ws/ticket_info.php?ticket=60674
UK Grids 03:47 NorthGrid - Atlas jobs in Manchester » UK Grids#6DAA1D Visit off-site link
August has seen a really a notable increase of Atlas user pilot jobs. Over 34000 jobs of which more than 12000 just in the last 4 days. Plotting the number of jobs since the beginning of the year there has been an inversion between production and users pilots.



The trend in August was probably helped by moving all the space to the DATADISK space token and attracting more interesting data. LOCALGROUP is also heavily used in Manchester.

In the past 4 days we also have applied the XFS file system tuning suggested by John that solves the load on the data servers experienced since upgrading to SL5. The tweak has increased notably the data throughput and reduced the load on the data servers practically to zero allowing us to increase the number of concurrent jobs. This has allowed a bigger job throughput and has had a clear improvement on the job efficiency isolating as most inefficient the very short ones (<10 mins CPU time) and even then the improvement is also notable as it is possible to see from the plots below.


Before applying the tweak




After applying the tweak




This also means we can keep on using XFS for the data servers which has currently more flexibility as far as partition sizes are concerned.

Tuesday 31 August

UK Grids 13:24 NorthGrid - Tuning Areca RAID controllers for XFS on SL5 » UK Grids#6DAA1D Visit off-site link
Sites (including Liverpool) running DPM on pool nodes running SL5 with XFS file systems have been experiencing very high (up to multiple 100s Load Average and close to 100% CPU IO WAIT) load when a number of analysis jobs were accessing data simultaneously with rfcp.The exact same hardware and file systems under SL4 had shown no excessive load, and the SL5 systems had shown no problems under system stress testing/burn-in. Also, the problem was occurring from a relatively small number of parallel transfers (about 5 or more on Liverpool's systems were enough to show an increased load compared to SL4).Some admins have found that using ext4 at least alleviates the problem although apparently it still occurs under enough load. Migrating production servers with TBs of live data from one FS to another isn't hard but would be a drawn out process for many sites.The fundamental problem for either FS appears to be IOPS overload on the arrays rather than sheer throughput, although why this is occurring so much under SL5 and not under SL4 is still a bit of a mystery. There may be changes in controller drivers, XFS, kernel block access, DPM access patterns or default parameters.When faced with an IOPS overload (that's resulting well below the theoretical throughput of the array) one solution is to make each IO operation access more bits from the storage device so that you need to make fewer but larger read requests.This leads to the actual fix (we have been doing this by default on our 3ware systems but we just assumed the Areca defaults were already optimal).
blockdev --setra 16384 /dev/$RAIDDEVICEThis sets the block device read ahead to (16384/2)kB (8MB). We have previously (on 3ware controllers) had to do this to get the full throughput from the controller. The default on our Areca 1280MLs is 128 (64kB read ahead). So when lots of parallel transfers are occurring our arrays have been thrashing spindles pulling off small 64kB chunks from each different file. These files are usually many hundreds or thousands of MB where reading MBs at a time would be much more efficient.The mystery for us is more why the SL4 systems *don't* overload rather than why SL5 does, as the SL4 systems use the exact same default values.Here is a ganglia plot of our pool nodes under about as much load as we can put on them at the moment. Note that previously our SL5 nodes would have LAs in the 10s or 100s under this load or less.http://hep.ph.liv.ac.uk/~jbland/xfs-fix.htmlAny time the systems go above 1LA now is when they're also having data written at a high rate. On that note we also hadn't configured our Arecas to have their block max sector size aligned with the RAID chunk size withecho "64" > /sys/block/$RAIDDEVICE/queue/max_sectors_kbalthough we don't think this had any bearing on the overloading and might not be necessary. 
We expect the tweak to also work for systems running ext4 as the underlying hardware access would still be a bottle neck, just at a different level of access.Note that this 'fix' doesn't fix the even more fundamental problem as pointed out by others that DPM doesn't rate limit connections to pool nodes. All this fix does is (hopefully) push the current limit where overload occurs above the point that our WNs can pull data.There is also a concern that using a big read ahead may affect small random (RFIO) access although the sites can tune this parameter very quickly to get optimum access. 8MB is slightly arbitrary but 64kB is certainly too small for any sensible access I can envisage to LHC data. Most access is via full file copy (rfcp) reads at the moment.
UK Grids 08:37 National Grid Service - Registration for NGS Innovation Forum open now! » UK Grids#6DAA1D Visit off-site link
The registration for the NGS Innovation Forum is now open - details of how to register can be found on the event page on the NGS website.

We are pleased to announce another speaker for the event. Neil Geddes, director of the NGS, will be speaking about the future of the NGS so if you are a long term NGS user who wants to know the future direction of the NGS or a new user who is planning to use the NGS for the long term, then make sure you attend!

A reminder that we are still looking for NGS users to submit poster abstracts demonstrating how they use our resources in their research. The deadline for abstracts is the 10th of September so it's approaching soon! There are many benefits of submitting an abstract and attending the event -

  • Walk through demos of new NGS tools
  • NGS staff on-hand to answer your questions
  • The opportunity to contribute and feedback to the future of the NGS
  • The poster abstracts will be peer reviewed by the NGS IF'10 programme committee
  • Publicity for your research both at the event and through accepted posters being placed on the NGS website
  • The chance to win a prize as "best poster" as voted for by IF'10 delegates
All that is required is a short 200 word abstract! Of course you are more than welcome to attend the event without submitting an abstract and you can attend for one or both days. We hope you can come along!




Null
 iSGTW 1 September 2010

Feature - The forecast before the storm

Q&A - Joe Hellerstein on cloud programming

Q&A - People behind EGI: Steve Brewer steps in as the voice of the user

Poll of the week - Rock stars of scientific computing

Videos of the week - NoHardware.com destroys server huggers' equipment

 Announcements

Symposium on Authentication Technologies for Research and Education abstracts due

Grace Hopper early bird registration due

Gordon Conference 2010 abstracts due

Jobs in distributed computing

 Subscribe

Enter your email address to subscribe to iSGTW.

Unsubscribe

 iSGTW Blog Watch

Keep up with the grid’s blogosphere

 Mark your calendar

September 2010

6-8, IASTED in Botswana

6-9, PRACE Training Week

6-10, GridKa School 2010

13-15, CaBIG

13-16, UK All Hands Meeting

14-17, EGI Technical Forum

20-24, Cluster 2010

27-29, ICT 2010

21-23, Cybera Summit 2010

More calendar items . . .

FooterINFSOMEuropean CommissionDepartment of EnergyNational Science Foundation RSSHeadlines | Site Map