Share |

How grid computing helped CERN hunt the Higgs

The first ATLAS inner detector end-cap after complete insertion within the liquid argon cryostat

Image courtesy ATLAS experiment © CERN

“As a layman, I’d say we have it.” It was with these words that CERN’s director general, Rolf Heuer, last month announced the discovery of a particle consistent with the Higgs boson, the long-sought-after corner stone of particle physics’ standard model. The scientific results upon which Heuer based his statement - taken from two experiments involved, ATLAS and CMS - are now set to be published in the upcoming issue of Physics Letters B.What many people outside of particle physics may not know is that distributed computing played a crucial role in the race towards this discovery.

 “Particle physics is nowadays an international and highly data-intensive field of science and it requires a massive international computing effort,” said Roger Jones, ATLAS physicist and collaboration board chair of the Worldwide LHC Computing Grid (WLCG), the organization that supplies this huge computing effort. Founded in 2002, today the WLCG involves the collaboration of over 170 computing centers in 36 countries, making it the largest scientific computing grid in the world.

The Worldwide LHC Computing Grid The Worldwide LHC Computing Grid (WLCG), which serves all the experiments on the LHC. Top left is the ATLAS detector; ‘evt’ stands for ‘event’, a physics term used to describe a set of particle interactions from the collision of two proton bunches.

Of its three tiers, Tier 0 is located at CERN, Switzerland, next to the ATLAS experiment. It has a capacity of about 68,000 cores, which is about a third of the grid’s total capacity of approximately 235,000 cores. Tier 0 is linked with the Tier 1 centers, which are typically regional research institutes, and each of those is connected with a series of Tier 2 computer centers, mostly situated in universities. The bandwidth used is impressive: 1.5 – 2 gigabytes per second flow continuously from CERN to the Tier 1 centers, and the worldwide flow of LHC-related data is 7.5 – 10 gigabytes per second.

The process of handling particle physics data can be broken down into three main parts: firstly reconstruction of the raw data from the detectors, secondly producing simulations of what the theory predicts should be seen in the detector (this is the most data-intensive part), and thirdly the physics analysis itself.

“In 2011 and 2012 ATLAS alone generated roughly six petabytes of raw data and a similar amount of derived data,” said Jamie Boyd, data preparation coordinator for ATLAS. “This is orders of magnitude more than previous particle physics experiments.” The challenge is not unique to ATLAS, but shared also by CMS, its sister experiment on the LHC. “CMS is today able to routinely sustain weekly data traffic of about 1.5 petabytes of data over a complex topology of Tier centers,” said Daniele Bonacorsi, deputy CMS computing coordinator. “The smooth handling of large volumes of data has been crucial for the LHC experiments to explore their physics potential”.

The data volume is exceptionally high because ATLAS, for instance, has about 100 million readout channels. Proton-proton collisions produce a high number of particles, requiring more computing power to reconstruct them. In addition, the LHC was unprecedentedly successful in 2012, resulting in what physicists call ‘pile-up’ beyond the design level of the machine.

The amount of data, or luminosity, delivered to ATLAS by the LHC

The amount of data, or luminosity, delivered to ATLAS by the LHC.
Note that the blue line for 2012 is much steeper than the red one
for 2011, indicating the extent of the challenge for the experiments
and the WLCG.

Add to that the pressure to produce a result in time for the biggest particle physics conference of the year, the ICHEP, held in Melbourne, Australia, and the race was well and truly on.

Of course, the physicists were not entirely taken by surprise, as they had been preparing for the demands of LHC operation for some time. “We reduced the CPU time to reconstruct one event to 25 seconds per event,” said ATLAS computing coordinator   Hans von der Schmitt. “From 2010 to 2012 we increased the capacity of the WLCG by 50%.” But when the crunch came in the first half of this year, even this was not enough. “At one point we had to borrow access to 2,000 more CPUs in Tier 0 to keep up with the LHC, and currently many of the funding agencies are providing us with about 20% more capacity than they had originally pledged.”

Following last month's announcement, physicists are now trying to more closely identify the “Higgs-like” particle. “Because the next step in the science requires us to look for differential distributions in the data, we will be needing much more of it,” said Boyd. This translates into even more computing capacity, but scientists also hope to make the existing capacity more efficient. 

The total number of ATLAS jobs on the WLCG from January to July 2012.

The total number of ATLAS jobs on the WLCG from January to July
2012. The majority are the blue Monte Carlo (MC) simulation
production jobs, closely followed by physics analysis in red. Note
the peak at the end of June, when scientists were intensively
preparing for the Higgs announcement on 4 July.

“We would like to change the computing models to require fewer copies of the data world-wide, and to minimize the number of bytes needed to store a physics event,” said Von der Schmitt. CMS also has some performance-improving tricks up its sleeve. “We are working on more dynamic data placement tactics and extending use of remote access techniques,” said Bonacorsi. Specialized processing units (originally developed for computer graphics) and multi-threading will also help.

Physicists are not so secretly hoping that their more detailed analyses will reveal something unusual about the Higgs boson leading to new physics. Who knows? But one thing is for sure: without distributed computing, their scientific advances would not be possible.
 

Your rating: None Average: 4.1 (13 votes)