Figure 1 - available via license: CC BY
Content may be subject to copyright.
Dislocation of the INFN divisions

Dislocation of the INFN divisions

Source publication
Article
Full-text available
The year 2017 was most likely a turning point for the INFN Tier- 1. In fact, on November 9th 2017 early at morning, a large pipe of the city aqueduct, located under the road next to CNAF, broke. As a consequence, a river of water and mud flowed towards the Tier-1 data center. The level of the water did not exceed the threshold of safety of the wate...

Context in source publication

Context 1
... Physics (INFN) is the research agency, funded by the Italian government, dedicated to the study of the fundamental constituents of matter and the laws that govern them. The INFN is composed by more than 20 divisions dislocated at the main Italian University Physics Departments, 4 Laboratories and 3 National Centers dedicated to specific tasks ( Fig. ...

Similar publications

Conference Paper
Full-text available
The research field of this paper is the sustainable recovery process of a traditional settlement after an earthquake and the role of community involvement into the recovery process. Disaster is defined as complicated, while its effects are multi-dimensional and influence both natural and human environment. Several scientists attempted to identify t...

Citations

... All the power distribution is carried out using two separated physical lines (referred to as "red" and "green" lines), consequently it is technically possible to provide a full dual redundant power supply to all the IT hardware installed in our data center. At the end of 2017 a main water pipeline located under the main front street of the building broke down [2] and we suffered a water flood of the lower underground levels of our Tier-1 including the whole power room and parts of the two IT resources rooms. This event has led to the unavailability of our Tier-1 for several months and furthermore, it has forced us to take note of the physical weaknesses of our building and the limits of the current monitoring and alarm system capabilities. ...
Article
Full-text available
During the last years we have carried out a renewal of the Building Management System (BMS) software of our data center with the aim of improving the data collection capability. Considering the complex physical distribution of the technical plants and the limits of the actual building hosting our center, a system that simply monitors and collects all the necessary information and provides alarms only in case of major failures has proven to be unsatisfactory. In 2017 we suffered a major flood due to one main water pipeline failure in the public street. After this disastrous event, clearly far beyond our control, we were however forced to reconsider completely the physical site robustness of our building in addition to the current monitoring and alarm system capabilities. It was clear that in some specific cases, alerts should be triggered hours or days before the actual main problem arises in order to allow efficient human intervention and proper escalation process. This paradigm could be easily applied to almost all the infrastructure components in our site, mainly the electric power distribution and continuity systems as well as the whole cooling devices. For this reason, in parallel to a consistent increase in the sensor widespread distribution of our BMS data collector system, a study of a predictive maintenance approach applicability to our site has been started. Predictive maintenance techniques aims at prevent unexpected infrastructure components failures or major events with the study of the whole monitoring data collection and the creation of appropriate statistical models with the help of big data analysis and machine learning techniques. An improvement in the Power Distribution Units (PDUs) monitoring in our site and the introduction of a dedicated network of water leak sensors were the first steps for increasing the data collection information at our disposal. With sufficient monitoring statistical information stored in our BMS system a preliminary and exploratory predictive data analysis proof of concept could be constructed. This could lead to the model building phase and the creation of a prototype with the aim of forecasting future infrastructure main failure events and forthcoming error conditions. The general idea is, conceivably, an approach to the predictive maintenance model where it would be possible to introduce scheduled corrective actions for the purpose of preventing potential failures in the next future and increasing the site overall reliability.