Advancements in Big Data Processing in the ATLAS and CMS Experiments
  Поиск в Google Поиск в Yandex
Автор: Vaniachine A. V.
Библиографическая ссылка:Vaniachine A. V. Advancements in Big Data Processing in the ATLAS and CMS Experiments // URL: (дата обращения: 07.09.2013).

The ever-increasing volumes of scientific data present new challenges for distributed computing and Grid technologies. The emerging Big Data revolution drives exploration in scientific fields including nanotechnology, astrophysics, high-energy physics, biology and medicine. New initiatives are transforming data-driven scientific fields enabling massive data analysis in new ways. In petascale data processing scientists deal with datasets, not individual files. As a result, a task (comprised of many jobs) became a unit of petascale data processing on the Grid. Splitting of a large data processing task into jobs enabled fine-granularity checkpointing analogous to the splitting of a large file into smaller TCP/IP packets during data transfers. Transferring large data in small packets achieves reliability through automatic re-sending of the dropped TCP/IP packets. Similarly, transient job failures on the Grid can be recovered by automatic re-tries to achieve reliable six sigma production quality in petascale data processing on the Grid. The computing experience of the ATLAS and CMS experiments provides foundation for reliability engineering scaling up Grid technologies for data processing beyond the petascale.

Ключевые слова: Параллельные вычисления, Алгоритмы, Большие данные, Grid, Передача данных, Эксперимент ATLAS, Эксперимент CMS
Код:Vaniachine 13
Последняя правка: 02.11.2013 21:34:00