On this page I breifly describe the projects I participated within
CMS experiment:
I designed and implemented CMS Data Aggregation System (DAS).
The idea of DAS is quite simple. You have bunch of data-services, which can
delivery information about some data (the meta-data). Some of the fields in meta-data
are related, e.g. item name. Instead of visit every vendor web site you want a queriable
solution to pull out information about your items and represent it somehow. In nutshell
you want to build a cartesian product from multiple data-sources.
The DAS project was designed to address these issues.
In CMS we had dozen of data service who were designed to collect and store
meta-data informatino from various sub-system of the detector. For example,
Data Bookkeeping System, Site DB, File transfer and location systems, Monte Carlo DB,
Luminosity DB, Run Summary DB, etc. All of them were developed at different stages
of experiment using different technologies. The DAS project provides an
uniform umbrella for our users to access all information from those data-services
via simple, flexible and intuitive DAS Query Language (QL). The DAS also
provides a common cache layer, based on Couch DB, memcached technologies,
for all CMS data-services.
More information about DAS can be found
DAS blog.
See also
Interactive, programmable shell environment for CMS experiment. The cmssh
shell allowed CMS users easily discovery their favorite data,
perform data transfer across distributed storage elements and
perform data management tasks.
I designed and implemented CMS Data Bookkeeping System (DBS) in collaboration
with software development team from FNAL (Fermi National Laboratory).
My primary responsibilities were: schema and API design, design and
development of Data Discovery toolkit and DBS Query Language, a flexible and
powerful approach enabling users to find data within the CMS physics data catalog.
The DBS comprises a database and
the services used to store and access metadata related to its physics data.
In addition to the existing WEB based and programmatic API, a generalized
query system has been designed and built. This query system has a query
language that hides the complexity of the underlying database structure.
This provides a way of querying the system that is straightforward for
CMS data managers and physicists. The DBS Query Language uses the ANTLR tool
to build the input query parser and tokenizer, then a query builder using
a graph representation of the DBS schema constructs the actual SQL sent
to underlying database.
For more information please use the
DBS twiki.
References:
-
CMS CR-2009/076 note or up-coming CHEP 2009 paper.
I designed and developed (in collaboration with Brian Bockelman)
the CMS File Mover web-service. Its primary goal was to simplify access to
data files located at different CMS Tier centers.
The CMS experiment has a distributed computing model,
supporting thousands of physicists at hundreds of sites around the world.
While this is a suitable solution for "day to day" work in the LHC era
there are edge use-cases that Grid solutions do not satisfy. Occasionally
it is desirable to have direct access to a file on a users desktop or
laptop; for code development, debugging or examining event displays.
A FileMover is a user-friendly, web based tool that bridges the gap
between the large scale Grid resources and the smaller, simpler user
edge cases.
For reference please use CHEP 09 FileMover
talk.
For more information use the following twiki
page.
My involvement into CMS C++ framework was limited to the following items:
- Add access to provenance information for "non-Event" data
- Replacement of boost::signals to libsigc++ signals
My involvement into CMS C++ event display was limited to the following items:
- Initial design of Table widget based on ROOT
- Remote access of data files, via FileMover interface
For reference please use CHEP 09 Fireworks paper.