We are at the start of a new era of information-rich astronomy. Several ongoing sky surveys over a range of wavelengths are now generating data sets measured in many terabytes (a terabyte is approximately equal to the amount of information contained in 2 million large books). The surveys are creating catalogs of objects (stars, galaxies, quasars, etc.) numbering in billions, with up to a hundred measured numbers for each object. Yet, this is just a foretaste of much larger data sets to come.
This vast amount of new information about the universe presents both a great scientific opportunity and a great technological challenge. Astronomers will be able to tackle some major problems with an unprecedented accuracy, e.g., mapping of the large-scale structure of the universe, the structure of our Galaxy, etc. The unprecedented size of the data sets will enable searches for extremely rare types of astronomical objects, and may well lead to the surprizing new discoveries of previously unknown types of objects or previously unknown astrophysical phenomena. Combining surveys done at different wavelengths, from radio and infrared, through visible light, ultraviolet, and x-rays, both from the ground-based telescopes and from space observatories, would provide a new, panchromatic picture of our universe, and lead to a better understanding of the objects in it. These are kinds of scientific investigations which were not practically possible with much more limited data sets in the past.
For the first time, astronomers will have data sets whose full information content exceedes the original purposes for which the data were obtained. This opens the new field of data-mining of digital sky surveys, using the data for newly conceived projects and exploring the vast data parameter spaces.
This will be a new way of doing the observational astronomy: with a computer, rather than a telescope. Anyone, anywhere, with a network connection and a clever idea would be able to do a first-rate science. This may include scientists and students far away from the major, established universities and institutes: from North Dakota to the south of India, and all points between. This new way of doing science will enable talented people anywhere to make their valuable contributions to astronomy.
This great opportunity comes with a comensurate technological challenge: how to manage, combine, analyse and explore these vast amounts of information, and to do it quickly and efficiently? The data volumes here are several orders of magnitude larger than what astronomers are used to deal with, and old methods simply do not work. There are issues on how to optimally store and access such complex data, how to combine sky surveys done at different wavelengths, how to visualise them, to search through them, etc. These technical problems, which are common to all data-intensive fields, require a development of a new generation of computing tools, and the implications and possible applications of such techniques reach well beyond astronomy.
There is now a rapidly growing interest in these concepts among the astronomers and space scientists world-wide, and we expect to see major new national initiatives along these lines in the next few years. But the initial exploration has already started. Astronomers at Caltech and JPL, in collaboration with computer scientists, have begun to build the groundwork for such future National Virtual Observatory (NVO).
Starting from the ongoing digital sky surveys, e.g., the Digital Palomar Observatory Sky Survey (DPOSS), led by Prof. George Djorgovski, and the infrared Two-Micron All-Sky Survey (TMASS), conducted out of IPAC, the Digital Sky project, led by Prof. Tom Prince, aims to develop some of the tools necessary for these tasks. Some of the tools may include applications of artificial intelligence, and many aspects of supercomputing technology.
Our vision is to ultimately combine a wide range of large digital sky surveys, include new data sets as they become available, and to provide a set of sophisticated tools which can be used by scientists and students to explore and exploit scientifically these enormous data sets, perhaps through a web interface. We must develop these new tools in order to acheve the scientific potential enabled by the data, and test them and improve them by doing actual science demonstration projects with them. This would be a working prototype of the future National Virtual Observatory, and it is likely that Caltech and JPL will be positioned as major contributors to its development and use.
The NVO will likely grow into a Global Virtual Observatory, serving as the fundamental information infrastructure for the astronomy and astrophysics at the turn of the century and beyond. The time to start on this path is now.
For more info, please contact George Djorgovski:
Back to the DPOSS home page Aug'99