• managing and processing exponentially growing data volumes, often arriving in time-sensitive streams from arrays of sensors and instruments, or as the outputs from simulations; and
• significantly reducing data analysis cycles so that researchers can make timely decisions.
• new algorithms that can scale to search and process massive datasets;
• new metadata management technologies that can scale to handle complex, heterogeneous, and distributed data sources;
• advances in high-performance computing platforms to provide uniform high-speed memory access to multiterabyte data structures;
• specialized hybrid interconnect architectures to process and filter multigigabyte data streams coming from high-speed networks and scientific instruments and simulations;
• high-performance, high-reliability, petascale distributed file systems;
• new approaches to software mobility, so that algorithms can execute on nodes where the data resides when it is too expensive to move the raw data to another processing site;
• flexible and high-performance software integration technologies that facilitate the plug-and-play integration of software components running on diverse computing platforms to quickly form analytical pipelines; and
• data signature generation techniques for data reduction and rapid processing.