Community Access to MODIS Satellite Reprojection and Reduction Pipeline and Data Sets

Val Hendrix 1 Jie Li 2 Keith Jackson 1 Lavanya Ramakrishnan 3 Youngryel Ryu 4 Keith Beattie 1 Christine Morin 5, 6 David Skinner 7 Catharine Van Ingen 8 Deb Agarwal 1
1 Advanced Computing for Science Department
ACS - Advanced Computing for Science Department
2 eScience Group
CS - Department of Computer Science [Charlottesville]
3 Advanced Computing for Science
ACS - Advanced Computing for Science Department
4 Dept. Landscape Architecture and Rural Systems Engineering
Department of Landscape Architecture and Rural Systems Engineering
5 MYRIADS - Design and Implementation of Autonomous Distributed Systems
IRISA-D1 - SYSTÈMES LARGE ÉCHELLE, Inria Rennes – Bretagne Atlantique
7 National Energy Research Scientific Computing Center
NERSC - National Energy Research Scientific Computing Center
8 San Francisco
Microsoft Research [Redmond]
Abstract : Moderate Resolution Imaging Spectroradiometer (MODIS), the key instrument aboard NASA's Terra and Aqua satellites, continuously generates data as the satellites cover the entire surface of earth every one to two days. This data is important to many scientific analyses, however, data procurement and processing can be challenging and cumbersome for user communities. Our current work is focused on enabling calculations using a combination of land and atmosphere products over land. Before performing the calculation the data must be downloaded and transformed, from a swath space and time system to a sinusoidal tiling system. Downloading data for a single product for an entire year can take several days for a single product and involves downloading via FTP many small files (on average ~83,000 files) in hierarchical data format (HDF4). The data processing, a swath-to-sinusoidal reprojection, is computationally intensive and currently available community tools only work for single sinusoidal tiles. We have developed a data-processing pipeline that downloads the MODIS products and reprojects them on HPC systems. HPC systems do not traditionally run these high-throughput data-intensive jobs and hence we need to address unique challenges for our pipeline. The first stage in the pipeline uses a catalog to determine what files need to be downloaded and downloads identified data sets. The downloaded files will in the future trigger an event that causes the reprojection job to be entered into a job queue. The output data is stored in an archival system. The resulting reprojected data will soon be widely available to the community through a front-end web portal. The portal will allow users to download reprojected data (~1 TB/year) for the following land and atmosphere products: MODO4_L2 (Aerosol), MOD05_L2 (Water Vapor), MOD06_L2 (Cloud), MOD07_L2 (Atmosphere Profile) and MOD11_L2 (Land Surface Temperature Emissivity). In this talk we will describe the architecture of the overall system and pipeline. Our long term plan is to allow users to reproject data on-demand and/or run algorithms on the reprojected MODIS data such as an evapotranspiration calculation.
Type de document :
Communication dans un congrès
AGU, 2012, San Francisco, United States. 2012
Liste complète des métadonnées

https://hal.inria.fr/hal-00762851
Contributeur : Christine Morin <>
Soumis le : samedi 8 décembre 2012 - 15:43:09
Dernière modification le : mardi 16 janvier 2018 - 15:54:19

Identifiants

  • HAL Id : hal-00762851, version 1

Citation

Val Hendrix, Jie Li, Keith Jackson, Lavanya Ramakrishnan, Youngryel Ryu, et al.. Community Access to MODIS Satellite Reprojection and Reduction Pipeline and Data Sets. AGU, 2012, San Francisco, United States. 2012. 〈hal-00762851〉

Partager

Métriques

Consultations de la notice

344