1. Home
  2. Tools & Services
  3. LAADS Data Download Scripts

LAADS Data Download Scripts

Downloading Files Using LAADS DAAC App Keys

ESDIS, our parent organization, requires that we track who downloads files. In addition, MODIS and VIIRS Science teams require that we protect some data from download by the general public but permit it for authorized users.

To meet these requirements, ESDIS has implemented Earthdata Profile (URS), a profile manager that helps us keep a record of anyone using our services. In addition to URS, LAADS manages authorizations that restrict access to certain resources by user and by type of access.

In order to access restricted resources (e.g. SENTINEL-3) a user would need to first get authorization from the owner of that resource and log in. Users that are currently logged into LAADS should not need to log out and log back in in order to see the changes.

Deprecation of FTP

File Transfer Protocol (FTP) has been around for many generations and has been used successfully for transfering files via FTP clients and numerous scripting languages. Unfortunately, FTP is also considered a security risk by many cybersecurity experts.

NASA’s Goddard Space Flight Center will be blocking all requests to public facing FTP servers—including LAADS DAAC and LANCE NRT—as of 20 April 2018.

Downloading via HTTP

Hypertext Transfer Protocol (HTTP) is the protocol that drives most web site internet traffic today. A variant of the protocol, called “HTTPS”, “S” for “Secure”, has been chosen to replace FTP.

HTTPS encrypts all transactions between client and server. This makes it extremely difficult for third parties to intercept what is being transferred.

All HTTPS downloads will require either an active login session or an app key (i.e. token) passed in via an Authorization HTTP header. LAADS DAAC currently requires HTTPS downloading of all data.

Earthdata Profile

Each user must create an Earthdata URS profile in order to download files. Each URS profile is tracked in LAADS by email address, not URS usernames.

Authorization

After creating an Earthdata profile, you may request authorization to access a resource from the owner of that resource. The following rubrick will guide you on the process for each type of resource:

Resource Instructions
MERIS or SENTINEL-3

Any user can agree to the corresponding license agreements when attempting to download a MERIS or SENTINEL-3 file. Each user will be prompted to agree to the terms of the license at that time.

Other

Have your PI contact us with the details. We need to know the following:

  1. what resource to grant access to (e.g. a private data set)
  2. the email address (not your username) that you used when registering with Earthdata.

If you are not directly associated with a MODIS, VIIRS, or GOES-R Science Team then you will need to contact us directly. Please tell us

  1. which resource you need access to along with a justification.
  2. the email address (not your username) that you used when registering with Earthdata. We usually confirm access with the project PI. Once confirmed, you will be notified that you have access to that resource.

App Keys

Users wishing to download via a browser simply need to log in and click links within the LAADS Archive. Scripted downloads will need to use LAADS app keys in order to be properly authorized.

LAADS app keys are string tokens that identify who you are. App keys get passed in the Authorization header of each HTTP GET request. See code samples below.

Requesting an App Key

Any user that does not already have an app key for LAADS DAAC can perform the following steps:

  1. login by going to Profile -> Earthdata Login
  2. select Profile -> App Keys from the top menu
  3. create your new app key by entering a description and clicking the "Create New App Key" button

Retrieving an App Key

Any forgotten or lost app keys can be retrieved via the following steps:

  1. login to the LAADS DAAC web site using the Profile -> Earthdata Login menu
  2. select Profile -> App Keys from the menu
  3. highlight and copy any existing app key from the list

How to use an App Key

This example uses the curl command to make a request for the file on our web server at the URL https://ladsweb.modaps.eosdis.nasa.gov/PATH_TO_MY_FILE

curl -v -H 'Authorization: Bearer MY_APP_KEY' 'https://ladsweb.modaps.eosdis.nasa.gov/PATH_TO_MY_FILE' > result

The example above passes your app key via the Authorization HTTP header while utilizing the Bearer schema. When finished, the resulting download will be written to a file called “result” in whatever directory (folder) you run the command from.

The app key located in the header is how LAADS identifies users. If the app key is invalid, missing, or an -h is used instead of -H, the curl command will not work. The -v parameter tells curl to be verbose so extra information about the request and response will be printed out. If this extra information is not needed the -v parameter can be left off.

Curl is available for all current operating systems, including Linux, MacOS, and Microsoft Windows.

Note

  1. all characters in the command are important, including dashes, colons, and quotation marks
  2. copy your app key in place of the string MY_APP_KEY
  3. copy the path to the file you need in place of the string PATH_TO_MY_FILE
  4. Most LAADS DAAC file paths look like

    archive/allData/COLLECTION/PRODUCT/YEAR/DAY_OF_YEAR/FILENAME

    An example of an existing file is below.

    https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/6/MOD02QKM/2007/018/MOD02QKM.A2007018.0105.006.2014227230926.hdf

    This path should return a MODIS Terra quarter kilometer (250 m) top of atmosphere reflectance product for year 2007, day-of-year 018 (i.e. January 18), from collection 6.

Automation

If all you need is one file and you know which file it is, it is much easier to go to the LAADS archive and click to download as needed. If you need many files (e.g. all of last month's MOD09 data) you might prefer to rely on scripts. We have samples for Shell Script, Perl, and Python.

Code Samples

Most current programming languages support HTTPS communication or can call on applications that support HTTPS communication. See sample scripts below. We provide support for wget, linux shell script, Perl, and Python. When recursively downloading entire directories of files, wget will likely require the least amount of code to run.

To use these, click "Download source" to download or copy and paste the code into a file with an extension reflecting the programming language (.sh for Shell Script, .pl for Perl, .py for Python). Be sure the Unix execute permissions are set for the file. Lastly, open a terminal or shell and execute the file. Command-line examples are also included below.

wget

wget is an open source utility that can download entire directories of files with just one command. The only path that is required is the root directory. wget will automatically traverse the directory and download any files it locates.

wget is free and available for Linux, macOS, and Windows.

Installation

  1. Linux
    1. Launch a command-line terminal
    2. Type yum install wget -y
  2. macOS
    1. Install Homebrew (admin privileges required)
    2. Launch Applications > Utiliites > Terminal
    3. Type brew install wget
  3. Windows
    1. Download the latest 32-bit or 64-bit binary (.exe) for your system
    2. Move it to C:\Windows\System32 (admin privileges will be required)
    3. Click the Windows Menu > Run
    4. Type cmd and hit Enter
    5. In the Windows Command Prompt, type wget -h to verify the binary is being executed successfully
    6. If any errors appear, the wget.exe binary may be not be located in correct directory or you may need to switch from 32-bit <-> 64-bit

Command-Line/Terminal Usage:

wget -e robots=off -m -np -R .html,.tmp -nH --cut-dirs=3 "https://ladsweb.modaps.eosdis.nasa.gov/archive/allData/PATH_TO_DATA_DIRECTORY" --header "Authorization: Bearer APP_KEY" -P TARGET_DIRECTORY_ON_YOUR_FILE_SYSTEM

Be sure to replace the following:

  • PATH_TO_DATA_DIRECTORY: location of source directory in LAADS Archive
  • APP_KEY: Your app key
  • TARGET_DIRECTORY_ON_YOUR_FILE_SYSTEM: Where you would like to download the files. Examples include /Users/jdoe/data for macOS and Linux or C:\Users\jdoe\data for Windows

Linux Shell Script

Download source (remove .txt extension when downloaded)

Command-Line/Terminal Usage:

% laads-data-download.sh
#!/bin/bash

function usage {
  echo "Usage:"
  echo "  $0 [options]"
  echo ""
  echo "Description:"
  echo "  This script will recursively download all files if they don't exist"
  echo "  from a LAADS URL and stores them to the specified path"
  echo ""
  echo "Options:"
  echo "    -s|--source [URL]         Recursively download files at [URL]"
  echo "    -d|--destination [path]   Store directory structure to [path]"
  echo "    -t|--token [token]        Use app token [token] to authenticate"
  echo ""
  echo "Dependencies:"
  echo "  Requires 'jq' which is available as a standalone executable from"
  echo "  https://stedolan.github.io/jq/download/"
}

function recurse {
  local src=$1
  local dest=$2
  local token=$3
  
  echo "Querying ${src}.json"

  for dir in $(curl -s -H "Authorization: Bearer ${token}" ${src}.json | jq '.[] | select(.size==0) | .name' | tr -d '"')
  do
    echo "Creating ${dest}/${dir}"
    mkdir -p "${dest}/${dir}"
    echo "Recursing ${src}/${dir}/ for ${dest}/${dir}"
    recurse "${src}/${dir}/" "${dest}/${dir}"
  done

  for file in $(curl -s -H "Authorization: Bearer ${token}" ${src}.json | jq '.[] | select(.size!=0) | .name' | tr -d '"')
  do
    if [ ! -f ${dest}/${file} ] 
    then
      echo "Downloading $file to ${dest}"
      # replace '-s' with '-#' below for download progress bars
      curl -s -H "Authorization: Bearer ${token}" ${src}/${file} -o ${dest}/${file}
    else
      echo "Skipping $file ..."
    fi
  done
}

POSITIONAL=()
while [[ $# -gt 0 ]]
do
  key="$1"

  case $key in
    -s|--source)
    src="$2"
    shift # past argument
    shift # past value
    ;;
    -d|--destination)
    dest="$2"
    shift # past argument
    shift # past value
    ;;
    -t|--token)
    token="$2"
    shift # past argument
    shift # past value
    ;;
    *)    # unknown option
    POSITIONAL+=("$1") # save it in an array for later
    shift # past argument
    ;;
  esac
done

if [ -z ${src+x} ]
then 
  echo "Source is not specified"
  usage
  exit 1
fi

if [ -z ${dest+x} ]
then 
  echo "Destination is not specified"
  usage
  exit 1
fi

if [ -z ${token+x} ]
then 
  echo "Token is not specified"
  usage
  exit 1
fi

recurse "$src" "$dest" "$token"

Perl

Download source (remove .txt extension when downloaded)

Command-Line/Terminal Usage:

% perl laads-data-download.pl
#!/usr/bin/env perl
use strict;
use warnings;
use Getopt::Long qw( :config posix_default bundling no_ignore_case no_auto_abbrev);
use LWP::UserAgent;
use LWP::Simple;
use JSON;

my $source      = undef;
my $destination = undef;
my $token       = undef;

GetOptions( 's|source=s' => \$source, 'd|destination=s' => \$destination, 't|token=s' => \$token) or die usage();

sub usage {
  print "Usage:\n";
  print "  $0 [options]\n\n";
  print "Description:\n";
  print "  This script will recursively download all files if they don't exist\n";
  print "  from a LAADS URL and stores them to the specified path\n\n";
  print "Options:\n";
  print "  -s|--source [URL]         Recursively download files at [URL]\n";
  print "  -d|--destination [path]   Store directory structure to [path]\n";
  print "  -t|--token [token]        Use app token [token] to authenticate\n";
}

sub recurse {
  my $src   = $_[0];
  my $dest  = $_[1];
  my $token = $_[2];
  my $ua = LWP::UserAgent->new;
  print "Recursing $dest\n";
  my $req = HTTP::Request->new(GET => $src.".json");
  $req->header('Authorization' => 'Bearer '.$token);
  my $resp = $ua->request($req);
  if ($resp->is_success) {
    my $message = $resp->decoded_content;
    my $listing = decode_json($message);
    for my $entry (@$listing){
      if($entry->{size} == 0){
        mkdir($dest."/".$entry->{name});
        recurse($src.'/'.$entry->{name}, $dest.'/'.$entry->{name}, $token);
      }
    }

    for my $entry (@$listing){
      # Set below to 1 for download progress, or consider LWP::UserAgent::ProgressBar
      $ua->show_progress(0);
      if($entry->{size} != 0 and ! -e $dest.'/'.$entry->{name}){
        print "Downloading $dest/$entry->{name}\n";
        my $req = HTTP::Request->new(GET => $src.'/'.$entry->{name});
        $req->header('Authorization' => 'Bearer '.$token);
        my $resp = $ua->request($req, $dest.'/'.$entry->{name});
      } else {
        print "Skipping $entry->{name} ...\n";
      }
    }
  }
  else {
    print "HTTP GET error code: ", $resp->code, "\n";
    print "HTTP GET error message: ", $resp->message, "\n";
  }
}


if(!defined($source)){
  print "Source not set\n";
  usage();
  die;
}

if(!defined($destination)){
  print "Destination not set\n";
  usage();
  die;
}

if(!defined($token)){
  print "Token not set\n";
  usage();
  die
}

recurse($source, $destination, $token);

Python

Download source (remove .txt extension when downloaded)

Command-Line/Terminal Usage:

% python laads-data-download.py
#!/usr/bin/env python

# script supports either python2 or python3
#
# Attempts to do HTTP Gets with urllib2(py2) urllib.requets(py3) or subprocess
# if tlsv1.1+ isn't supported by the python ssl module
#
# Will download csv or json depending on which python module is available
#

from __future__ import (division, print_function, absolute_import, unicode_literals)

import argparse
import os
import os.path
import shutil
import sys

try:
    from StringIO import StringIO   # python2
except ImportError:
    from io import StringIO         # python3


################################################################################


USERAGENT = 'tis/download.py_1.0--' + sys.version.replace('\n','').replace('\r','')


def geturl(url, token=None, out=None):
    headers = { 'user-agent' : USERAGENT }
    if not token is None:
        headers['Authorization'] = 'Bearer ' + token
    try:
        import ssl
        CTX = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
        if sys.version_info.major == 2:
            import urllib2
            try:
                fh = urllib2.urlopen(urllib2.Request(url, headers=headers), context=CTX)
                if out is None:
                    return fh.read()
                else:
                    shutil.copyfileobj(fh, out)
            except urllib2.HTTPError as e:
                print('HTTP GET error code: %d' % e.code(), file=sys.stderr)
                print('HTTP GET error message: %s' % e.message, file=sys.stderr)
            except urllib2.URLError as e:
                print('Failed to make request: %s' % e.reason, file=sys.stderr)
            return None

        else:
            from urllib.request import urlopen, Request, URLError, HTTPError
            try:
                fh = urlopen(Request(url, headers=headers), context=CTX)
                if out is None:
                    return fh.read().decode('utf-8')
                else:
                    shutil.copyfileobj(fh, out)
            except HTTPError as e:
                print('HTTP GET error code: %d' % e.code(), file=sys.stderr)
                print('HTTP GET error message: %s' % e.message, file=sys.stderr)
            except URLError as e:
                print('Failed to make request: %s' % e.reason, file=sys.stderr)
            return None

    except AttributeError:
        # OS X Python 2 and 3 don't support tlsv1.1+ therefore... curl
        import subprocess
        try:
            args = ['curl', '--fail', '-sS', '-L', '--get', url]
            for (k,v) in headers.items():
                args.extend(['-H', ': '.join([k, v])])
            if out is None:
                # python3's subprocess.check_output returns stdout as a byte string
                result = subprocess.check_output(args)
                return result.decode('utf-8') if isinstance(result, bytes) else result
            else:
                subprocess.call(args, stdout=out)
        except subprocess.CalledProcessError as e:
            print('curl GET error message: %' + (e.message if hasattr(e, 'message') else e.output), file=sys.stderr)
        return None



################################################################################


DESC = "This script will recursively download all files if they don't exist from a LAADS URL and stores them to the specified path"


def sync(src, dest, tok):
    '''synchronize src url with dest directory'''
    try:
        import csv
        files = [ f for f in csv.DictReader(StringIO(geturl('%s.csv' % src, tok)), skipinitialspace=True) ]
    except ImportError:
        import json
        files = json.loads(geturl(src + '.json', tok))

    # use os.path since python 2/3 both support it while pathlib is 3.4+
    for f in files:
        # currently we use filesize of 0 to indicate directory
        filesize = int(f['size'])
        path = os.path.join(dest, f['name'])
        url = src + '/' + f['name']
        if filesize == 0:
            try:
                print('creating dir:', path)
                os.mkdir(path)
                sync(src + '/' + f['name'], path, tok)
            except IOError as e:
                print("mkdir `%s': %s" % (e.filename, e.strerror), file=sys.stderr)
                sys.exit(-1)
        else:
            try:
                if not os.path.exists(path):
                    print('downloading: ' , path)
                    with open(path, 'w+b') as fh:
                        geturl(url, tok, fh)
                else:
                    print('skipping: ', path)
            except IOError as e:
                print("open `%s': %s" % (e.filename, e.strerror), file=sys.stderr)
                sys.exit(-1)
    return 0


def _main(argv):
    parser = argparse.ArgumentParser(prog=argv[0], description=DESC)
    parser.add_argument('-s', '--source', dest='source', metavar='URL', help='Recursively download files at URL', required=True)
    parser.add_argument('-d', '--destination', dest='destination', metavar='DIR', help='Store directory structure in DIR', required=True)
    parser.add_argument('-t', '--token', dest='token', metavar='TOK', help='Use app token TOK to authenticate', required=True)
    args = parser.parse_args(argv[1:])
    if not os.path.exists(args.destination):
        os.makedirs(args.destination)
    return sync(args.source, args.destination, args.token)


if __name__ == '__main__':
    try:
        sys.exit(_main(sys.argv))
    except KeyboardInterrupt:
        sys.exit(-1)

Last updated: June 11, 2018