Handling Coverages

This document will explain the basic principles of handling the most important EOxServer data models: coverages. The layout of the data models is explained in its own chapter.

Since all data models in EOxServer are based upon the django.db.models.Model class all associated documentation is also applicable to all EOxServer models. Highly recommendable is also the Django QuerySet documentation,

Creating Coverages

As we allready mentioned, coverages are basically Django models and are also created as such.

The following example creates a Rectified Dataset.

from eoxserver.core.util.timetools import parse_iso8601
from django.contrib.gis import geos
from eoxserver.resources.coverages import models


dataset = models.RectifiedDataset(
    identifier="SomeIdentifier",
    size_x=1024, size_y=1024,
    min_x=0, min_y=0, max_x=90, max_y=90, srid=4326,
    begin_time=parse_iso8601("2014-05-10"),
    end_time=parse_iso8601("2014-05-12"),
    footprint=geos.MultiPolygon(geos.Polygon.from_bbox((0, 0, 90, 90)))
)

dataset.full_clean()
dataset.save()

Of course, in a productive environment, all of the above values would come from a actual data and metadata files and would be parsed by metadata readers.

Also, our dataset is currently not linked to any actual raster files. To do this, we need to create at least one DataItem and add it to our Dataset.

from eoxserver.backends import models as backends


data_item = backends.DataItem(
    dataset=dataset, location="/path/to/your/data.tif", format="image/tiff",
    semantic="bands"
)

data_item.full_clean()
data_item.save()

This would link the dataset to a local file with the path /path/to/your/data.tif.

Note

Be cautious with relative paths! Depending on the deployment of the server instance the actual meaning of the paths might differ! If you are using Storages or Packages relative paths are of course okay and unambigous since they are relative to the package or storage base location.

If you want to set up a data item that resides in a package (such as a .zip or .tar file) or on a storage (like an HTTP or FTP server) you would need to set up the Packages or Storages:

http_storage = backends.Storage(
    url="http://example.com/base_path/",
    storage_type="HTTP"
)
http_storage.full_clean()
http_storage.save()

data_item.storage = http_storage
data_item.full_clean()
data_item.save()

# *or* in case of a package

zip_package = backends.Package(
    location="/path/to/package.zip",
    format="ZIP"
)
zip_package.full_clean()
zip_package.save()

data_item.package = zip_package
data_item.full_clean()
data_item.save()

Note

A DataItem can only be in either a storage or a package. If it has defined both a storage and a package, the storage has precedence. If you want to have a Package that resides on a Storage you must use the storage of the Package.

Creating Collections

Collections are also created like Coverages, but usually require less initial information (because the metadata is usually collected from all entailed datasets).

The following creates a DatasetSeries, a collection that can entail almost any object of any subtype of EOObject.

dataset_series = models.DatasetSeries(identifier="CollectionIdentifier")
dataset_series.full_clean()
dataset_series.save()

The handling of collections is fairly simple: you use insert() to add a dataset or subcollection to a collection and use remove() to remove them. Whenever either of the action is performed, the EO metadata of the collection is updated according to the entailed datasets.

dataset_series.insert(dataset)
dataset_series.footprint  # is now exactly the same as dataset.footprint
dataset_series.begin_time # is now exactly the same as dataset.begin_time
dataset_series.end_time   # is now exactly the same as dataset.end_time

dataset_series.remove(dataset)
dataset_series.footprint  # is now None
dataset_series.begin_time # is now None
dataset_series.end_time   # is now None

Accessing Coverages

The simplest way to retrieve a coverage is by its ID:

from eoxserver.resources.coverages import models

dataset = models.Coverage.objects.get(identifier="SomeIdentifier")

This always returns an object of type Coverage, to “cast” it to the actual type:

dataset = dataset.cast()

Note

the cast() method only makes a database lookup if the actual type and the current type do not match. Otherwise (and only in this case), the object itself is returned and no lookup is performed.

If you know the exact type of the coverage you want to look up you can also make the query with the desired type:

dataset = models.RectifiedDataset.objects.get(identifier="SomeIdentifier")

If the get() query did not match any object (or possible more than one) an exception is raised.

If you want to query more than one coverage at one (e.g: all coverages in a certain time period) the filter() method is what you want:

from eoxserver.core.util.timetools import parse_iso8601

start = parse_iso8601("2014-05-10")
stop = parse_iso8601("2014-05-12")
coverages_qs = models.Coverage.objects.filter(
    begin_time__gte=start, end_time__lte=stop
)
for coverage in coverages_qs:
    ... # Do whatever you like with the coverage

Note

filter() returns a Django QuerySet which can be chained to further refine the actual query. There is a lot of documentation on the topic I highly recommend.

Usually coverages are organized in collections. If you want to iterate over a collection simply do so:

dataset_series = models.DatasetSeries.objects.get(
    identifier="CollectionIdentifier"
)
for eo_object in dataset_series:
    ...

It is important to note that such an iteration does not yield coverages, but EOObjects. This is due to the fact that collections might also contain other collections that don’t necessarily have to inherit from Coverage. If you just want to explicitly get all Coverages from a collection you can do it like this:

coverages_qs = models.Coverage.objects.filter(
    collections__in=[dataset_series.pk]
)

You can also combine the filters for searches within a collection:

coverages_qs = dataset_series.eo_objects.filter(
    begin_time__gte=start, end_time__lte=stop
)

# append an additional geometry search
coverages_qs = coverages_qs.filter(
    footprint__intersects=geos.Polygon.from_bbox((30,30,40,40))
)

Note

There is no intrinsic order of EOObjects in a Collection, but the EOObjects can be sorted when they are retrieved from a collection. (e.g: by identifier, begin_time or end_time) using the QuerySets order_by() method.

Accessing Coverage Data

As already discussed, the actual data and metadata files of a coverage are referenced via its associated DataItems. First, it is necessary to select the DataItems that are actually relevant. This depends on the current situation: for example in a metadata oriented request (such as the WCS DescribeCoverage operation) only metadata items will be accessed (and only if they are of relevance):

metadata_items = dataset.data_items.filter(
    semantic="metadata", format="eogml"
)

The above example selected only metadata items with the format “eogml”.

In some cases the bands of a coverage are separated into multiple files that have a semantic like this: “bands[x:y]”. To select only those, we can use the startswith field lookup:

band_items = dataset.data_items.filter(
    semantic__startswith="bands"
)
for band_item in band_items:
    # TODO: parse the band index or start/stop indices
    ...

Now that we have our relevant DataItems we can start using them.

We also explained that the DataItems can reside on a Storage or inside a Package. Each storage has a specific storage type and each package has a specific format. What types and formats are available depends on your instance configuration, since the formats are implemented as Components. EOxServer ships with support of local, HTTP, FTP and Rasdaman storages and with ZIP and TAR packages. This list of both storages and packages can be easily extended by creating plugin Components implementing either the FileStorageInterface, ConnectedStorageInterface or the PackageInterface. See the documentation for writing Plugins for further info.

To ease the actual data access, there are two main methods: retrieve() and connect().

Both functions have in common, that they operate on DataItems which are passed as the first parameter to the function.

The function retrieve() returns a path to the local file: for already local files, the path is simply passed, in other cases the file is downloaded, unpacked, retrieved or whatever is necessary to make the file locally accessible.

data_item = dataset.data_items.get(semantic="metadata")
local_path = retrieve(data_item)

You do not have to care for cleanup afterwards, since this is handled by EOxServers cache layer.

The function connect() works similarly, apart from the fact that it takes also storages into account that do not provide files, but streams of data. Currently this only includes the Rasdaman Storage. If this function does not deal with a Connected Storages it behaves like the retrieve() function.