Handling Coverages¶
This document will explain the basic principles of handling the most important EOxServer data models: coverages. The layout of the data models is explained in its own chapter.
Since all data models in EOxServer are based upon the
django.db.models.Model
class all associated documentation is also
applicable to all EOxServer models. Highly recommendable is also the Django
QuerySet documentation,
Creating Coverages¶
As we allready mentioned, coverages are basically Django models and are also created as such.
The following example creates a Rectified Dataset
.
from eoxserver.core.util.timetools import parse_iso8601
from django.contrib.gis import geos
from eoxserver.resources.coverages import models
dataset = models.RectifiedDataset(
identifier="SomeIdentifier",
size_x=1024, size_y=1024,
min_x=0, min_y=0, max_x=90, max_y=90, srid=4326,
begin_time=parse_iso8601("2014-05-10"),
end_time=parse_iso8601("2014-05-12"),
footprint=geos.MultiPolygon(geos.Polygon.from_bbox((0, 0, 90, 90)))
)
dataset.full_clean()
dataset.save()
Of course, in a productive environment, all of the above values would come
from a actual data and metadata files and would be parsed by
metadata readers
.
Also, our dataset is currently not linked to any actual raster files. To do
this, we need to create at least one DataItem
and add it to our Dataset.
from eoxserver.backends import models as backends
data_item = backends.DataItem(
dataset=dataset, location="/path/to/your/data.tif", format="image/tiff",
semantic="bands"
)
data_item.full_clean()
data_item.save()
This would link the dataset to a local file with the path
/path/to/your/data.tif
.
Note
Be cautious with relative paths! Depending on the deployment of the
server instance the actual meaning of the paths might differ! If you are
using Storages
or
Packages
relative paths are of
course okay and unambigous since they are relative to the package or storage
base location.
If you want to set up a data item that resides in a package (such as a .zip or
.tar file) or on a storage (like an HTTP or FTP server) you would need to set
up the Packages
or
Storages
:
http_storage = backends.Storage(
url="http://example.com/base_path/",
storage_type="HTTP"
)
http_storage.full_clean()
http_storage.save()
data_item.storage = http_storage
data_item.full_clean()
data_item.save()
# *or* in case of a package
zip_package = backends.Package(
location="/path/to/package.zip",
format="ZIP"
)
zip_package.full_clean()
zip_package.save()
data_item.package = zip_package
data_item.full_clean()
data_item.save()
Note
A DataItem
can only be in either a storage or a package. If it
has defined both a storage and a package, the storage has precedence. If you
want to have a Package
that resides on a Storage
you must use the
storage
of the
Package
.
Creating Collections¶
Collections are also created like Coverages, but usually require less initial information (because the metadata is usually collected from all entailed datasets).
The following creates a DatasetSeries
, a collection that can
entail almost any object of any subtype of EOObject
.
dataset_series = models.DatasetSeries(identifier="CollectionIdentifier")
dataset_series.full_clean()
dataset_series.save()
The handling of collections is fairly simple: you use insert()
to add a dataset or
subcollection to a collection and use remove()
to remove them.
Whenever either of the action is performed, the EO metadata of the collection is
updated according to the entailed datasets.
dataset_series.insert(dataset)
dataset_series.footprint # is now exactly the same as dataset.footprint
dataset_series.begin_time # is now exactly the same as dataset.begin_time
dataset_series.end_time # is now exactly the same as dataset.end_time
dataset_series.remove(dataset)
dataset_series.footprint # is now None
dataset_series.begin_time # is now None
dataset_series.end_time # is now None
Accessing Coverages¶
The simplest way to retrieve a coverage is by its ID:
from eoxserver.resources.coverages import models
dataset = models.Coverage.objects.get(identifier="SomeIdentifier")
This always returns an object of type Coverage
, to “cast” it to the actual
type:
dataset = dataset.cast()
Note
the cast()
method only makes a database lookup if the actual type
and the current type do not match. Otherwise (and only in this case), the
object itself is returned and no lookup is performed.
If you know the exact type of the coverage you want to look up you can also make the query with the desired type:
dataset = models.RectifiedDataset.objects.get(identifier="SomeIdentifier")
If the get()
query did not match any object (or possible more than one) an
exception is raised.
If you want to query more than one coverage at one (e.g: all coverages in a
certain time period) the filter()
method is what you want:
from eoxserver.core.util.timetools import parse_iso8601
start = parse_iso8601("2014-05-10")
stop = parse_iso8601("2014-05-12")
coverages_qs = models.Coverage.objects.filter(
begin_time__gte=start, end_time__lte=stop
)
for coverage in coverages_qs:
... # Do whatever you like with the coverage
Note
filter()
returns a Django QuerySet
which can be chained to further refine the
actual query. There is a lot of documentation on the topic I
highly recommend.
Usually coverages are organized in collections. If you want to iterate over a collection simply do so:
dataset_series = models.DatasetSeries.objects.get(
identifier="CollectionIdentifier"
)
for eo_object in dataset_series:
...
It is important to note that such an iteration does not yield coverages, but
EOObjects
. This is due
to the fact that collections might also contain other collections that don’t
necessarily have to inherit from Coverage
. If you just want to explicitly
get all Coverages
from a collection you can do it like this:
coverages_qs = models.Coverage.objects.filter(
collections__in=[dataset_series.pk]
)
You can also combine the filters for searches within a collection:
coverages_qs = dataset_series.eo_objects.filter(
begin_time__gte=start, end_time__lte=stop
)
# append an additional geometry search
coverages_qs = coverages_qs.filter(
footprint__intersects=geos.Polygon.from_bbox((30,30,40,40))
)
Note
There is no intrinsic order of EOObjects
in a Collection
, but
the EOObjects
can be sorted when they are retrieved from a collection.
(e.g: by identifier
, begin_time
or end_time
) using the
QuerySets order_by()
method.
Accessing Coverage Data¶
As already discussed, the actual data and metadata files of a coverage are
referenced via its associated DataItems
. First, it is necessary to select the
DataItems
that are actually relevant. This depends on the current situation:
for example in a metadata oriented request (such as the WCS DescribeCoverage
operation) only metadata items will be accessed (and only if they are of
relevance):
metadata_items = dataset.data_items.filter(
semantic="metadata", format="eogml"
)
The above example selected only metadata items with the format “eogml”.
In some cases the bands of a coverage are separated into multiple files that
have a semantic
like this: “bands[x:y]”. To select only those, we can use
the startswith field lookup:
band_items = dataset.data_items.filter(
semantic__startswith="bands"
)
for band_item in band_items:
# TODO: parse the band index or start/stop indices
...
Now that we have our relevant DataItems
we can start using them.
We also explained that the DataItems can reside on a Storage
or inside a Package
. Each storage has a specific storage type
and each package has a specific format. What types and formats are available
depends on your instance configuration, since the formats are implemented as
Components
. EOxServer ships with
support of local
, HTTP
, FTP
and Rasdaman
storages and with ZIP
and TAR
packages. This list of both storages and
packages can be easily extended by creating plugin Components
implementing either the
FileStorageInterface
,
ConnectedStorageInterface
or the
PackageInterface
.
See the documentation for writing Plugins for further info.
To ease the actual data access, there are two main methods: retrieve()
and connect()
.
Both functions have in common, that they operate on DataItems
which are
passed as the first parameter to the function.
The function retrieve()
returns a
path to the local file: for already local files, the path is simply passed,
in other cases the file is downloaded, unpacked, retrieved or whatever is
necessary to make the file locally accessible.
data_item = dataset.data_items.get(semantic="metadata")
local_path = retrieve(data_item)
You do not have to care for cleanup afterwards, since this is handled by EOxServers cache layer.
The function connect()
works
similarly, apart from the fact that it takes also storages into account that
do not provide files, but streams of data. Currently this only includes the
Rasdaman Storage
. If this
function does not deal with a Connected Storages
it behaves like the
retrieve()
function.