datasafe.server module
Server components of the LabInform datasafe.
Different server components can be distinguished:
user-facing components (frontends)
storage components (backends)
Note that “user” is a broad term here, meaning any person and program
accessing the datasafe. In this respect, the clients contained in
datasafe.client
are users as well.
The backend components deal with the actual storage of data (in the file system) and the access to them.
Frontends
Frontends allow a “user” (mostly another program) to access the datasafe, without needing any details of how the data are actually stored.
Currently, there are two frontends implemented, that have different use cases:
-
General frontend that can be used locally with
datasafe.client.LocalClient
. -
API for the HTTP server running via flask.
HTTP frontend that can be used via HTTP, e.g. using the
datasafe.client.HTTPClient
class. Using HTTP, this allows generally to completely separate client and server in terms of their locations and access data even remotely. However, keep in mind that remote access comes with security implications that are currently not dealt with.The actual HTTP server is created with the function
create_http_server()
, but the API class is the interesting part here.
Backends
Backends deal with actually storing the data.
Currently, there is only one backend implemented:
-
A backend using the file system for storing data.
Things to decide
Some things that need to be decided about:
Where to store configuration?
At least the base directory for the datasafe needs to be defined in some way.
Other configuration values could be the issuer (number after the “42.” of a LOI)
Perhaps one could store the configuration in a separate configuration class to start with and see how this goes…
Module documentation
- class datasafe.server.Server[source]
Bases:
object
Server part of the datasafe.
The server interacts with the storage backend to store and retrieve contents and provides the user interface.
It retrieves datasets, stores them and should check, whether its content is complete and not compromised.
The transfer occurs as bytes of the zipped dataset that is received by the server, decoded, unzipped, and archived into the correct directory.
- storage
- Type:
- loi
- Type:
- new(loi='')[source]
Create new LOI.
The storage corresponding to the LOI will be created and the LOI returned if successful. This does, however, not add any data to the datasafe. Therefore, calling
new()
will usually be followed by callingupload()
at some later point. On the other hand, before callingupload()
, you need to callnew()
to create the new LOI storage space.- Parameters:
loi (
str
) – LOI for which the resource should be created- Returns:
loi – LOI the resource has been created for
- Return type:
- Raises:
datasafe.exceptions.MissingLoiError – Raised if no LOI is provided
datasafe.exceptions.InvalidLoiError – Raised if LOI is not valid (for the given operation)
- upload(loi='', content=None)[source]
Upload data to the datasafe.
Data are upload as bytes of the zipped content (dataset).
- Parameters:
- Returns:
integrity – dict with fields
data
andall
containing boolean valuesFor details see
datasafe.manifest.Manifest.check_integrity()
.- Return type:
- Raises:
datasafe.exceptions.MissingLoiError – Raised if no LOI is provided
datasafe.exceptions.LoiNotFoundError – Raised if resource corresponding to LOI does not exist
datasafe.exceptions.ExistingFileError – Raised if resource corresponding to LOI is not empty
- download(loi='')[source]
Download data from the datasafe.
- Parameters:
loi (
str
) – LOI the data should be downloaded for- Returns:
content – byte representation of a ZIP archive containing the contents of the directory corresponding to path
- Return type:
- Raises:
datasafe.exceptions.MissingLoiError – Raised if no LOI is provided
datasafe.exceptions.LoiNotFoundError – Raised if resource corresponding to LOI cannot be found
datasafe.exceptions.MissingContentError – Raised if resource corresponding to LOI has no content
- update(loi='', content=None)[source]
Update data in the datasafe.
Data are upload as bytes of the zipped content (dataset).
- Parameters:
- Returns:
integrity – dict with fields
data
andall
containing boolean valuesFor details see
datasafe.manifest.Manifest.check_integrity()
.- Return type:
- Raises:
datasafe.exceptions.MissingLoiError – Raised if no LOI is provided
datasafe.exceptions.LoiNotFoundError – Raised if resource corresponding to LOI does not exist
datasafe.exceptions.NoFileError – Raised if resource corresponding to LOI is not empty
- class datasafe.server.StorageBackend[source]
Bases:
object
File system backend for the datasafe, actually handling directories.
The storage backend does not care at all about LOIs, but only operates on paths within the file system. As far as datasets are concerned, the backend requires a manifest file to accompany each dataset. However, it does not create such file. Furthermore, data are deposited (using
deposit()
) and retrieved (usingretrieve()
) as streams containing the contents of ZIP archives.- working_path(path='')[source]
Full path to working directory in datasafe
- Returns:
working_path – full path to work on
- Return type:
- create(path='')[source]
Create directory for given path.
- Parameters:
path (
str
) – path to create directory for- Raises:
datasafe.exceptions.MissingPathError – Raised if no path is provided
- remove(path='', force=False)[source]
Remove directory corresponding to path.
Usually, non-empty directories will not be removed but raise an
OSError
exception.
- get_highest_id(path='')[source]
Get number of subdirectory corresponding to path with highest number
Return last element of a sorted list of directory contents, assuming the directory to only contain subdirectories with numeric IDs.
In case there is no numeric ID yet in the directory, it returns 0.
Todo
Handle directories whose names are not convertible to integers
- create_next_id(path='')[source]
Create next subdirectory in directory corresponding to path
- Parameters:
path (
str
) – path the subdirectory should be created in
- deposit(path='', content=None)[source]
Deposit data provided as content in directory corresponding to path.
Content is the byte representation of a ZIP archive containing the actual content. This byte representation is saved in a temporary file and afterwards unpacked in the directory corresponding to path.
After depositing the content (including unzipping), the checksums in the manifest are checked for consistency with newly generated checksums, and in case of inconsistencies, an exception is raised.
- Parameters:
- Returns:
integrity – dict with fields
data
andall
containing boolean valuesFor details see
datasafe.manifest.Manifest.check_integrity()
.- Return type:
- Raises:
datasafe.exceptions.MissingPathError – Raised if no path is provided
datasafe.exceptions.MissingContentError – Raised if no content is provided
- retrieve(path='')[source]
Obtain data from directory corresponding to path
The data are compressed as ZIP archive and the contents of the ZIP file is returned as bytes.
- Parameters:
path (
str
) – path the data should be retrieved for- Returns:
content – byte representation of a ZIP archive containing the contents of the directory corresponding to path
- Return type:
- Raises:
datasafe.directory.MissingPathError – Raised if no path is provided
OSError – Raised if path does not exist
- get_index()[source]
Return list of paths to datasets
Such a list of paths to datasets is pretty useful if one intends to check locally for existing LOIs (corresponding to paths in the datasafe).
If a path has been created already, but no data yet saved in there, as may happen during an experiment to reserve the corresponding LOI, this path will nevertheless be included.
- Returns:
paths – list of paths to datasets
- Return type:
- datasafe.server.create_http_server(test_config=None)[source]
Create a HTTP server for accessing the datasafe.
- Parameters:
test_config (
dict
) – Configuration for HTTP server- Returns:
app – WSGI application created via flask
- Return type:
- class datasafe.server.HTTPServerAPI[source]
Bases:
MethodView
API view used in the HTTP server.
The actual server is created via
create_http_server()
and operates via flask. This API view provides the actual API functionality to access the datasafe and its underlying storage backend via HTTP.The API provides methods for the HTTP methods, currently GET, POST, PUT, and PATCH.
Furthermore, exceptions are converted into the appropriate HTTP status codes and the message of the exception is contained in the response body. Thus, clients such as
datasafe.client.HTTPClient
can convert the HTTP status codes back into Python exceptions.- server
Server backend that communicates with the storage backend.
- Type:
- get(loi='')[source]
Handle get requests.
The following responses are currently returned, depending on the status the request resulted in:
Status
Code
data
success
200
dataset contents (ZIP archive)
no data
204
message
not found
404
error message
invalid
404
error message
The status “no data” results from querying a LOI that has been created (using POST), but no data uploaded to so far.
The status “invalid” differs from “not found” in that the LOI requested is invalid.
- Parameters:
loi (
str
) – LOI of get request- Returns:
response – Response object
- Return type:
- post(loi='')[source]
Handle POST requests.
A POST request will only create a new empty resource connected to the LOI, but never upload data. For uploading, use put. While this may seem like not conforming to the typical usage of POST requests, the reason is simple:
post()
returns the newly created LOI, whileput()
returns the JSON representation of the integrity check dict. Hence, to be able to check that the data have been successfully arrived at the datasafe storage backend, it is essential to separate POST and PUT requests.The following responses are currently returned, depending on the status the request resulted in:
Status
Code
data
created
201
newly created LOI
invalid
404
error message
- Parameters:
loi (
str
) – LOI of post request- Returns:
response – Response object
- Return type:
class:flask.Response
- put(loi='')[source]
Handle PUT requests.
PUT requests are used to transfer data to an existing resource of the datasafe. To create a new resource, use
post()
beforehand. If data exist already at the resource, this will result in an error (status code 405, see table below).The following responses are currently returned, depending on the status the request resulted in:
Status
Code
data
success
200
JSON representation of integrity check dict
does not exist
400
error message
missing content
400
error message
invalid
404
error message
existing content
405
error message
The status “does not exist” refers to the LOI the data should be put to not existing (in this case, you need to first create it using PUSH). Therefore, in this particular case, status code 400 instead of 404 (“not found”) is returned.
The status “missing content” refers to the request missing data.
The status “existing content” refers to data already present at the storage referred to with the LOI. As generally, you could update the content using another method, a status code 405 (“method not allowed”) is returned in this case.
- Parameters:
loi (
str
) – LOI of put request- Returns:
response – Response object
- Return type:
class:flask.Response
- patch(loi='')[source]
Handle PATCH requests.
PATCH requests are used to update data at an existing resource of the datasafe. To upload new data to an existing resource, use
put()
. If no data exist at the resource, this will result in an error (status code 405, see table below).The following responses are currently returned, depending on the status the request resulted in:
Status
Code
data
success
200
JSON representation of integrity check dict
does not exist
400
error message
missing content
400
error message
invalid
404
error message
no resource content
405
error message
The status “does not exist” refers to the LOI the data should be put to not existing (in this case, you need to first create it using PUSH). Therefore, in this particular case, status code 400 instead of 404 (“not found”) is returned.
The status “missing content” refers to the request missing data.
The status “no resource content” refers to no data present at the storage referred to with the LOI. As generally, you could upload new content using another method, a status code 405 (“method not allowed”) is returned in this case.
- Parameters:
loi (
str
) – LOI of put request- Returns:
response – Response object
- Return type:
class:flask.Response