labthings_fastapi.outputs.blob

BLOB Output Module.

The .Blob class is used when you need to return something file-like that can’t easily (or efficiently) be converted to JSON. This is useful for returning large objects like images, especially where an existing file-type is the obvious way to handle it.

There is a documentation page on Blob input/output that explains how to use this mechanism.

To return a file from an action, you should declare its return type as a Blob subclass, defining the Blob.media_type attribute.

class MyImageBlob(Blob):
    media_type = "image/png"


class MyThing(Thing):
    @action
    def get_image(self) -> MyImageBlob:
        # Do something to get the image data
        data = self._get_image_data()
        return MyImageBlob.from_bytes(data)

The action should then return an instance of that subclass, with data supplied either as a bytes object or a file on disk. If files are used, it’s your responsibility to ensure the file is deleted after the Blob object is garbage-collected. Constructing it using the class methods Blob.from_bytes or Blob.from_temporary_directory will ensure this is done for you.

Bear in mind a tempfile.TemporaryFile object only holds a file descriptor and is not safe for concurrent use, which does not work well with the HTTP API: action outputs may be retrieved multiple times after the action has completed, possibly concurrently. Creating a temp folder and making a file inside it with Blob.from_temporary_directory is the safest way to deal with this.

Serialisation

Blob objects are serialised to a JSON representation that includes a download href. This is generated using middleware.url_for which uses a context variable to pass the function that generates URLs to the serialiser code. That context variable is available in every response handler function in the FastAPI app - but it is not, in general, available in action or property code (because actions and properties run their code in separate threads). The sequence of events that leads to a Blob being downloaded as a result of an action is roughly:

  • A POST request invokes the action.
    • middleware.url_for.url_for_middleware makes url_for accessible via

      a context variable

    • A 201 response is returned that includes an href to poll the action.

    • Action code is run in a separate thread (without url_for in the context):
      • The action creates a Blob object.

      • The function that creates the Blob object also creates a BlobData

        object as a property of the Blob

      • The BlobData object’s constructor adds it to the blob_manager and

        sets its id property accordingly.

      • The Blob is returned by the action.

    • The output value of the action is stored in the Invocation thread.

  • A GET request polls the action. Once it has completed:
    • middleware.url_for.url_for_middleware makes url_for accessible via

      a context variable

    • The Invocation model is returned, which includes the Blob in the

      output field.

    • FastAPI serialises the invocation model, which in turn serialises the Blob

      and uses url_for to generate a valid download href including the id of the BlobData object.

  • A further GET request actually downloads the Blob.

This slightly complicated sequence ensures that we only ever send URLs back to the client using url_for from the current fastapi.Request object. That means the URL used should be consistent with the URL of the request - so if an action is started by a client using one IP address or DNS name, and polled by a different client, each client will get a download href that matches the address they are already using.

In the future, it may be possible to respond directly with the Blob data to the original POST request, however this only works for quick actions so for now we use the sequence above, which will work for both quick and slow actions.

Attributes

router

A FastAPI router for BlobData download endpoints.

Classes

BlobData

The data store of a Blob.

RemoteBlobData

A BlobData subclass that references remote data via a URL.

LocalBlobData

A BlobData subclass where the data is stored locally.

BlobBytes

A Blob that holds its data in memory as a bytes object.

BlobFile

A BlobData backed by a file on disk.

BlobModel

A model for JSON-serialised Blob objects.

Blob

A container for binary data that may be retrieved over HTTP.

Functions

parse_media_type(→ tuple[str, str])

Parse a media type string into its type and subtype.

match_media_types(→ bool)

Check if a media type matches a pattern.

download_blob(→ fastapi.responses.Response)

Download a Blob.

url_to_id(→ uuid.UUID | None)

Extract the blob ID from a URL.

Module Contents

class labthings_fastapi.outputs.blob.BlobData(media_type: str)

The data store of a Blob.

Blob objects can represent their data in various ways. Each of those options must provide three ways to access the data, which are the content property, the save() method, and the open() method.

This base class defines the interface needed by any data store used by a Blob.

Blobs that store their data locally should subclass LocalBlobData which adds a response() method and id property, appropriate for data that would need to be downloaded from a server. It also takes care of generating a download URL when it’s needed.

Initialise a BlobData object.

Parameters:

media_type – the MIME type of the data.

_media_type
property media_type: str

The MIME type of the data, e.g. ‘image/png’ or ‘application/json’.

abstract get_href() str

Return the URL to download the blob.

The implementation of this method for local blobs will need url_for.url_for and thus it should only be called in a response handler when the middeware.url_for middleware is enabled.

Returns:

the URL as a string.

Raises:

NotImplementedError – always, as this must be implemented by subclasses.

property content: bytes
Abstractmethod:

The data as a bytes object.

Raises:

NotImplementedError – always, as this must be implemented by subclasses.

abstract save(filename: str) None

Save the data to a file.

Parameters:

filename – the path where the file should be saved.

Raises:

NotImplementedError – always, as this must be implemented by subclasses.

abstract open() io.IOBase

Return a file-like object that may be read from.

Returns:

an open file-like object.

Raises:

NotImplementedError – always, as this must be implemented by subclasses.

class labthings_fastapi.outputs.blob.RemoteBlobData(media_type: str, href: str, client: httpx.Client | None = None)

Bases: BlobData

A BlobData subclass that references remote data via a URL.

This BlobData implementation will download data lazily, and provides it in the three ways defined by BlobData. It does not cache downloaded data: if the content attribute is accessed multiple times, the data will be downloaded again each time.

Note

This class is rarely instantiated directly. It is usually best to use Blob.from_url on a Blob subclass.

Create a reference to remote Blob data.

Parameters:
  • media_type – the MIME type of the data.

  • href – the URL where it may be downloaded.

  • client – if supplied, this httpx.Client will be used to download the data.

_href
_client
get_href() str

Return the URL to download the data.

Returns:

the URL as a string.

property content: bytes

The binary data, as a bytes object.

save(filepath: str) None

Save the output to a file.

Note that the current implementation retrieves the data into memory in its entirety, and saves to file afterwards.

Parameters:

filepath – the file will be saved at this location.

open() io.IOBase

Open the output as a binary file-like object.

Internally, this will download the file to memory, and wrap the resulting bytes object in an io.BytesIO object to allow it to function as a file-like object.

To work with the data on disk, use save instead.

Returns:

a file-like object containing the downloaded data.

class labthings_fastapi.outputs.blob.LocalBlobData(media_type: str)

Bases: BlobData

A BlobData subclass where the data is stored locally.

Blob objects can reference data by a URL, or can wrap data held in memory or on disk. For the non-URL options, we need to register the data with the BlobManager and allow it to be downloaded. This class takes care of registering with the BlobManager and adds the response method that must be overridden by subclasses to allow downloading.

See BlobBytes or BlobFile for concrete implementations.

Initialise the LocalBlobData object.

Parameters:

media_type – the MIME type of the data.

_all_blobdata: ClassVar[weakref.WeakValueDictionary[uuid.UUID, LocalBlobData]]

A way to retrieve LocalBlobData objects by their ID.

Note that this does not interfere with garbage collection, as it only holds weak references to the LocalBlobData objects.

_id
classmethod from_id(id: uuid.UUID) LocalBlobData

Retrieve a LocalBlobData object by its ID.

Note that this does not imply LocalBlobData objects are permanently stored: if there are no strong references to the object, it may have been garbage collected and will no longer be available.

Parameters:

id – the UUID of the desired LocalBlobData object.

Returns:

the corresponding LocalBlobData object.

Raises:

KeyError – if no such object exists.

classmethod all_ids() list[uuid.UUID]

Return a list of all currently registered BlobData IDs.

Returns:

a list of UUIDs for all registered LocalBlobData objects.

property id: uuid.UUID

A unique identifier for this BlobData object.

The ID is set when the BlobData object is added to the BlobDataManager during initialisation.

get_href() str

Return a URL where this data may be downloaded.

Note that this should only be called in a response handler, as it relies on url_for.url_for.

Returns:

the URL as a string.

abstract response() fastapi.responses.Response

Return a`fastapi.Response` object that sends binary data.

Returns:

a response that streams the data from disk or memory.

Raises:

NotImplementedError – always, as this must be implemented by subclasses.

class labthings_fastapi.outputs.blob.BlobBytes(data: bytes, media_type: str)

Bases: LocalBlobData

A Blob that holds its data in memory as a bytes object.

Blob objects use objects conforming to the BlobData protocol to store their data either on disk or in a file. This implements the protocol using a bytes object in memory.

Note

This class is rarely instantiated directly. It is usually best to use Blob.from_bytes on a Blob subclass.

Create a BlobBytes object.

Note

This class is rarely instantiated directly. It is usually best to use Blob.from_bytes on a Blob subclass.

Parameters:
  • data – is the data to be wrapped.

  • media_type – is the MIME type of the data.

_id: uuid.UUID
_bytes
property content: bytes

The wrapped data, as a bytes object.

save(filename: str) None

Save the wrapped data to a file.

Parameters:

filename – where to save the data.

open() io.IOBase

Return an open file-like object containing the data.

This wraps the underlying bytes in an io.BytesIO.

Returns:

an io.BytesIO object wrapping the data.

response() fastapi.responses.Response

Send the underlying data over the network.

Returns:

a response that streams the data from memory.

class labthings_fastapi.outputs.blob.BlobFile(file_path: str, media_type: str, **kwargs: Any)

Bases: LocalBlobData

A BlobData backed by a file on disk.

Only the filepath is retained by default. If you are using e.g. a temporary directory, you should add the TemporaryDirectory as an instance attribute, to stop it being garbage collected. See Blob.from_temporary_directory.

Note

This class is rarely instantiated directly. It is usually best to use Blob.from_file on a Blob subclass.

Create a BlobFile to wrap data stored on disk.

BlobFile objects wrap data stored on disk as files. They are not usually instantiated directly, but made using Blob.from_temporary_directory or Blob.from_file.

Parameters:
  • file_path – is the path to the file.

  • media_type – is the MIME type of the data.

  • **kwargs – will be added to the object as instance attributes. This may be used to stop temporary directories from being garbage collected while the Blob exists.

Raises:

IOError – if the file specified does not exist.

_file_path
property content: bytes

The wrapped data, as a bytes object in memory.

This reads the file on disk into a bytes object.

Returns:

the contents of the file in a bytes object.

save(filename: str) None

Save the wrapped data to a file.

BlobFile objects already store their data on disk. Currently, this method copies the file to the given filename. In the future, this may change to move for increased efficiency.

Parameters:

filename – the path where the file should be saved.

open() io.IOBase

Return an open file-like object containing the data.

In the case of BlobFile, this is an open file handle to the underlying file, which is where the data is already stored. It is opened with mode "rb" i.e. read-only and binary.

Returns:

an open file handle.

response() fastapi.responses.Response

Generate a response allowing the file to be downloaded.

Returns:

a response that streams the file from disk.

class labthings_fastapi.outputs.blob.BlobModel(/, **data: Any)

Bases: pydantic.BaseModel

A model for JSON-serialised Blob objects.

This model describes the JSON representation of a Blob and does not offer any useful functionality.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

href: str

The URL where the data may be retrieved.

media_type: str

The MIME type of the data. This should be overridden in subclasses.

rel: Literal['output'] = 'output'

The relation of this link to the host object.

Currently, Blob objects are found in the output of Actions, so they always have rel = "output".

description: str = 'The output from this action is not serialised to JSON, so it must be retrieved as a file. This...

This description is added to the serialised Blob.

labthings_fastapi.outputs.blob.parse_media_type(media_type: str) tuple[str, str]

Parse a media type string into its type and subtype.

Parameters:

media_type – the media type string to parse.

Returns:

a tuple of (type, subtype) where each is a string or None.

Raises:

ValueError – if the media type is invalid.

labthings_fastapi.outputs.blob.match_media_types(media_type: str, pattern: str) bool

Check if a media type matches a pattern.

The pattern may include wildcards, e.g. image/* or */*.

Parameters:
  • media_type – the media type to check.

  • pattern – the pattern to match against.

Returns:

True if the media type matches the pattern, False otherwise.

class labthings_fastapi.outputs.blob.Blob(data: BlobData, description: str | None = None)

A container for binary data that may be retrieved over HTTP.

See Blob input/output for more information on how to use this class.

A Blob may be created to hold data using the class methods Blob.from_bytes, Blob.from_file or Blob.from_temporary_directory. It may also reference remote data, using Blob.from_url, though this is currently only used on the client side. The constructor requires a BlobData instance, so the methods mentioned previously are likely a more convenient way to instantiate a Blob.

You are strongly advised to use a subclass of this class that specifies the Blob.media_type attribute, as this will propagate to the auto-generated documentation and make the return type of your action clearer.

This class is pydantic compatible, in that it provides a schema, validator and serialiser. However, it may use url_for.url_for during serialisation, so it should only be serialised in a request handler function. This functionality is intended for use by LabThings library functions only. Validation and serialisation behaviour is described in the docstrings of Blob._validate and Blob._serialise.

Create a Blob object wrapping the given data.

Parameters:
  • data – the BlobData object that stores the data.

  • description – an optional description of the blob.

Raises:

ValueError – if the media_type of the data does not match the media_type of the Blob subclass.

media_type: str = '*/*'

The MIME type of the data. This should be overridden in subclasses.

description: str | None = None

An optional description that may be added to the serialised Blob.

_data: BlobData

This object stores the data - in memory, on disk, or at a URL.

classmethod __get_pydantic_core_schema__(source: type[Any], handler: pydantic.GetCoreSchemaHandler) pydantic_core.core_schema.CoreSchema

Get the pydantic core schema for this type.

This magic method allows pydantic to serialise Blob instances, and generate a JSONSchema for them.

We tell pydantic to base its handling of Blob on the BlobModel schema, with custom validation and serialisation. Validation and serialisation behaviour is described in the docstrings of Blob._validate and Blob._serialise.

The JSONSchema is generated for BlobModel but is then refined in __get_pydantic_json_schema__ to include the media_type and description defaults.

Parameters:
  • source – The source type being converted.

  • handler – The pydantic core schema handler.

Returns:

The pydantic core schema for the URLFor type.

classmethod __get_pydantic_json_schema__(core_schema: Blob.__get_pydantic_json_schema__.core_schema, handler: pydantic.GetJsonSchemaHandler) pydantic.json_schema.JsonSchemaValue

Customise the JSON Schema to include the media_type.

Parameters:
  • core_schema – The core schema for the Blob type.

  • handler – The pydantic JSON schema handler.

Returns:

The JSON schema for the Blob type, with media_type included.

classmethod _validate(value: Any, handler: collections.abc.Callable[[Any], BlobModel]) Self

Validate and convert a value to a Blob instance.

Parameters:
  • value – The input value, as passed in or loaded from JSON.

  • handler – A function that runs the validation logic of BlobModel.

If the value is already a Blob, it will be returned directly. Otherwise, we first validate the input using the BlobModel schema.

When a Blob is validated, we check to see if the URL given as its href looks like a Blob download URL on this server. If it does, the returned object will hold a reference to the local data.

If we can’t match the URL to a Blob on this server, we will raise an error. Handling of Blob input is currently experimental, and limited to passing the output of one Action as input to a subsequent one.

Returns:

a Blob object pointing to the data.

Raises:

ValueError – if the href does not contain a valid Blob ID, or if the Blob ID is not found on this server.

classmethod _serialise(obj: Self, handler: collections.abc.Callable[[BlobModel], Mapping[str, str]]) Mapping[str, str]

Serialise the Blob to a dictionary.

See Blob.to_blobmodel for a description of how we serialise.

Parameters:
  • obj – the Blob instance to serialise.

  • handler – the handler (provided by pydantic) takes a BlobModel and converts it to a dictionary. The handler runs the serialiser of the core schema we’ve wrapped, in this case the BlobModel serialiser.

Returns:

a JSON-serialisable dictionary with a URL that allows the Blob to be downloaded from the BlobManager.

to_blobmodel() BlobModel

Represent the Blob as a BlobModel to get ready to serialise.

When pydantic serialises this object, we first generate a BlobModel with just the information to be serialised. We use from_url.from_url to generate the URL, so this will error if it is serialised anywhere other than a request handler with the middleware from middleware.url_for enabled.

Returns:

a JSON-serialisable dictionary with a URL that allows the Blob to be downloaded from the BlobManager.

property data: BlobData

The data store for this Blob.

It is recommended to use the Blob.content property or Blob.save or Blob.open methods rather than accessing this property directly.

Returns:

the data store wrapping data on disk or in memory.

property content: bytes

Return the the output as a bytes object.

This property may return the bytes object, or if we have a file it will read the file and return the contents. Client objects may use this property to download the output.

This property is read-only. You should also only read it once, as no guarantees are given about caching - reading it many times risks reading the file from disk many times, or re-downloading an artifact.

Returns:

a bytes object containing the data.

save(filepath: str) None

Save the output to a file.

This may remove the need to hold the output in memory, especially if it is already stored on disk.

Parameters:

filepath – The location to save the data on disk.

open() io.IOBase

Open the data as a binary file-like object.

This will return a file-like object that may be read from. It may be either on disk (i.e. an open file handle) or in memory (e.g. an io.BytesIO wrapper).

Returns:

a binary file-like object.

classmethod from_bytes(data: bytes) Self

Create a Blob from a bytes object.

This is the recommended way to create a Blob from data that is held in memory. It should ideally be called on a subclass that has set the media_type.

Parameters:

data – the data as a bytes object.

Returns:

a Blob wrapping the supplied data.

classmethod from_temporary_directory(folder: tempfile.TemporaryDirectory, file: str) Self

Create a Blob from a file in a temporary directory.

This is the recommended way to create a Blob from data that is saved to a file, when the file should not be retained. It should ideally be called on a subclass that has set the media_type.

The tempfile.TemporaryDirectory object will persist as long as this Blob does, which will prevent it from being cleaned up until the object is garbage collected. This means the file will stay on disk until it is no longer needed.

Parameters:
Returns:

a Blob wrapping the file.

classmethod from_file(file: str) Self

Create a Blob from a regular file.

This is the recommended way to create a Blob from a file, if that file will persist on disk. It should ideally be called on a subclass of Blob that has set media_type.

Note

The file should exist for at least as long as the Blob does; this is assumed to be the case and nothing is done to ensure it’s not temporary. If you are using temporary files, consider creating your Blob with from_temporary_directory instead.

Parameters:

file – is the path to the file. This file must exist.

Returns:

a Blob object referencing the specified file.

classmethod from_url(href: str, client: httpx.Client | None = None) Self

Create a Blob that references data at a URL.

This is the recommended way to create a Blob that references data held remotely. It should ideally be called on a subclass of Blob that has set media_type.

Parameters:
  • href – the URL where the data may be downloaded.

  • client – if supplied, this httpx.Client will be used to download the data.

Returns:

a Blob object referencing the specified URL.

response() fastapi.responses.Response

Return a suitable response for serving the output.

This method is called by the ThingServer to generate a response that returns the data over HTTP.

Returns:

an HTTP response that streams data from memory or file.

Raises:

NotImplementedError – if the data is not local. It’s not currently possible to serve remote data via the BlobManager.

labthings_fastapi.outputs.blob.router

A FastAPI router for BlobData download endpoints.

labthings_fastapi.outputs.blob.download_blob(blob_id: uuid.UUID) fastapi.responses.Response

Download a Blob.

This function returns a fastapi.Response allowing the data to be downloaded, using the LocalBlobData.response method.

Parameters:

blob_id – the unique ID of the blob data.

Returns:

a fastapi.Response object that will send the content of the blob over HTTP.

Raises:

HTTPException – if the requested blob is not found.

labthings_fastapi.outputs.blob.url_to_id(url: str) uuid.UUID | None

Extract the blob ID from a URL.

Currently, this checks for a UUID at the end of a URL. In the future, it might check if the URL refers to this server.

Parameters:

url – a URL previously generated by blobdata_to_url.

Returns:

the UUID blob ID extracted from the URL.