dlg

This package contains the modules implementing the core functionality of the system.

dlg.event

class dlg.event.Event

An event sent through the DALiuGE framework.

Events have at least a field describing the type of event they are (instead of having subclasses of the Event class), and therefore this class makes sure that at least that field exists. Any other piece of information can be attached to individual instances of this class, depending on the event type.

class dlg.event.EventFirer

An object that fires events.

Objects that have an interest on receiving events from this object subscribe to it via the subscribe method; likewise they can unsubscribe from it via the unsubscribe method. Events are handled to the listeners by calling their handleEvent method with the event as its sole argument.

Listeners can specify the type of event they listen to at subscription time, or can also prefer to receive all events fired by this object if they wish so.

subscribe(listener, eventType=None)

Subscribes listener to events fired by this object. If eventType is not None then listener will only receive events of eventType that originate from this object, otherwise it will receive all events.

unsubscribe(listener, eventType=None)

Unsubscribes listener from events fired by this object.

dlg.io

class dlg.io.DataIO

A class used to read/write data stored in a particular kind of storage in an abstract way. This base class simply declares a number of methods that deriving classes must actually implement to handle different storage mechanisms (e.g., a local filesystem or an NGAS server).

An instance of this class represents a particular piece of data. Thus at construction time users must specify a storage-specific unique identifier for the data that this object handles (e.g., a filename in the case of a DataIO class that works with local filesystem storage, or a host:port/fileId combination in the case of a class that works with an NGAS server).

Once an instance has been created it can be opened via its open method indicating an open mode. If opened with OpenMode.OPEN_READ, only read operations will be allowed on the instance, and if opened with OpenMode.OPEN_WRITE only writing operations will be allowed.

close(**kwargs)

Closes the underlying storage where the data represented by this instance is stored, freeing underlying resources.

delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

open(mode, **kwargs)

Opens the underlying storage where the data represented by this instance is stored. Depending on the value of mode subsequent calls to self.read or self.write will succeed or fail.

read(count, **kwargs)

Reads count bytes from the underlying storage.

write(data, **kwargs)

Writes data into the storage

class dlg.io.ErrorIO

An DataIO method that throws exceptions if any of its methods is invoked

delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

class dlg.io.FileIO(filename, **kwargs)
delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

dlg.io.IOForURL(url)

Returns a DataIO instance that handles the given URL for reading. If no suitable DataIO class can be found to handle the URL, None is returned.

class dlg.io.MemoryIO(buf, **kwargs)

A DataIO class that reads/write from/into the BytesIO object given at construction time

delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

class dlg.io.NgasIO(hostname, fileId, port=7777, ngasConnectTimeout=2, ngasTimeout=2, length=-1)

A DROP whose data is finally stored into NGAS. Since NGAS doesn’t support appending data to existing files, we store all the data temporarily in a file on the local filesystem and then move it to the NGAS destination

delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

class dlg.io.NgasLiteIO(hostname, fileId, port=7777, ngasConnectTimeout=2, ngasTimeout=2, length=-1)

An IO class whose data is finally stored into NGAS. It uses the ngaslite module of DALiuGE instead of the full client-side libraries provided by NGAS itself, since they might not be installed everywhere.

The ngaslite module doesn’t support the STATUS command yet, and because of that this class will throw an error if its exists method is invoked.

delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

class dlg.io.NullIO

A DataIO that stores no data

delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

class dlg.io.ShoreIO(doid, column, row, rows=1, address=None, **kwargs)
delete()

Deletes the data represented by this DataIO

exists()

Returns True if the data represented by this DataIO exists indeed in the underlying storage mechanism

dlg.drop

Module containing the core DROP classes.

class dlg.drop.AbstractDROP(**kwargs)

Base class for all DROP implementations.

A DROP is a representation of a piece of data. DROPs are created, written once, potentially read many times, and they finally potentially expire and get deleted. Subclasses implement different storage mechanisms to hold the data represented by the DROP.

If the data represented by this DROP is written through this object (i.e., calling the write method), this DROP will keep track of the data’s size and checksum. If the data is written externally, the size and checksum can be fed into this object for future reference.

DROPs can have consumers attached to them. ‘Normal’ consumers will wait until the DROP they ‘consume’ (their ‘input’) moves to the COMPLETED state and then will consume it, most typically by opening it and reading its contents, but any other operation could also be performed. How the consumption is triggered depends on the producer’s executionMode flag, which dictates whether it should trigger the consumption itself or if it should be manually triggered by an external entity. On the other hand, streaming consumers receive the data that is written into its input as it gets written. This mechanism is driven always by the DROP that acts as a streaming input. Apart from receiving the data as it gets written into the DROP, streaming consumers are also notified when the DROPs moves to the COMPLETED state, at which point no more data should be expected to arrive at the consumer side.

DROPs’ data can be expired automatically by the system after the DROP has transitioned to the COMPLETED state if they are created by a DROP Manager. Expiration can either be triggered by an interval relative to the creation time of the DROP (via the lifespan keyword), or by specifying that the DROP should be expired after all its consumers have finished (via the expireAfterUse keyword). These two methods are mutually exclusive. If none is specified no expiration occurs.

addConsumer(**kwargs)

Adds a consumer to this DROP.

Consumers are normally (but not necessarily) AppDROPs that get notified when this DROP moves into the COMPLETED or ERROR states. This is done by firing an event of type dropCompleted to which the consumer subscribes to.

This is one of the key mechanisms by which the DROP graph is executed automatically. If AppDROP B consumes DROP A, then as soon as A transitions to COMPLETED B will be notified and will probably start its execution.

addProducer(**kwargs)

Adds a producer to this DROP.

Producers are AppDROPs that write into this DROP; from the producers’ point of view, this DROP is one of its many outputs.

When a producer has finished its execution, this DROP will be notified via the self.producerFinished() method.

addStreamingConsumer(**kwargs)

Adds a streaming consumer to this DROP.

Streaming consumers are AppDROPs that receive the data written into this DROP as it gets written, and therefore do not need to wait until this DROP has been moved to the COMPLETED state.

checksum

The checksum value for the data represented by this DROP. Its value is automatically calculated if the data was actually written through this DROP (using the self.write() method directly or indirectly). In the case that the data has been externally written, the checksum can be set externally after the DROP has been moved to COMPLETED or beyond.

See:self.checksumType
checksumType

The algorithm used to compute this DROP’s data checksum. Its value if automatically set if the data was actually written through this DROP (using the self.write() method directly or indirectly). In the case that the data has been externally written, the checksum type can be set externally after the DROP has been moved to COMPLETED or beyond.

See:self.checksum
close(**kwargs)

Closes the given DROP descriptor, decreasing the DROP’s internal reference count and releasing the underlying resources associated to the descriptor.

consumers

The list of ‘normal’ consumers held by this DROP.

See:self.addConsumer()
dataURL()

A URL that points to the data referenced by this DROP. Different DROP implementations will use different URI schemes.

decrRefCount()

Decrements the reference count of this DROP by one atomically.

delete()

Deletes the data represented by this DROP.

executionMode

The execution mode of this DROP. If ExecutionMode.DROP it means that this DROP will automatically trigger the execution of all its consumers. If ExecutionMode.EXTERNAL it means that this DROP will not trigger its consumers, and therefore an external entity will have to do it.

exists()

Returns True if the data represented by this DROP exists indeed in the underlying storage mechanism

getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

handleEvent(**kwargs)

Handles the arrival of a new event. Events are delivered from those objects this DROP is subscribed to.

handleInterest(drop)

Main mechanism through which a DROP handles its interest in a second DROP it isn’t directly related to.

A call to this method should be expected for each DROP this DROP is interested in. The default implementation does nothing, but implementations are free to perform any action, such as subscribing to events or storing information.

At this layer only the handling of such an interest exists. The expression of such interest, and the invocation of this method wherever necessary, is currently left as a responsibility of the entity creating the DROPs. In the case of a Session in a DROPManager for example this step would be performed using deployment-time information contained in the dropspec dictionaries held in the session.

incrRefCount()

Increments the reference count of this DROP by one atomically.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

isBeingRead()

Returns True if the DROP is currently being read; False otherwise

isCompleted()

Checks whether this DROP is currently in the COMPLETED state or not

oid

The DROP’s Object ID (OID). OIDs are unique identifiers given to semantically different DROPs (and by consequence the data they represent). This means that different DROPs that point to the same data semantically speaking, either in the same or in a different storage, will share the same OID.

open(**kwargs)

Opens the DROP for reading, and returns a “DROP descriptor” that must be used when invoking the read() and close() methods. DROPs maintain a internal reference count based on the number of times they are opened for reading; because of that after a successful call to this method the corresponding close() method must eventually be invoked. Failing to do so will result in DROPs not expiring and getting deleted.

parent

The DROP that acts as the parent of the current one. This parent/child relationship is created by ContainerDROPs, which are a specific kind of DROP.

phase

This DROP’s phase. The phase indicates the resilience of a DROP.

precious

Whether this DROP should be considered as ‘precious’ or not

producerFinished(**kwargs)

Method called each time one of the producers of this DROP finishes its execution. Once all producers have finished this DROP moves to the COMPLETED state (or to ERROR if one of the producers is on the ERROR state).

This is one of the key mechanisms through which the execution of a DROP graph is accomplished. If AppDROP A produces DROP B, as soon as A finishes its execution B will be notified and will move itself to COMPLETED.

producers

The list of producers that write to this DROP

See:self.addProducer()
read(descriptor, count=4096, **kwargs)

Reads count bytes from the given DROP descriptor.

setCompleted(**kwargs)

Moves this DROP to the COMPLETED state. This can be used when not all the expected data has arrived for a given DROP, but it should still be moved to COMPLETED, or when the expected amount of data held by a DROP is not known in advanced.

setError(**kwargs)

Moves this DROP to the ERROR state.

size

The size of the data pointed by this DROP. Its value is automatically calculated if the data was actually written through this DROP (using the self.write() method directly or indirectly). In the case that the data has been externally written, the size can be set externally after the DROP has been moved to COMPLETED or beyond.

status

The current status of this DROP.

streamingConsumers

The list of ‘streaming’ consumers held by this DROP.

See:self.addStreamingConsumer()
uid

The DROP’s Unique ID (UID). Unlike the OID, the UID is globally different for all DROP instances, regardless of the data they point to.

write(**kwargs)

Writes the given data into this DROP. This method is only meant to be called while the DROP is in INITIALIZED or WRITING state; once the DROP is COMPLETE or beyond only reading is allowed. The underlying storage mechanism is responsible for implementing the final writing logic via the self.writeMeta() method.

class dlg.drop.AppDROP(**kwargs)

An AppDROP is a DROP representing an application that reads data from one or more DROPs (its inputs), and writes data onto one or more DROPs (its outputs).

AppDROPs accept two different kind of inputs: “normal” and “streaming” inputs. Normal inputs are DROPs that must be on the COMPLETED state (and therefore their data must be fully written) before this application is run, while streaming inputs are DROPs that feed chunks of data into this application as the data gets written into them.

This class contains two methods that should be overwritten as needed by subclasses: dropCompleted, invoked when input DROPs move to COMPLETED, and dataWritten, invoked with the data coming from streaming inputs.

How and when applications are executed is completely up to the user, and is not enforced by this base class. Some applications might need to be run at initialize time, while other might start during the first invocation of dataWritten. A common scenario anyway is to start an application only after all its inputs have moved to COMPLETED (implying that none of them is an streaming input); for these cases see the BarrierAppDROP.

dataWritten(uid, data)

Callback invoked when data has been written into the DROP with UID uid (which is one of the streaming inputs of this AppDROP). By default no action is performed

dropCompleted(uid, drop_state)

Callback invoked when the DROP with UID uid (which is either a normal or a streaming input of this AppDROP) has moved to the COMPLETED or ERROR state. By default no action is performed.

execStatus

The execution status of this AppDROP

handleEvent(e)

Handles the arrival of a new event. Events are delivered from those objects this DROP is subscribed to.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

inputs

The list of inputs set into this AppDROP

outputs

The list of outputs set into this AppDROP

streamingInputs

The list of streaming inputs set into this AppDROP

class dlg.drop.BarrierAppDROP(**kwargs)

A BarrierAppDROP is an InputFireAppDROP that waits for all its inputs to complete, effectively blocking the flow of the graph execution.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

class dlg.drop.ContainerDROP(**kwargs)

A DROP that doesn’t directly point to some piece of data, but instead holds references to other DROPs (its children), and from them its own internal state is deduced.

Because of its nature, ContainerDROPs cannot be written to directly, and likewise they cannot be read from directly. One instead has to pay attention to its “children” DROPs if I/O must be performed.

dataURL()

A URL that points to the data referenced by this DROP. Different DROP implementations will use different URI schemes.

delete()

Deletes the data represented by this DROP.

exists()

Returns True if the data represented by this DROP exists indeed in the underlying storage mechanism

getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

class dlg.drop.DirectoryContainer(**kwargs)

A ContainerDROP that represents a filesystem directory. It only allows FileDROPs and DirectoryContainers to be added as children. Children can only be added if they are placed directly within the directory represented by this DirectoryContainer.

delete()

Deletes the data represented by this DROP.

exists()

Returns True if the data represented by this DROP exists indeed in the underlying storage mechanism

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

class dlg.drop.FileDROP(**kwargs)

A DROP that points to data stored in a mounted filesystem.

Users can (but usually don’t need to) specify both a filepath and a dirname parameter for each FileDrop. The combination of these two parameters will determine the final location of the file backed up by this drop on the underlying filesystem. When no filepath is provided, the drop’s UID will be used as a filename. When a relative filepath is provided, it is relative to dirname. When an absolute filepath is given, it is used as-is. When a relative dirname is provided, it is relative to the base directory of the currently running session (i.e., a directory with the session ID as a name, placed within the currently working directory of the Node Manager hosting that session). If dirname is absolute, it is used as-is.

In some cases drops are created outside the context of a session, most notably during unit tests. In these cases the base directory is a fixed location under /tmp.

The following table summarizes the calculation of the final path used by the FileDrop class depending on its parameters:

. filepath
dirname empty relative absolute
empty /$B/$u /$B/$f /$f
relative /$B/$d/$u /$B/$d/$f ERROR
absolute /$d/$u /$d/$f ERROR

In the table, $f is the value of filepath, $d is the value of dirname, $u is the drop’s UID and $B is the base directory for this drop’s session, namelly /the/cwd/$session_id.

delete()

Deletes the data represented by this DROP.

getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

initialize(**kwargs)

FileDROP-specific initialization.

class dlg.drop.InMemoryDROP(**kwargs)

A DROP that points data stored in memory.

getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

class dlg.drop.InputFiredAppDROP(**kwargs)

An InputFiredAppDROP accepts no streaming inputs and waits until a given amount of inputs (called effective inputs) have moved to COMPLETED to execute its ‘run’ method, which must be overwritten by subclasses. This way, this application allows to continue the execution of the graph given a minimum amount of inputs being ready. The transitions of subsequent inputs to the COMPLETED state have no effect.

Normally only one call to the run method will happen per application. However users can override this by specifying a different number of tries before finally giving up.

The amount of effective inputs must be less or equal to the amount of inputs added to this application once the graph is being executed. The special value of -1 means that all inputs are considered as effective, in which case this class acts as a BarrierAppDROP, effectively blocking until all its inputs have moved to the COMPLETED state.

An input error threshold controls the behavior of the application given an error in one or more of its inputs (i.e., a DROP moving to the ERROR state). The threshold is a value within 0 and 100 that indicates the tolerance to erroneous effective inputs, and after which the application will not be run but moved to the ERROR state itself instead.

dropCompleted(uid, drop_state)

Callback invoked when the DROP with UID uid (which is either a normal or a streaming input of this AppDROP) has moved to the COMPLETED or ERROR state. By default no action is performed.

execute(**kwargs)

Manually trigger the execution of this application.

This method is normally invoked internally when the application detects all its inputs are COMPLETED.

exists()

Returns True if the data represented by this DROP exists indeed in the underlying storage mechanism

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

run()

Run this application. It can be safely assumed that at this point all the required inputs are COMPLETED.

class dlg.drop.ListAsDict(my_set)

A list that adds drop UIDs to a set as they get appended to the list

append(drop)

L.append(object) – append object to end

class dlg.drop.NgasDROP(**kwargs)

A DROP that points to data stored in an NGAS server

getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

class dlg.drop.NullDROP(**kwargs)

A DROP that doesn’t store any data.

getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

class dlg.drop.PathBasedDrop

Base class for data drops that handle paths (i.e., file and directory drops)

class dlg.drop.RDBMSDrop(**kwargs)

A Drop that stores data in a table of a relational database

getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

insert(vals)

Inserts the values contained in the vals dictionary into the underlying table. The keys of vals are used as the column names.

select(columns=None, condition=None, vals=())

Returns the selected values from the table. Users can constrain the result set by specifying a list of columns to be returned (otherwise all table columns are returned) and a condition to be applied, in which case a list of vals to be applied as query parameters can also be given.

class dlg.drop.ShoreDROP(**kwargs)
getIO()

Returns an instance of one of the dlg.io.DataIO instances that handles the data contents of this DROP.

initialize(**kwargs)

Performs any specific subclass initialization.

kwargs contains all the keyword arguments given at construction time, except those used by the constructor itself. Implementations of this method should make sure that arguments in the kwargs dictionary are removed once they are interpreted so they are not interpreted by accident by another method implementations that might reside in the call hierarchy (in the case that a subclass implementation calls the parent method implementation, which is usually the case).

class dlg.drop.dropdict

An intermediate representation of a DROP that can be easily serialized into a transport format such as JSON or XML.

This dictionary holds all the important information needed to call any given DROP constructor. The most essential pieces of information are the DROP’s OID, and its type (which determines the class to instantiate). Depending on the type more fields will be required. This class doesn’t enforce these requirements though, as it only acts as an information container.

This class also offers a few utility methods to make it look more like an actual DROP class. This way, users can use the same set of methods both to create DROPs representations (i.e., instances of this class) and actual DROP instances.

Users of this class are, for example, the graph_loader module which deals with JSON -> DROP representation transformations, and the different repositories where graph templates are expected to be found by the DROPManager.

dlg.s3_drop

Drops that interact with AWS S3

class dlg.s3_drop.S3DROP(oid, uid, **kwargs)

A DROP that points to data stored in S3

bucket

Returns the bucket name :return: the bucket name

exists()

Returns True if the data represented by this DROP exists indeed in the underlying storage mechanism

getIO()

This type of DROP cannot be accessed directly :return:

initialize(**kwargs)
Parameters:kwargs – the dictionary of arguments
key

Return the S3 key :return: the S3 key

path

Returns the path to the S3 object :return: the path

size()

The size of the data pointed by this DROP. Its value is automatically calculated if the data was actually written through this DROP (using the self.write() method directly or indirectly). In the case that the data has been externally written, the size can be set externally after the DROP has been moved to COMPLETED or beyond.

dlg.droputils

Utility methods and classes to be used when interacting with DROPs

class dlg.droputils.DROPFile(drop)

A file-like object (currently only supporting the read() operation, more to be added in the future) that wraps the DROP given at construction time.

Depending on the underlying storage of the data the file-like object returned by this method will directly access the data pointed by the DROP if possible, or will access it through the DROP methods instead.

Objects of this class will automatically close themselves when no referenced anymore (i.e., when __del__ is called), but users should still try to invoke close() eagerly to free underlying resources.

Objects of this class can also be used in a with context.

class dlg.droputils.DROPWaiterCtx(test, drops, timeout=1)

Class used by unit tests to trigger the execution of a physical graph and wait until the given set of DROPs have reached its COMPLETED status.

It does so by appending an EvtConsumer consumer to each DROP before they are used in the execution, and finally checking that the events have been set. It should be used like this inside a test class:

# There is a physical graph that looks like: a -> b -> c
with DROPWaiterCtx(self, c):
    a.write('a')
    a.setCompleted()
class dlg.droputils.EvtConsumer(evt)

Small utility class that sets the internal flag of the given threading.Event object when consuming a DROP. Used throughout the tests as a barrier to wait until all DROPs of a given graph have executed.

dlg.droputils.allDropContents(drop, bufsize=4096)

Returns all the data contained in a given DROP

dlg.droputils.breadFirstTraverse(toVisit)

Breadth-first iterator for a DROP graph.

This iterator yields a tuple where the first item is the node being visited, and the second is a list of nodes that will be visited subsequently. Callers can alter this list in order to remove certain nodes from the graph traversal process.

This implementation is non-recursive.

dlg.droputils.copyDropContents(source, target, bufsize=4096)

Manually copies data from one DROP into another, in bufsize steps

dlg.droputils.depthFirstTraverse(node, visited=[])

Depth-first iterator for a DROP graph.

This iterator yields a tuple where the first item is the node being visited, and the second is a list of nodes that will be visited subsequently. Callers can alter this list in order to remove certain nodes from the graph traversal process.

This implementation is recursive.

dlg.droputils.getDownstreamObjects(drop)

Returns a list of all direct “downstream” DROPs for the given DROP. An DROP A is “downstream” with respect to DROP B if any of the following conditions are true: * A is an output of B (therefore B is an AppDROP) * A is a normal or streaming consumer of B (and A is therefore an AppDROP)

In practice if A is a downstream DROP of B means that it cannot advance to the COMPLETED state until B does so.

dlg.droputils.getLeafNodes(nodes)

Returns a list of all the “leaf nodes” of the graph pointed by nodes. nodes is either a single DROP, or a list of DROPs.

dlg.droputils.getUpstreamObjects(drop)

Returns a list of all direct “upstream” DROPs for the given DROP. An DROP A is “upstream” with respect to DROP B if any of the following conditions are true:

  • A is a producer of B (therefore A is an AppDROP)
  • A is a normal or streaming input of B (and B is therefore an AppDROP)

In practice if A is an upstream DROP of B means that it must be moved to the COMPLETED state before B can do so.

dlg.droputils.get_leaves(pg_spec)

Returns a set with the OIDs of the dropspecs that are the leaves of the given physical graph specification.

dlg.droputils.get_roots(pg_spec)

Returns a set with the OIDs of the dropspecs that are the roots of the given physical graph specification.

dlg.droputils.has_path(x)

Returns True if x has a path attribute

dlg.droputils.listify(o)

If o is already a list return it as is; if o is a tuple returns a list containing the elements contained in the tuple; otherwise returns a list with o being its only element

dlg.droputils.replace_dataurl_placeholders(cmd, inputs, outputs)

Replaces any placeholder found in cmd with the dataURL property of the respective input or output Drop from inputs or outputs. Placeholders have the different formats:

  • %iDataURLN, with N starting from 0, indicates the path of the N-th element from the inputs argument; likewise for %oDataURLN.
  • %iDataURL[X] indicates the path of the input with UID X; likewise for %oDataURL[X].
dlg.droputils.replace_path_placeholders(cmd, inputs, outputs)

Replaces any placeholder found in cmd with the path of the respective input or output Drop from inputs or outputs. Placeholders have the different formats:

  • %iN, with N starting from 0, indicates the path of the N-th element from the inputs argument; likewise for %oN.
  • %i[X] indicates the path of the input with UID X; likewise for %o[X].

dlg.utils

Module containing miscellaneous utility classes and functions.

class dlg.utils.ZlibCompressedStream(content)

An object that takes a input of uncompressed stream and returns a compressed version of its contents when .read() is read.

class dlg.utils.ZlibUncompressedStream(content)

A class that reads gzip-compressed content and returns uncompressed content each time its read() method is called.

dlg.utils.b2s(b, enc='utf8')

Converts bytes into a string

dlg.utils.browse_service(zc, service_type_name, protocol, callback)

ZeroConf: Browse for services based on service type and protocol

callback signature: callback(zeroconf, service_type, name, state_change)
zeroconf: ZeroConf object service_type: zeroconf service name: service name state_change: ServiceStateChange type (Added, Removed)

Returns ZeroConf object

dlg.utils.check_port(host, port, timeout=0, checking_open=True, return_socket=False)

Checks that the port specified by host:port is either open or closed (depending on the value of checking_open) within a given timeout. When checking for an open port, this method will keep trying to connect to it either until the given timeout has expired or until the socket is found open. When checking for a closed port this method will keep trying to connect to it until the connection is unsuccessful, or until the timeout expires. Additionally, if some data is passed and the method is checking_open then data will be written to the socket if it connects successfully.

This method returns True if the port was found on the expected state within the time limit, and False otherwise.

dlg.utils.connect_to(host, port, timeout=None)

Connects to host:port within the given timeout and return the connected socket. If no connection could be established a socket.timeout error is raised

dlg.utils.createDirIfMissing(path)

Creates the given directory if it doesn’t exist

dlg.utils.deregister_service(zc, info)

ZeroConf: Deregister service

dlg.utils.escapeQuotes(s, singleQuotes=True, doubleQuotes=True)

Escapes single and double quotes in a string. Useful to include commands in a shell invocation or similar.

dlg.utils.fname_to_pipname(fname)

Converts a graph filename (assuming it’s a .json file) to its “pipeline” name (the basename without the extension).

dlg.utils.getDlgDir()

Returns the root of the directory structure used by the DALiuGE framework at runtime.

dlg.utils.getDlgLogsDir()

Returns the location of the directory used by the DALiuGE framework to store its logs. If createIfMissing is True, the directory will be created if it currently doesn’t exist

dlg.utils.getDlgPidDir()

Returns the location of the directory used by the DALiuGE framework to store its PIDs. If createIfMissing is True, the directory will be created if it currently doesn’t exist

dlg.utils.get_all_ipv4_addresses()

Get a list of all IPv4 interfaces found in this computer

dlg.utils.get_local_ip_addr()

Enumerate all interfaces and return bound IP addresses (exclude localhost)

dlg.utils.isabs(path)

Like os.path.isabs, but handles None

dlg.utils.object_tracking(name)

Returns a decorator that helps classes track which object is currently under execution. This is done via a thread local object, which can be accessed via the ‘tlocal’ attribute of the returned decorator.

dlg.utils.portIsClosed(host, port, timeout)

Checks if a given host/port is closed, with a given timeout.

dlg.utils.portIsOpen(host, port, timeout=0)

Checks if a given host/port is open, with a given timeout.

dlg.utils.prepare_sql(sql, paramstyle, data=())

Prepares the given SQL statement for proper execution depending on the parameter style supported by the database driver. For this the SQL statement must be written using the “{X}” or “{}” placeholders in place for each, parameter which is a style-agnostic parameter notation.

This method returns a tuple containing the prepared SQL statement and the values to be bound into the query as required by the driver.

dlg.utils.register_service(zc, service_type_name, service_name, ipaddr, port, protocol='tcp')

ZeroConf: Register service type, protocol, ipaddr and port

Returns ZeroConf object and ServiceInfo object

dlg.utils.terminate_or_kill(proc, timeout)

Terminates a process and waits until it has completed its execution within the given timeout. If the process is still alive after the timeout it is killed.

dlg.utils.to_externally_contactable_host(host, prefer_local=False)

Turns host, which is an address used to bind a local service, into a host that can be used to externally contact that service.

This should be used when there is no other way to find out how a client to that service is going to connect to it.

dlg.utils.write_to(host, port, data, timeout=None)

Connects to host:port within the given timeout and write the given piece of data into the connected socket.

dlg.utils.zmq_safe(host_or_addr)

Converts host_or_addr to a format that is safe for ZMQ to use

dlg.graph_loader

Module containing functions to load a fully-functional DROP graph from its full JSON representation.

Adds a link from lhDropSpec to point to rhOID. The link type (e.g., a consumer) is signaled by linkType.

dlg.graph_loader.loadDropSpecs(dropSpecList)

Loads the DROP definitions from dropSpectList, checks that the DROPs are correctly specified, and return a dictionary containing all DROP specifications (i.e., a dictionary of dictionaries) keyed on the OID of each DROP. Unlike readObjectGraph and readObjectGraphS, this method doesn’t actually create the DROPs themselves.

dlg.delayed

dlg.delayed(x, *args, **kwargs)

Like dask.delayed, but quietly swallowing anything other than nout