arx

Arx offers a structured approach to combining data files from disparate sources with commands to run against them, in a structured, auditable, repeatable way. It draws inspiration from Dockerfiles and Cloud Config and (even) curl ... | sh.

Arx is built around four core data types:

A Bundle is built from Codes and Datas, which themselves reference Sources. A Source can reference local files, inline data or a URL. Sources can have file or archive nature: a Git repository has an directory nature while a direct HTTP reference has a file nature.

The Arx API provides for flexible reading and writing of bundles and direct, programmatic construction thereof. Running, repacking and auditing bundles can be performed from the command line or through the Arx API.

The following URL schemes are supported as Sources:

file://... File
tar+file://... FileTar
git+ssh://... Git
git+http://...
git+file://...
http://... HTTP
https://...
tar+http://... HTTPTar
tar+https://...
s3://... S3
tar+s3://... S3Tar

An Example

Consider a from-source deployment of a web application. Maybe our application is hosted on GitHub, it’s Nginx configuration is separatelym maintained, and it needs an internal secretes file. In Arx, it looks like this:

code:
  - [sh, -c, 'service app stop ; service app start']
  - [sh, -c, 'service nginx stop ; service nginx start']
data:
  - /srv/app: git+ssh://github.com/examplecom/app.git
  - /etc/nginx: git+ssh://github.com/examplecom/sys.git#nginx
  - /etc/default/app: https://secrets.internal.example.com/generate/env

Or:

from arx import arx

b = arx.Bundle(
    arx.Code('sh', '-c', 'service app stop ; service app start'),
    arx.Code('sh', '-c', 'service nginx stop ; service nginx start'),
    arx.Data('git+ssh://github.com/examplecom/app.git', '/srv/app'),
    arx.Data('git+ssh://github.com/examplecom/sys.git#nginx', '/etc/nginx'),
    arx.Data('https://secrets.internal.example.com/generate/env',
             '/etc/default/app'),
)

Calling arx.run(b) (or arx /path/to/arx.yaml) will step through all the data, unpacking it to the specified locations, and then run the two commands specified with ~arx.bundle.Code (or in YAML, code).

Convenience API

from arx import arx imports an Arx API object [1] that provides a few convenience methods for object creation.

Arx.Bundle(*args, **kwargs)[source]
A convenience method for obtaining Bundles
which can be called with a variety of argument types.
Bundle(code=[Code], data=[Data], cwd=str, label=str, env=dict)

With keyword arguments. These keyword arguments can all be empty – the default bundle has no additional environment, runs in a temporary directory and does not have a custom label.

Bundle(a: Code|Data, b: Code|Data, c: Code|Data, ...)

With Code and Data objects as arguments, in any order.

Bundle(h: io.IOBase)

With a handle – an open file or StringIO like object. The contents are read and passed through a YAML parser. The resulting dictionary object is passed to the dictionary form mentioned below.

Bundle({...}: dict)

With a dict, as is provided by parsing a JSON or YAML file. The arguments of the dict should mirror those of the keyword arguments form, above.

Note that the keyword arguments cwd, label and env from the first form can also be passed to any of the following forms.

Arx.Code(*args, **kwargs)[source]

Convenience method for constructing Code, which handles translation of strings and simple datatypes to Sources.

Arx.Data(*args, **kwargs)[source]

Convenience method for constructing Data, which handles translation of strings and simple datatypes to Sources.

Core API

The core API is composed of less convenient, but also less magical and more uniform, types and functions.

class arx.bundle.Bundle(*args, **kwargs)[source]
class arx.bundle.Code(*args, **kwargs)[source]
class arx.bundle.Data(*args, **kwargs)[source]
class arx.sources.core.Source[source]

ABC for sources.

cache(path) → File[source]

Generates a filesystem local source from this source.

The File thus returned can be used in place of the original source.

The method is passed a created, temporary directory which is cleared by the implementation. The File source should be stored somewhere under this directory, but it is allowed to store metadata, like checksums, alongside it.

It is advisable that this method reuse data when it is already present in the cache; but this is not a requirement. An implementation that clears the cache and pulls the data anew each time will not break clients.

For File and its subclasses, cache returns self.

class arx.err.Err(*args, **kwargs)[source]

Base class for custom exceptions in this package.

Sources

class arx.sources.files.File(*args, **kwargs)[source]

Local files and directories as Arx sources.

Local file URLs are resolved at the bundling step, allowing vendored dependencies and local configuration to be captured and forwarded as part of sending an Arx manifest off to be run.

File URLs are generally absolute paths; but in practice one will want paths relative to the Arx manifest file for vendored dependencies and relative to the user’s home directory for settings. To accommadate this need, Arx introduces a virtual directory, @.

  • file:///@/./ refers to the directory in which the Arx manifest file is present. Say one wants to reference package.json; this would be file:///@/./package.json.
  • The URL file:///@/~/ refers to the user’s home directory.
  • The URL file:///@/~x/ refers to the home directory of user x and so on and so forth for every user of the system.

Now the question arises, what if we have a directory @ at the system root? That is what file:///@/ normally refers to. Because @ is a reserved character in URIs, a literal interpretation can be forced by percent encoding it: file:///%40/.

class arx.sources.files.FileTar(*args, **kwargs)[source]
class arx.sources.git.Git(*args, **kwargs)[source]

Git respositories as Arx sources.

These sources have directory nature by default but do support fragments to indicate that only certain files or directories should be extracted.

Query parametes can be used to indicate a particular branch, tag or SHA:

# Try to find a branch or a tag called beta.
git+ssh://abc.example.com/web/server.git?beta

# Same as above.
git+ssh://abc.example.com/web/server.git?ref=beta

# Find a SHA beginning with 0abc3df.
git+ssh://abc.example.com/web/server.git?0abc3df

# Same as above.
git+ssh://abc.example.com/web/server.git?ref=0abc3df

The git+ssh:// and git+http:// schemes are passed, minus the leading git+, to git.

As discussed in File, git+file:/// URLs can point to repositories in home or the project directory using /@/~ or /@/. or /@/... To reference the Git repository local to the manifest, use: git+file:///@/..

class arx.sources.http.HTTP(*args, **kwargs)[source]

Links to files available over HTTP/S.

The URL can contain query parameters (?...) but not a fragment (#...). All HTTP URLs are treated as having file nature.

class arx.sources.http.HTTPTar(*args, **kwargs)[source]

Links to tar archives available over HTTP/S.

These URLs have directory nature unless a fragment is passed, as described under Tar.

class arx.sources.s3.S3(*args, **kwargs)[source]

Links objects in S3.

The URL can end with a / to give it directory nature; otherwise it has file nature. With directory nature, the directory is unpacked recursively.

class arx.sources.s3.S3Tar(*args, **kwargs)[source]

Links to tar archives available over S3.

Note that these URLs may not end with a slash.

These URLs have directory nature unless a fragment is passed, as described under Tar.

Source Mixins

class arx.sources.tar.Tar[source]

With tar+... type URLs, passing fragment can give the source file nature; but by default it has directory nature.

A fragment can reference a subdirectory or a particular file in the archive; in the latter case, the URL can be treated as a runnable program.

There is a convention with release tarballs, to have a single top-level directory that is named something like <project>-<version>. Tar offers --strip-components for this situation, where the archive is expanded as though only the top-level directory had been asked for. To access this functionality, use #/ (this is the same as --strip-components 1). Supplying #// would expand from all second-level directories (of which there is hopefully only one), &c.

APIs for Extension

To customize how URLs are interpreted, how tasks are run and how logging is performed, you’ll need these APIs.

class arx.Arx[source]

An Arx API object encapsulates configuration for the convenience API.

API configuration includes an interpreter (a callable) that translates strings and simple string dictionaries to URLs, as well as settings for temporary directory setup and task logging.

class arx.sources.interpreter.Interpreter(uri_handlers=[], data_handlers=[])[source]

An interpreter translates strings and simple string dictionaries to sources.

arx.sources.interpreter.default(str|dict) → Source

An interpreter translates strings and simple string dictionaries to sources.

Provides the default mapping of source specs to classes.

Indices and tables

Footnotes

[1]It is possible to construct another such object, an arx.Arx, to customize how URLs are intepreted.