Server Management and Configuration¶
Installation and Requirements¶
The server and its dependencies can be installed from PyPI like so:
$ pip install spalloc_server
The server is currently only compatible with Linux due to its use of the
poll()
system call.
Operation¶
The SpiNNaker partitioning server is started like so:
$ spalloc_server configfile.cfg
The server is configured by a configuration file whose name is supplied as the
first command-line argument. See the section below
for an overview of the config file format.
You can trigger the server to reread its config file by sending a SIGHUP
signal to the server process. When you do this (and where it is possible),
running jobs will continue to execute without interruption and queued jobs will
automatically be enqueued on any newly added machines.
Command-line usage¶
There is one mandatory argument, the name of the file holding the configuration definitions, and a number of options that may be specified.
spalloc_server [OPTIONS] CONFIG_FILE [OPTIONS]
The options that may be given are:
- --version
Print the version of
spalloc_server
and quit.- --quiet
Hide non-error output.
- --cold-start
Force a cold start, discarding any existing saved state.
- --port PORT
The network port that the service should listen on. Defaults to 22244. (The older style of setting the port through the configuration file is deprecated.)
Stopping the server¶
The server runs until it is terminated by SIGINT
, i.e. pressing ctrl+c.
When terminated the server attempts to gracefully shut-down, completing any
outstanding board power-management commands and saving its state to disk. When
the server is subsequently restarted, the saved state is restored and operation
may continue as if the server had never been shut-down. Alternatively a
cold-start may be enforced using the --cold-start
argument when starting
the server.
When the server is terminated, machines allocated to running jobs are left powered on meaning that user’s jobs are not interrupted by the partitioning server being restarted. If it is necessary to perform major maintenance and fully shut-down the server and all controlled SpiNNaker machines, the recommended approach is to add the following line to the bottom of the server’s config file:
configuration = Configuration()
This causes the server to believe all machines have been removed resulting in all queued and running jobs being cancelled and all previously allocated SpiNNaker boards being powered-down. To maximise the chances of all clients realising their jobs have been cancelled, the server should then be left running for a few minutes before being finally shut down.
Configuration file format¶
A configuration file is used to describe the machines which are to be
managed. Configuration files are Python scripts which define a global
configuration
variable which is an instance of the
Configuration
class.
Note
Everything in spalloc_server.configuration
and
spalloc_server.coordinates
modules is implicitly imported
into the namespace of the config file.
A minimal (though useless) configuration file would look like so:
configuration = Configuration()
This causes the server to listen on all interfaces on the default port but does
not define any machines for the server to manage. As a result, this server will
cancel all jobs sent to it for lack of a suitable machine. See the
Configuration
class for a description of all available
configuration options and default values.
Machines are defined using Machine
objects. These specify the
dimensions, broken boards and links, physical layout and IP addresses of a
SpiNNaker machine. All machines are presumed to be interconnected in a valid
hexagonal torus topology constructed from a rectangular array of triads of
boards. (See also spalloc_server.coordinates
for details of
the coordinate systems used when referring to boards.)
Defining Machines¶
Since defining machines completely by hand can be quite verbose (see example below), a some convenience functions are provided to deal with the common case of machines constructed in the standard manner.
To define an isolated single-board machine, the
Machine.single_board()
constructor may be used as follows:
m = Machine.single_board("my-board",
bmp_ip="spinn-board-bmp",
spinnaker_ip="spinn-board")
configuration = Configuration(machines=[m])
Most multi-board systems follow a standardised IP addressing scheme and have
their physical layout defined by SpiNNer. The
board_locations_from_spinner()
function reads CSV files produced by
spinner-ethernet-chips
describing machine layouts and the Machine.with_standard_ips()
constructor produces a Machine
with IP addresses based on the
standard IP addressing scheme. These may be used together like so:
# spinner-ethernet-chips -n 1200 > ethernet_chips.csv
m = Machine.with_standard_ips(
"million-core-machine",
board_locations=board_locations_from_spinner("ethernet_chips.csv"),
base_ip="10.2.0.0",
)
configuration = Configuration(machines=[m])
If neither of the above convenience functions apply to your machine, you can
also explicitly define your machine’s parameters. (Be sure to read about the
coordinates
used when referring to boards.)
For example, a desktop 3-board machine may look something like:
m = Machine(name="my-three-board-machine",
board_locations={
#X Y Z C F B
(0, 0, 0): (0, 0, 0),
(0, 0, 1): (0, 0, 2),
(0, 0, 2): (0, 0, 5),
},
# Just one BMP
bmp_ips={
#C F
(0, 0): "192.168.240.0",
},
# Each SpiNNaker board has an IP
spinnaker_ips={
#X Y Z
(0, 0, 0): "192.168.240.1",
(0, 0, 1): "192.168.240.17",
(0, 0, 2): "192.168.240.41",
})
configuration = Configuration(machines=[m])
Remember, since the configuration file is just a normal Python file, you can use any code you like to pragmatically specify machines, etc. which you use.
Configuration File API Reference¶
- class Configuration(machines=None, port=22244, ip='', timeout_check_interval=5.0, max_retired_jobs=1200, seconds_before_free=30)[source]¶
Defines the configuration of a server.
- Parameters:
- machines[
Machine
, …] The list of machines, highest priority first, the server is to manage. (Default: [])
- portint
The port number the server should listen on. Note that this is now deprecated; the port should be specified by the
--port
option on the spalloc_server command line. (Default: 22244)- ipstr
The IP the server should listen on. (Default: “”, i.e. all interfaces)
- timeout_check_intervalfloat
The number of seconds between the server’s checks for job timeouts. (Default: 5.0)
- max_retired_jobsint
The number of retired jobs to keep records of. (Default: 1200)
- machines[
- class Machine(name, tags=frozenset({'default'}), width=None, height=None, dead_boards=frozenset({}), dead_links=frozenset({}), board_locations=None, bmp_ips=None, spinnaker_ips=None)[source]¶
Defines a SpiNNaker machine.
- Parameters:
- namestr
The name of the machine.
- tagsset([str, …])
A set of tags which jobs may use to filter machines by. Note that by default jobs are assigned the ‘default’ tag and thus machines probably ought have this tag too.
- width, heightint
The dimensions of the machine in triads of boards. If omitted, these are inferred from the boards defined in board_locations and dead_boards.
- dead_boardsset([(x, y, z), …])
The board coordinates of all dead boards in the machine.
- dead_linksset([(x, y, z,
Links
), …]) The board coordinates of all dead links in the machine. Links to dead boards are implicitly dead and may or may not be included in this set.
- board_locations{(x, y, z): (c, f, b), …}
Lookup from board coordinate to its physical in a SpiNNaker machine in terms of cabinet, frame and board position. Must give the coordinates of all working boards.
- bmp_ips{(c, f): hostname, …}
The IP address of a BMP in every frame of the machine which contains working boards.
- spinnaker_ips{(x, y, z): hostname, …}
For every working board gives the IP address of the SpiNNaker board’s Ethernet connected chip.
- static __new__(cls, name, tags=frozenset({'default'}), width=None, height=None, dead_boards=frozenset({}), dead_links=frozenset({}), board_locations=None, bmp_ips=None, spinnaker_ips=None)[source]¶
Create new instance of Machine(name, tags, width, height, dead_boards, dead_links, board_locations, bmp_ips, spinnaker_ips)
- classmethod single_board(name, tags=frozenset({'default'}), bmp_ip=None, spinnaker_ip=None)[source]¶
Convenience constructor. Construct a
Machine
representing a single SpiNNaker board.- Parameters:
- namestr
The name of the machine
- tagsset([tag, …])
The tags to assign to the machine.
- bmp_ipstr
The hostname of the BMP controlling the board.
- spinnaker_ipstr
The hostname of the SpiNNaker board.
- classmethod with_standard_ips(name, tags=frozenset({'default'}), width=None, height=None, dead_boards=frozenset({}), dead_links=frozenset({}), board_locations=None, base_ip='192.168.0.0', cabinet_stride='0.0.5.0', frame_stride='0.0.1.0', board_stride='0.0.0.8', bmp_offset='0.0.0.0', spinnaker_offset='0.0.0.1')[source]¶
Convenience constructor. Construct a
Machine
which infers IP addresses of the form conventionally used by SpiNNaker installations.In standard SpiNNaker installations, IP addresses are allocated in a regular fashion as described below.
IP addresses for a particular machine are allocated within an address range, e.g. 192.168.0.0 - 192.168.255.255.
This address range is then subdivided into address ranges for each frame, for example:
Cabinet 0, Frame 0: 192.168.0.0 - 192.168.0.255
Cabinet 0, Frame 1: 192.168.1.0 - 192.168.1.255
Cabinet 0, Frame 2: 192.168.2.0 - 192.168.2.255
Cabinet 0, Frame 3: 192.168.3.0 - 192.168.3.255
Cabinet 0, Frame 4: 192.168.4.0 - 192.168.4.255
Cabinet 1, Frame 0: 192.168.5.0 - 192.168.5.255
…
Boards within a frame are each allocated their own range of IPs, for example:
Cabinet 0, Frame 0, Board 0: 192.168.0.0 - 192.168.0.7
Cabinet 0, Frame 0, Board 1: 192.168.0.8 - 192.168.0.15
Cabinet 0, Frame 0, Board 2: 192.168.0.16 - 192.168.0.23
…
Finally, the IP address of the BMP and Ethernet-connected SpiNNaker chip of each board is at some fixed offset within this range, for example:
Cabinet 0, Frame 0, Board 0, BMP: 192.168.0.0
Cabinet 0, Frame 0, Board 0, SpiNNaker: 192.168.0.1
Cabinet 0, Frame 0, Board 1, BMP: 192.168.0.8
Cabinet 0, Frame 0, Board 1, SpiNNaker: 192.168.0.9
Finally, we assume that board 0’s BMP is to be used as the BMP for controlling all boards in its frame.
- Parameters:
- namestr
The name of the machine.
- tagsiterable([str, …])
A set of tags which jobs may use to filter machines by. Note that by default jobs are assigned the ‘default’ tag and thus machines probably ought have this tag too.
- width, heightint
The dimensions of the machine in triads of boards. If omitted, these are inferred from the boards defined in board_locations and dead_boards.
- dead_boardsiterable([(x, y, z), …])
The board coordinates of all dead boards in the machine.
- dead_linksiterable([(x, y, z,
Links
), …]) The board coordinates of all dead links in the machine. Links to dead boards are implicitly dead and may or may not be included in this set.
- board_locations{(x, y, z): (c, f, b), …}
Lookup from board coordinate to its physical in a SpiNNaker machine in terms of cabinet, frame and board position. Must give the coordinates of all working boards.
- base_ipstr
The IPv4 address from which the IP address range assigned to the machine starts.
- cabinet_stridestr
The stride in IP addresses between individual cabinets, expressed as an IPv4 address.
- frame_stridestr
The stride in IP addresses between individual frames within a cabinet, expressed as an IPv4 address.
- board_stridestr
The stride in IP addresses between individual boards within a frame, expressed as an IPv4 address.
- bmp_offsetstr
The offset of a board’s BMP IP from the start of a board’s IP address range, expressed as an IPv4 address.
- spinnaker_offsetstr
The offset of a board’s Ethernet-connected SpiNNaker chip IP from the start of a board’s IP address range, expressed as an IPv4 address.
- board_locations_from_spinner(filename)[source]¶
Utility function which converts a CSV file produced by the spinner-ethernet-chips utility into a
board_locations
dictionary suitable for definingMachine
objects.- Parameters:
- filenamestr
The name of a CSV file produced by spinner-ethernet-chips defining the relationship between Ethernet connected chip coordinates and physical board locations.
This file is expected to have five columns (named in the first line of the CSV) named ‘board’, ‘cabinet’, ‘frame’, ‘x’, and ‘y’.
- Returns:
- {(x, y, z): (c, f, b), …}
The mapping from board coordinates to physical locations.
Coordinate systems¶
Utilities for working with board/triad coordinates.
This software assumes that all machines provided to it are interconnected in a valid (subset of) a hexagonal torus topology. Boards locations are expressed in one of several forms depending on circumstance.
Board coordinates
(x, y, z)
giving the logical location of the board within a hexagonal torus configuration. This is the most frequently used coordinate system used by this software.Physical coordinates
(cabinet, frame, board)
giving the physical location of a board in a cabinet. Generally only used when dealing with board power management.Ethernet chip coordinates
(x, y)
giving the chip coordinates of the Ethernet connected chip at the bottom-left coordinate of a SpiNNaker board. Generally only used when relating information to a client.
To deal with these coordinate systems a selection of utility functions are
provided in the spalloc_server.coordinates
.
Board coordinates¶
Board coordinates are given as a tuple (x, y, z)
.
Systems of SpiNNaker boards are defined in terms of ‘triads’ of boards. The figure below shows a single triad. The ‘z’ part of a board coordinate comes from the index of the board within its triad and are numbered as follows:
___
/ 2 \___
\___/ 1 \
/ 0 \___/
\___/
Larger systems are defined by replicating this pattern of triads. Triads are indexed along the X axis as follows:
___ ___ ___ ___
/ 2 \___/ 2 \___/ 2 \___/ 2 \___
\___/ 1 \___/ 1 \___/ 1 \___/ 1 \
/ 0 \___/ 0 \___/ 0 \___/ 0 \___/
\___/ \___/ \___/ \___/
0 1 2 3
And then along the Y axis thus:
___ ___ ___ ___
/ 2 \___/ 2 \___/ 2 \___/ 2 \___
\___/ 1 \___/ 1 \___/ 1 \___/ 1 \ 3
/ 0 \___/ 0 \___/ 0 \___/ 0 \___/
\___/ 2 \___/ 2 \___/ 2 \___/ 2 \___
\___/ 1 \___/ 1 \___/ 1 \___/ 1 \ 2
/ 0 \___/ 0 \___/ 0 \___/ 0 \___/
\___/ 2 \___/ 2 \___/ 2 \___/ 2 \___
\___/ 1 \___/ 1 \___/ 1 \___/ 1 \ 1
/ 0 \___/ 0 \___/ 0 \___/ 0 \___/
\___/ 2 \___/ 2 \___/ 2 \___/ 2 \___
\___/ 1 \___/ 1 \___/ 1 \___/ 1 \ 0
/ 0 \___/ 0 \___/ 0 \___/ 0 \___/
\___/ \___/ \___/ \___/
0 1 2 3
Physical coordinates¶
Physical coordinates are given as a tuple (cabinet, frame, board)
.
Physical coordinates give the positions of boards within a set of cabinets containing several frames containing several boards. These are indexed as illustrated below, starting from the top-right corner:
2 1 0
Cabinet --+-------------+----------------+
| | |
+-------------+ +-------------+ +-------------+ Frame
| | | | | | |
| +---------+ | | +---------+ | | +---------+ | |
| | : : : : | | | | : : : : | | | | : : : : |--------+ 0
| | : : : : | | | | : : : : | | | | : : : : | | |
| +---------+ | | +---------+ | | +---------+ | |
| | : : : : | | | | : : : : | | | | : : : : |--------+ 1
| | : : : : | | | | : : : : | | | | : : : : | | |
| +---------+ | | +---------+ | | +---------+ | |
| | : : : : | | | | : : : : | | | | : : : : |--------+ 2
| | : : : : | | | | : : : : | | | | : : : : | | |
| +---------+ | | +---------+ | | +---------+ | |
| | : : : : | | | | : : : : | | | | : : : : |--------+ 3
| | : : : : | | | | : : : : | | | | : : : : | |
| +---------+ | | +|-|-|-|-|+ | | +---------+ |
| | | | | | | | | | |
+-------------+ +--|-|-|-|-|--+ +-------------+
| | | | |
Board -----+-+-+-+-+
4 3 2 1 0
A mapping from board coordinates to physical coordinates is supplied by the user and is unique to the machine being built. A tool such as SpiNNer may be used to generate such mappings.
Ethernet chip coordinates¶
Ethernet chip coordinates are given as a tuple (x, y)
.
Ethernet chip coordinates give the chip coordinates of Ethernet connected chips at the bottom-left corner of SpiNNaker boards.
Utilities¶
The following utilities are provided for working with the above coordinate systems.
- link_to_vector = {(0, <Links.east: 0>): (0, -1, 2), (0, <Links.north_east: 1>): (0, 0, 1), (0, <Links.north: 2>): (0, 0, 2), (0, <Links.west: 3>): (-1, 0, 1), (0, <Links.south_west: 4>): (-1, -1, 2), (0, <Links.south: 5>): (-1, -1, 1), (1, <Links.east: 0>): (1, 0, -1), (1, <Links.north_east: 1>): (1, 0, 1), (1, <Links.north: 2>): (1, 1, -1), (1, <Links.west: 3>): (0, 0, 1), (1, <Links.south_west: 4>): (0, 0, -1), (1, <Links.south: 5>): (0, -1, 1), (2, <Links.east: 0>): (0, 0, -1), (2, <Links.north_east: 1>): (1, 1, -2), (2, <Links.north: 2>): (0, 1, -1), (2, <Links.west: 3>): (0, 1, -2), (2, <Links.south_west: 4>): (-1, 0, -1), (2, <Links.south: 5>): (0, 0, -2)}¶
A lookup from (z,
spalloc_server.links.Links
) to (dx, dy, dz).
- board_down_link(x1, y1, z1, link, width, height)[source]¶
Get the coordinates of the board down the specified link.
- Parameters:
- x1, y1, z1int
The board coordinates from which a link will be traversed.
- link
spalloc_server.links.Link
The link to follow.
- width, heightint
The dimensions of the system in triads.
- Returns:
- x, y, zint
The coordinates of the board down the specified link.
- wrapped
WrapAround
In what way did we wrap-around when following that link?
- board_to_chip(x, y, z)[source]¶
Convert a board coordinate into a chip coordinate.
Assumes a regular torus composed of SpiNN-5 boards.
- Parameters:
- x, y, zint
Board coordinates.
- Returns:
- x, yint
Chip coordinates.
- chip_to_board(x, y, w, h)[source]¶
Convert a chip coordinate into a board coordinate.
Assumes a regular torus composed of SpiNN-5 boards.
- Parameters:
- x, yint
Chip coordinates.
- w, hint
Dimensions of the system, in chips.
- Returns:
- x, y, zint
Board coordinates.
- triad_dimensions_to_chips(w, h, torus)[source]¶
Convert the dimensions of a system from numbers of triads to numbers of chips in the underlying network.
Assumes a regular torus composed of SpiNN-5 boards.
- Parameters:
- w, hint
Dimensions of the system in triads.
- torus
WrapAround
What wrap-around connections are present?
- Returns:
- w, hint
Dimensions of the SpiNNaker chip network in the specified machine, e.g. for booting.
- class WrapAround(value)[source]¶
Defines what type of wrap-around links a torus has, if any.
Values chosen have the following useful properties:
>>> # Can be meaningfully cast to bool >>> assert bool(WrapAround.none) is False >>> assert bool(WrapAround.x) is True >>> assert bool(WrapAround.y) is True >>> assert bool(WrapAround.both) is True >>> # Bit-operations make sense >>> assert bool(WrapAround.both & WrapAround.x) is True >>> assert bool(WrapAround.both & WrapAround.y) is True >>> assert bool(WrapAround.x & WrapAround.x) is True >>> assert bool(WrapAround.x & WrapAround.y) is False
- none = 0¶
No wrap-around links.
- x = 1¶
Has wrap around links around X-axis.
- y = 2¶
Has wrap around links around Y-axis.
- both = 3¶
Has wrap around links on X and Y axes.