Available Plugins¶
In this section we present all the plugins that are shipped along with Watcher. If you want to know which plugins your Watcher services have access to, you can use the Guru Meditation Reports to display them.
Goals¶
airflow_optimization¶
AirflowOptimization
This goal is used to optimize the airflow within a cloud infrastructure.
cluster_maintaining¶
ClusterMaintenance
This goal is used to maintain compute nodes without having the user’s application being interrupted.
hardware_maintenance¶
HardwareMaintenance
This goal is to migrate instances and volumes on a set of compute nodes and storage from nodes under maintenance
noisy_neighbor¶
NoisyNeighborOptimization
This goal is used to identify and migrate a Noisy Neighbor - a low priority VM that negatively affects performance of a high priority VM in terms of IPC by over utilizing Last Level Cache.
server_consolidation¶
ServerConsolidation
This goal is for efficient usage of compute server resources in order to reduce the total number of servers.
thermal_optimization¶
ThermalOptimization
This goal is used to balance the temperature across different servers.
unclassified¶
Unclassified
This goal is used to ease the development process of a strategy. Containing no actual indicator specification, this goal can be used whenever a strategy has yet to be formally associated with an existing goal. If the goal achieve has been identified but there is no available implementation, this Goal can also be used as a transitional stage.
workload_balancing¶
WorkloadBalancing
This goal is used to evenly distribute workloads across different servers.
Scoring Engines¶
dummy_scorer¶
Sample Scoring Engine implementing simplified workload classification.
Typically a scoring engine would be implemented using machine learning techniques. For example, for workload classification problem the solution could consist of the following steps:
Define a problem to solve: we want to detect the workload on the machine based on the collected metrics like power consumption, temperature, CPU load, memory usage, disk usage, network usage, etc.
The workloads could be predefined, e.g. IDLE, CPU-INTENSIVE, MEMORY-INTENSIVE, IO-BOUND, … Or we could let the ML algorithm to find the workloads based on the learning data provided. The decision here leads to learning algorithm used (supervised vs. non-supervised learning).
Collect metrics from sample servers (learning data).
Define the analytical model, pick ML framework and algorithm.
Apply learning data to the data model. Once taught, the data model becomes a scoring engine and can start doing predictions or classifications.
Wrap up the scoring engine with the class like this one, so it has a standard interface and can be used inside Watcher.
This class is a greatly very simplified version of the above model. The goal is to provide an example how such class could be implemented and used in Watcher, without adding additional dependencies like machine learning frameworks (which can be quite heavy) or over-complicating it’s internal implementation, which can distract from looking at the overall picture.
That said, this class implements a workload classification “manually” (in plain python code) and is not intended to be used in production.
Scoring Engine Containers¶
dummy_scoring_container¶
Sample Scoring Engine container returning a list of scoring engines.
Please note that it can be used in dynamic scenarios and the returned list might return instances based on some external configuration (e.g. in database). In order for these scoring engines to become discoverable in Watcher API and Watcher CLI, a database re-sync is required. It can be executed using watcher-sync tool for example.
Strategies¶
actuator¶
Actuator
Actuator that simply executes the actions given as parameter
This strategy allows anyone to create an action plan with a predefined set of actions. This strategy can be used for 2 different purposes:
Test actions
Use this strategy based on an event trigger to perform some explicit task
Actions¶
change_node_power_state¶
Compute node power on/off
By using this action, you will be able to on/off the power of a compute node.
The action schema is:
schema = Schema({
'resource_id': str,
'state': str,
})
The resource_id references a ironic node id (list of available
ironic node is returned by this command: ironic node-list
).
The state value should either be on or off.
migrate¶
Migrates a server to a destination nova-compute host
This action will allow you to migrate a server to another compute destination host. Migration type ‘live’ can only be used for migrating active VMs. Migration type ‘cold’ can be used for migrating non-active VMs as well active VMs, which will be shut down while migrating.
The action schema is:
schema = Schema({
'resource_id': str, # should be a UUID
'migration_type': str, # choices -> "live", "cold"
'destination_node': str,
'source_node': str,
})
The resource_id is the UUID of the server to migrate.
The source_node and destination_node parameters are respectively the
source and the destination compute hostname (list of available compute
hosts is returned by this command: nova service-list --binary
nova-compute
).
Note
Nova API version must be 2.56 or above if destination_node parameter is given.
nop¶
logs a message
The action schema is:
schema = Schema({
'message': str,
})
The message is the actual message that will be logged.
resize¶
Resizes a server with specified flavor.
This action will allow you to resize a server to another flavor.
The action schema is:
schema = Schema({
'resource_id': str, # should be a UUID
'flavor': str, # should be either ID or Name of Flavor
})
The resource_id is the UUID of the server to resize. The flavor is the ID or Name of Flavor (Nova accepts either ID or Name of Flavor to resize() function).
sleep¶
Makes the executor of the action plan wait for a given duration
The action schema is:
schema = Schema({
'duration': float,
})
The duration is expressed in seconds.
volume_migrate¶
Migrates a volume to destination node or type
By using this action, you will be able to migrate cinder volume. Migration type ‘swap’ can only be used for migrating attached volume. Migration type ‘migrate’ can be used for migrating detached volume to the pool of same volume type. Migration type ‘retype’ can be used for changing volume type of detached volume.
The action schema is:
schema = Schema({
'resource_id': str, # should be a UUID
'migration_type': str, # choices -> "swap", "migrate","retype"
'destination_node': str,
'destination_type': str,
})
The resource_id is the UUID of cinder volume to migrate.
The destination_node is the destination block storage pool name.
(list of available pools are returned by this command: cinder
get-pools
) which is mandatory for migrating detached volume
to the one with same volume type.
The destination_type is the destination block storage type name.
(list of available types are returned by this command: cinder
type-list
) which is mandatory for migrating detached volume or
swapping attached volume to the one with different volume type.
Workflow Engines¶
taskflow¶
Taskflow as a workflow engine for Watcher
Full documentation on taskflow at https://docs.openstack.org/taskflow/latest
Planners¶
weight¶
Weight planner implementation
This implementation builds actions with parents in accordance with weights. Set of actions having a higher weight will be scheduled before the other ones. There are two config options to configure: action_weights and parallelization.
Limitations
This planner requires to have action_weights and parallelization configs tuned well.
workload_stabilization¶
Workload Stabilization planner implementation
This implementation comes with basic rules with a set of action types that
are weighted. An action having a lower weight will be scheduled before the
other ones. The set of action types can be specified by ‘weights’ in the
watcher.conf
. You need to associate a different weight to all available
actions into the configuration file, otherwise you will get an error when
the new action will be referenced in the solution produced by a strategy.
Limitations
This is a proof of concept that is not meant to be used in production