Introduction
Assumptions
- All the hardware are Uniform
- Same number of NICs with same PCI ids
- Same number of disks with same addresses.
- Everything is named and their names are used for reference - In Airship, the name if the filename is not important, but the name in the 'schema' (found in schema/metadata/name) is important.
Deployment Configuration and Strategy
This section is added mainly for the completeness purposes. User may chose to configure these values - only if it is required. For example, with slow-internet access site, some timeouts may be modified. Or, If user wants to perform some check in-between two actions.
Parameter | Sub-Category-1 | Sub-Category-2 | Description | Example Value |
physical_provisioner | ||||
deployment_strategy | Name of the strategy to use. User can use the one that is defined in airshipit/treasuremap/global/deployment See below. | deployment-strategy | ||
deploy_interval | The seconds delayed between checks for progress of the step that performs deployment of servers | 30 | ||
deploy_timeout | The maximum seconds allowed for the step that performs deployment of all servers | 3600 | ||
destroy_interval | The seconds delayed between checks for progress of destroying hardware nodes | 30 | ||
destroy_timeout | The maximum seconds allowed for destroying hardware nodes | 900 | ||
join_wait | The number of seconds allowed for a node to join the Kubernetes cluster | 0 | ||
prepare_node_interval | The seconds delayed between checks for progress of preparing nodes | 30 | ||
prepare_node_timeout | The maximum seconds allowed for preparing nodes | 1800 | ||
prepare_site_interval | The seconds delayed between checks for progress of preparing the site | 10 | ||
prepare_site_timeout | The maximum seconds allowed for preparing the site | 300 | ||
verify_interval | The seconds delayed between checks for progress of verification | 10 | ||
verify_timeout | The maximum seconds allowed for verification | 60 | ||
kubernetes | ||||
node_status_interval | ||||
node_status_timeout | ||||
kubernetes_provisioner | ||||
drain_timeout | maximum seconds allowed for draining a node | 3600 | ||
drain_grace_period | seconds provided to Promenade as a grace period for pods to cease | 1800 | ||
clear_labels_timeout | maximum seconds provided to Promenade to clear labels on a node | 1800 | ||
remove_etcd_timeout | maximum seconds provided to Promenade to allow for removing etcd from a node | 1800 | ||
etcd_ready_timeout | maximum seconds allowed for etcd to reach a healthy state after a node is removed | 600 | ||
armada+ | ||||
get_releases_timeout | timeout for Retrieving Helm charts releases after deployment | 300 | ||
get_status_timeout | timeout for retrieving status | 300 | ||
manifest+ | The name of the manifest document that the workflow will use during site deployment activities | 'full-site' | ||
post_apply_timeout | 7200 | |||
validate_design_timeout | Timeout to validate the design. | 600 | ||
Deployment-Strategy | ||||
groups | named sets of nodes that will be deployed together. | |||
name | name of the group | masters | ||
critical | if this group is required to continue to additional phases of deployment | true | ||
depends_on | Group names that must be successful before this group can be processed | [] | ||
selectors | A list of identifying information to indicate the nodes that are members of this group. Each selector has following 4 filter values | |||
node_names | Name of the node | node01 | ||
node_labels | Label of the node | ucp_control_plane: enabled | ||
node_tags | Tags in Node | control | ||
rack_names | Name of the rack | rack01 | ||
success_criteria | A list of identifying information to indicate the nodes that are members of this group. When no criteria are specified, it means that no checks are done - processing continues as if nothing is wrong | |||
percent_successful_nodes | The calculated success rate of nodes completing the deployment phase. | 75 would mean that 3 of 4 nodes must complete the phase successfully | ||
minimum_successful_nodes | An integer indicating how many nodes must complete the phase to be considered successful | 3 | ||
maximum_failed_nodes | An integer indicating a number of nodes that are allowed to have failed the deployment phase and still consider that group successful. | 0 |
Typical Ordering of groups is shown below.
__________ __________________
| ntp-node | | monitoring-nodes | ---------- ------------------ | ____V__________ | control-nodes | --------------- |_________________________ | | ______V__________ ______V__________ | compute-nodes-1 | | compute-nodes-2 | ----------------- -----------------
Profiles
There are two important categories of profiles that the user should create to match their environment:
- Hardware ( site/<site_name>/profiles/hardware/<profile_name>.yaml)
- Host site/<site_name>/profiles/host/<profile_name(s)>.yaml
Hardware Profile
Under the hardware profile, user can provide details about the server, and few device (n/w and disk) aliases. User can contact the administrator to obtain this information - else, one has to obtain this information from the 'lshw' command. For example: sudo lshw -c network -businfo - to know the NIC names and PCI Ids. Once the user has the hardware information, it can be configure following parameters:
Server:
Parameter | Description | Example-Value |
---|---|---|
vendor | Vendor of the server chassis | Intel |
generation | Generation of the chassis model | '4' |
hw_version | Version of the chassis model within its generation | '3' |
bios_version | The certified version of the chassis BIOS | 'SE5C .... |
boot_mode | Mode of the default boot of hardware - bios, uefi | bios |
bootstrap_protocol | Protocol of boot of the hardware - pxe, usb, hdd | 'pxe |
pxe_interface | Which interface to use for network booting within the OOB manager, not OS device | 0 |
Device-Aliases:
NICs:
User can categorize the NICs in the hardware as control-plane nics or dataplane nics. In each category he can have one or more NICs. For example, he can define ctrl_nic1, ctrl_nic2, ctrl_nic3, and data_nic1, data_nic2, data_nic3, etc. It is better to use names that are self-explanatory - for example, if you have a separate NIC for PXE, name it as pxe_nic. This categorization will be referred in the host-profiles. For every NIC defined, the below information can be configured.
Parameter | Description | Example Value |
---|---|---|
address | The PCI address of the NIC | 0000:04:00.0 |
dev_type | Description of the NIC | 'I350 Gigabit Network Connection' |
bus_type | The bus supported | 'pci' |
Disks:
The Disks can be either bootdisk or datadisk(s). Similar to NICs, the self-explanatory names should be choosen. For example, cephjournal1 can be the name for one of the disks use as one the ceph journals.
For every disk defined, the below information can be configured:
Parameter | Description | Example Value |
---|---|---|
address | The bus address of the Disk | 0:2.0.0 |
dev_type | Description of the disk. | 'INTEL SSDSC2BB48' |
bus_type | The bus supported | 'scsi' |
Others
Parameter | Sub-category-1 | Sub-category-2 | Description | Example Value |
---|---|---|---|---|
cpu_set | ||||
kvm | '4-43,48-87' | |||
huge_pages | ||||
dpdk | ||||
size | '1G' | |||
count | 32 |
Host Profiles
The host profiles Following things are covered:
- Mapping NICs of the host the Networks it would belong to. NOTE: For definition of network please refer to Networks section below.
- How the Bootdisk is partitioned.
- What all software components are enabled on the particular host.
- What hardware profile that host is using.
- Platform specific configuration for the host.
For majority of the cases, you only need two host profiles - Dataplane and Control Plane. Of course, the user can create more than 2 and use it accordingly. The below table summarizes the configurable parameters for the host profiles.
Note: One host profile can adopt values from other host profile. It just have to add
Parameter Category | Sub Category 1 | Sub Category 2 | Sub-Category-2 | Sub-Category-2 | Description | Example Value 1 |
---|---|---|---|---|---|---|
hardware_profile | NA | NA | The hardware profile used by the host | intel_2600.yaml | ||
primary_network | NA | NA | The main network used for administration | dmz | ||
Interfaces | NA | NA | Define each and every interfaces of the host in detail. | |||
Name | NA | Name of the Interface | dmz, data1 | |||
device_link | The name of the networkLink that will be attached to this interface. NetworkLink definition includes part of the interface configuration such as bonding(see below) | dmz, data1 | ||||
slaves | NIC Aliases. The list of hardware interfaces used for creating this interface. This value can be a device alias defined in the HardwareProfile or the kernel name of the hardware interface. For bonded interfaces, this would list all the slaves. For non-bonded interfaces, this should list the single hardware interface used | ctrl_nic1, data_nic1 | ||||
networks | This is the list of networks to enable on this interface. If multiple networks are listed, the NetworkLink attached to this interface must have trunking enabled or the design validation will fail.. | dmz, private, management | ||||
storage | Either in a HostProfile or BaremetalNode document. The storage configuration can describe the creation of partitions on physical disks, the assignment of physical disks and/or partitions to volume groups, and the creation of logical volumes. | |||||
physical_devices* | A physical device can either be carved up in partitions (including a single partition consuming the entire device) or added to a volume group as a physical volume. Each key in the physical_devices mapping represents a device on a node. The key should either be a device alias defined in the HardwareProfile or the name of the device published by the OS. The value of each key must be a mapping with the following keys | |||||
labels | A mapping of key/value strings providing generic labels for the device | bootdrive: true | ||||
volume_group | A volume group name to add the device to as a physical volume. Incompatible with the partitions specification | |||||
partitions* | A sequence of mappings listing the partitions to be created on the device. Incompatible with volume_group specification | |||||
name | Metadata describing the partition in the topology | 'root | ||||
size | The size of the partition. | '30g' | ||||
part_uuid | A UUID4 formatted UUID to assign to the partition. If not specified one will be generated | |||||
volume_group | name assigned to a volume group | |||||
labels | ||||||
bootable | Boolean whether this partition should be the bootable device | true | ||||
filesystem | An optional mapping describing how the partition should be formatted and mounted | |||||
mountpoint | Where the filesystem should be mounted. If not specified the partition will be left as a raw device | '/' | ||||
fstype | The format of the filesystem. Defaults to ext4 | 'ext4' | ||||
mount_options | fstab style mount options. Default is ‘defaults’ | 'defaults' | ||||
fs_uuid | A UUID4 formatted UUID to assign to the filesystem. If not specified one will be generated | |||||
fs_label | A filesystem label to assign to the filesystem. Optional. | |||||
volume_groups | ||||||
vg_uuid | A UUID4 format uuid applied to the volume group. If not specified, one is generated | |||||
logical_volumes* | A sequence of mappings listing the logical volumes to be created in the volume | |||||
name | Used as the logical volume name | |||||
lv_uuid | A UUID4 format uuid applied to the logical volume: If not specified, one is generated | |||||
size | The logical volume size | |||||
filesystem | A mapping specifying how the logical volume should be formatted and mounted | |||||
mountpoint | Same as above. | |||||
fstype | ||||||
mount_options | ||||||
fs_uuid | ||||||
fs_label | ||||||
platform | define the operating system image and kernel to use as well as customize the kernel configuration | |||||
image | Image name | 'xenial' | ||||
kernel | Kernel Version | 'hwe-16.04' | ||||
kernel_params | A mapping. Each key should either be a string or boolean value. For boolean true values, the key will be added to the kernel parameter list as a flag. For string values, the key:value pair will be added to the kernel parameter list as key=value | kernel_package: 'linux-image-4.15.0-46-generic' | ||||
oob | The ipmi OOB type requires additional configuration to allow OOB management | |||||
network | which node network is used for OOB access. | oop | ||||
account | valid account that can access the BMC via IPMI over LAN | root | ||||
credential | valid password for the account that can access the BMC via IPMI over LAN | root | ||||
spec | host_profile | Name of the HostProfile that this profile adopts and overrides values from. | defaults | |||
metadata | ||||||
owner_data | ||||||
<software-component-name> enabled/disabled | openstack-l3-agent: enabled |
Nodes
This is defined under Baremetal. Node network attachment can be described in a HostProfile
or a BaremetalNode
document. Node addressing is allowed only in a BaremetalNode
document.
Hence, this focuses mostly on addressing. Nodes adopts all values from the profile that it is mapped to and can then again override or append any configuration that is specific to that node.
Separate schema is created for each and every node. That is the below table contents are repeated for each and every node of the deployment.
Parameter Category | Sub-Category-1 | Sub-Category-2 | Sub-Category-3 | Sub-Category-4 | Description | Example Value |
---|---|---|---|---|---|---|
addressing* | Contain IP address assignment for all the networks. It is a valid design to omit networks from this, and in that case the interface attached to the omitted network will be configured as link up with no address | |||||
address | It defines a static IP address or dhcp for each network a node should have a configured layer 3 interface on. | 10.10.100.12 or dhcp | ||||
network | The Network name. | oob, private, mgmt, pxe, etc. | ||||
host_profile | Which host profile to assign to this node. | cp-intel-pod10 | ||||
metadata | ||||||
tags | 'masters' | |||||
rack | pod10-rack |
*: Array of Values.
Network Definition
Network
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
cidr | |||
ranges* | |||
type | |||
start | |||
end | |||
dns | |||
domain | |||
servers | |||
dhcp_relay | |||
self_ip | |||
upstream_target | |||
mtu | |||
vlan | |||
routedomain | |||
routes* | |||
subnet | |||
gateway | |||
metric | |||
routedomain | |||
labels |
Network Link
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
bonding | |||
mode | |||
hash | |||
peer_rate | |||
mon_rate | |||
up_delay | |||
down_delay | |||
mtu | |||
linkspeed | |||
trunking | |||
mode | |||
default_network | |||
allowed_networks* | |||
labels |
Software
Recently openstack services are deployed as containers, and to manage these containers various container management platforms such as Kubernetes are used.
Airship too uses the approach of openstack on Kubernetes (OOK). For deployment/configuration of services/applications/pods (in this case Openstack, monitoring, etc.) on Kubernetes, users have two options (a) Kolla-Kubernetes (b) Openstack Helm. Both the options uses helm for packaging the Kubernetes definitions for each service. However, openstack helm uses helm charts, whereas kolla-kubernetes uses uses ansible for deployment/orchestration. Airship uses the former option - helm charts. Accordingly, under software, user configurations will fall under two important categories - Charts and Configurations.
Charts
Kubernetes
For Kubernetes system (Namespace: kube-system), user just has to do some substitutions for the control nodes. In this definition, list of control plane nodes (i.e. genesis node + master node list) on which calico etcd will run and will need certs is created. It is assumed that Airship sites will have 3 control plane nodes, so this should not need to change for a new site. User only has to perform some substitutions..
First he has to create a mapping: The mapping would be:
Source (as mentioned in commonaddress.yaml) | Destination |
---|---|
.genesis.hostname | .values.nodes[0].name |
.masters[0].hostname | .values.nodes[1].name |
.masters[1].hostname | .values.nodes[2].name |
Source | Destination |
---|---|
certificate of calico-etcd-<podname>-node1 | .values.nodes[0].tls.client.cert |
certificate-key calico-etcd-<podname>-node1 | .values.nodes[0].tls.client.key |
certificate of calico-etcd-<podname>-node1-peer | .values.nodes[0].tls.peer.cert |
certificate-key of calico-etcd-<podname>-node1-peer | .values.nodes[0].tls.peer.key |
certificate of calico-etcd-<podname>-node2 | .values.nodes[1].tls.client.cert |
certificate-key calico-etcd-<podname>-node2 | .values.nodes[1].tls.client.key |
certificate of calico-etcd-<podname>-node2-peer | .values.nodes[1].tls.peer.cert |
certificate-key of calico-etcd-<podname>-node2-peer | .values.nodes[1].tls.peer.key |
certificate of calico-etcd-<podname>-node3 | .values.nodes[2].tls.client.cert |
certificate-key calico-etcd-<podname>-node3 | .values.nodes[2].tls.client.key |
certificate of calico-etcd-<podname>-node3-peer | .values.nodes[2].tls.peer.cert |
certificate-key of calico-etcd-<podname>-node3-peer | .values.nodes[2].tls.peer.key |
Undercloud Platform
Ceph
Openstack helm Infra
This includes configuring parameters of various infrastucture components, such as elasticsearch, fluentbit, fluentd, grafana, ingress, mariadb, prometheus.
User can leave all the values as is.
Open Stack Helm - Compute Kit
Under this, there are three important configurations -
- Libvirt:
- Network Backend: openvswitch or sriov.
- Neutron
- Nova
Tenant-Ceph
Config
Under this configuration, user can only set the region name for openstack helm.
Parameter | sub-category | Description | Example-Value |
---|---|---|---|
osh | |||
region_name | The region name to use. Typically Site name is provided. | intel-pod10 |
PKI-Catalog
Parameter | sub-category-1 | sub-category-2 | Description | Example Value |
certificate_authorities | ||||
description | ||||
certificates | ||||
document_name | ||||
description | ||||
common_name | ||||
hosts | ||||
groups | ||||
keypairs | ||||
name | ||||
description |
Secrets
Publickeys of the Users.
Path: site/<site_name>/secrets/publickey/<username>_ssh_public_key.yaml
The public key of the user is added as 'data'.
Passphrases of the users
Path: site/<site_name>/secrets/publickey/<username>_crypt_password.yaml
Put a passphrase for the user as 'data'.
Actions
Parameter | Sub-Category-1 | sub-category-2 | Description | Example Value |
---|---|---|---|---|
signaling | ||||
assets | ||||
items | ||||
path | ||||
location | ||||
type | 'unit','file', 'pkg_list | |||
data | ||||
location_pipeline | template | |||
data_pipeline | base64_encode', 'template', 'base64_decode', 'utf8_encode','utf8_decode' | |||
permissions | ||||
node_filter | ||||
filter_set_type | ||||
filter_set items | ||||
filter_type | ||||
node_names | ||||
node_tags | ||||
node_labels | ||||
rack_names | ||||
rack_labels |
Rack
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
tor_switches | |||
mgmt_ip | |||
sdn_api_uri | |||
location | |||
clli | |||
grid | |||
local_networks* | |||
labels |
Region
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
tag_definitions* | |||
tag | |||
definition_type | lshw_xpath | ||
definition | |||
authorized_keys* | |||
repositories | |||
remove_unlisted | |||
repo_type+ | |||
url+ | |||
distributions | |||
subrepos | |||
components | |||
gpgkey | |||
arches+ | |||
options |