Introduction
Assumptions
- All the hardware are Uniform
- Same number of NICs with same PCI ids
- Same number of disks with same addresses.
- Everything is named and their names are used for reference - In Airship, the name if the filename is not important, but the name in the 'schema' (found in schema/metadata/name) is important.
Deployment Configuration and Strategy
This section is added mainly for the completeness purposes. User may chose to configure these values - only if it is required. For example, with slow-internet access site, some timeouts may be modified. Or, If user wants to perform some check in-between two actions.
Parameter | Sub-Category-1 | Sub-Category-2 | Description | Example Value |
physical_provisioner | ||||
deployment_strategy | Name of the strategy to use. User can use the one that is defined in airshipit/treasuremap/global/deployment See below. | deployment-strategy | ||
deploy_interval | The seconds delayed between checks for progress of the step that performs deployment of servers | 30 | ||
deploy_timeout | The maximum seconds allowed for the step that performs deployment of all servers | 3600 | ||
destroy_interval | The seconds delayed between checks for progress of destroying hardware nodes | 30 | ||
destroy_timeout | The maximum seconds allowed for destroying hardware nodes | 900 | ||
join_wait | The number of seconds allowed for a node to join the Kubernetes cluster | 0 | ||
prepare_node_interval | The seconds delayed between checks for progress of preparing nodes | 30 | ||
prepare_node_timeout | The maximum seconds allowed for preparing nodes | 1800 | ||
prepare_site_interval | The seconds delayed between checks for progress of preparing the site | 10 | ||
prepare_site_timeout | The maximum seconds allowed for preparing the site | 300 | ||
verify_interval | The seconds delayed between checks for progress of verification | 10 | ||
verify_timeout | The maximum seconds allowed for verification | 60 | ||
kubernetes | ||||
node_status_interval | ||||
node_status_timeout | ||||
kubernetes_provisioner | ||||
drain_timeout | maximum seconds allowed for draining a node | 3600 | ||
drain_grace_period | seconds provided to Promenade as a grace period for pods to cease | 1800 | ||
clear_labels_timeout | maximum seconds provided to Promenade to clear labels on a node | 1800 | ||
remove_etcd_timeout | maximum seconds provided to Promenade to allow for removing etcd from a node | 1800 | ||
etcd_ready_timeout | maximum seconds allowed for etcd to reach a healthy state after a node is removed | 600 | ||
armada+ | ||||
get_releases_timeout | timeout for Retrieving Helm charts releases after deployment | 300 | ||
get_status_timeout | timeout for retrieving status | 300 | ||
manifest+ | The name of the manifest document that the workflow will use during site deployment activities | 'full-site' | ||
post_apply_timeout | 7200 | |||
validate_design_timeout | Timeout to validate the design. | 600 | ||
Deployment-Strategy | ||||
groups | named sets of nodes that will be deployed together. | |||
name | name of the group | masters | ||
critical | if this group is required to continue to additional phases of deployment | true | ||
depends_on | Group names that must be successful before this group can be processed | [] | ||
selectors | A list of identifying information to indicate the nodes that are members of this group. Each selector has following 4 filter values | |||
node_names | Name of the node | node01 | ||
node_labels | Label of the node | ucp_control_plane: enabled | ||
node_tags | Tags in Node | control | ||
rack_names | Name of the rack | rack01 | ||
success_criteria | A list of identifying information to indicate the nodes that are members of this group. When no criteria are specified, it means that no checks are done - processing continues as if nothing is wrong | |||
percent_successful_nodes | The calculated success rate of nodes completing the deployment phase. | 75 would mean that 3 of 4 nodes must complete the phase successfully | ||
minimum_successful_nodes | An integer indicating how many nodes must complete the phase to be considered successful | 3 | ||
maximum_failed_nodes | An integer indicating a number of nodes that are allowed to have failed the deployment phase and still consider that group successful. | 0 |
Typical Ordering of groups is shown below.
__________ __________________
| ntp-node | | monitoring-nodes | ---------- ------------------ | ____V__________ | control-nodes | --------------- |_________________________ | | ______V__________ ______V__________ | compute-nodes-1 | | compute-nodes-2 | ----------------- -----------------
Profiles
There are two important categories of profiles that the user should create to match their environment:
- Hardware ( site/<site_name>/profiles/hardware/<profile_name>.yaml)
- Host site/<site_name>/profiles/host/<profile_name(s)>.yaml
Hardware Profile
Under the hardware profile, user can provide details about the server, and few device (n/w and disk) aliases. User can contact the administrator to obtain this information - else, one has to obtain this information from the 'lshw' command. For example: sudo lshw -c network -businfo - to know the NIC names and PCI Ids. Once the user has the hardware information, it can be configure following parameters:
Server:
Parameter | Description | Example-Value |
---|---|---|
vendor | Vendor of the server chassis | Intel |
generation | Generation of the chassis model | '4' |
hw_version | Version of the chassis model within its generation | '3' |
bios_version | The certified version of the chassis BIOS | 'SE5C .... |
boot_mode | Mode of the default boot of hardware - bios, uefi | bios |
bootstrap_protocol | Protocol of boot of the hardware - pxe, usb, hdd | 'pxe |
pxe_interface | Which interface to use for network booting within the OOB manager, not OS device | 0 |
Device-Aliases:
NICs:
User can categorize the NICs in the hardware as control-plane nics or dataplane nics. In each category he can have one or more NICs. For example, he can define ctrl_nic1, ctrl_nic2, ctrl_nic3, and data_nic1, data_nic2, data_nic3, etc. It is better to use names that are self-explanatory - for example, if you have a separate NIC for PXE, name it as pxe_nic. This categorization will be referred in the host-profiles. For every NIC defined, the below information can be configured.
Parameter | Description | Example Value |
---|---|---|
address | The PCI address of the NIC | 0000:04:00.0 |
dev_type | Description of the NIC | 'I350 Gigabit Network Connection' |
bus_type | The bus supported | 'pci' |
Disks:
The Disks can be either bootdisk or datadisk(s). Similar to NICs, the self-explanatory names should be choosen. For example, cephjournal1 can be the name for one of the disks use as one the ceph journals.
For every disk defined, the below information can be configured:
Parameter | Description | Example Value |
---|---|---|
address | The bus address of the Disk | 0:2.0.0 |
dev_type | Description of the disk. | 'INTEL SSDSC2BB48' |
bus_type | The bus supported | 'scsi' |
Others
Parameter | Sub-category-1 | Sub-category-2 | Description | Example Value |
---|---|---|---|---|
cpu_set | ||||
kvm | '4-43,48-87' | |||
huge_pages | ||||
dpdk | ||||
size | '1G' | |||
count | 32 |
Host Profiles
The host profiles Following things are covered:
- Mapping NICs of the host the Networks it would belong to. NOTE: For definition of network please refer to Networks section below.
- How the Bootdisk is partitioned.
- What all software components are enabled on the particular host.
- What hardware profile that host is using.
- Platform specific configuration for the host.
For majority of the cases, you only need two host profiles - Dataplane and Control Plane. Of course, the user can create more than 2 and use it accordingly. The below table summarizes the configurable parameters for the host profiles.
Note: One host profile can adopt values from other host profile. It just have to add
Parameter Category | Sub Category 1 | Sub Category 2 | Sub-Category-2 | Sub-Category-2 | Description | Example Value 1 |
---|---|---|---|---|---|---|
hardware_profile | NA | NA | The hardware profile used by the host | intel_2600.yaml | ||
primary_network | NA | NA | The main network used for administration | dmz | ||
Interfaces | NA | NA | Define each and every interfaces of the host in detail. | |||
Name | NA | Name of the Interface | dmz, data1 | |||
device_link | The name of the network link. | dmz, data1 | ||||
slaves | NIC Aliases | ctrl_nic1, data_nic1 | ||||
networks | The networks this interface belongs to. | dmz, private, management | ||||
storage | ||||||
physical_devices | ||||||
labels | ||||||
volume_group | ||||||
partitions* | ||||||
name | ||||||
size | ||||||
part_uuid | ||||||
volume_group | ||||||
labels | ||||||
bootable | ||||||
filesystem | ||||||
mountpoint | ||||||
fstype | ||||||
mount_options | ||||||
fs_uuid | ||||||
fs_label | ||||||
volume_groups | ||||||
vg_uuid | ||||||
logical_volumes* | ||||||
name | ||||||
lv_uuid | ||||||
size | ||||||
filesystem | ||||||
mountpoint | ||||||
fstype | ||||||
mount_options | ||||||
fs_uuid | ||||||
fs_label | ||||||
platform | ||||||
image | ||||||
kernel | ||||||
kernel_params | ||||||
metadata | ||||||
tags* | ||||||
owner_data | ||||||
rack | ||||||
boot_mac | ||||||
host_profile | ||||||
hardware_profile | ||||||
primary_network | ||||||
interfaces | ||||||
device_link | ||||||
slaves* | ||||||
networks* | ||||||
oob | The ipmi OOB type requires additional configuration to allow OOB management | |||||
network | which node network is used for OOB access. | oop | ||||
account | valid account that can access the BMC via IPMI over LAN | root | ||||
credential | valid password for the account that can access the BMC via IPMI over LAN | root | ||||
spec | host_profile | Name of the HostProfile that this profile adopts and overrides values from. | defaults | |||
metadata | ||||||
owner_data | ||||||
<software-component-name> enabled/disabled | openstack-l3-agent: enabled |
Nodes
Nodes adopts all values from the profile that it is mapped to and can then again override or append any configuration that is specific to that node.
Parameter Category | Sub-Category-1 | Sub-Category-2 | Sub-Category-3 | Sub-Category-4 | Description | Example Value |
---|---|---|---|---|---|---|
addressing* | Contain IP address assignment for all the networks. It is a valid design to omit networks from this, and in that case the interface attached to the omitted network will be configured as link up with no address | |||||
address | It defines a static IP address or dhcp for each network a node should have a configured layer 3 interface on. | 10.10.100.12 or dhcp | ||||
network | The Network name. | oob, private, mgmt, pxe, etc. | ||||
*: Array of Values.
Network Definition
Network
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
cidr | |||
ranges* | |||
type | |||
start | |||
end | |||
dns | |||
domain | |||
servers | |||
dhcp_relay | |||
self_ip | |||
upstream_target | |||
mtu | |||
vlan | |||
routedomain | |||
routes* | |||
subnet | |||
gateway | |||
metric | |||
routedomain | |||
labels |
Network Link
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
bonding | |||
mode | |||
hash | |||
peer_rate | |||
mon_rate | |||
up_delay | |||
down_delay | |||
mtu | |||
linkspeed | |||
trunking | |||
mode | |||
default_network | |||
allowed_networks* | |||
labels |
Software
Recently openstack services are deployed as containers, and to manage these containers various container management platforms such as Kubernetes are used.
Airship too uses the approach of openstack on Kubernetes (OOK). For deployment/configuration of services/applications/pods (in this case Openstack, monitoring, etc.) on Kubernetes, users have two options (a) Kolla-Kubernetes (b) Openstack Helm. Both the options uses helm for packaging the Kubernetes definitions for each service. However, openstack helm uses helm charts, whereas kolla-kubernetes uses uses ansible for deployment/orchestration. Airship uses the former option - helm charts. Accordingly, under software, user configurations will fall under two important categories - Charts and Configurations.
Charts
Kubernetes
For Kubernetes system (Namespace: kube-system), user just has to do some substitutions for the control nodes. In this definition, list of control plane nodes (i.e. genesis node + master node list) on which calico etcd will run and will need certs is created. It is assumed that Airship sites will have 3 control plane nodes, so this should not need to change for a new site. User only has to perform some substitutions..
First he has to create a mapping: The mapping would be:
Source (as mentioned in commonaddress.yaml) | Destination |
---|---|
.genesis.hostname | .values.nodes[0].name |
.masters[0].hostname | .values.nodes[1].name |
.masters[1].hostname | .values.nodes[2].name |
Source | Destination |
---|---|
certificate of calico-etcd-<podname>-node1 | .values.nodes[0].tls.client.cert |
certificate-key calico-etcd-<podname>-node1 | .values.nodes[0].tls.client.key |
certificate of calico-etcd-<podname>-node1-peer | .values.nodes[0].tls.peer.cert |
certificate-key of calico-etcd-<podname>-node1-peer | .values.nodes[0].tls.peer.key |
certificate of calico-etcd-<podname>-node2 | .values.nodes[1].tls.client.cert |
certificate-key calico-etcd-<podname>-node2 | .values.nodes[1].tls.client.key |
certificate of calico-etcd-<podname>-node2-peer | .values.nodes[1].tls.peer.cert |
certificate-key of calico-etcd-<podname>-node2-peer | .values.nodes[1].tls.peer.key |
certificate of calico-etcd-<podname>-node3 | .values.nodes[2].tls.client.cert |
certificate-key calico-etcd-<podname>-node3 | .values.nodes[2].tls.client.key |
certificate of calico-etcd-<podname>-node3-peer | .values.nodes[2].tls.peer.cert |
certificate-key of calico-etcd-<podname>-node3-peer | .values.nodes[2].tls.peer.key |
Undercloud Platform
Ceph
Openstack helm Infra
This includes configuring parameters of various infrastucture components, such as elasticsearch, fluentbit, fluentd, grafana, ingress, mariadb, prometheus.
User can leave all the values as is.
Open Stack Helm - Compute Kit
Under this, there are three important configurations -
- Libvirt:
- Network Backend: openvswitch or sriov.
- Neutron
- Nova
Tenant-Ceph
Config
Under this configuration, user can only set the region name for openstack helm.
Parameter | sub-category | Description | Example-Value |
---|---|---|---|
osh | |||
region_name | The region name to use. Typically Site name is provided. | intel-pod10 |
PKI-Catalog
Parameter | sub-category-1 | sub-category-2 | Description | Example Value |
certificate_authorities | ||||
description | ||||
certificates | ||||
document_name | ||||
description | ||||
common_name | ||||
hosts | ||||
groups | ||||
keypairs | ||||
name | ||||
description |
Secrets
Publickeys of the Users.
Path: site/<site_name>/secrets/publickey/<username>_ssh_public_key.yaml
The public key of the user is added as 'data'.
Passphrases of the users
Path: site/<site_name>/secrets/publickey/<username>_crypt_password.yaml
Put a passphrase for the user as 'data'.
Actions
Parameter | Sub-Category-1 | sub-category-2 | Description | Example Value |
---|---|---|---|---|
signaling | ||||
assets | ||||
items | ||||
path | ||||
location | ||||
type | 'unit','file', 'pkg_list | |||
data | ||||
location_pipeline | template | |||
data_pipeline | base64_encode', 'template', 'base64_decode', 'utf8_encode','utf8_decode' | |||
permissions | ||||
node_filter | ||||
filter_set_type | ||||
filter_set items | ||||
filter_type | ||||
node_names | ||||
node_tags | ||||
node_labels | ||||
rack_names | ||||
rack_labels |
Rack
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
tor_switches | |||
mgmt_ip | |||
sdn_api_uri | |||
location | |||
clli | |||
grid | |||
local_networks* | |||
labels |
Region
Parameter | Sub-Category | Description | Example Value |
---|---|---|---|
tag_definitions* | |||
tag | |||
definition_type | lshw_xpath | ||
definition | |||
authorized_keys* | |||
repositories | |||
remove_unlisted | |||
repo_type+ | |||
url+ | |||
distributions | |||
subrepos | |||
components | |||
gpgkey | |||
arches+ | |||
options |