Create staging network, ironic-based openstack deployment, and provide support for prototype provisioning for ubuntu 12.04

From MozillaWiki
Jump to: navigation, search

This document describes the design options for using Openstack to provide a common end-user API to facilitate self service across the Mozilla build/test environment

Recommended solution: Non-SDN Design 4 new VLANs minimum

  • Core
  • Try
  • Loaner
  • Admin
    • connects all administration hosts (eg. Conductors, Compute hosts)
    • All admin hosts are ESX VMs
    • Secondary NICs connect to core, try, and loaner networks as necessary for DHCP/TFTP

Default GW for all VLANs is the releng fw ( Non Neutron ) Each VLAN is isolated as a separate AZ with openstack IMPI, iLO, PDUs, etc. are on the existing releng BU 'inband' VLAN (these can be outband as long as they are routeable. no need to have them on the same VLAN)


For loaners, BM nodes can use standard scheduler to loan out For production, each BM will be requested through the standard Openstack scheduler but will be filtered down to a metadata tag unique to each BM node (eg. asset tag, serial) via individual aggregate groups or some other scheduler filter

  • Our 2 options for doing this are:
    • create a AZ for each node. When requesting a node, the caller must call an AZ (node) within the group needed (eg. AZ = nodeXYZ, which is within Try VLAN)
    • Write a secondary filter similar to a AZ but one that would be exposed to the end user


Bare metal nodes will be grouped to an AZ matching the VLAN When requesting an AZ or a single node through a node dedicated AZ, caller must also select a matching network the the network the node is located on Additional aggregate host groups will define hardware groups (mac mini rX, ix7000, etc)

DHCP will be provided through each conductor and all VLANs containing BM nodes should forward DHCP traffic to all conductors via DHCP helpers


Networks

    • These sizes may need adjustment based on host counts + moderate growth
    • /21 = try.cloud.releng.scl3.mozilla.com
    • /22 = core.cloud.releng.scl3.mozilla.com
    • /23 = loaner.cloud.releng.scl3.mozilla.com
    • /23 = admin.cloud.releng.scl3.mozilla.com

Decisions/Assumptions

  • Network traffic should not flow through Linux hosts, because we don't have the skill or hardware support to handle the massive data flows that would result
  • Avoid SDN, for similar reasons. If we do have to do SDN, do it using "normal" OpenStack tools operating on dedicated network hardware, rather than trying to build a solution that will work for all of Mozilla's systems.
  • While isolating each host from all others is desirable, it is difficult given the above decisions, so we will fall back to the current level of isolation: hosts at each trust level can only communicate with other hosts at that trust level
  • The scheduling implementation relies on deployment of taskcluster, or a tool like it, that will interface properly with the dynamic provisioning API

Notes

  • "loaner" is considered a different trust level than "try", so loaner hosts must be isolated from try, in a distinct VLAN. Moves between VLANs are manual and rare, so that VLAN needs to have a minimum pool of dedicated hosts for each platform, unlike the current practice of taking specific machines out of production and then adding them back in.

Costs/Risks

  • Need some time from netops to set up networks/VLANs/flows
  • Solution is based on icehouse (just released) and ironic (beta), so lots of bugs and issues are expected
  • Significant development work required to support more than Ubuntu installs
  • Interfacing with existing deployment tools such as JAMF and WDS is an unknown


Alternatives

  • Use SDN to allow bare metal hosts' VLANs to be changed dynamically
    • costs/risks: configuring SDN, probably with dedicated network hardware
      • unfamiliar technology
      • needs time from netops, dcops
      • definitely needs dedicated switches; may require dedicated firewall
      • may not work
    • benefit: all nodes can be in the same AZ
    • timeline: plus quarters, depending on how much needs to be dedicated hardware
  • configure one VLAN per host with SDN on the firewall
    • costs/risks: SDN configuration
      • unfamiliar technology
      • security issues with dynamic configuration of firewalls
      • must interact with netops to establish SDN techniques
      • high-risk work: firewall changes generally require TCWs, so we'll wait for a TCW for every incremental change
      • expensive option: purchase a staging firewall to test against
      • may not work
    • benefit: inter-node network isolation
    • timeline: plus several quarters
    • 4000 vlan limitation
  • configure one VLAN per host with flows through neutron
    • costs/risks: neutron configuration, building appropriate neutron hardware
      • high-volume network flows require extensive tuning
      • HA is hard to achieve at this level
      • may not be possible with neutron and without SDN
      • may not even be possible with SDN
    • benefit: inter-node network isolation
    • timeline: plus several months, if it works

Next Steps

  • Choose a preferred solution
  • Build the prototype out in puppet
    • Install process should be easily customized with local patches and/or custom components -- so don't use upstream packages verbatim
    • Be generic enough that another organization can use the same modules to install a more "normal" openstack (e.g., webeng may be interested)
    • Something functional in a matter of 2-3 weeks
  • Build out infrastructure
    • VLANs + plumbing for ESX, firewall
    • Flow Requests
    • Set up with hosts in place in 1-2 weeks
  • After:
    • Begin working on back end drivers
      • timeline depends on level of engineering assistance, compatibility of existing tools like JAMF, WDS, etc.

Open Questions

  • does ironic support AZ conductor isolation so bm node X in vlan A gets mapped to a conductor in Vlan A
    • Ironic has no knowledge of host aggregation boundaries.
  • Do conductors have to be in the same VLAN, or can we get away with DHCP helpers?
    • Because Ironic has no knowledge of host aggregation boundaries, conductors should not be isolated to separate VLANs. We should use DHCP helpers to forward the traffic to all conductors