vac:dst:ift:vaclab

Description

The VacLab is a resource provided to Vac by Riff Labs Limited, intended to help us perform detailed simulations and deployments of distributed systems at scale as well as the systems and dependencies that surround them.

With the VacLab reaching maturity where it can start being used comfortably to advance IFT’s research, development testing efforts, and quality control, we want to ensure it is being used to its full potential and that teams understand that the resource exists, and if they find a useful case for their work being improved by collaboration with the DST team through VacLab, they should be comfortable reaching out to us, and confident that we will be willing to try and help them, based on our attitude and more importantly the track record and results we will produce using these tools and our expertise.

The lab will be treated as an IaaS-style service at first, with the raw underlying infrastructure being developed in partnership with Riff Labs who handles the details of making that IaaS layer available to us and reliable.

As we progress through the maturity of the lab, we will transition to supporting a more PaaS or even SaaS software model, where as much as possible is accessible to the IFT ecosystem and teams to use and benefit from without them needing to concern themselves with the details of the underlying infrastructure or be blocked by the need to build and manage their own.

We will move towards self service testing and deployment, and by doing so unblock and accelerate the development, R&D and productionisation of IFT’s projects by providing a safe and reliable place to experiment and test.

It will continue to provide significant efficiency benefits in terms of cost vs output when compared to cloud providers and even on traditional premises deployment of infrastructure, using many independent and cheaper nodes rather than larger more powerful vertically scaled machines, building on second hand and used equipment, and “patching around” the unreliability of individual hardware, by ensuring everything is resilient and reliable even in the face of individual failures, and in doing so continue to reduce and control the costs of testing our systems at scale.

Through the use of the VacLab we will support the Conduit of Expertise narrative by:

  • providing a unique capability to the IFT ecosystem that would not otherwise be available to them, lower the barrier to entry for teams needing research - or services that require infrastructure - by lowering the cost and removing the need for them to get it themselves through cloud providers that provide less flexibility and direct control,

  • using our knowledge of what is possible to do with these resources, based on who is already using them, and apply that knowledge to intuit new use cases that will unlock better collaboration between teams and the DST, driving and accelerating development of IFT projects such as Waku, Codex, Nomos and more.

  • Accelerating initiatives by providing the means, capability and encouragement to test every aspect of anything that can be tested in a simulation, across every team and use case that is interested in doing so, up to the limits of what the DST team can support.

We will also provide support for the Premier Research destination narrative by:

  • Allowing public access to non-sensitive telemetry and metrics from non-sensitive systems such as Codex storage nodes, and potentially even probes that measure the state of networks such as The Waku Network and Status.

Task List

Status Page Known

  • fully qualified name: vac:dst:ift:vaclab:status-page-known
  • owner: Wings
  • status: 0%
  • start-date: 2024/12/01
  • end-date: 2024/12/10

Description

A status page for the VacLab that has wide acceptance and use by anyone who wants to know the current status of the VacLab and its availability.

Deliverables

  • Status page reflects reality and is accepted by the users as being a good fit for their needs.
  • Status page sees widespread use among its users.
  • Build an external probe and a fallback status page that can be used in case everything else is down.

Better Time Slicing

  • fully qualified name: vac:dst:ift:vaclab:better-time-slicing
  • owner: Wings
  • status: 0%
  • start-date: 2024/12/01
  • end-date: 2024/12/14

Description

Do a better job of time slicing the lab.

Deliverables

  • A report on the current state of time slicing in the lab.
  • A plan for how to improve time slicing in the lab.
  • A timeline for implementing the plan.
  • Measurable improvements in usage of the lab that aims for an initial target of 25% of real world time being used for useful workloads and tests

Later repeats in the VacLab commitment will aim to improve this to 50%, then 75%, then as far as possible to the limits of the underlying infrastructure and our actual needs.

Train Lab Staff

  • fully qualified name: vac:dst:ift:vaclab:train-lab-staff
  • owner: Wings
  • status: 30%
  • start-date: 2024/12/01
  • end-date: 2024/12/31

Description

Fully dedicate all time outside of core DST deliverable work to training Michaela, the VacLab (Riff Labs Perth) custodian, in all aspects of not just managing the VacLab, but providing support to DST’s work that utilises it, with the focus of improving both the reliability of the lab and provide a better systems testing service.

Will - must, for practical reasons - be done in person in Perth.

Will also be used to improve the reliability and capabilities of the VacLab as a platform for IFT’s research and development needs.

Must not impact other works outside of this task.

Deliverables

  • Full automation for anything we know needs doing regularly
    • Automated patching for security updates (Debian, Authentik, SeaweedFS)
    • Secure key management and rotation automation (for SSH keys)
  • Michaela fully comfortable operating the lab independently
  • A report on what was learned in this process and how we believe it improved VacLab support and operations
  • Improvements to the lab that are documented, implemented and recorded.

Automation Uplift

  • fully qualified name: vac:dst:ift:vaclab:automation-uplift
  • owner: Wings
  • status: 0%
  • start-date: 2024/12/01
  • end-date: 2024/12/31

Description

Significantly improve the automation and management of the VacLab, freeing up resources for Wings to focus on other work.

Deliverables

  • Full automation for anything we know needs doing regularly
    • Automated patching for security updates (Debian, Authentik, SeaweedFS)
    • Secure key management and rotation automation (for SSH keys)
  • A report on what was learned in this process and how we believe it improved VacLab support and operations
    • What was automated? Why? What did that change?
    • What remains manual and needs improving?
  • Improvements to the lab that are documented, implemented and recorded.