How Riskfuel is using Inlets to build machine learning models at scale

Introduction

Riskfuel trains neural network models on datasets consisting of hundreds of millions of data points. Running at such a scale presents its own set of unique challenges, especially when looking to maintain reliable pipelines and infrastructure.

Our clients are often large institutions like banks and insurance companies who consider security a top priority. As a result, it’s often preferred to run some or all of the processes required to produce a model on a client’s on-prem or cloud environment. When looking to deploy to a client’s environment, there are often many challenges which stem from the limited observability granted to us.

A few months ago, I reached out to Alex looking for a solution to a specific problem we’d been having:

Is there a better way to do remote deployments which doesn’t involve passing commands over zoom and email?

In this blog, we’ll show how Riskfuel is using Inlets to securely oversee fully remote and hybrid cloud deployments. I’ll also touch on how we train machine learning models on our clients’ infrastructure using Inlets to send millions of control messages.

Why remote environment management is important to us

When running our software in a client’s environment, it’s important that we make sure everything is working as expected before we attempt the generation of hundreds of millions of datapoints and trigger GPU instances galore. Having better visibility of the environments we are managing allows for smoother deployments and more effective use of compute.

Our software stack is built on the strong foundation that is Kubernetes. In order for us to be able to remotely manage and maintain our deployments we need access to the Kubernetes API server. If we are able to securely access the Kubernetes API server, then we have the ability to debug and perform maintenance tasks with optimal efficiency.

We had two different deployment models to consider:

fully remote
hybrid

In the case of a fully remote deployment the entire process of generating a dataset and training a neural network happens exclusively on the client’s compute environment. In this case, we are running Riskfuel software on our client’s hardware.

Pictured: An overview of the model creation process running exclusively on the client’s compute environment

In many cases, we might only need to perform a subset of tasks on the client’s compute. For example, the generation of training data will typically happen inside of the client’s environment while training often leverages Riskfuel’s existing GPU clusters and infrastructure.

In these cases, we may opt for a hybrid deployment where some of the processes occur in the client’s compute environment and the remainder is performed on Riskfuel’s compute environment. In this case we might look to expose not only the Kubernetes API server, but also our Apache Pulsar instance which passes messages between mircroservices.

Pictured: An overview of the model creation process using a hybrid deployment model

Connecting to a remote Kubernetes cluster using an inlets Pro tunnel

We solved our problem of servicing remote deployments by leveraging an inlets Pro tunnel to connect to Kubernetes API server endpoints which run outside of our environment.

How Riskfuel remotely connects to the Kubernetes API servers using Inlets tunnels

Deploying the inlets Pro server

We will first need to generate a token which will be used by the client to authenticate with the server.

export TOKEN=$(head -c 16 /dev/random | shasum|cut -d" " -f1)
kubectl create secret generic -n default \
  inlets-pro-secret \
  --from-literal token=$TOKEN
# Save a copy for later
echo $TOKEN > token.txt

Then, using the secret we can deploy the inlets Pro exit server using the following Kubernetes deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: inlets-server
spec:
  replicas: 1
  selector:
    matchLabels:
      app: inlets-server
  template:
    metadata:
      labels:
        app: inlets-server
    spec:
      containers:
      - name: inlets-server
        image: ghcr.io/inlets/inlets-pro:0.9.5
        imagePullPolicy: IfNotPresent
        env:
        - name: TOKEN
          valueFrom:
            secretKeyRef:
              name: inlets-pro-secret
              key: token
        command:
        - /bin/bash
        - -c
        args:
        - >-
          inlets-pro server \
            --auto-tls=true \
            --control-port=8001 \
            --token=$TOKEN \
            --auto-tls-san=INLETS_SERVER_IP

Make sure you update the value for INLETS_SERVER_IP

Running the inlets Pro client

The inlets-pro repository provides charts for both the client and the server.

The client helm chart expects a secret containing the same token used by the server so we will need to create it on both ends:

kubectl create secret generic -n default \
  inlets-pro-secret \
  --from-literal token=<token>

Next, we can quickly deploy the client using the helm chart:

helm repo add inlets-pro https://inlets.github.io/inlets-pro/charts/
helm repo update

helm install \
  --namespace default \
  --set autoTLS=true \
  --set ports=6443 \
  --set upstream=https://<kube-apiserver-ip>:6443 \
  --set url=wss://<inlets-server-ip>:<control-port> \
  --set fullnameOverride="inlets-client" \
    inlets-client \
    ./inlets-tcp-client

Now, provided our Kubernetes API server is configured to allow incoming requests to come in via <inlets-server-ip>, we can run kubectl commands as if this were any other cluster in our kubeconfig.

Sending messages to a message queue through an inlets Pro tunnel

In most cases, using the Kubernetes API server to relay information describing how a set of experiments are performing is enough to ensure the prompt delivery of trained models. In some cases however, we might look to leverage a more iterative approach where the state of our experiments needs to be evaluated in real time.

A common task associated with this is to send data generation requests to a message queue. Using Inlets we’ve been able to pass messages through an Inlets tunnel at speeds of up to 400mbps (around 40% saturation of a single outbound connection from our datacenter). For us, this is sufficient for passing instructions which describe the experiments we wish to run in a particular remote environment.

How Riskfuel is using Inlets to allow for message passing across firewall boundaries

Wrapping up

Having the ability to securely watch over our deployments is a game changer. We are currently building this into our deployment process in order to reduce the need for back and forth emails and zoom calls with clients.

Looking ahead, we’d like to expand our use cases to include:

Remote development environments like Jupyterlab and VSCode
Product usage metering and metric collection using Prometheus and Grafana

If you have any questions about Riskfuel or our Inlets use-cases, feel free to check out our website or reach out at addison@riskfuel.com.

Taking things further

To contact Alex or the inlets team, use the contact form.

Watch a free webinar explaining inlets use-cases:

A tale of two networks - demos and use-cases for inlets tunnels

See a quick demo: