Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions docs/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@
"pages": [
"geneva/index",
"geneva/overview/index",
"geneva/getting-started",
{
"group": "Transforms",
"pages": [
Expand All @@ -181,26 +182,27 @@
{
"group": "Running Jobs",
"pages": [
"geneva/jobs/contexts",
"geneva/jobs/backfilling",
"geneva/jobs/bulk-load-columns",
"geneva/jobs/materialized-views",
"geneva/jobs/advanced-job-configuration",
"geneva/jobs/lifecycle",
"geneva/jobs/conflicts",
"geneva/jobs/performance",
"geneva/jobs/job_metrics",
"geneva/jobs/console",
"geneva/jobs/troubleshooting"
"geneva/jobs/troubleshooting",
"geneva/jobs/contexts"
]
},
{
"group": "Deployment",
"pages": [
"geneva/deployment/index",
"geneva/deployment/helm",
"geneva/jobs/startup",
"geneva/deployment/dependency-verification",
"geneva/udfs/advanced-configuration",
"geneva/deployment/index",
"geneva/deployment/troubleshooting"
]
},
Expand Down
4 changes: 2 additions & 2 deletions docs/geneva/deployment/dependency-verification.mdx
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
title: Dependency Verification
sidebarTitle: Dependency Verification
description: Diagnose and resolve package version mismatches between local and Ray worker environments.
description: Diagnose and resolve package version mismatches between local and distributed worker environments.
icon: magnifying-glass-chart
---

import { PyQuickFixManifest, PyEnvVarsViaCluster, PyPipManifest, PyCondaClusterPath, PyCondaClusterInline } from '/snippets/geneva_dependency_verification.mdx';

When running Geneva UDFs on Ray, your code is serialized locally and executed on remote workers. If the worker environment differs from your local environment, you may encounter subtle and difficult-to-debug errors.
When running Geneva UDFs on distributed workers, your code is serialized locally and executed on remote workers. If the worker environment differs from your local environment, you may encounter subtle and difficult-to-debug errors.

## Example environment mismatch errors

Expand Down
96 changes: 94 additions & 2 deletions docs/geneva/deployment/helm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -64,14 +64,14 @@ geneva:

azure:
# Azure managed identity client ID for the Geneva client.
# This identity should have a federated credential for the Geneva namespace
# This identity should have a federated credential for the LanceDB namespace
# and Storage Blob Data Contributor role on the storage account.
clientPrincipalId: ""
```

3. Install kuberay operator
```bash
export NAMESPACE=geneva
export NAMESPACE=lancedb

helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
Expand All @@ -89,4 +89,96 @@ kubectl apply -f nvidia-device-plugin.yml
5. Install Geneva Helm chart
```bash
helm install geneva ./geneva -n $NAMESPACE --create-namespace
```

## Default cluster and manifest

In LanceDB Enterprise, backfill and refresh jobs run on a **default cluster** (the compute
pool jobs run on) and a **default manifest** (the Python dependency environment — image and
packages). Configuring these in the LanceDB Enterprise chart lets jobs run out of the box
without per-job configuration. They are set under `geneva.defaults` in the chart's
`values.yaml`:

```yaml
geneva:
defaults:
cluster:
cluster_type: external_ray
name: deployment-default
ray_address: "ray://raycluster-kuberay-head-svc.lancedb.svc.cluster.local:10001"
manifest:
name: deployment-default
pip: [geneva, pyarrow, lancedb, pylance]
head_image: rayproject/ray:2.54.0-py312
worker_image: rayproject/ray:2.54.0-py312
skip_site_packages: true
```

If no default is configured, jobs must specify a manifest explicitly. Individual transforms can override the default manifest by pinning one
with `@udf` / `@chunker` / `@udtf` (see
[Advanced Job Configuration](/geneva/jobs/advanced-job-configuration)); to override the cluster
at runtime, use an [Advanced Execution Context](/geneva/jobs/contexts).

## Providing a Ray cluster

The LanceDB Helm chart can be configured to deploy a static KubeRay cluster, provision KubeRay clusters on demand per job, or
use an existing Ray cluster.

### Use default LanceDB Enterprise Ray cluster (default)

By default, LanceDB Enterprise will use a shared, statically provisioned Ray cluster for job execution.

This can be enabled in the Helm chart by setting the following values.

```yaml
raycluster:
enabled: true

global:
rayclusterUri: "ray://raycluster-kuberay-head-svc.lancedb.svc.cluster.local:10001"
```
Comment on lines +122 to +139

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm kind of confused how this interacts with the previous section. Like, "I just said what the default cluster was in the deployment-default bit above - now why do I have to set up raycluster: and global:?"

(I mean, I know this as an experienced user, but it still made me double take, which makes me think that it might be confusing to a new user.)

I guess the point to make is something like "geneva.defaults tells your jobs what cluster to use. But we assume you don't already have a Ray cluster, so you have to deploy one there; here's how you do that."


Configuration for the Ray cluster can be specified by modifying raycluster.yaml Helm values.

### Provision KubeRay clusters on demand

Set `global.rayclusterUri` to an empty value to provision ephemeral KubeRay clusters on-demand for each execution job. The default KubeRay cluster configuration
is specified in `geneva.defaults.cluster`, i.e.

```yaml
geneva:
defaults:
Comment on lines +145 to +150

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not good at helm but I'd love it if this were super explicit (do you mean ""? or some other empty value?) so like this:

Suggested change
Set `global.rayclusterUri` to an empty value to provision ephemeral KubeRay clusters on-demand for each execution job. The default KubeRay cluster configuration
is specified in `geneva.defaults.cluster`, i.e.
```yaml
geneva:
defaults:
Set `global.rayclusterUri` to an empty value to provision ephemeral KubeRay clusters on-demand for each execution job. The default KubeRay cluster configuration
is specified in `geneva.defaults.cluster`, i.e.
```yaml
global:
rayclusterUri: ""
geneva:
defaults:

cluster:
cluster_type: kuberay
name: deployment-default
kuberay:
namespace: lancedb
config_method: IN_CLUSTER
head_group:
service_account: geneva-service-account
num_cpus: 2
memory: 8Gi
image: rayproject/ray:2.54.0-py312
worker_groups:
- name: cpu
service_account: geneva-service-account
num_cpus: 4
memory: 8Gi
replicas: 2
min_replicas: 0
max_replicas: 4
idle_timeout_seconds: 60
node_selector:
geneva.lancedb.com/ray-worker-cpu: "true"
image: rayproject/ray:2.54.0-py312
```

### Use an external Ray cluster

Self-managed enterprise customers can bring an existing Ray cluster to run Geneva jobs. Simply set the rayclusterUri property in the Helm chart
to a Ray address that can be accessed from the LanceDB Enterprise deployment.

```yaml
global:
rayclusterUri: "ray://my-ray-cluster.my-ns.svc.cluster.local:10001"
```
8 changes: 4 additions & 4 deletions docs/geneva/deployment/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ via the instructions below.
In the following sections we'll use these variables:

```bash
NAMESPACE=geneva # replace with your actual namespace if different
NAMESPACE=lancedb # replace with your actual namespace if different
KSA_NAME=geneva-ray-runner # replace with an identity name
```

Expand Down Expand Up @@ -142,7 +142,7 @@ Geneva needs the ability to deploy a KubeRay cluster and submit jobs to Ray. The
In the following sections we'll use these variables:

```bash
NAMESPACE=geneva # replace with your actual namespace if different
NAMESPACE=lancedb # replace with your actual namespace if different
KSA_NAME=geneva-ray-runner # replace with an identity name
PROJECT_ID=... # replace with your google cloud project name
GSA_EMAIL=${KSA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com
Expand Down Expand Up @@ -218,7 +218,7 @@ Geneva can be used to provision Ray clusters running in Amazon Web Services (AWS
In the following sections we'll use these variables:

```bash
NAMESPACE=geneva # replace with your actual namespace if different
NAMESPACE=lancedb # replace with your actual namespace if different
CLUSTER=geneva # replace with your actual namespace if different
KSA_NAME=geneva-ray-runner # replace with an identity name
```
Expand Down Expand Up @@ -428,7 +428,7 @@ worker_spec = _WorkerGroupSpec(

with ray_cluster(
name="my-ray-cluster",
namespace="geneva",
namespace="lancedb",
cluster_name="geneva",
config_method=K8sConfigMethod.EKS_AUTH,
region="us-east-1",
Expand Down
Loading
Loading