Installing Trino

This section provides a step-by-step guide to installing Trino.

Trino is a SQL query engine and does not store data. Instead, Trino interacts with various databases or directly on object storage. Trino parses and analyzes the SQL query you pass in, creates and optimizes a query execution plan that includes the data sources, and then schedules worker nodes that are able to intelligently query the underlying databases they connect to.

Here, Trino is deployed in accordance with Minio for Dataset Store, Hive Metastore for Metadata, and Redis for table schema.

Prerequisites:

Before performing the installation, make sure to have the following things in place:

  • kubectl: Install primary command-line tool for managing Kubernetes clusters, which allows to inspect, manipulate, and administer cluster resources.
  • helm: A package manager for Kubernetes. Helm allows you to deploy, upgrade, and manage applications within your cluster using pre-defined charts.

To install Trino and its component, follow the steps below:

  1. As a first step to access the resources needed for deploying Trino on Kubernetes, clone the GitHub repository.
    git clone https://github.com/minio/blog-assets.git
    cd blog-assets/trino-on-kubernetes
  2. Create a new namespace for Trino in kubectl to provide isolated environments for applications.
    kubectl create namespace trino --dry-run=client -o yaml | kubectl apply -f -
  3. Create a generic Kubernetes secret for sourcing data from a JSON file to secure the Redis table schema.
    kubectl create secret generic redis-table-definition --from-file=redis/test.json -n trino || true
  4. Then, Add the Bitnami and Trino repositories to Helm configuration.
    helm repo add bitnami https://charts.bitnami.com/bitnami || true
    helm repo add trino https://trinodb.github.io/charts/ || true
  5. After adding the repositories, install PostgreSQL.
    helm upgrade --install hive-metastore-postgresql bitnami/postgresql -n trino -f hive-metastore-postgresql/values.yaml
  6. Update the /blog-assets/trino-on-kubernetes/hive-metastore/values.yaml file with Minio details and deploy the Hive Metastore within the Trino namespace.
    helm upgrade --install my-hive-metastore -n trino -f hive-metastore/values.yaml ./charts/hive-metastore
  7. Redis is a high-speed, in-memory data store used to hold Trino table schema for enhanced query performance. Deploy Redis in the Trino namespace using helm chart.
    helm upgrade --install my-redis bitnami/redis -n trino -f redis/values.yaml
  8. Update the security context to allow permissions to default service account on namespace where trino will be deployed.
    oc edit scc anyuid
    users:
    - system:serviceaccount:trino:default
  9. Update /blog-assets/trino-on-kubernetes/trino/values.yaml with Minio details, and deploy Trino as the distributed SQL query engine that will connect to MinIO and other data sources.
    helm upgrade --install my-trino trino/trino --version 0.7.0 --namespace trino -f trino/values.yaml
  10. On successful deployment, verify that all the components are running correctly.
    kubectl get pods -n trino
  11. As a result, the output will be as shown below.
    $ oc get pods
    NAME                                    READY   STATUS    RESTARTS   AGE
    hive-metastore-postgresql-0             1/1     Running   0          6h59m
    my-hive-metastore-0                     1/1     Running   0          6h57m
    my-redis-master-0                       1/1     Running   0          6h56m
    my-redis-replicas-0                     1/1     Running   0          6h56m
    my-redis-replicas-1                     1/1     Running   0          6h56m
    my-redis-replicas-2                     1/1     Running   0          6h55m
    my-trino-coordinator-767ff55b85-vp7wm   1/1     Running   0          6h40m
    my-trino-worker-5f6bcd5c46-7n9gf        1/1     Running   0          6h47m
    my-trino-worker-5f6bcd5c46-gchqd        1/1     Running   0          6h46m
    $