Can you set up Trino in HA mode?

Originally posted on the Trino Slack channel.

Can you set up Trino in HA mode?

Here’s a popular repository if you want to role your own version of this: GitHub - lyft/presto-gateway: A load balancer / proxy / gateway for prestodb. It says presto but it is actually Trino, just needs a brand update.

I have run such a test case,
step1 : Launch the presto-gateway and start up two Trino cluster says clusterA and clusterB.
step2 : Then I issue some queries to the presto-gateway , it can distribute query between clusters, Its ok.
step3 : Then I shutdown clusterA and I issue some queries to the presto-gateway I found some query failed because clusterA is dead .So I am not very sure that can presto-gateway switch request to the active cluster automatically or is my problem? Thanks!

1 Like

That’s not how it works. Gateway tracks which cluster each query goes against and keeps it there. So, you can mark a cluster inactive and gateway will stop sending new queries there. The old queries are still running. You should only shut down the cluster when no queries are left (this can he done easily with graceful shut down). At the time of writing, Trino has no way to get true HA and avoid having any query fail. If your coordinator fails, running queries will fail.

You can leverage gateway for:

  1. Zero downtime release.
  2. Convenient way to stack multiple clusters behind a DNS.
  3. We coded it so the users can add tags to route to high memory or high cpu clusters, etc.

We actually run our clusters in Kubernetes and just make sure an extra node is around for the coordinator to jump to if it dies (30 seconds or less downtime). So, that’s their HA mechanism.

Pre k8s we did HA with HAProxy and actually kept 2 running, I have an old blog post for that here → Presto Coordinator High Availability (HA) | Coding Stream of Consciousness.

To be honest, we find coordinator failures super rare. They run close to 20PB (from s3) of data in around 2 million queries a month now, and the coordinator often stays up for 100+ days without restart if they don’t do a release. If it’s got the needed CPU/memory/config it’s pretty solid.

3 Likes

I agree with Coordinator resiliency overview. Leveraging a K8s platform that has built in node/pod recovery helps to reduce the need for otherwise complex solution.

Where a proxy/gateway and multiple coordinators is a nice solution is when lots of smaller queries come flooding in and leveraging something like resources groups is not enough to keep the coordinator from getting saturated. (i.e. lots of jobs are queued and waiting for “planning” yet the workers are almost idle)

A round-robin approach to balancing the load across 2 (or more) moderately sized coordinators with fewer workers each can provide better response time than a single massively sized coordinator with lots of workers.

1 Like