Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
142 changes: 142 additions & 0 deletions core/src/main/java/io/temporal/samples/temporalmetricsdemo/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
# Temporal Cloud OpenMetrics → Prometheus → Grafana (Step-by-step)

This demo shows how to **scrape Temporal Cloud OpenMetrics(https://docs.temporal.io/cloud/metrics/openmetrics/)**
with **Prometheus** and **visualize them in Grafana**.

It uses the Grafana Temporal mixin dashboard template:
https://github.com/grafana/jsonnet-libs/blob/master/temporal-mixin/dashboards/temporal-overview.json

Once imported/provisioned, the dashboard lets you view the key Temporal metrics in a ready-made layout.
it may take a second to load, refresh if it takes longer than that.

![Grafana dashboard 4](docs/images/img4.png)

Grafana dashboard view :-

**temporal cloud openmetrics**
![Grafana dashboard 1](docs/images/img1.png)
![Grafana dashboard 2](docs/images/img2.png)


**worker metrics**
![Grafana dashboard 3](docs/images/img6.png)
![Grafana dashboard 4](docs/images/img7.png)

Prometheus :-

**cloud metrics**
![Prometheus Cloud metrics](docs/images/img3.png)
**worker metrics**
![Prometheus Worker metrics](docs/images/img3.png)

---

## 1) Create Service Account + API Key (Temporal Cloud)

OpenMetrics auth reference:
https://docs.temporal.io/production-deployment/cloud/metrics/openmetrics/api-reference#authentication

In Temporal Cloud UI:
- **Settings → Service Accounts**
- Create a service account with **Metrics Read-Only** role
- Generate an **API key** ( copy this, it will be needed later)
---


## 2) Create the `secrets/` folder + API key file

From the repo root (same folder as `docker-compose.yml`), run:

```
cd temporalmetricsdemo
mkdir -p secrets
echo "put your CLOUD API KEY HERE" > secrets/temporal_cloud_api_key
```
now the folder will look like below and temporal_cloud_api_key will have the above api key that we generated in step 1

```
temporalmetricsdemo/
├── docker-compose.yml
├── prometheus/
├── grafana/
├── secrets/
│ └── temporal_cloud_api_key
```

## 3) Configure TemporalConnection.java

Edit `TemporalConnection.java` and set your defaults:

```
public static final String NAMESPACE = env("TEMPORAL_NAMESPACE", "<namespace>.<account-id>");
public static final String ADDRESS = env("TEMPORAL_ADDRESS", "<namespace>.<account-id>.tmprl.cloud:7233");
public static final String CERT = env("TEMPORAL_CERT", "/path/to/client.pem");
public static final String KEY = env("TEMPORAL_KEY", "/path/to/client.key");
public static final String TASK_QUEUE = env("TASK_QUEUE", "openmetrics-task-queue");
public static final int WORKER_SECONDS = envInt("WORKER_SECONDS", 60);
```

## 4) 4) Update Prometheus scrape config

prometheus/config.yml
Update it to use your namespace
```
params:
namespaces: [ '<namespace>.<account-id>' ]
```


## 5) Start Prometheus + Grafana

docker compose up -d
docker compose ps


## 6) View Grafana dashboard

http://localhost:3001/

- Username: admin
- Password: admin

You should see the Temporal Cloud OpenMetrics dashboard.

## 7) Verify metrics in Prometheus

Prometheus: http://localhost:9093/

Go to:
Status → Targets (make sure the scrape target is UP)
Graph tab (search for Temporal metrics and run a query)

## 8) Ran the sample and view the cloud metrics

- `./gradlew -q execute -PmainClass=io.temporal.samples.temporalmetricsdemo.WorkerMain`
- `METRICS_PORT=9465 ./gradlew -q execute -PmainClass=io.temporal.samples.temporalmetricsdemo.Starter`

starter logs
```
➜ samples-java git:(deepika/openmetrics-demo) ✗ METRICS_PORT=9465 ./gradlew -q execute -PmainClass=io.temporal.samples.temporalmetricsdemo.Starter
Worker metrics exposed at http://0.0.0.0:9465/metrics
13:58:44.102 { } [main] INFO i.t.s.WorkflowServiceStubsImpl - Created WorkflowServiceStubs for channel: ManagedChannelOrphanWrapper{delegate=ManagedChannelImpl{logId=1, target=deepika-test-namespace.a2dd6.tmprl.cloud:7233}}

=== Starting scenario: success workflowId=scenario-success-6120c15f-b2f0-40cb-b3a6-f39bf0af6698 ===
Scenario=success Result=Hello Temporal

=== Starting scenario: fail workflowId=scenario-fail-0ff499af-1c3c-4a78-b75d-7d24c66ef46d ===
Scenario=fail ended: WorkflowFailedException - Workflow execution {workflowId='scenario-fail-0ff499af-1c3c-4a78-b75d-7d24c66ef46d', runId='', workflowType='ScenarioWorkflow'} failed. Metadata: {closeEventType='EVENT_TYPE_WORKFLOW_EXECUTION_FAILED', retryState='RETRY_STATE_RETRY_POLICY_NOT_SET', workflowTaskCompletedEventId=10'}

=== Starting scenario: timeout workflowId=scenario-timeout-e63954fb-c39b-4fb4-9dc9-ee2fa2835b52 ===
Scenario=timeout ended: WorkflowFailedException - Workflow execution {workflowId='scenario-timeout-e63954fb-c39b-4fb4-9dc9-ee2fa2835b52', runId='', workflowType='ScenarioWorkflow'} timed out. Metadata: {closeEventType='EVENT_TYPE_WORKFLOW_EXECUTION_TIMED_OUT', retryState='RETRY_STATE_RETRY_POLICY_NOT_SET'}

=== Starting scenario: continue workflowId=scenario-continue-26e780ab-1f0a-4222-9b2b-5c09c62cb824 ===
Scenario=continue Result=Hello Temporal

=== Starting scenario: cancel workflowId=scenario-cancel-08c9d1f6-6bfb-4ee9-9a5d-23b35fe1af7c ===
Scenario=cancel ended: WorkflowFailedException - Workflow execution {workflowId='scenario-cancel-08c9d1f6-6bfb-4ee9-9a5d-23b35fe1af7c', runId=''} was cancelled. Metadata: {closeEventType='EVENT_TYPE_WORKFLOW_EXECUTION_CANCELED', retryState='RETRY_STATE_NON_RETRYABLE_FAILURE'}
<===========--> 87% EXECUTING [18m 6s]

```
there will be some failures in the worker logs as we are intentionally failing workflows for the data generation purpose.
give few seconds to see the data in both the dashboard and try to run couple of workflows so the rate
queries show properly.
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
package io.temporal.samples.temporalmetricsdemo;

import io.temporal.client.WorkflowClient;
import io.temporal.client.WorkflowOptions;
import io.temporal.client.WorkflowStub;
import io.temporal.samples.temporalmetricsdemo.workflows.ScenarioWorkflow;
import java.time.Duration;
import java.util.UUID;

public class Starter {

public static void main(String[] args) throws Exception {
WorkflowClient client = TemporalConnection.client();

String name = "Temporal";
String[] scenarios = {"success", "fail", "timeout", "continue", "cancel"};

for (String scenario : scenarios) {
String wid = "scenario-" + scenario + "-" + UUID.randomUUID();

WorkflowOptions.Builder optionsBuilder =
WorkflowOptions.newBuilder()
.setTaskQueue(TemporalConnection.TASK_QUEUE)
.setWorkflowId(wid);

// workflow timeout
if ("timeout".equalsIgnoreCase(scenario)) {
optionsBuilder.setWorkflowRunTimeout(Duration.ofSeconds(3));
}

ScenarioWorkflow wf = client.newWorkflowStub(ScenarioWorkflow.class, optionsBuilder.build());

System.out.println("\n=== Starting scenario: " + scenario + " workflowId=" + wid + " ===");

try {
if ("cancel".equalsIgnoreCase(scenario)) {
WorkflowClient.start(wf::run, scenario, name);
Thread.sleep(2000);
WorkflowStub untyped = client.newUntypedWorkflowStub(wid);
untyped.cancel();
untyped.getResult(String.class);
continue;
}

// normal synchronous execution
String result = wf.run(scenario, name);
System.out.println("Scenario=" + scenario + " Result=" + result);

} catch (Exception e) {
System.out.println(
"Scenario="
+ scenario
+ " ended: "
+ e.getClass().getSimpleName()
+ " - "
+ e.getMessage());
}
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
package io.temporal.samples.temporalmetricsdemo;

import com.sun.net.httpserver.HttpServer;
import com.uber.m3.tally.RootScopeBuilder;
import com.uber.m3.tally.Scope;
import com.uber.m3.tally.StatsReporter;
import com.uber.m3.util.Duration;
import io.grpc.netty.shaded.io.grpc.netty.GrpcSslContexts;
import io.grpc.netty.shaded.io.netty.handler.ssl.SslContext;
import io.grpc.netty.shaded.io.netty.handler.ssl.SslContextBuilder;
import io.micrometer.prometheus.PrometheusConfig;
import io.micrometer.prometheus.PrometheusMeterRegistry;
import io.temporal.client.WorkflowClient;
import io.temporal.client.WorkflowClientOptions;
import io.temporal.common.reporter.MicrometerClientStatsReporter;
import io.temporal.serviceclient.WorkflowServiceStubs;
import io.temporal.serviceclient.WorkflowServiceStubsOptions;
import java.io.FileInputStream;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.InetSocketAddress;
import java.nio.charset.StandardCharsets;

public final class TemporalConnection {
private TemporalConnection() {}

// Read from environment (docker-compose env_file: .env)
public static final String NAMESPACE = env("TEMPORAL_NAMESPACE", "<namespace>.<account-id>");
public static final String ADDRESS =
env("TEMPORAL_ADDRESS", "<namespace>.<account-id>.tmprl.cloud:7233");
public static final String CERT =
env("TEMPORAL_CERT", "path/to/client.pem");
public static final String KEY =
env("TEMPORAL_KEY", "path/to/client.key");
public static final String TASK_QUEUE = env("TASK_QUEUE", "openmetrics-task-queue");
private static final int METRICS_PORT = envInt("METRICS_PORT", 9464);
private static final int METRICS_REPORT_SECONDS = envInt("METRICS_REPORT_SECONDS", 10);

private static volatile WorkflowClient CLIENT;
private static volatile WorkflowServiceStubs SERVICE;

private static volatile boolean METRICS_STARTED = false;
private static volatile PrometheusMeterRegistry PROM_REGISTRY;

public static WorkflowClient client() {
if (CLIENT != null) return CLIENT;
synchronized (TemporalConnection.class) {
if (CLIENT != null) return CLIENT;

SERVICE = serviceStubs();
CLIENT =
WorkflowClient.newInstance(
SERVICE, WorkflowClientOptions.newBuilder().setNamespace(NAMESPACE).build());
return CLIENT;
}
}

// create service stubs used by worker + starter
private static WorkflowServiceStubs serviceStubs() {
try (InputStream clientCert = new FileInputStream(CERT);
InputStream clientKey = new FileInputStream(KEY)) {

SslContext sslContext =
GrpcSslContexts.configure(SslContextBuilder.forClient().keyManager(clientCert, clientKey))
.build();

Scope metricsScope = metricsScope(); // ✅ tally scope that writes into Prometheus registry

WorkflowServiceStubsOptions options =
WorkflowServiceStubsOptions.newBuilder()
.setTarget(ADDRESS)
.setSslContext(sslContext)
.setMetricsScope(metricsScope) // ✅ Temporal SDK emits metrics here
.build();

return WorkflowServiceStubs.newServiceStubs(options);
} catch (Exception e) {
throw new RuntimeException("Failed to create Temporal TLS connection", e);
}
}

private static Scope metricsScope() {
synchronized (TemporalConnection.class) {
if (PROM_REGISTRY == null) {
PROM_REGISTRY = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
}

StatsReporter reporter = new MicrometerClientStatsReporter(PROM_REGISTRY);

Scope scope =
new RootScopeBuilder()
.reporter(reporter)
.reportEvery(Duration.ofSeconds(METRICS_REPORT_SECONDS));

if (!METRICS_STARTED) {
METRICS_STARTED = true;
startMetricsHttpServer(PROM_REGISTRY);
}

return scope;
}
}

private static void startMetricsHttpServer(PrometheusMeterRegistry registry) {
try {
HttpServer server = HttpServer.create(new InetSocketAddress("0.0.0.0", METRICS_PORT), 0);
server.createContext(
"/metrics",
exchange -> {
byte[] body = registry.scrape().getBytes(StandardCharsets.UTF_8);
exchange
.getResponseHeaders()
.add("Content-Type", "text/plain; version=0.0.4; charset=utf-8");
exchange.sendResponseHeaders(200, body.length);
try (OutputStream os = exchange.getResponseBody()) {
os.write(body);
}
});
server.setExecutor(null);
server.start();
System.out.println("Worker metrics exposed at http://0.0.0.0:" + METRICS_PORT + "/metrics");
} catch (Exception e) {
throw new RuntimeException("Failed to start /metrics endpoint", e);
}
}

private static String env(String key, String def) {
String v = System.getenv(key);
return (v == null || v.isBlank()) ? def : v;
}

private static int envInt(String key, int def) {
String v = System.getenv(key);
if (v == null || v.isBlank()) return def;
try {
return Integer.parseInt(v.trim());
} catch (Exception e) {
return def;
}
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
package io.temporal.samples.temporalmetricsdemo;

import io.temporal.client.WorkflowClient;
import io.temporal.samples.temporalmetricsdemo.activities.ScenarioActivitiesImpl;
import io.temporal.samples.temporalmetricsdemo.workflows.ScenarioWorkflowImpl;
import io.temporal.worker.WorkerFactory;

public class WorkerMain {
public static void main(String[] args) throws Exception {
WorkflowClient client = TemporalConnection.client();

WorkerFactory factory = WorkerFactory.newInstance(client);
io.temporal.worker.Worker worker = factory.newWorker(TemporalConnection.TASK_QUEUE);

worker.registerWorkflowImplementationTypes(ScenarioWorkflowImpl.class);
worker.registerActivitiesImplementations(new ScenarioActivitiesImpl());

factory.start();
System.out.println(
"Worker started. namespace="
+ TemporalConnection.NAMESPACE
+ " taskQueue="
+ TemporalConnection.TASK_QUEUE
+ " metrics=http://0.0.0.0:"
+ System.getenv().getOrDefault("METRICS_PORT", "9464")
+ "/metrics");

Thread.currentThread().join();
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
package io.temporal.samples.temporalmetricsdemo.activities;

import io.temporal.activity.ActivityInterface;
import io.temporal.activity.ActivityMethod;

@ActivityInterface
public interface ScenarioActivities {
@ActivityMethod
String doWork(String name, String scenario);
}
Loading
Loading