- Documentation
- Observability
- Continuous CPU Profiling
Continuous CPU Profiling
Every service on Guara Cloud emits CPU profiling samples continuously. The platform collects them with eBPF and stores them in Grafana Pyroscope, then renders them as interactive flamegraphs in the dashboard. There is no SDK to install, no environment variable to set, and no opt-in toggle: profiling is always on for every service you deploy and for every catalog service you provision.
Where to find it
There are two surfaces:
- Service detail page → Profiling tab. The flamegraph for a single service over the time window you pick. Use this when you’re focused on one service.
- Profiling Explorer. A standalone page that lets you pick any project / service / time range from a sidebar and compare flamegraphs side-by-side. Use this when you’re investigating across services or sharing a profile with a teammate.
Both surfaces let you zoom into stack frames, see self and total CPU percentages, and copy a deep-link URL that captures the project, service, time range, and zoom state.
Time-range retention by plan
How far back you can query profiles depends on your plan. The dashboard’s time-range picker hides options outside your retention.
| Plan | Profile retention |
|---|---|
| Hobby | 24 hours |
| Pro | 7 days |
| Business | 7 days |
| Enterprise | 7 days |
If you request a window that exceeds your plan’s retention through the API, the platform returns a clear PROFILING_TIER_RETENTION_EXCEEDED error with the maximum window your plan allows.
Supported runtimes
Profiling works for every runtime Guara Cloud supports. Symbol fidelity (whether stack frames are named or appear as raw addresses) depends on whether the runtime needs help from the JIT to expose its frames. The platform injects the right environment variables automatically based on your service’s runtime — you don’t need to do anything.
| Runtime | Symbol fidelity | Notes |
|---|---|---|
| Node.js | Full | Auto-injects NODE_OPTIONS=--perf-basic-prof-only-functions --interpreted-frames-native-stack. |
| JVM (Java, Kotlin, Scala) | Full | Auto-injects JAVA_TOOL_OPTIONS=-XX:+PreserveFramePointer. |
| Python | Full | Pure eBPF — no env vars needed. |
| Go | Full | Pure eBPF — no env vars needed. |
| Rust | Full | Pure eBPF — no env vars needed. |
| Other native runtimes | Best-effort | Frames may show as addresses if the runtime doesn’t expose symbols. |
If you’re already setting NODE_OPTIONS or JAVA_TOOL_OPTIONS in your service’s environment variables, the platform appends to your value rather than replacing it.
Reading a flamegraph
A flamegraph is a top-down view of where your service spent CPU time during the selected window:
- Width of a frame = how much CPU time it consumed (its total percentage).
- Stacking = caller above, callee below. The widest leaf at the bottom of any tower is usually the hot loop.
- Self % (shown on hover) = time spent directly in that function, excluding callees.
- Total % = self time plus all callees.
Use the search box to highlight every frame whose function name matches a substring — handy when you remember a function name but not where it sits in the call graph.
Click a frame to zoom in (the rest of the graph collapses to that subtree). Click the breadcrumb above the graph to zoom back out.
Trace → profile
When you open a trace span in the Traces view, the View profile action jumps to the flamegraph for the same service over the surrounding time window. This is the fastest way to answer “the request was slow — was the service CPU-bound or waiting on I/O?”.
Errors you might see
| Error code | What it means |
|---|---|
PROFILING_NO_DATA | No samples for the selected window. The service may have just started, been idle, or crashed. |
PROFILING_TIER_RETENTION_EXCEEDED | Your time range goes further back than your plan’s retention. Pick a shorter window or upgrade. |
PROFILING_PYROSCOPE_UNAVAILABLE | The profiling backend is temporarily unavailable. Try again in a moment. |
Privacy and overhead
eBPF-based profiling samples the CPU at a fixed rate (typically ~100 Hz per CPU) and only captures call stacks — never application data, request bodies, environment variables, or memory contents. Overhead is bounded to a few percent of CPU per service in the worst case and is invisible in practice.
From the CLI
guara services profiles get renders an ASCII flamegraph for the same window the dashboard would show, plus a top-N self-time table for at-a-glance hot frames. Use it from a shell session, an SSH jump-box, or pipe --json into your own tooling.
guara services profiles get
guara services profiles get --time-range 1h --top 20
guara services profiles get --from 2026-05-02T15:00:00Z --to 2026-05-02T15:15:00Z
guara services profiles get --json | jq .data.metadata
Sample output (truncated):
my-api · profile · 2026-05-02T15:00:00Z → 2026-05-02T15:15:00Z
▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 100.0% total
▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 62.4% handleRequest
▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 38.1% db.query
▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 25.8% pg.client.send
▇▇▇▇▇▇▇▇ 12.6% serialize
▇▇▇▇▇▇▇▇▇▇▇▇ 18.7% gc.minorMark
Top 10 by self time
# SELF % FRAME
1 25.8% pg.client.send
2 18.7% gc.minorMark
3 12.6% serialize
4 8.4% crypto.randomFillSync
5 7.1% net.write
…
Tier: Pro · retention 168h · window 15m
When stdout is not a TTY (piped into another command, redirected to a file), the flamegraph itself is omitted and only the top-N table is printed for legibility.
See guara services profiles get in the CLI reference for the full flag list.