Definite runs as a private deployment in your own cloud (AWS, GCP, Azure, or anywhere with Kubernetes); you keep control of data, compute, and networking. Request a demo →

§ Agent · Datadog

The Datadog data agent that acts the way you would.

It keeps an eye on your Datadog telemetry alongside your deploys, releases, and warehouse, on a schedule you set or whenever fresh data lands. When something drifts, it tells you, or handles it the way you'd want.

See it on your Datadog data →Book a demo

DefiniteAPP9:14 AM · #platform-alerts

⚠️ checkout-api p99 latency up 2.4x, SLO error budget half-spent in 6 hours

APM duration on checkout-api jumped from ~180ms to 430ms right after the 14:10 deploy, well past your baseline, and the 99.9% SLO is burning budget fast enough to breach by Thursday.

Review & approve →Dismiss

Datadog APM Request Duration Metrics + SLO · joined to your deploy log · audit log

How an agent works

An agent watches one thing and acts on it. Not a workflow, just a standing watch that usually does nothing and acts the moment it should.

Watch

A number that matters

on your schedule, or when fresh data lands

→

Judge

Worth your attention?

remembers what it already flagged

→

Act

Tells you, or acts

Slack · webhook · hand off to Fi

◄ repeats on the schedule you set ►

Get started with your Datadog agent today →

You stay in control

An agent does what you'd do, and only what you've authorized.

The same trusted numbers

It acts on the same governed metrics as your dashboards, and every action is logged and traceable.

You approve anything that writes

It alerts and recommends on its own; anything that changes data is yours to approve.

Try it on a test channel first

Point a new agent at a throwaway channel and watch its judgment before it touches anything real.

No false alarms

It remembers what it already flagged and waits before acting again, so it won't alert you about the same thing twice.

What you can put an agent on

CorrelateACROSS YOUR SOURCES

Tie a latency spike back to the deploy that caused it

It watches your APM duration metrics next to your deploy and release history, so when p99 jumps it already knows which ship lined up with it, not just that something got slow. You get the regression and the likely cause together, instead of pivoting between Datadog and your CI history at 2am.

APM Request Duration MetricsAPI Request Logs

SLOs

Catch an error budget burning before it breaches

When an SLO starts burning budget faster than the period can absorb, it tells you which objective is at risk and how long you have, and looks at the logs behind the burn. You hear about the breach while there's still room to act, not in the postmortem.

Service Level Objective (SLO)APM Request Duration Metrics

Logs

Surface a new error pattern in the request logs

It watches your API request logs for error rates and patterns breaking trend, surfaces the endpoints and status codes involved, and lines up the context for you before it turns into a page. You find the quiet 500s before a customer does.

API Request Logs

Custom

Run any Python it needs to get the job done

Beyond alerts and write-backs, an agent can run arbitrary Python, so it can do whatever the task actually requires: hit the Datadog API, open an incident, reshape the metrics, or wire into your own tooling. The action space is yours to define.

Why not just build it yourself?

You could rig one of these with a cron job and a Slack webhook in an afternoon. The watching is the easy part. Here's what you'd own forever, and don't, here:

→The cross-source join: not one tool's data, but it reconciled against the rest of your stack
→A trusted, consistent metric: the same number your dashboards use
→The investigation into why, when something fires
→A full audit trail of everything it did
→The upkeep, when the schema drifts or the script breaks at 2am

See how autonomous agents work →

The data it works from

Every Datadog object, modeled and query-ready the moment you connect.

◆API Request Logs

infrastructure_devops

◆APM Request Duration Metrics

infrastructure_devops

◆Service Level Objective (SLO)

infrastructure_devopsoperations

It runs on your real Datadog account (noisy services, half-defined SLOs, log volume spikes and all), not a tidy demo.

Explore the Datadog connector →

Where it acts

Slack

A message in the channel you choose, with the context and a button to act on it.

Email

A summary in the inbox of the people who need to see it.

Webhook

A payload to your own systems, to wire the agent into whatever you already run.

Warehouse write-back

A flag written back to your warehouse for everything downstream to pick up.

Hand off to Fi

Kick the question to Fi to investigate the why and propose the fix.

MCP

Expose it to your own agents and tools over MCP, and drive it from your stack.

Run it in your own VPC or fully self-hosted. Everything it does is pure SQL and Python you can inspect.

See autonomous agents →

Build your agents with Fi

Fi is your AI analyst. It helps you build and customize everything in Definite, including the agents that watch and act.

Fi

Your AI analyst. Ask questions in plain English, and let it help you build and customize everything in Definite, including your agents.

Meet Fi →

Agents

The watchers and actors. Once you've built one, it runs on its own, keeping an eye on what matters and acting the way you would.

Autonomous agents →

Get started

1Connect Datadog, and the sources it needs to reconcile against. Synced and modeled in an afternoon.
2See the numbers tie out to what you already trust.
3Put an agent on one thing you can't afford to miss. Fi helps you build it.

Book a demo Try Definite free

§ FAQ

Common questions

You set the schedule, and it also re-checks whenever fresh Datadog data lands. Each agent watches the one thing you point it at, nothing else.

It alerts and recommends on its own. Anything that writes, whether to a tool, your warehouse, or a customer, is yours to approve. You can also point a new agent at a test channel first and watch its judgment before it touches anything real.

When something fires, it can hand off to Fi to investigate, drilling into the data it has across your connected sources to find what's behind the move, and showing its work.

Those fire on a threshold you hard-code inside Datadog, on Datadog data alone. This watches continuously, reasons across your telemetry plus your deploys and warehouse, and hands off to Fi to investigate why, so you get the cause with the alert instead of a number with no context.

Your answer engine
is one afternoon away.

Book a 30-minute call and we'll answer your first questions live, with your own data.

Schedule a demo →Try Definite free