Everybody has a Datadog cost problem

Today we're announcing a new specialized agent available to use on Firetiger, designed to help platform engineering teams cut their Datadog bill in half.

Datadog costs are out of control, and everyone knows it

I've used Datadog since 2016 (what? 10 years now?), and from the start the question of how to manage cost has been top of mind. And I've heard this same story a thousand times from peers, everybody who is using Datadog has a Datadog cost problem.

So it's no surprise that one of the first things Firetiger customers asked their agents to do was audit their Datadog usage and find ways to cut it.

Also not surprising that some customers found opportunities to cut their bill down by over 60%!

How did we get there?

At the root of this issue is the incentive misalignment between the vendor (Datadog) and the customer. The vendor makes money when the customer writes more data, not when they get more value out of the product.

This sentiment is everywhere. Pierre de Wulf captured it perfectly on LinkedIn, highlighting how the cost structure is the problem.

In the early days, the cost makes sense. Every metric and tag you add brings genuine visibility, and you get tremendous value from the data flowing into Datadog. But over time, metrics go unused as the teams move on. Tags get added without anyone considering how they'll impact cardinality, and cost grows quietly in the background.

Before you know it, you've gone from the honeymoon phase where sending more data to Datadog felt like a superpower, to being afraid of it. You want to add a tag to get visibility into a new dimension, but you hesitate, because last time someone did that it blew up the bill. The same action that used to bring clarity now brings dread. You're doing exactly what made you love the product, and it's punishing you for it.

Agents know how to fix your bill

A couple years ago when I was at Segment, I had the pleasure to work with Mr. Ben Yolken; Ben was the kind of engineer you'd throw any problem at and know he would do an amazing job at solving it; so eventually the problem of controlling Datadog cost fell on his plate.

Ben developed a dashboard displaying cost estimates for all our metrics, which allowed us to track down the worst offenders, and stop leaking cash to pay for metrics that nobody ever looked at.

However, week-over-week the situation would degrade, new metrics would pop-up, or new tags would cause cardinality explosions, and the cycle would repeat. Every month Ben had to be called for damage control on observability cost when the situation had gone back to being unbearable.

This is the story of so many engineering teams. The work required to manage this cost center is substantial, but analyzing the problem and understanding what to do about it always requires applying expertise across multiple layers of the stack: monitoring cost, correlating with the codebase, understanding what we're measuring, etc...

The problem isn't building the dashboard, it's keeping someone watching it, forever.
That's what agents do.

Instead of having to task people ad-hoc to avert a crisis, the agents will tirelessly work on those problems, identifying issues before they become emergencies, proposing changes or applying automatic remediations.

Get your agents to work!

The newcomer to the Firetiger agent catalog is the Datadog Cost Expert, and it's ready to start digging into your custom metrics, scanning every dashboard, monitor, SLO, notebook, looking for what's unused and costing you more than it should.

Start by creating a Datadog connection, with API and App keys giving the agents permissions to access the APIs.

Then on the Firetiger agent catalog, select the Datadog Cost Expert and hit "Add to My Agents".

The agent will then start exploring your setup to draft a cost optimization plan tailored to your usage.

You can configure the agent to run on a schedule, or execute it manually as needed to receive a report of actionable changes to make such as:

  • Disabling metrics that are not used in any dashboard, monitor, etc...
  • Disabling tags that drive up cardinality and aren't being used either
  • Analyzing trends to surface unexpected changes in usage

To summarize:

  1. Create a Datadog connection to give agents access to the API
  2. Add the Datadog Cost Expert agent that runs ad-hoc or on a schedule
  3. Receive actionable reports of to cut down you Datadog bill

Your Datadog bill doesn't have to be something you dread every month. You can sign up to Firetiger to create your Datadog Cost Expert agent today. Get your own tireless Ben Yolken, and let us know how much it saved you!

Subscribe to The Firetiger Blog

Subscribe to get new posts delivered straight to your inbox as soon as they're published.
jamie@example.com
Subscribe