What are DORA metrics for software engineering teams

DORA metrics are four measurements validated by Google's DevOps Research and Assessment team: deployment frequency (how often you ship), lead time for changes (how fast code goes from commit to production), change failure rate (what percentage of deploys break something), and mean time to restore (how quickly you fix it). Elite teams deploy multiple times per day with lead times under an hour, sub-5% failure rates, and restore times under an hour. Track them as a team baseline and improvement tool, never as individual performance metrics.

What Are DORA Metrics and Should Your Engineering Team Be Tracking Them

Every few months, a CEO asks me: "I keep hearing about DORA metrics. Should we be tracking them?" The short answer is yes. The longer answer is: yes, but how you use them matters more than whether you collect them.

DORA, DevOps Research and Assessment, started as a research program that Google acquired in 2018. The team studied thousands of engineering organizations over several years and identified four metrics that consistently predict both technical performance and business outcomes. Not opinion. Not theory. Statistical correlation across a massive dataset.

Here's what they are and what they actually tell you.

The Four Metrics

Deployment frequency. How often does your team ship code to production? This measures your ability to deliver value to customers. Elite teams deploy on demand, multiple times per day. Low performers deploy monthly or less frequently.

Why it matters: deployment frequency is a proxy for batch size. Teams that deploy frequently ship small changes. Small changes are easier to review, easier to test, easier to roll back, and easier to understand when something goes wrong. Teams that deploy monthly are shipping massive bundles of changes where any one of fifty things could be the cause of a production issue.

Lead time for changes. The clock starts when a developer commits code and stops when that code is running in production. Elite teams measure this in minutes to hours. Low performers measure it in weeks to months.

Long lead times mean your pipeline has bottlenecks, manual approval gates, slow CI/CD, environment provisioning delays, or heavyweight change management processes. Every day of lead time is a day your customers aren't getting value from work that's already done.

Change failure rate. What percentage of deployments cause a production incident, require a rollback, or need a hotfix? Elite teams stay below 5%. Low performers are above 45%.

This is your quality signal. A high change failure rate means your testing, review, and deployment processes aren't catching problems before they reach production. And here's the counterintuitive part: the fix isn't to deploy less often. Teams that deploy less frequently typically have higher failure rates because their deployments are bigger and riskier.

Mean time to restore service (MTTR). When something breaks in production, how quickly do you fix it? Elite teams restore service in under an hour. Low performers take days to weeks.

MTTR measures your operational maturity, monitoring, alerting, runbooks, on-call processes, and your team's ability to diagnose and resolve problems under pressure. A team that deploys frequently with a low failure rate but takes three days to fix production issues has a serious operational gap.

What the Benchmarks Look Like

The annual State of DevOps report categorizes teams into four performance levels:

Elite: Deploy on demand (multiple times per day), lead time under one hour, change failure rate 0-5%, restore time under one hour.
High: Deploy between once per day and once per week, lead time between one day and one week, change failure rate 6-10%, restore time under one day.
Medium: Deploy between once per week and once per month, lead time between one week and one month, change failure rate 11-15%, restore time between one day and one week.
Low: Deploy less than once per month, lead time between one month and six months, change failure rate 46-60%, restore time more than six months (yes, really).

Most of the teams I work with start somewhere between medium and low. That's not a failure, it's a baseline. The value of DORA isn't knowing where you rank. It's knowing where to focus improvement.

How to Actually Use Them

Start by measuring, not optimizing. Instrument your pipeline to capture these four numbers automatically. Don't rely on self-reporting, engineers will unconsciously round in favorable directions. Pull deployment data from your CI/CD system, incident data from your monitoring tools, and lead time from your version control and deployment timestamps.

Track trends, not snapshots. A single measurement is meaningless. What matters is direction. Are you deploying more frequently this quarter than last? Is your lead time shrinking? Is your failure rate trending down? Improvement velocity matters more than absolute position.

Use them as team metrics, never individual metrics. The moment you start comparing individual developers by DORA numbers, you incentivize gaming. An engineer will avoid deploying to keep their personal failure rate low, or rush deploys to inflate frequency. These are team health indicators, full stop.

Focus on one metric at a time. If your change failure rate is 40%, don't simultaneously try to increase deployment frequency. Fix quality first, then accelerate. Trying to improve all four at once usually means you improve none.

The Metrics DORA Doesn't Cover

DORA tells you about your delivery pipeline. It doesn't tell you whether you're building the right things, whether your engineers are happy, or whether your architecture will support growth. I complement DORA with developer experience metrics (how long to onboard, how painful is the local dev environment) and business outcome metrics (feature adoption rate, time from idea to customer value).

DORA is a diagnostic tool, not a complete picture. But as diagnostic tools go, it's the best one we have.