"We hired a dev shop and we're not sure what we're getting." This is one of the most common conversations in my discovery calls. A founder or COO engaged an offshore team — often at an attractive hourly rate — and 6 months in, they can't tell whether the code is good, the pace is reasonable, or the money is well spent.
They're not asking because the relationship has obviously failed. They're asking because they lack the technical frame of reference to evaluate it. Is a 3-week estimate for a user settings page reasonable? Is 40% test coverage acceptable? Is it normal to have this many production bugs?
Here's the audit I run. It takes about a week and gives you a clear picture.
Dimension 1: Code Quality
Pull up the last 10 pull requests (or the equivalent if the team doesn't use PRs, which is itself a finding). For each one, check:
Does the PR include tests? Not just "tests exist" but "tests that would catch the most likely bugs." If a PR adds a payment processing feature with no test for what happens when a payment fails, that's a quality signal.
Is there error handling? Does the code account for what happens when things go wrong — network failures, invalid input, missing data? Or does it only handle the happy path?
Are there hardcoded values that should be configuration? API endpoints, feature flags, business rules — these should be configurable, not buried in code.
Are there security basics? Input validation, parameterized queries (not string concatenation for database queries), no secrets in code, proper authentication checks on protected routes.
You don't need to be a senior engineer to run this audit. You need someone technical enough to read code and check these boxes. If you don't have that person internally, this is exactly the kind of bounded engagement a fractional CTO handles in a few hours.
Dimension 2: Velocity Honesty
Get the team's estimates for the last 3 months of work alongside the actual delivery dates. Calculate the estimate accuracy: (actual time / estimated time) for each significant piece of work.
Healthy teams estimate within 1.5x of actual. Struggling teams are routinely 2-3x off. If a team consistently estimates 2 weeks and delivers in 6 weeks, they're either bad at estimating (trainable) or overpromising to avoid difficult conversations (cultural problem).
Also look at the trend. Is accuracy improving over time (good — the team is learning) or staying flat or worsening (concerning — the feedback loop isn't working)?
Dimension 3: Bus Factor
For each major area of the codebase, how many team members can independently make changes and ship them? If the answer is "one person" for any critical area, that's a bus factor problem — and it's more dangerous with offshore teams where turnover can be sudden and communication barriers make knowledge transfer harder.
Ask the team lead to describe who can work on each major component. Then verify by checking git blame — who actually commits to each area? If the team lead says "anyone can work on payments" but one person has authored 95% of the payment code, reality disagrees with the narrative.
Dimension 4: Communication Overhead
Track how many hours per week your onshore team spends on offshore-team-related communication: writing detailed specifications, clarifying requirements mid-sprint, reviewing and requesting changes on pull requests, and fixing issues in work that was marked "complete."
Some communication overhead is expected and healthy. Requirements clarification and code review are part of any development process. But if your onshore team is spending 10+ hours per week rewriting specifications because the offshore team doesn't understand the product, or fixing bugs in "completed" features, the offshore team's effective hourly rate is much higher than what you're paying.
The math: if an offshore developer costs $40/hour and your onshore engineer spends 2 hours fixing every 8 hours of offshore work, the real cost is $40 × 8 + $150 × 2 = $620 for 8 hours of output, or $77.50/hour. At that rate, you might be better off with fewer, more expensive engineers who require less oversight.
Dimension 5: Rework Ratio
What percentage of "completed" work items come back for significant revision? Not minor polish — significant revision where the feature doesn't meet the requirement, doesn't handle edge cases, or has bugs that should have been caught during development.
In a healthy team, rework ratio is under 15%. At 15-25%, there's a systemic quality issue that might be addressable with better specifications, code review processes, or test requirements. Above 25%, the team is costing you more than they're saving.
What the Audit Tells You
The audit results cluster into three categories:
The team is solid, your management process needs work. High communication overhead but low rework ratio suggests the team delivers quality work but needs better input. Invest in better specifications, clearer acceptance criteria, and more structured communication.
The team has gaps but is salvageable. Moderate issues across dimensions suggest specific skill or process gaps. Targeted interventions — requiring test coverage, implementing code review, adding a senior developer to the team — can close the gaps.
The team isn't working. High rework ratio, low velocity honesty, bus factor of 1 across critical areas, and communication overhead eating your onshore team's productivity. It's time for a frank conversation about whether this engagement is achieving its goals.
Related: How to Evaluate Your Offshore Development Team | Scaling From 1 to 20 Engineers | Signs Your Engineering Team Needs Outside Leadership