Catching tech debt in ecommerce: Why APMs aren't enough
You have New Relic, Dynatrace, Datadog all ready to go for your latest release or site update.
You launch the new update, your engineers eagerly staring at their APM platform waiting to catch regressions, calculate error rates, identify improperly-written lines of code that are affecting your users. Your development team resolves the major issues and the site starts to normalize again… Now what?
As time goes on, are your developers focusing their efforts on the new issues that affect less than a percent of site traffic? In all likelihood, those errors are put into the low-priority bucket.
When the Director of Finance is browsing the site and flags an error to the rest of the team… That error is likely to be prioritized, right?
And with the data APMs are providing you, why wouldn’t you want to focus on the bugs that your director caught or the ones affecting only a large enough percent of your web traffic to be flagged?
It makes sense to operate this way… Until it suddenly doesn’t.
Why we love APMs
Application Performance Monitoring tools are great for initial releases.
No code is perfect, and app monitoring tools like New Relic and Dynatrace give you a view into the velocity rate of your site errors, which is super helpful when doing a major launch.
But after that launch, unless your web store is completely static and isn’t using any third party apps, there’s a surprising amount of nooks and crannies that could let an error flow in. (To learn about how errors leak into your system without your knowledge, check out this article on Why Site Errors Aren’t Always Your Engineering Team’s Fault)
Does error occurrence data really matter?
I’ve heard this argument quite a few times now, "What if an error is only happening sometimes, and only to a tiny portion of our traffic? Is that worth our developers' time and resources?"
Honestly, it’s a fair argument.
Armed with only that data, there’s no reason to prioritize that fix, especially if there’s another bug that’s interacting with 10% of your visitors.
So I wouldn’t fault you for putting that into a low priority bucket and stamping on a "We’ll get to it eventually" sticker. You have limited time, and your APM is telling you there are more pressing issues to address!
But what makes that other error truly more pressing than the one only affecting fringe cases of traffic? Because if it’s simply just the number of times said error has occurred, I argue that it might not be as pressing as you may think.
What if the fringe error only occurs to approximately 5 people 3 times a week, but the error itself is blocking those users from completing their order? That’s 15 sales in a week that could have happened, which you could argue is not a lot. It would be easy enough to sweep that under the proverbial rug and deal with that when your developers don’t have active projects on the go.
BUT
With ecommerce AOV sitting at around $58 USD, that’s $870 in revenue lost in a week. And the week after that, and the week after that.
After a year of that low-velocity, “low priority” error sitting in the backlog, you’ve lost $45K in revenue that should have come in.
Now imagine you’ve backlogged a few of those low velocity errors, or the new errors being introduced into your environment aren’t significant enough to be flagged due to minimum occurrence thresholds... (By the way, here’s why that happens)
There’s going to be a lot of tech debt looming in the background that, without something else in place, is going to go unaddressed.
How to eliminate ecommerce tech debt
Okay, so how do you get that something else that could make error-debt a thing of the past for your teams?
You’re going to need something that will detect ALL errors across all browsers and devices, that will capture session recordings and (more importantly) full stack traces for your engineers, that will assign a hard dollar value to every individual bug so engineers and business users can know the true impact of any issue, and that will auto-prioritize based on potential losses in revenue so the team knows what to focus on… And have this running at all times.
By implementing the above system (Surprise! There's a platform that does all this for you), your ecommerce and Engineering teams are going to have solid data-driven way to plan sprints, work on projects with demonstrated financial impact to the business, solve issues more efficiently, and consistently improve the customer user experience.
Major brands have already been getting on board with this new system:
Guess has improved their checkout conversions by 6%.
Champion has seen an ROI of 18x, saying, "We have been able to resolve and address issues, without waiting for customer generated tickets, proactively and become better aligned on the impact of what we’re working on."
Moschino touts an impressive 99.57x ROI having identified and resolved over 100 errors on their ecommerce platform over an 18 month period.
To name a few...
What’s the common denominator between these brands that led to the numbers we’re seeing above?
It’s Noibu. The industry leading error detection, revenue recovery software built specifically for ecommerce sites.
Using Noibu’s auto-detection and prioritization software, 100s of ecommerce brands worldwide have eliminated their tech-debt, made their developers happier, improved user experience and reduced customer complaints, refocused teams around data, and accelerated error resolution time.
All that stuff? You’re not going to get from an APM tool.
So sure, APMs have their use and we’re not saying you should abandon yours… Just that it shouldn’t be the only tech you have at your disposal.
Because when you get down to it, what’s really more important for your brand? The amount of people seeing something, or the amount of actual revenue being blocked?