Case studies

14 Oct, 2022

Observability for real-time performance data during high heat launches at EQL

EQL Logo

Background

Demand for rare or popular goods has shifted from in-person camp-outs to online with eCommerce retailers hosting exciting raffles and limited releases. With hype culture exploding outside of sneakers and into whiskey, collectibles, ticketing, and more, social media has made it extremely easy for consumers to discover the next big thing.

EQL recognised the opportunity to better connect eCommerce businesses with real fans through a cloud-based launch platform. Online retailers use this platform to drop hyped goods to eager customers so they have a fair opportunity to purchase goods and products that they’re passionate about. Co-founder and CTO, Patrick Donelan, describes the experience: "When half the internet is sending you traffic all at once, your first challenge is to keep your website up and running so that you can capture all of those entries... You need the right tech in place to meet the demand of high heat launches, purpose-built, battle-tested, scalable cloud infrastructure that eats hype for breakfast... We're always investing in the infrastructure and smarts that allow us to capture and analyse an ever-growing number of entries in real-time."

Popular retailers like Foot Locker, Culture Kings, and Fast Times use EQL's platform to manage the massive traffic spikes that come with hype product launches.

The platform's infrastructure gives retailers the confidence to effectively service bursts of entries in short timeframes. Cutting edge technology around verification and payment capture significantly limits scammers and bots. As a result, retailers are able to transact with real fans smoothly, and their customers love the experience.

The challenge

The efficiency of processes and tools employed by EQL's Engineering team are crucial to incident management and ensuring the platform manages the immense volume of traffic during high-demand product launches. As EQL looked to scale internationally with a growing number of popular retailers jumping on board, Midnyte City was engaged to assist with an Observability initiative designed to:

  • Investigate and implement a suitable monitoring and logging solution

  • Increase engineer productivity

  • Improve speed and efficiency of incident resolution (MTTR - Mean Time To Recovery)

We also expected the Observability engagement to help EQL achieve greater visibility into their cloud platform, leading to:

  • An increase in confidence in the performance of their environment

  • Better identification of bottlenecks or negative impacts to engineers' experience

  • Improved visualisation and unification of data to expedite incident resolution

Solution

To get started, Midnyte City ran workshops with key stakeholders and the Engineering team to understand EQL's code base, gather requirements, identify constraints and agree on trade-offs. This provided the basis to devise a project plan and set milestones with EQL's input and feedback.

The next phase of the engagement involved researching and identifying suitable options for the observability tool. Vendors were engaged and various tools were mapped for comparison against EQLs goals and objectives. EQL wanted an observability tool that was fast, easy to use, collaborative, informative and could collect, process, and correlate data from across their entire stack in one place. Our showcase included observability tool recommendations, a proposal for sending the telemetry data to the new tool and a high-level implementation and capability building plan.

Midnyte City encourages collaboration and co-sourced delivery during our engagements. This approach ensures any implemented solution becomes sustainably integrated with the client's existing teams, technology, and processes. Most importantly, the solution is owned, managed, and evolved by permanent teams long after Midnyte City has finished up.

Once the implementation was in train, the focus shifted to creating dashboards to visually track key events across EQL's systems, apps, and services. Platform observability features like actionable alerts and threat detection rules were leveraged, and the performance of 3rd party API calls and entries per second were able to be tracked. This provided engaging and useful insight and metrics for the Leadership group and helped both the Engineering and Operations teams to quickly determine if something had gone wrong and the size and scale of the impact.

To increase awareness and adoption of the new tool, Midnyte City planned and delivered a range of pairing sessions, workshops, showcases and brown bags around the observability platform with the Engineering team. These sessions focused on how to get started and ways the team could maximise the value of the new insights, dashboards, and functionality. 

Results and benefits

Greater platform visibility
Not only can EQL see into the health and performance of each layer in their cloud environment, they can also see what's happening in real time on the frontend of their platform. They're able to track key user behaviour when customers are participating in a product launch on a retailer's website, as well as determine whether any changes made by engineers in the system backend did in fact result in improvements for users. With the new observability tool, frontend issues that were previously difficult to replicate in development or staging environments are quickly troubleshooted. EQL's engineers can also validate hypotheses with minimal effort. For example, the business wanted to tweak its analysis of suspected malicious activity related to a specific behaviour on the frontend. We were able to quickly put together a dashboard to help assess the quality of the suggested approach before EQL's engineers spent any effort on building the solution.

Review performance of 3rd party integrations
With code instrumentation implemented as part of the observability tool, the performance of 3rd party API calls can now be seen in the code. This helps engineers quickly pinpoint latency issues or where in the code there might be a problem, without pushing new code changes to track the occurrence. With this awareness, engineers can now monitor the problem over time to determine if it has a growing and significant impact on users. With more information on how the 3rd party integrations are performing, EQL can better evaluate SLAS with the vendors and keep them on track so that their platform continues to deliver a great user experience.

Improved proactivity in spotting errors
We created dashboards with the new observability tool, so problems such as launch mechanic exploitation by bots can be spotted immediately and fixed before customers are impacted. The dashboards allow EQL's engineers to determine the magnitude of an issue by visually showcasing the impact it has created. Addressing customer support queries has also improved with the Support team directly accessing the dashboards to see what happened rather than involving the Engineering team to investigate for them.

Faster at finding the root cause
Engineer's search capabilities have increased across multiple logs as they now have an easy-to-use interface that helps them gather more information about the events in EQL's cloud environment. The new observability tool includes queries and dashboards that help engineers start their investigation immediately, providing more context around incidents or errors so they can respond correctly. Alerts have been configured with greater logic to help engineers with incident management. For example, to reduce noise, a notification is sent only if traffic reaches a certain volume. This is particularly beneficial when alerts come through outside of business hours.

Real-time performance data
The observability tool provides valuable data that stakeholders beyond the Engineering and Support teams now use to facilitate decision making. Being able to see the platform's performance in real time gives EQL in-depth insight into where and how they can make their platform even better for real fans who bring the hype. One example is EQL can now more easily measure entries per second to confirm if backend changes made by the Engineering team improved both the system's performance and frontend user experience.

Final Words

“Midnyte City has been instrumental in building the foundation for observability of our hype commerce platform. We now have extremely valuable insights into the real-time performance of our system, frontend user behaviour, and integrations with 3rd party vendors. This supports EQL in continuing to meet the demand of high heat launches as we're rapidly scaling and always improving our launch platform. Midnyte City took the time to deeply understand our requirements and code base to sustainably implement an observability tool. The leadership they displayed through one-on-one pairing and knowledge sharing helped our engineers adapt to the new tool with ease and enthusiasm. With end-to-end visibility, we have a richer and holistic understanding of our launch platform.”  

Alex Barreto
Head of Engineering at EQL

Contact us

If you would like to speak to someone about similar challenges in your team or organisation, reach out below to schedule a time.

*Fields are mandatory

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.