Val is a skilled backend engineer who is capable to carry entire initiative on his shoulders end-to-end if needed. He is never afraid to roll up his sleeves and dive deep into the problem. Whether we are talking about hunting a pesky bug, designing integration with 3rd party API provider or building your own API - the "I can do it" attitude is something we all can learn from Val 💪
To clearly observe a system's operation and execution, we use a monitoring and observability approach to generate crucial data that can be useful for debugging. Additionally, it helps to monitor the app performance from users’ perspective.
The approach relies on three major components:
We need to constantly monitor the health and performance of the product that our customers are using.
Monitoring is a key practice to ensure alignment between engineering, ops and product groups when it comes to health and performance of the app.
There are 2 key pillars that will help us gain visibility into the health of a system:
Following techniques will help us perform forensic analysis:
Our goal - is to understand what’s going on with a customer while he’s using an app. Did she encounter an error? Did everything work as expected? Was the app responsive?
In order for us to achieve this level of visibility, we need to monitor all the components invovled with a specific user action: frontend, backend, any internal service and most importantly integration points around 3rd party APIs.
When it comes to tools there's a vast list of options to help you solve a specific problem. Here I will cover few tools we have experience with, however there are plenty in each category.
Used ONLY for error tracking.
Was chosen because of:
Q: Is it possible to converge on a single tool in the future - perhaps Splunk? A: Yes and no. Yes, because technically it’s possible to track MOST of the metrics in Splunk and build corresponding dashboards / queries to give a similar level of visibility. No, because it will require substantial effort, time and SKILL in order to build these dashboards. So when making a decision between Build vs Buy we lean towards Buy. And only build what we can’t buy.
Q: Are we using any forms of alerts from these tools? A: Yes. Rollbar sends deploy notifications to a dedicated Slack channel. Also sends messages for production errors to dedicated Slack channels.