The Platform team at Hootsuite creates and manages microservices and is always obsessive over response times. We write all of our code with that as a guideline, so users can get the best experience. To achieve this, we rely on many cool open source projects like Scala and Akka, and do our best to give back to the communities whenever we can. In keeping with that, we are happy to announce we are open sourcing our Scala circuit breaker.
Circuit Breakers, Responsiveness, and Complexity
So, what is a circuit breaker? A circuit breaker in the context of a service helps deal with situations in which a service’s dependency (can be a DB, another service, a web site, a toaster, etc) stops working as expected and causes slow interactions with end users. The circuit breaker helps mitigate the problem by detecting slow or failed dependencies, and stops the sending of further requests. This fail-fast behaviour informs the users that there is a problem quickly, as well as lowering the resources allocated to communicating with the broken dependency. After being tripped, the circuit breaker should not remain tripped, or the dependency would be quarantined forever. To mitigate this situation, the circuit breaker can periodically retry the dependency and/or wait for predefined amount of time to see if the dependency is available again. Without the circuit breaker, the server would continue to try to use the unavailable dependency, and users would see pages timing out or returning an error after many seconds.
Those diagrams in full:
There are many circuit breakers available, such as the one given to you out of the box when you use Akka. Unsurprisingly, the Akka implementation does exactly what you need a circuit breaker to do: it trips after a predefined number of consecutive failures and opens after a predefined amount of time. A failed request is determined by configurable timeout.
Hootsuite Scala Circuit Breaker
Since we are familiar with Akka and use it a lot, we looked into using Akka’s circuit breaker but ultimately decided to write our own in Scala. When thinking about circuit breakers in general, we wanted to define exactly what we consider a “failed request”. Our first attempt came up with the following list:
- A 500 response from a dependency
- A timed out request
- A failed Scala future that contains a Throwable
But we quickly identified the following edge cases:
- A request that normally takes longer than the timeout
- Custom exceptions in our Scala SDK such as BadRequest or Forbidden that extends Throwable
- Custom exceptions in our Scala service such as FacebookBadRequest that wraps 4xx response from Facebook
- A valid response from Twitter that contains thousands of data objects
Timeout on a request is at our discretion, and it’s easy to solve. As for the “good” exceptions and “bad” valid responses, we could keep a whitelist and blacklist to check against when deciding to trip the circuit breaker. These are real scenarios we ran into, and ultimately affected the decision to write our own circuit breaker. On top of these configurations, we added hooks when circuit breaker state changes and when circuit breaker is invoked. They enable extendable actions which in our case is used to log and monitor the system.
Real World Example
Hootsuite helps customers post content to different social media networks. If one of our posting requests to a social network is slow or unresponsive, it should not affect our customer’s ability to post to their other social media networks. We rely heavily on our circuit breaker to detect such an issue, fail fast, and attend to the problem.
Another use case of circuit breaker is around our session service with ElastiCache and Redis. In rare cases, a primary Redis node can fail and trigger automatic failover to a read replica if it exists. This can result in a readable (but not writable) state if the service caches DNS. Our solution, besides fixing the DNS caching, was to put a separate circuit breaker for read and write. This only fails write requests while allowing read requests and does not affect users who are already logged in.
Give it a Try!
Hootsuite Scala Circuit Breaker is available on GitHub. Instructions and sample project are also available in the same repository. We hope our Scala Circuit Breaker will be helpful for your project, and invite you to reach out to us if you have any comments or questions.
Andres Rama is a Co-op Software Developer on the Build and Deploy Team at Hootsuite. He works on creating and maintaining tools to help deploy code to Hootsuite’s customers. He also plays Counter Strike and reads articles on the internet. Follow him on GitHub.
Steve Song is a Software Developer on the Platform Team at Hootsuite. When not working on writing distributed and scalable microservices in Scala, he likes to explore beautiful Vancouver, read books, and discover technology. Follow Steve on Twitter @ssongvan.