Distributed Computing
What can we do, when a single computer system cannot cope with the load of data to be processed, especially when it comes to response times?
How can we build scalable applications which support both up- and down scaling?
The answer is simple: Distribute the load on a cluster of hosts, on-premise or on a system provided by a cloud provider.
The implementation on the other hand might not be that simple, as this approach creates additional challenges: How to manage distributed data (state machines, caches), deal with split brain scenarios, up- and down-scaling etc.
Characteristics of distributed systems
- Distributed data: Clustering, sharding
- Distributed computing: Process streams in parallel
- Optimized for low latency, high speed messaging
- Distributed state machines
- Distributed caching
- Upscaling, downscaling according to system load
- Locking in a cluster, cluster singletons
- Asynchronous computing
Asynchronous computing (“futures”) is not limited to distributed systems, but it fits naturally when a message driven approach in a cluster is used.
The actor model
One possibility to build distributed systems is to use the Actor model
It provides a framework to deal with sending messages, state management and locking.
Instead of having a microservice which is backed (and limited) by a database, the actor model scales down to the instance level: For example, a customer or a bank account is represented by an actor:
“Everything is an actor” as well as “one actor is no actor, they come in systems” - Eric Meijer Carl Hewitt
This quote is from an introduction to Akka, and Akka is one implementation of the actor model.
Sharding guarantees that there is only one actor to represent an entity in the cluster, and this provides a means to implement caches.
Stream processing
Stream processing can be implemented using Akka streams
Another possibility to deal with distributed systems is to use a framework like Kafka Streams
Kafka uses messaging and durable topics to implement data flows, branching, transformations etc.
Projects
Akka
- Reporting front end for sustainability reporting
- Distributed fare query engine
- Manage emergency calls
Kafka
Tools
-
REST APIs
-
Cassandra
-
Infinispan
Programming languages
- Scala, Java