Scalability and HA requirements for distributed systems
- Read Fallacies of distributed computing
- Read Fallacies of distributed computing again!
- Guaranteed delivery is difficult.
- Handle duplicates and use at least once semantics.
- XA don't work.
- Build for at-least once semantics.
- Use retry and support duplicates instead of transactions to handle consistency requirements.
- High Availability is mandatory for all services.
- Automated deployment and upgrades is mandatory for all services.
- Optimize for writing, handle read performance with caching.
- Persistence is expensive. Optimize for it early.
- Disks are slow.
- Optimize number of I/O operations.
- When disks become too slow, consider technology which can guarantee not to loose data without persisting to disk.
- Coherence,
- Latency requirements will probably vary for different services. Design for it!
- Reduce coupling between services by defaulting to asynchronous integration.
- There will be many consumers of the data.
- Data will be needed in different forms.
- Polyglot persistence
- Not only SQL!
- Duplicating data is (usually not a problem)
- Consumers can get their own copy
- Especially relevant to duplicate data if consumers have separate bounded context.
- Producer, especially hardware sensors (or hardware systems) will behave badly!