Over the last few weeks we’ve looked at cloud computing and microservices, and some of the technologies that support them.
So what have fallacies got to do with microservices? Cloud computing and microservices are distributed applications. When we develop these applications, we need to keep the problems of distributed computing in mind, otherwise we’ll get severely burnt.
What is a Fallacy?
A fallacy is an incorrect, invalid or faulty assumption or set of reasoning when making an argument. It is an argument based on a false, invalid or mistaken idea or belief.
Sometimes people use fallacies intentionally to manipulate or deceive others. In these cases, fallacies are used with the intention to persuade others to follow a course of action that might not be beneficial in the long term. If the fallacy isn’t discovered, then the argument will appear stronger than it really is.
Often people use fallacies unintentionally due to ignorance, careless reasoning or personal biases. Sometimes these are unsubstantiated beliefs argued with a conviction that makes them sound like proven facts.
The Eight Fallacies
The eight fallacies of distributed computing were defined by L. Peter Deutsch, James Gosling and other developers at Sun Microsystems around 1994. They describe the false assumptions that programmers new to distributed applications often make.
Almost everyone makes these assumptions when they first build a distributed application. In the long run, all of them turn out to be incorrect, and they all cause lots of problems and some painful learning experiences along the way.
The fallacies are:
- The network is reliable.
- Latency is zero.
- Bandwidth is infinite.
- The network is secure.
- Topology doesn’t change.
- There is one administrator.
- Transport cost is zero.
- The network is homogeneous.
Fallacy 1: The network is reliable
This is obviously a fallacy. The network is never reliable! Packets drop, connections break and data gets corrupted as it’s sent over the wire. Network outages are caused by hardware and software failures. Routers and switches can fail, stall or restart.
If a software developer assumed that the network was reliable, then they probably wouldn’t write enough network error-handling code. The application would then either stall or wait indefinitely for an answering packet. This would consume memory and other resources. When the network becomes available again, the application might forget to retry any stalled operations, or might even require a manual restart.
Reliability becomes more complicated if we have to communicate with external business partners. Their side of the connection is not under our control, and anything can go wrong on their side of the network. We must think about all of the possible failure points, and find ways of reducing their effect or eliminating them altogether.
Fallacy 2: Latency is zero
Latency is a measure of the time it takes to move data from one computer to another. Latency can be low on a local area network (LAN), but slows when we send data over a wide area network (WAN) or the Internet. These transfers are not instantaneous. Moving data through undersea cables or satellites increases latency. We must never assume that data transfer happens instantaneously.
If developers don’t know about network latency, and of the packet losses it can cause, they could allow unbounded data traffic. This will greatly increase the number of dropped packets and waste the already limited bandwidth.
Fallacy 3: Bandwidth is infinite
Bandwidth is a measure of the amount of data we can transfer over a period of time. Bandwidth is not infinite. Bandwidth has increased radically over the years, but the amount of data we transfer has increased to match it. Applications such as video streaming services, video conferencing and voice-over-IP (VoIP) take up huge amounts of bandwidth. Applications with rich user interfaces and verbose data formats also need lots of bandwidth.
We need to strike a balance between this and the previous fallacy (latency is not zero). We should transfer more data per request to minimize the number of network round trips. But we should transfer less data per request to minimize the bandwidth usage. We need to send the right amount of data to balance these opposing constraints. Transfer only the data that we need, no more, no less. If a developer doesn’t know about bandwidth limits, data transfer could result in bottlenecks.
Fallacy 4: The network is secure
Even if we’ve been living under a rock, we’ve all heard of malware, phishing, data breaches, password and identify thefts, ransomware, denial of service attacks and the like. The network is definitely not secure!
Malicious agents are constantly trying to sniff every packet going over the wire and steal as much data as possible. At the very least, we must encrypt all application data transferred across the network.
We need a layered approach to security. There are different security checks needed at the network, infrastructure and application level. There are many components in a distributed system and each one of them can be a target for malicious attacks. We must follow best practices for secure software design. The list of the top ten vulnerabilities doesn’t change much from year to year, so there’s no excuse for not reviewing our code for common security problems.
Security is difficult and expensive to do properly, but is absolutely essential for the integrity of our system.
Fallacy 5: Topology doesn’t change
Network topology is the physical and logical arrangement of the various devices and connections in a network. We can think of the network as a city, and the topology as the road map through the city.
Network topology changes all the time. Sometimes it changes accidentally when hardware fails and is replaced. Most of the time we change it by design when we add new servers and other hardware. This is inherent when we move to cloud-based computing with containerized applications. We also need to be able to add and remove servers as the workload scales up and down.
When the topology changes it will cause changes in latency and data transfer times. We must monitor these in case they deviate too much from an acceptable level. Our systems should be able to handle changes in topology.
Fallacy 6: There is one administrator
In the past, there was usually one person responsible for maintaining the computing environment, and installing and upgrading applications. This has changed with the move to DevOps and cloud computing.
Cloud-based applications consist of many services which work together, but have been developed by different teams. It’s almost impossible for a single person to know and understand the whole application, let alone try to fix all the problems.
One of the problems with multiple administrators is they could set up conflicting policies that could lead to lower data throughput.
We should develop procedures that make it easy to fix any issues that crop up. These procedures should cover areas such as version control, release management, logging, metrics analysis, monitoring and the like.
Fallacy 7: Transport cost is zero
This fallacy is related to the second fallacy (latency is zero). Transferring data over the network has a cost, both in terms of time and resources. The second fallacy addresses the time taken to transfer data. This fallacy addresses the cost of data transfer.
There is always a cost of hardware, software and maintenance when running a distributed system. In a corporate environment, this cost tends to be hidden behind capital expenses, maintenance projects and general IT. However, if we use a cloud service provider such as Amazon AWS, Microsoft Azure, Google Cloud, Alibaba Cloud, etc., then there is a real data transfer cost. The cost is very small for a minimal application, but becomes significant when an application operates at scale. We must be aware of these costs, and monitor the application usage carefully.
Object serialization and de-serialization operations are used extensively in both SOAP and REST web services. Both operations are expensive in terms of performance. We need to know how much serialization/de-serialization is done in our application. We may need to optimise/change our code if resource consumption is high.
Fallacy 8: The network is homogeneous
A homogeneous network is a network of computers with similar configurations, all using the same communication protocol. But networks aren’t homogeneous: they are wildly heterogeneous. The computer hardware will be different; the network configurations will be different; even the protocols will differ. There will be a wide variety of different mobile devices connecting to our application.
We need to choose standard protocols so that the different components can communicate, regardless of the hardware. This will also mean using common file and data transfer formats, like XML or JSON.
Further Reading and Conclusion
Wikipedia has a very short article.
A more detailed, but still easy-to-read article is here.
In this post, we learned about the fallacies of distributed systems and how to avoid them. Being aware of these issues will help us design better performing, robust, secure distributed systems.
Don’t forget to share your comments and experiences.