Blockchain offers a glimpse of a future computing world built on a trustless, highly security consensus mechanism. The important qualities of these distributed systems
- Security designed in, able to handle bad/corrupt nodes
- Increased resilient to outages, can recover and repair
- Highly distributed, capable of running in the cloud, or an edge device
Looking back at history it feels like we there are large 20 year epochs where computing is centralizing or decentralizing. For the last 20 years we have been centralizing our computing infrastructure. It is logical to think the next 20 years might be a shift to decentralized computing. Take for example, Web3.0 which is a shift away from centralized computing to decentralized computing.
- 1960 - 1980 - Centralizing working on the big mainframe
- 1980 - 2000 - Decentralization rise of the desktop
- 2000 - 2020 - Centralization to data centers and rise of big data repositories
- 2020 - 2040 - Decentralization ??
If you are working in technology it may not feel like centralization. Working on sharding data, or working on multi-region computing to support your organization feels like decentralization. I would argue this work is really an effort to scale centralized services to higher levels of capacity and utilization. Take a step back and think about the systems you build they have
- Synchronous Messages because order of updates is very important
- Single Master or Single Leader to route work, order updates, and manage work
In addition, the systems aren't very robust to high latency or security exploits.
- Require Low Latency. Systems fall apart under high latency, as nodes lose track of state
- Assume Trust. Systems fall apart if one node is exploited.
Single Masters, Synchronous Message Passing combined with the assumption of trust and a stable, low-latency network are the qualities of centralized computing. Today we spend a great deal of time and effort working on the control plan for our systems to work around these limitations. By control plane I mean routing of requests, allocation of resources, deallocating resources, secure handoffs, logging, and monitoring. We need to make sure sharded requests are routed to the correct endpoint. We need to grant permission and access controls so new applications may be spun up. We write code to handle a surge of traffic or an unresponsive services. We spend time repairing data-sets and queues that go out of sync. This is all custom work, and this is all control plane work.
Contrast that with a distributed system. Think about your average blockchain. Transactions in a blockchain wait in a pool where a miner will pick them up and try to create a new block. Once a new block is created it may be added to the blockchain. At every step along the way there are multiple competing agents involved.
- Miners compete with each other to form new block
- Many nodes contain the same state
- Nodes talk to each other to validate blocks
- A bad/corrupt node is excluded until it aligns with the majority
- Nodes will automatically re-synchronize with the majority
In the blockchain example we can see how the control plane logic is distributed among the miners and the nodes. Take for example aligning state with the majority. This control plane logic is built into the system, it isn't coded up after the fact. You can add new records, grow or shrink the nodes, and there are no access controls or permissions to add.
There is still a lot of work to support distributed systems. If blockchains are distributed ledgers where don't have a distributed compute or distributed function platform. Distributed compute would need to support the same qualities as a blockchains. That is being secure, resilient, and distributed. We could have some generally supported secure platforms that would allocate compute resources on the fly and run multiple copies comparing the results. Or we could utilize zero knowledge proofs that perform a function and return a true false result.
A distributed compute environment might work in the following way: Note that the data needed by the compute function would need to be packaged along with the function to run.
- Broadcast request for resources along with compensation.
- Organization signed requests that supported required resources for given compensation are returned
- Select resources from the orgs or people you trust, exchange keys
- Send over data/function and run functions on n or more providers, escrow compensation
- Results returned, compare results, select majority answer
- Release compensation, pay out for timely results, take back compensation for late results or bad results
A zero knowledge proof would work differently. First there would need to be domain providers that have both data and can run functions on that data. The statements asked would always return a true/false answer. Range queries could be asked. For example is this persons income above $100,000 so they can qualify for this mortgage.
- Request result from domain provider. A domain provider would be a trusted service for example, holder of income data.
- Exchange keys
- Return encrypted response with either a true or false answer
In this case you would know the person's income is above the required threshold, but you wouldn't know their exact income level.
Hopefully the next 20 years brings more secure, resilient, and distributed computing, and I feel Web3 offers a glimpse into how that future will materialize.