Chapter 1. Origins of Blockchain Technology

The term blockchain may sound mysterious or even scary to the uninitiated. Its literal meaning—a chain of blocks of information—is perhaps the simplest way to explain blockchain. But what is it for? Why does anyone need something called a blockchain?

To find the answer we need to look back to an earlier time, closer to the start of the web. The internet is about storage and distribution of information to large numbers of people. Blockchain has a similar goal, and it builds on previous experiments looking for ways to improve that distribution.

Electronic Systems and Trust

Before blockchain, cryptocurrency, or the systems that use them, could ever be a reality, the internet needed to exist in a reliable and distributed manner, and it needed to be used by a lot of people. In its infancy in the 1960s, the internet was a simple, relatively small network, and it was primarily used as a tool for university researchers and the US government to share information digitally.

Over time, early internet pioneers made the system more usable. The biggest impacts came from the development of TCP/IP, which established a standard for communication, HTTP, which enabled web browsing, and SMTP, which delivered electronic mail. These protocols made the internet accessible not just to researchers, but to everyone, and on a growing number of devices, including computers and later tablets and smartphones.

The evolution of the internet has changed life forever—incredibly large amounts of information and services are now available in the palm of anyone’s hand, much of it for free. However, using most online products or services requires a person or entity, known as a third party, to act as a trusted gatekeeper. These systems require two types of trust:

Intermediary trust
A third party is relied on to make rational and fair decisions.
Issuance trust
A third party is relied on to ensure the safety and security of any value.

Financial transactions are one major area where this trust is relied upon, since most money has become digital. For various reasons, the use of fiat paper money, or government-issued physical cash, is on the decline—people today utilize electronic financial tools like debit and credit cards more than ever before. In some countries, such as Sweden, payment systems are almost entirely electronic, with most customers using smartphones and cards at the point of sale. But while for consumers the shift of payment interfaces from physical to digital is a relatively recent trend, the systems powering this accounting have long been electronic. Although cash is still readily available to most, money has largely gone from paper and coins to just numbers in a computer system, without many people even noticing.

When value is moved from physical items to a database, there must be an element of trust among the multiple parties involved. Huge payment companies around the world have been created based on the idea that people storing value digitally can trust these brands. However, trust hasn’t always been a reliable factor in finance. In fact, the 2008 financial crisis gave people pause, and many began to think that perhaps blind trust and faith in financial institutions wasn’t what it cracked up to be.

Bitcoin was the first working system to use a blockchain. But before Bitcoin came into existence, several predecessors tried—and failed—to create similar concepts. One of the main reasons they failed was the inability to put together a truly distributed system on the internet.

Distributed Versus Centralized Versus Decentralized

The internet today is a mix of centralized and distributed applications, though it was designed as a distributed technology. Rather than building a centralized structure with one point of failure, early internet architects wanted to create a more resilient system. The idea for a distributed internet came from the goal (inspired by the military) of ensuring that if one part of the system were attacked, it would still be able to operate if properly distributed.

On a bike wheel (see Figure 1-1), many spokes connect to a single hub (the axle). This design facilitates a distributed approach—if some spokes are broken, the wheel can still work. Distributed means that no single point of failure can bring down an entire system, such as the network of computers that powered the early implementations of the internet.

Figure 1-1. A bicycle wheel has a distributed design

The early internet as designed decades ago was distributed to protect the network from any type of disruption, and this system has proven itself to this day. In more recent times, centralized companies such as Google, Facebook, Apple, and Amazon have come to largely dominate the internet. It is the hope of some that blockchain technology’s distributed nature could help to mitigate the dominance of the web by these few powerful companies by giving individual users more control—a topic that will be explored later in this book.

In the field of computing, a distributed system is one where processing is not done solely on one computer. Rather, computation is shared across a number of computing resources. These systems communicate with one another using some form of messaging. Figure 1-2 illustrates a few different network designs. A distributed system has characteristics of decentralization, in that the failure of a single entity (or node) does not mean the failure of the whole network. The common goal is to use processing power to collectively accomplish a task by distributing responsibility across many computers. However, decentralization changes the concept of common goals and messaging. In a fully decentralized system, a given node does not necessarily collaborate with every other node to achieve its objective, and decision-making is done through some form of consensus rather than having this responsibility rest in the hands of a single entity.

Figure 1-2. Centralized, decentralized, and distributed network designs

Figures 1-3 through 1-5 illustrate the differences between centralized, distributed, and decentralized systems in the form of databases that store information.

Figure 1-3. In a centralized database, like PayPal, all nodes connect to a single, central node that is controlled by one entity
Figure 1-4. In a distributed database, like multiple databases hosted on Amazon Web Services (AWS), each node can maintain a replicated copy of the same data, each node knows the identity of other nodes, and all nodes are controlled by one entity
Figure 1-5. In a decentralized database, like Bitcoin’s Blockchain, each node can maintain a replicated copy of the same data, each node may not know the identify of other nodes, and all nodes are controlled by many entities who may be anonymous

Bitcoin Predecessors

The internet’s ubiquity has been disruptive and changed many industries. To name just a few examples, over the past few decades Wikipedia has more or less replaced encyclopedias, Craigslist has taken the place of newspaper classified ads, and Google Maps has mostly rendered printed atlases obsolete.

Yet the financial industry was able to resist the internet’s turbulent changes for quite a while. Prior to 2009, when Bitcoin launched, control of money had not changed much outside of the switch for users from analog (physical currency and checkbooks) to digital (electronic banking). Because of this shift the idea of digital money was a familiar concept, but control was still centralized.

Many pre-Bitcoin concepts were tried before ultimately failing for various reasons, but the ultimate goal was always the same: increased financial sovereignty, or better control for users over their money. Looking at a few of the early failures can bring the reasons for Bitcoin’s growing popularity into greater focus.

DigiCash

Founded by David Chaum in 1989, DigiCash was a company that facilitated anonymous digital payments online. Chaum is the inventor of blind signature technology, which proposed using cryptography to protect the privacy of payments online. Cryptography uses encryption-based mathematics to obscure sensitive information and has long been used by governments worldwide as a communications tool. Chapter 2 covers cryptography and encryption in a bit more detail.

The DigiCash platform had its own currency, known as cyberbucks. Users who signed up for the service would receive $100 in cyberbucks, which were often referred to as tokens or coins. The company pioneered secure microchipped smart cards, similar to the system used in most credit cards today. It was also an early innovator in terms of the concept of a digital wallet for storing value—in this case, cyberbucks.

DigiCash systems were trialed by a few banks, including Deutsche Bank. A handful of merchants also signed up to accept cyberbucks, including the book publisher Encyclopaedia Britannica. In the 1990s commerce on the internet was very new, and because of concerns about fraud, most people were hesitant to even use credit cards on the web, much less adopt an entirely new type of payment system. However, many privacy-conscious users did begin using cyberbucks and even developed a mailing-list marketplace that was in operation for some time. It was never able to achieve traction due to lack of merchants, though, and DigiCash ultimately filed for bankruptcy in 1998.

E-Gold

A digital store of value established in 1996, E-gold was backed by real units of precious metal. Operated by a company called Gold & Silver Reserve, E-gold enabled instant transfers between its users on the internet. Everything on the platform was denominated in units of gold or other precious metals. By 2006 there were over 3.5 million E-gold accounts. At that time, the company was processing $5.9 million in daily volume.

With denominations as small as one ten-thousandth of a gram of gold, the platform was the first to introduce the concept of making micropayments, or transferring tiny amounts of value, on the internet. Innovative for the time, E-gold also offered developers an API that allowed others to create additional services on top of the platform. Merchants accepted E-gold as a form of payment alongside credit cards in online shopping carts. Support for mobile payments was introduced in 1999.

E-gold was technologically ingenious in the context of its features during the 1990s and early 2000s. However, the system was plagued with problems from the outset, which ultimately led to its demise. A centralized system, it had no mechanism to tie accounts to anyone’s identity. As such, the platform was used for nefarious purposes, facilitating money laundering, online scams, and other illegal activity. The US government shut down E-gold in 2008, seizing its assets and establishing a system of redemption for account holders.

Hashcash

Invented by Adam Back in 1997, Hashcash introduced the idea of using proof-of-work to verify the validity of digital funds, including the concept of money that exists solely on the internet. Proof-of-work means that computers need to produce some kind of verifiable, computation-intensive output for electronic money to have any value (Chapter 2 explains this in more detail). Hashcash used cryptography to enable proof-of-work, and Back proposed using an algorithm called SHA1 in order to accomplish this.

In his initial proposal for Hashcash, Back referenced DigiCash and raised the idea that adding a fee or “postage” on emails with digitized currency could reduce spam. By utilizing a hash, or a function requiring computer processing, Hashcash would impose an economic cost, which would limit spam in email systems. For digital currency, the concept of using hashes would solve what’s called the double spend problem, which enables a digital unit to be copied like a file and thus spent more than once. Computers, after all, make it easy to duplicate files; anyone can copy an image file and reproduce it over and over. The use of hashing is meant to limit that possibility with digital money by imposing a cost through proof-of-work, or computing power.

Although Hashcash was tested in email systems from Microsoft and the open source software provider Apache, it never took off. Conceptually, Hashcash was a great example of how to introduce the digital scarcity required for internet-based money, but the technology itself wasn’t really a good form of digital currency.

B-Money

Proposed by Wei Dai in 1998, B-Money introduced the concept of using computer science to facilitate monetary creation outside of governmental systems. Like Hashcash, B-Money suggested that digital money could be produced through computation, or proof-of-work. Similar to Adam Back, Wei proposed that the cost of creating digital money could be calculated from the computer power used to create it. This digital money would be priced based on a basket of real-world assets such as gold and other commodities and limited in its supply to protect it from inflation, or losing value over time.

B-Money advanced the idea of broadcasting transactions to a network. For example, if one party wanted to pay another, a message would be sent to the network saying, “Person 1 will send $X to Person 2.” The system would be enforceable via a system of digital contracts. These contracts would in theory be used to resolve any disputes, similar to how credit card companies deal with problems like fraud. This system would use cryptography instead of a centralized system for both payments and the enforcement of contractual issues, enabling users of the network to be anonymous; no identity would be required.

The concept of B-Money brought together a number of components of digital cash. It applied the idea of contracts to provide order to an anonymous and distributed system. And it introduced the concept of using proof-of-work to create money. However, B-Money was mostly just a theoretical exercise by Wei. Its purpose was to explore the concept of nongovernmental money that could not be subject to inflation via a controlled money supply.

The Bitcoin Experiment

By 2008, the world was already relying on the internet as a distributed entity for a large number of services. With electronic maps and GPS apps, people looked to the internet to help them get from point A to point B. Email, texting, Skype, WhatsApp, and other communication apps allowed almost instantaneous connections with friends and family near and far.

In addition, people had begun buying more and more goods and services online rather than in-store. Credit and debit cards had become popular payment methods, along with PayPal and other services. However, as mentioned in the previous section, many still desired a tamper-proof, distributed way to transfer value via the internet—and amazingly, that had still not yet been devised.

The Whitepaper

On August 18, 2008, >the domain bitcoin.org was registered. Then, written by someone or a group using the pseudonym Satoshi Nakamoto, a whitepaper was published on October 31, 2008, and shared on numerous software developer mailing lists. Titled “Bitcoin: A Peer-to-Peer Electronic Cash System,” the paper provided a detailed proposal for creating a value system that existed only on the internet. The aim was to create a digital currency that could operate without any connection to a bank or central government, and to build a more transparent financial system that could prevent the catastrophic events of the financial crisis from ever happening again.

The Bitcoin proposal featured a number of ideas pulled from systems that preceded it. These included:

  • Secure digital transactions, like the smart contracts outlined by Nick Szabo

  • Using cryptography to secure transactions, like in DigiCash

  • The theoretical ability to send small amounts of secured value, as E-gold was able to do

  • The creation of money outside of governmental systems, as B-Money had proposed

  • Using proof-of-work to verify validity of digital funds, as Hashcash was designed to do

The whitepaper also introduced several concepts that were new to many people, including:

Double spending
The risk that a unit of currency is spent more than once via falsified duplication.
Proof-of-work
A mathematical problem that must be solved using computational power.
Hashes
A fixed-length output is produced so that data of different sizes and sequences can be organized.
Nonces
A random number is used to ensure that a particular communication can only be used once.

Storing Data in a Chain of Blocks

In the mint-based model, a government or central authority uses standard accounting practices to keep track of transactions. The Bitcoin whitepaper introduces the concept of tracking transactions using a chain of signatures, or hashes. These are organized by blocks of time in chronological order.

This scheme, in essence, creates a unit of account that does not require any single entity to keep track of transactions. Instead, the chain of blocks, or blockchain, uses cryptographic mathematical trust to keep track of transactions in a digital system. The network does not require a complex structure, as it uses a peer-to-peer system to verify and publish these chains of blocks. Basically, it needs a distributed data structure for storage and a messaging system protocol that makes up a public network on the internet. As explained further in Chapter 2, a blockchain is made up of multiple blocks of transactions, and those blocks are connected to each other through hashes. Though many blockchains are available freely on the internet, some blockchains are not public—especially those used in some business settings, as detailed further in Chapter 9.

Here is the challenge Bitcoin sought to overcome: how can multiple parties who don’t know each other and don’t trust each other collaborate? Maintaining a global ledger where they all agree which transactions are valid and should be processed is Bitcoin’s solution to this challenge. The Bitcoin blockchain is the global ledger that all parties in the Bitcoin network agree is valid and accurate. Disagreement can mean a fork in the chain and the creation of a new root, a subject that is covered in Chapter 3.

The following are important attributes of every Bitcoin block:

Block hash
A unique identifier for the block. The block hash is generated from input data that provides a snapshot of the current state of the blockchain within 256 bits of data. This snapshot is like a technical version of a balance sheet for the entire Bitcoin blockchain. A Bitcoin block does not contain its own block hash, but it does contain the hash of the previous block it is building on, which is what makes the blocks chained. A block hash can be found by hashing the block header.
Coinbase transaction
This is the first transaction of each new block mined on the network. It adds new bitcoin to the supply, which is given as a reward to the miner who adds the block to the chain. Miners are discussed further in Chapter 2.
Block height number
This number identifies how many blocks there are between the current block and the first block in the chain (also known as the Genesis block).
Merkle root
This is a hash that allows proof of the validity of the blockchain (Chapter 2 talks more about Merkle roots).

Figure 1-7 shows a Bitcoin block.

Figure 1-7. Bitcoin block #170, which records a transaction of 10 BTC sent from Satoshi Nakamoto to developer and early blockchain pioneer Hal Finney

Figure 1-8 illustrates why it would be hard to change a past transaction.

Figure 1-8. Why it’s difficult to roll back bitcoin transactions

Bringing Bitcoin to Life

The initial Bitcoin concept as outlined in the 2008 whitepaper brought together technologies in cryptography, privacy, and distributed computing to rethink financial platforms. However, a lot of work remained to be done to bring these ideas to fruition. Fortunately, a number of computer programmers devoted to open source software and Bitcoin’s ideals believed in its potential. Bringing the network to life was the next task, and it required the efforts of some early pioneers.

Achieving Consensus

On January 3, 2009, Satoshi Nakamoto “mined” the first 50 bitcoins, utilizing processing power to create the first Bitcoin block. Known as the Genesis block, this first block in the Bitcoin blockchain referred to the financial crisis as the purpose for bringing the network to life. In the coinbase, or transaction content input, the Genesis block has this information:

The Times 03/Jan/2009 Chancellor on brink of second bailout for banks

Bitcoin is a distributed network, which means people were needed to act as miners in the system. So, Satoshi produced the first Bitcoin client. Running the client allowed users to run nodes and mine Bitcoin blocks. “If you can keep a node running that accepts incoming connections, you’ll really be helping the network a lot,” Satoshi wrote in the message posting the software, titled “Bitcoin v0.1 released - P2P e-cash.”

A blockchain is a living, constantly updating document. As time goes on, more and more transactions are added to it. Users of a centralized payments network like PayPal trust that the central authority will update its ledger with new transactions as time goes on. But in a decentralized payments network like Bitcoin, there is no central authority—just thousands of anonymous miners powering the network.

So who should users trust to update Bitcoin’s blockchain with a new block of transactions? Gaining that trust is called achieving consensus. It is a process that all the miners powering the network use for the following two purposes:

Block discovery
To agree on which miner gets the right to add a block of transactions.
Validation of transactions
To agree that the transactions included in that new block are legitimate.

Most blockchains used for cryptocurrency follow one of two approaches to achieve consensus (Chapter 2 covers these in more detail):

  • Proof-of-work

  • Proof-of-stake

Enterprise blockchains use other methods of consensus, which are discussed in Chapter 9.

Generating keys

A private key is a 256-bit number that is chosen at random. Private keys are almost always shown in hexadecimal format. The private key is generated by a computer—most programming languages have a function to randomly generate a number.

A private key can be paired with a public key to make transactions on the Bitcoin network. Without a private key it is, by design, nearly impossible to do so (more on this in Chapter 2). In cryptography, a public key can be generated by running the private key through an Elliptic Curve Digital Signature Algorithm (ECDSA) secp256k1 function. A public key hash is then generated by running the public key through the cryptographic SHA256 and RIPEMD160 functions. The Bitcoin address is generated by first adding 00 to the public key hash and then running that value through a Base58Check function. Figure 1-9 illustrates.

Figure 1-9. Process of generating a Bitcoin address from a private key

Some people use a Bitcoin client that has an option to generate an address, following certain rules:

  • Starts with 1, 3, or bc1

  • Rest of string is between 25–34 characters long

  • Valid characters include 0–9, A–Z, and a–z

  • Most addresses do not include l (lowercase L), I (uppercase i), O (uppercase o), or 0 (zero), to prevent visual ambiguity

An alternative is to use https://www.bitaddress.org, a website that generates randomness in the address based on a user’s mouse movement; however, users have to trust that the website’s owners will not record their private keys. Most people generate a new Bitcoin address through an exchange like Coinbase, which does it for them using their internal software.

An Early Vulnerability

As a new protocol, Bitcoin was not without its share of issues early on. It was not easy to use, so not a lot of people downloaded the Bitcoin client. Some of the earliest proponents of Bitcoin were those who had already proposed some of the concepts it used. They included Wei Dai, who proposed B-Money, and Nick Szabo, whose bit gold concept led to a lot of development on securing transactions. Another early Bitcoin advocate was Hal Finney, who received the first bitcoin transaction from Satoshi Nakamoto.

A major security flaw was found less than two years into Bitcoin’s existence. On August 6, 2010, a member of the community noticed an abnormally large output transaction and posted about it on a popular message board. “The ‘value out’ in this block #74638 is quite strange,” developer Jeff Garzik wrote, as someone attempted to create 91,979,000,000 out of thin air. Example 1-1 shows the transaction.

Example 1-1. An abnormally large bitcoin transaction
CBlock(hash=0000000000790ab3, ver=1, hashPrevBlock=0000000000606865, hashMerkleR
oot=618eba, nTime=1281891957, nBits=1c00800e, nNonce=28192719, vtx=2)
  CTransaction(hash=012cd8, ver=1, vin.size=1, vout.size=1, nLockTime=0)
    CTxIn(COutPoint(000000, -1), coinbase 040e80001c028f00)
    CTxOut(nValue=50.51000000, scriptPubKey=0x4F4BA55D1580F8C3A8A2C7)
  CTransaction(hash=1d5e51, ver=1, vin.size=1, vout.size=2, nLockTime=0)
    CTxIn(COutPoint(237fe8, 0), scriptSig=0xA87C02384E1F184B79C6AC)
    CTxOut(nValue=92233720368.54275808, scriptPubKey=OP_DUP OP_HASH160 0xB7A7)
    CTxOut(nValue=92233720368.54275808, scriptPubKey=OP_DUP OP_HASH160 0x1512)
  vMerkleTree: 012cd8 1d5e51 618eba

The vulnerability was subsequently patched, and the blockchain was “forked” to diverge the chain (more on forks in Chapter 3). The fork was to make sure the blockchain did not reflect the erroneous transaction. To this day the vulnerability found in 2010 remains the largest security flaw in Bitcoin’s history, a testament to the cryptocurrency community’s growing strength.