Storing Data on Blockchain

Though we are experiencing crypto winter at the moment, with major coins devalued more than 80% in 2018, the underlying blockchain technology is still exciting. The blockchain provides a democratized trust, distributed and validation protocol that has already disrupted banking and financial services and is on the verge of overhauling other industries like healthcare, supply chain, HR and more.

Despite the hype and its promising future, blockchain still has its shortcomings, the issue of data storage is one of them. The transactions based on the POW consensus for bitcoin, Ethereum, and other cryptocurrencies are extremely slow and therefore not suitable for storage of large data. For example, the deployment of dApp Cryptokitties nearly crippled the Ethereum network

The main problem of storing data on a blockchain is the limitation of the amount of data we can store because of its protocol and the high transaction costs. As a matter of fact, a block in blockchain can store data from a few kilobytes to maybe a few megabytes. For example, the block size of the Bitcoin is only 1Mb. The block size limitation has a serious impact on the scalability of most cryptocurrencies and the bitcoin community is debating whether to increase the block size.

Another issue is the high cost of the transactions. Why is storing data on the blockchain so expensive? It is because the data has to be stored by every full node on the blockchain network. When storing data on the blockchain, we do pay the base price for the transaction itself plus an amount per byte we want to store. If smart contracts are involved, we also pay for the execution time of the smart contract. This is why even storing kilobytes of data on the blockchain can cost you a fortune.

Therefore, it is not viable to store large data files like images and videos on the blockchain. Is there a possible solution to solve the storage issue? Yes, there are quite a few solutions but the most promising one is IPFS.

What is IPFS?

IPFS or Interplanetary File System is an innovative open-source project created by the developers at Protocol Labs. It is a peer-to-peer filesharing system that aims to change the way information is distributed across a wide area network. IPFS has innovated some communication protocols and distributed systems and combine them to produce a unique file-sharing system.

The current HTTP client-server protocol is location-based addressing which faces some serious drawbacks. First of all, location-based addressing consumes a huge amount of bandwidth, and thus costs us a lot of money and time. On top of that, HTTP downloads a file from a single server at a time, which can be slow if the file is big. In addition, it faces single-point of failure. If the webserver is down or being hacked, you will encounter 404 Not Found error. Besides that, it also allows for powerful entities like the governments to block access to certain locations.

On the other hand, IPFS is a content-based addressing system. It is a decentralized way of storing files, similar to BitTorrent. In the IPFS network, every node stores a collection of hashed files. The user can refer to the files by their hashes. The process of storing a file on IPFS is by uploading the file to IPFS, store the file in the working directory, generate a hash for the file and his file will be available on the IPFS network. A user who wants to retrieve any of those files simply needs to call the hash of the file he or she wants. IPFS then search all the nodes in the network and deliver the file to the user when it is found.

IPFS will overcome the aforementioned HTTP weaknesses. As files are stored on the decentralized IPFS network, if a node is down, the files are still available on other nodes, therefore there is no single point of failure. Data transfer will be cheaper and faster as you can get the files from the nearest node. On top of that, it is almost impossible for the powerful entities to block access to the files as the network is decentralized.

The following figure shows the difference between the centralized client-server protocol(HTTP) and the peer-to-peer IPFS protocol.


 [Source: https://www.maxcdn.com/one/visual-glossary/interplanetary-file-system/]

Blockchain and IPFS

IPFS is the perfect match for the blockchain. As I have mentioned, the blockchain is inefficient in storing large amounts of data in a block because all the hashes need to be calculated and verified to preserve the integrity of the blockchain. Therefore, instead of storing data on the blockchain, we simply store the hash of the IPFS file. In this way, we only need to store a small amount of data that is required on the blockchain but get to enjoy the file storage and decentralized peer-to-peer properties of IPFS.

One of the real-world use cases of blockchain and IPFS is Nebulis. It is a new project exploring the concept of a distributed DNS that supposedly never fails under an overwhelming access request. Nebulis uses the Ethereum blockchain and the Interplanetary Filesystem (IPFS), a distributed alternative to HTTP, to register and resolve domain names. We shall see more integration of Blockchain and IPFS in the future.

References

Building the blockchain using JavaScript

The blockchain is a data structure that comprises blocks of data that are chained together using a cryptographic hash. In this article, we shall attempt to build a blockchain prototype using JavaScript. This is a bit technical for non-technical people but should be a piece of cake for computer nerds.

First of all, let’s examine the content of a block. A block consists mainly of the block header containing metadata and a list of transactions appended to the block header. The metadata includes the hash of the previous block, the hash of the block, timestamp, nonce, difficulty and block height. For more information, please refer to my earlier article Blockchain in a Nutshell.

Prior to writing the code, you need to install the following software:

  • Chocolatey
  • Visual Studio Code
  • node.js

The installations are based on Windows 10, but you can do the the same thing easily in Ubuntu. Chocolatey is a software management solution for Windows, Visual Studio Code is a streamlined code editor with support for development operations like debugging, task running and version control and Node.js is an open source server environment.

Let’s start writing the code using Visual Studio Code. The first line is

const SHA256 = require("crypto-js/sha256");

This code meas we are using the JavaScript library of crypto standards. We require the crypto-js library because the sha256 hash function is not available in JavaScript.The crypto module provides cryptographic functionality.

SHA-256 is a cryptographic hash algorithm. A cryptographic hash is a kind of ‘signature’ for a text or a data file. SHA-256 generates an unique 256-bit signature for a text.  SHA256 is always 256 bits long, equivalent to 32 bytes, or 64 bytes in an hexadecimal string format. In blockchain hash, we use hexadecimal string format, so it is 64 characters in length.

Next, we create the block using the following scripts:

class Block {
   constructor(index, timestamp, data, previousHash = '') {
       this.index = index;
       this.previousHash = previousHash;
       this.timestamp = timestamp;
       this.data = data;
       this.hash = this.calculateHash();
   }

The class Block comprises a constructor that initializes the properties of the block. Each block is given an index that tells us at what position the block sits on the chain. We also include a timestamp, some data to store in the block, the hash of the block and the hash of the previous block.

We also need to write a function calculateHash() that creates the hash, as follows:

 calculateHash() {
       return SHA256(this.index + this.previousHash + this.timestamp + JSON.stringify(this.data)).toString();
   }
}

Finally, we write the script that creates the blockchain, as follows:

class Blockchain{
   constructor() {
       this.chain = [this.createGenesisBlock()];
   }

   createGenesisBlock() {
       return new Block(0, "01/01/2017", "Genesis block", "0");
   }

   getLatestBlock() {
       return this.chain[this.chain.length - 1];
   }

   addBlock(newBlock) {
       newBlock.previousHash = this.getLatestBlock().hash;
       newBlock.hash = newBlock.calculateHash();
       this.chain.push(newBlock);
   }
   

Notice that the class blockchain comprises a constructor that consists of a few functions, createGenesisBlock(),  getLatestBlock() and  addBlock(newBlock).

Save the file as myblkchain.js or any name you like.

Now, execute the file in the VS terminal using the following command

node myblkchain.js

The output is as follows:

Let’ examine the output. Notice that each block comprises the properties index, previous hash, data and hash.

The index for the Genesis is always 0 . Notice that the hash of the genesis block and the previous hash of the second block are the same. It is the same for other blocks.

References