Is this the Future of the Internet?

The Inter-Planetary File System

Editorial @ TRN
The Research Nest

--

(Picture Courtesy: Wikimedia Commons)

In today’s world where terms like blockchain and cryptocurrency are thrown around casually in conversation, decentralization is the newest craze among computer geeks. This technology is leading to exciting innovations, one of which is the InterPlanetary File System (IPFS). The protocol is established based on the bitcoin blockchain. Like bitcoin, IPFS is a peer-to-peer system that provides a solution to store and share files in a decentralized manner. It can connect all computing devices with the same system of files. In some ways, it is like the HTTP web, but IPFS is more similar to a single bit torrent swarm, exchanging git objects. If built according to expectations, it could complement or even replace HTTP and build a better web.

Unlike HTTP, it’s not just a protocol but a hypermedia distribution protocol to make the web faster and safer. This article will take you a little closer to what IPFS is, how it works, and how it will shape the future of our society.

BASIC ARCHITECTURE

IPFS began as an effort by Juan Benet to build a distributed file system. It is now available as an open-source project and implementations in Go and JavaScript also exist. There is a Python implementation too, although it is still under progress at this point in time. Till now, we have known that IPFS uses a peer-to-peer model in which multiple nodes store files that are submitted to the network.

Let’s understand how it works. At its core, IPFS is a versioned file system. Each file and all the blocks within it, on the network, are given a unique identity or fingerprint called a cryptographic hash. This allows the IPFS network to automatically delete duplicate data and track version history for every file.

IPFS follows some rules that tell how data should move around on the network, which is quite similar to bit torrent. It uses content addressing, which means that the information can be retrieved using the content instead of the location. In other words, the content determines the address. The idea is to take a file and hash it cryptographically so that the file becomes secure. It ensures that some other users on the network cannot come up with another file with the same hash and use that as the address. Thus, IPFS refers to everything by the hash on the file.

When a user requests to access a particular page, IPFS will ask the entire network “does anyone have content that corresponds to this hash?” and the node that has the file can return it and the user will have the page in his browser. There is no doubt that this is a faster way of storage and retrieval of data.

THE BACKGROUND

IPFS combines many internet technologies to provide a successful peer-to-peer system. These are:

Distributed Hash Tables (DHTs):

Distributed hash tables are used to locate files on the network. They coordinate and maintain the metadata of the peer-to-peer network.

Some examples of DHTs are:

Kademlia DHT is a popular DHT which is widely used in peer-to-peer applications. It provides many features that are not offered by other DHTs. It optimizes the messages it sends to other nodes. Here, each node has a node ID as its identity. The Kademlia algorithm uses this to provide a direct map to hash files and that node stores information on where to get the file or resource. Kademlia is resistant against various attacks. Hence the lifetime of nodes is longer.

Coral DSHT is an extension of Kademlia DHT which stores the values in nodes that are both far and nearby, unlike Kademlia which stores value only in nodes whose node IDs are nearest to the key. Coral maintains a separate cluster depending upon region and size. This saves time to look up data according to the region without querying distant nodes.

Bit Torrent: It is a communication protocol widely used for peer-to-peer sharing. It is one of the most popular protocols for sharing large files such as digital video files or digital audio files. It acts as an alternative to the older single source techniques and can work effectively over networks having lower bandwidths.

Version Control Systems-Git: It is a version control system that captures changes to a file system on a computer. It is basically a repository that has a complete history and tracking abilities and is independent of network access or a central server. Git provides a powerful Merkle DAG object model that tracks the changes in a distributed-friendly way.

Why do we need IPFS?

The internet we use today is totally centralized. It has a centralized server that hosts several files on the network. This means that a single server somewhere on the network manages the links to other systems for the exchange of information. When a user requests a page on the internet, it basically connects to the central server which gets the information from another system where the file is located and then gives the result to the user.

HTTP, which we use these days, is described by many developers as inefficient and expensive. It downloads a file from a single computer, whereas IPFS gets distributed files from multiple nodes simultaneously. If one link in HTTP goes down, the whole process of transfer of data breaks. So the internet system that we use today is entirely dependent on centralized networks which may go down any moment. A solution to this problem is having a distributed system. IPFS is one such innovation that will help the internet grow into the system we wish for.

BASIC IPFS DESIGN

IPFS is a distributed file system that is an aggregation of internet technologies such as DHTs, Bit Torrent, and Git. It is a peer-to-peer system and has multiple nodes that store IPFS objects. Nodes connect to each other and transfer objects. The objects represent files and other data. Here is the top-level view of the protocol.

IPFS and the Merkle DAG

Merkle DAG is a directed acyclic graph whose links are hashes. It is the core of IPFS that bestows the following characteristics to IPFS files and objects:

  • Authentication: Content on the network can be hashed and verified.
  • Permanent: Once retrieved, objects can be cached forever.
  • Decentralized: Objects can be created by anyone, without centralized writers.

Nodes and Network Model

IPFS network uses PKI based identity. As mentioned earlier, nodes have their own node IDs. Node ID is a cryptographic hash of a public key. Nodes store both their private and public keys.

Multihash and upgradable hashing

Multihash is a self-describing hash format and all hashes in IPFS are encoded with this. Upgradable hashing means that as hash networks are broken, networks can shift to stronger hashes.

The Stack

The IPFS protocol is divided into different sub-protocols each having a unique function-

  • Merkledag: Data structure format.
  • Network: This protocol manages connections among the nodes of IPFS.
  • Routing: The nodes of IPFS require a routing system that finds peer network addresses. This can be achieved using a distributed hash table such as Kademlia or Coral DSHT.
  • Exchange: Data is distributed in IPFS by exchanging blocks with other peers using the Bit Torrent protocol BitSwap.
  • Naming: Construct self-certified names.

APPLICATIONS AND FUTURE DIRECTIONS

Well, it sounds a bit crazy, but yes, IPFS could route the entire internet in the future. It could push the web to a whole new level, where it can be used as a next-generation file-sharing system. IPFS is a new platform of the decentralized internet infrastructure, upon which different kinds of applications can be built. With IPFS, we can think of a better, safe, and permanent web.

Start-Ups and companies working in this field and products being developed

The Wikipedia logo has a hash address, using which it can be accessed. There are also plans to implement IPFS to make Wikipedia reachable in countries where it is blocked. (Picture Courtesy: Wikipedia)

IPFS seems to be a perfect match to the blockchain. As the latter is also a distributed system, together with IPFS it will amount to a successful venture. Some projects which integrate blockchain with IPFS have already started.

Novus, a software development company is taking advantage of IPFS and Novusphere blockchain to create the Advanced File Index (AFIX), a web-based index. It works as a decentralized search engine for content within the IPFS. Anyone can submit the content on IPFS by referencing the unique cryptographic hash of the content along with its title, description, etc. Users can then search and view the content through the AFIX.

Another interesting project that uses this technology is Embermine. It uses IPFS to store files and encrypted Embermine application data. Protocol Labs is a technology company whose main focus is to build decentralized network products. Led by Juan Benet, it emphasizes open source development. The company is best known for IPFS and Filecoin. In addition to this, Protocol Labs makes platforms like LIbp2p and IPLD.

As blockchain technology continues to develop and IPFS seeks to solve many problems regarding the web, the system can potentially become the biggest upcoming technology and an important backbone to the internet.

(This article was written by Research Nest’s technical writer, Akshita Kapoor)

Clap and Share if you liked this one, and do follow “The Research Nest” for more insightful content.

--

--