Everyone knows what file transfers are but transferring files over a pub/sub system is an atypical use case for a distributed messaging system. At Streamr, we have been experimenting with different kinds of proofs of concepts and use cases and now we are ready to introduce you to a proof of concept Command Line Interface (CLI) that can handle one-to-one and one-to-many file transfers over the Streamr Network.
In essence, Streamr file transfer CLI is the Streamr JavaScript SDK utilizing the Oclif CLI framework. Custom logic has been built to support reading files as binary chunks and wrapping these binary chunks as JSON objects asynchronously on the sender side. These file chunks are then sent over the Streamr Network to all receivers listening to the same Streamr stream that the sender is using. On the receiving end, each chunk is read asynchronously, decoded and appended into a file that is stored locally. The integrity of the received file is checked once the last file chunk has been received.
Each file chunk that is sent over Streamr is a JSON object that contains; the unique sender identifier, the base64 encoded binary chunk of the file, the message id, file chunk id, total number of chunks, file size and md5 hash of the complete file. To distinguish who is sending what unique identifiers are regenerated every time the CLI is instantiated since multiple people may be utilizing the same Stream for sending and receiving. The MD5 message digest algorithm is used to create a unique fingerprint for a file which is used to check the integrity of the transferred file at the receiving end, after the file has been transferred.
Table of Contents
What kind of performance can I expect?
We’ve tested file transfers from a few megabyte file sizes to a few gigabytes with different options for chunk sizes and wait times between chunks. As the CLI is at the proof-of-concept stage, the slower you let it upload, the more consistent the file transfers have been, according to our testing. Keeping the speeds in the range of 700—1000 kilobytes has produced the most reliable transfers during our internal testing.
What is required?
Naturally you’ll need the Streamr filetransfer CLI but in addition to it you’ll also need a Web3 wallet private key that has access to the stream that you are either listening to or sending in to.
How to use it and where can I get it?
File transfer CLI operates in two different modes: in receive mode and in send mode. To successfully send a file, the receivers must set the CLI into receive mode. With a waiting receiver, the sender may now send the file over Streamr, to the receiver.
You can find the CLI and detailed instructions on how to operate the client from GitHub or from the NPM package manager.
Why build file transfer over Streamr?
The point of this proof of concept was to demonstrate that Streamr can support use cases outside of the typical pub/sub use cases. Now that we have shown that it is possible to send files over Streamr, we hope that it’ll spark some ideas and thoughts on what else it could be used for. For example, streaming live audio and video to millions of viewers or ‘airdropping’ files to any number of devices at once or sending over-the-air updates to a large fleet of IoT devices. Any one-to-many type of use cases should scale well thanks to Streamr Network and Streamr Protocol.
What are the caveats of the PoC file transfer client?
There are some caveats regarding the current implementation of the Proof-of-Concept (PoC) client. We have taken a simplistic approach to the implementation and have not addressed all the potential issues that may arise, such as intermittent connectivity problems between the sender and receiver. The primary objective of the command-line interface (CLI) was to showcase the feasibility of file transfers. Presented below are some of these limitations.
Sending faster than the recipients can receive may lead to lost file chunks
Other caveats:
- File transfer PoC is a naive implementation of transferring files over pub/sub
- Base64 encoding and wrapping chunks into JSON is not as performant as sending binary data. Encoding binary information to Base64 takes approximately 33% more space. Streamr will support native binary data transport in H2 2023.
- Using private streams and encrypting the data will take approximately twice as much space in comparison to public streams
- Underlying WebRTC data channel payload size, message encoding and encryption set an upper limit for the chunk size, in our case roughly 750 kB for public streams and 350 kB for private streams. Recommended reading: https://tensorworks.com.au/blog/webrtc-stream-limits-investigation
- As the file transfer speed increases, so does the flakiness of the current solution
- Missing chunks will lead to corrupted file, as chunks are expected to be received in order and appended into a file on order
- Doesn’t yet support continuing file transfer, if chunks are missed during the file transfer process the resulting transferred file will end up being incomplete
- Synchronously asynchronous, files are read asynchronously but files are transferred synchronously, the file receivers need to listen to the Streamr before the file transfer is started so that they can receive and rebuild the file as it’s being sent
- Major differences between sending and receiving speeds may lead to lost chunks
- There are more robust strategies for file integrity, e.g. torrent-like protocols that take into account which chunk has been successfully received and support to sending file chunks in an order
There’s a lot of room to improve for the CLI and we would love to hear your feedback.
You can find us on the Streamr Discord or alternatively you can create a new issue ticket on the CLI Github repository.