Everyone knows what file transfers are but transferring files over a pub/sub system is an atypical use case for a distributed messaging system. At Streamr we have been experimenting with different kinds of proof of concepts and use cases and now we are ready to introduce you to a proof of concept Command Line Interface (CLI) that can handle one-to-one and one-to-many file transfers over the Streamr network.
Each file chunk that is sent over Streamr is a JSON object that contains; the unique sender identifier, the base64 encoded binary chunk of the file, the message id, file chunk id, total number of chunks, file size and md5 hash of the complete file. To distinguish who is sending what unique identifiers are regenerated every time the CLI is instantiated since multiple people may be utilizing the same Stream for sending and receiving. The MD5 message digest algorithm is used to create an unique fingerprint for a file which is used to check the integrity of the transferred file at the receiving end, after the file has been transferred.
Table of Contents
What kind of performance can I expect?
We’ve tested file transfers from a few megabyte file sizes to a few gigabytes with different options for chunk sizes and wait times between chunks. As the CLI is at the proof-of-concept stage, the slower you let it upload, the more consistent the file transfers have been, according to our testing. Keeping the speeds in the range of 700 – 1000 kilobytes has produced the most reliable transfers during our internal testing.
What is required?
Naturally you’ll need the Streamr filetransfer CLI but in addition to it you’ll also need a Web3 wallet private key that has access to the stream that you are either listening to or sending in to.
How to use it and where can I get it?
File transfer CLI operates in two different modes, in receive mode and in send mode. To successfully send a file, the receivers must set the CLI into receive mode.With a waiting receiver, the sender may now send the file over Streamr, to the receiver.
Why build file transfer over Streamr?
The point of this proof of concept was to demonstrate that Streamr can support use cases outside of the typical pub/sub use cases. Now that we have shown that it is possible to send files over Streamr, we hope that it’ll spark some ideas and thoughts on what else it could be used for. For example, streaming live audio and video to millions of viewers or ‘airdropping’ files to any number of devices at once or sending over-the-air updates to a large fleet of IoT devices. Any one-to-many type of use case should scale well thanks to Streamr Network and Streamr Protocol.
What are the caveats of the PoC file transfer client?
There are some caveats regarding the current implementation of the Proof-of-Concept (PoC) client. We have taken a simplistic approach to the implementation and have not addressed all the potential issues that may arise, such as intermittent connectivity problems between the sender and receiver. The primary objective of the command-line interface (CLI) was to showcase the feasibility of file transfers. Presented below are some of these limitations.
- File transfer PoC is a naive implementation of transferring files over pub/sub
- Base64 encoding and wrapping chunks into JSON is not as performant as sending binary data. Encoding binary information to Base64 takes approximately 33% more space. Streamr will support native binary data transport in H2 2023.
- Using private streams and encrypting the data will take approximately twice as much space in comparison to public streams
- Underlying WebRTC data channel payload size, message encoding and encryption set an upper limit for the chunk size, in our case roughly 750 kB for public streams and 350 kB for private streams. Recommended reading: https://tensorworks.com.au/blog/webrtc-stream-limits-investigation
- As the file transfer speed increases, so does the flakiness of the current solution
- Missing chunks will lead to corrupted file, as chunks are expected to be received in order and appended into a file on order
- Doesn’t yet support continuing file transfer, if chunks are missed during the file transfer process the resulting transferred file will end up being incomplete
- Synchronously asynchronous, files are read asynchronously but files are transferred synchronously, the file receivers need to listen to the Streamr before the file transfer is started so that they can receive and rebuild the file as it’s being sent
- Major differences between sending and receiving speeds may lead to lost chunks
- There are more robust strategies for file integrity, e.g. torrent-like protocols that take into account which chunk has been successfully received and support to sending file chunks in an order
There’s a lot of room to improve for the CLI and we would love to hear your feedback.