01 February 2013

p2p File Sharing - a Dropbox Killer?


I've been a big fan of Dropbox since it came out, even with the security ups and downs along the way.    They offer plenty of storage for free and it Just Works.  However, two recent announcements have got me thinking about how a competitor might go after Dropbox and other similar (mostly inferior!) products like Box.Net, Microsoft Skydrive and Google Drive.

Background - Some Recent Announcements


The first announcement was from a company called LogMeIn, and their new network storage and sync product Cubby.  Cubby's subtle twist versus Dropbox is that they offer p2p syncing called "DirectSync" (no "cloud" centralised data store required).  LogMeIn has been around for a long time and their product/service Hamachi has been a semi-tech way to set up a VPN between systems - basically a "Goto My PC" for a slightly more technical audience.  Here is the important bit: they have a long history as a p2p communications enabler (UDP hole punching, STUN, TURN, and related tricks) to make the p2p Hamachi service (and now Cubby) work in a behind NAT and firewalls.

The second announcement was by BitTorrent and their new Sync product.  BitTorrent is of course the eponymous creator of the bit torrent protocol, very successful at creating a broadly adopted way of sharing files.  It is also a p2p, no cloud required, network sync product.  BitTorrent, the underlying protocol, also has a very well established p2p communications enabler.

(Postnote: See also subsequent blog article A Closer Look at BitTorrent's SyncApp for related information.)

Now, who is the biggest, well-known p2p service out there, one with a very effective p2p communications enabler scheme, plenty of money and technical capability behind them?  Skype of course, with Microsoft's deep pockets and Skydrive capability.

Dropbox: You've been warned.

Network storage and sync: Cloud vs p2p


Of course, we don't really need a cloud service to sync directories and files between multiple devices.  rsync and boatloads of scripting around it has been been around for many years.  What Dropbox and others did was combine rsync, a cloud-based backup and control location, and It Just Works software with sufficient and simple usability to make it all work.

Skype has shown us that a p2p approach, with highly functional p2p communications enablement  to hook everything together, works just fine.  Articles on Skype discuss this very point of how they keep their infrastructure costs down by only setting up communications between two people (two devices), not actually handling the communications data itself unless they have to (setting aside group coms, super nodes and must-do coms relays to simplify a bit).  Skype wisely focused on value-added services like Skype Out to generate revenue.

BitTorrent has shown us that a flexible and lightweight p2p connection broker service can copy (sync!) massive amounts of data quickly.  And because p2p "cloud-based" components (mostly) only manage connection and data location coordination and don't manage the actual content like Dropbox does, they are light, inexpensive, and resilient (especially against content-rights litigators!).

What's the Minimum Viable Product feature set to push p2p sync over the line?


In addition to the fundamental idea of p2p sync, I think there are a few additional key features to create this new, hypothetical, revenue-generative p2p-based Dropbox killer:

1. Just Works
  • Dropbox absolutely gets this one right
  • With less reliance on the cloud, this should be even easier in a p2p world
  • Stable (no Microsoft BSODs), fast
  • Doesn't crush system performance by pigging all available disk I/O for indexing, CPU for sync calculations, Internet or local network I/O for sync transfers (if possible, sync via directly on a local LAN with no public Internet bandwidth required to shovel data between devices)
  • At least the efficiency of rsync in the sync approach (e.g., block level not file level syncing)
  • Works in corporate environments with strong firewalling (e.g., relay if necessary, uses typically unblocked outbound ports 80/443)
2. Pervasive - available on all high-use platforms and applications
  • Desktop, Laptop - OS X, Windows, Linux
  • Mobile - OS X, Android
  • Application access: simply local filesystem use where possible (desktop/laptop) or an API for sandboxed environments like mobile
3. Secure
  • All data sent over the network is encrypted, both to peers and to the cloud based service coordinator (e.g., public/private key)
  • Service coordination:
    • Authenticates you as a valid user, your account information, including billing information for billable value-add services
    • Authorises your nodes you want to keep in sync
    • Authenticates and authorises other's nodes you are sharing data with
    • Knows nothing about your data (no relay function unless required - and even then only passes through encrypted information)
  • All data is encrypted at rest in each participating node (see BoxCryptor - a big hole and opportunity in Dropbox's offering)
4. Usability
  • BitTorrent really struggles here (indexing, trackers) but a viable connection coordinator (e..g, Skype, Cubby) would solve this problem.
  • Tight filesystem integration (like Dropbox)
    • Async recognition of folder/file changes
    • Iconic representation of sync status in Finder/Explorer windows
  • Clear, simple communication of sync status between devices within a client and potentially iconically:
    • Which device(s) has the master/newest version of a file (think BitTorrent's Seed/Leech)?
    • How synchronised is is a file, directory, or everything? (as a % complete)
    • How "safe" is a file?
      • # of master/complete copies (# of devices on which file fully exists)
      • Are devices geographically distributed or co-located? (client asks OS for location info if available)
  • Basic management of sync conflicts
    • Visual notification of conflicts (Notification, Finder/Explorer iconics)
    • Filesystem filename changes (as per Dropbox)
    • Client logfile of existing, unreconciled conflicts with basic guidance on how to clear them
  • Can flag and manage favourites and high-priority sync choices
  • Camera/photo find and sync - I use this for all photo sources now
5. Sharing folders/files with others via syncing
  • BitTorrent excels at this, but not in a secure way
  • Use the cloud service coordinator to authenticate invited users; local node authorises access to authenticated users

What does this new p2p network storage and sync world look like?


Fast forward to a world where a company like BitTorrent Sync or LogMeIn Cubby has successfully deployed a working, free + billable services p2p sync product that has the required MVP features to compete with and beat Dropbox.

The user downloads/buys the software for each device they want to sync/copy their data between - just like Dropbox.  The client includes a simple dashboard indicating sync status and data "safety".  The client can administer all participating nodes and shared folders.  Client is a gateway to buying additional services.  User can see status of all devices, summary of content all on devices, when device was last sync'ed/seen.

Offer a user a "free" version of something that feels close to what Dropbox is today, but enables the synchronisation of an unlimited amount of data across any number of personal devices.  In fact, in general the more devices you store the data on, the faster sync happens and the "safer" the data is.

Why can't Dropbox and similar existing services do this today?


They could certainly do some of it, particularly around simplified usability.

However, Dropbox and most others have a legacy business model and investment in cloud storage, not p2p.  This will hold them back from p2p.

I'm not sure Skype can pull this off either.  Now that Skype is owned by Microsoft, one can guess the internal politics between Microsoft's Skydrive and the new Skype team will slow any progress to develop this to a crawl.  Besides, Microsoft rarely demonstrates Internet-first thinking.

What that means of course is that a new player like Cubby or BitTorrent Sync may be able to slip in.

How to make money in p2p


BitTorrent never made any money on p2p.  Just a big "thanks" from the tech-savvy Interent community for the protocol that has enabled fast and resilient file sharing for years.  However, I have a penchant for actually making money as well as technical elegance, so here are some ways that might happen with a p2p sync product.

Software sales.  Clients must be purchased for each platform you want to synchronise between.  I think users would generally accept a single shot, nominal software fee for client software purchases.  Certainly after beta and pre a mass-market ramp.  Each new device type is a new software purchase.  BoxCryptor and 1Password successfully use this model.

Services.  Much to the frustration of the media rights holders, p2p (a la BitTorrent) doesn't enable a path for them to make money.  They key of course is offering truly valuable services on top of the p2p service.

Cloud based service coordination.  I think a small fee per year could be charged after the first year for data relay, authentication and similar coordination/security services.  Certainly enough to cover related operational costs.  Looks like Cubby is doing this today.

The most obvious way to generate revenue is to create value-add services on top of cloud storage and data transfer to/from cloud storage.

A backup service is the most obvious value-add offering.

Basic backup service.  User's want to be confident that their precious documents and digital photographs are backed up to a safe and secure location.  Particularly if you build in having a separate geographic location being an important criteria to receive a "Your'e safe!" rating in the status dashboard for data redundancy.  From a security perspective, cloud based backups must be a "locked box" with only the encrypted format of the file backed up to the cloud and with only the user having the key to decrypt the files (client side de/encrypt) - not like Dropbox that store all your data in their cloud servers in an unencrypted format.

Versioning service.  Deliver as an extension to backup, perhaps like Apple's Time Capsule as a growing number of people understand the Time Machine concept.  Pay as you use for frequency and size of backups/versions.

High value local application data backup services.  Some of the below are structured data files meaning backup may not just be a simple copy.  Seek and suggest high value targets for sync and backup by offering a billable backup extension:
- Photos, iPhoto DB

- Contacts
- Calendar

- Email (local)
- Video
- 1Password DB
- BoxCryptor DB
- iTunes purchased music
- Purchased software
- ...

High value social/internet backup services.  Offer each as a billable backup extension.  A little like Facebook's timeline, but platform neutral.  Use APIs to pull out social media contributions and references:
- Email (hotmail, google, ...)
- Facebook
- Twitter
- LinkedIn
- Blogger
- ...

Web-based publication service.  Provide cloud storage and controls to Internet publish and share your content.

Specialised hardware.  Offer dedicated NAS devices that directly support the p2p sync protocol or license the protocol to NAS makers (BitTorrent Sync hits this point).  Offer from 1 to 5 disk chassis.  The sync client will automatically discover any new device connected to the local network and offer to configure it for you.  Usability will be critical.

Technical Challenges


I didn't say there aren't technical challenges here.  A few of them might be:

1. How does each node determine which file is the master among all nodes?  When to sync the newest versus flag a sync conflict and rename files.  Will have to develop a strategy around inaccurate system clocks.

2. Mobile versus desktop/laptop.  How do you manage limited space and CPU access on mobile devices versus always-connected and plenty-of-horsepower desktop/laptops?  Caching and sync prioritisation is tricky in a mixed node environment.
  • User can explicitly mark high/low priority data (BitTorrent client "Transmission" has this feature) - always a high priority or just a high priority until everything in sync then back to normal priority
  • User can explicitly mark favourites to imply an on-going high priority
  • Automatically identify "hot" files through regular/frequent/recent use.  Files you are actively working on are implicitly prioritised for sync across all nodes.
  • Amazon Kindle's approach to "cloud" vs "device" location of books is a good usability model to consider and will educate mainstream users on this model
  • Always prioritise what user is asking for right now in the front of the sync work stream, ahead of what is pending (for other reasons) to be sync'ed.

Bits and pieces


There are a few other factors to consider when looking for Dropbox weaknesses.

Dropbox incurs a competitive disadvantage bourne of their very successful "share with a friend" referral model to build up membership - a whole bunch of freeloaders aren't paying for the service but still creating operational costs.  Of course, freeloaders don't cost Dropbox - the costs are borne by customers who pay for Dropbox services by paying somewhat more than the true cost of their service.  Assuming you don't sync media and just stick with documents, 5GB of free storage is room plus the refer-a-friend space bonuses can store for *a lot* of documents.

Unfortunately, Cubby appears to have copied the Dropbox business model instead of offering (e.g.) a limited duration trial and aggressive shutdown of freeloaders.  I would recommend that Cubby emphasise unlimited space free syncing for one year then charge for value-add services like data relaying and cloud backup.

Dropbox also has a higher cost by hosting their storage in AWS' S3.  They must be at a point where a dedicated in-house equivalent service with an S3 simulation wrapper around it would be cheaper.

Dropbox is still semi-technical in nature - further usability improvements are possible.

BitTorrent including their name in marketing their Sync product will be a mistake if mass-market usage is their goal.  BitTorrent file sharing has in part been inhibited from wide-spread public acceptance because of its association with illegal activity (media rights violation).  However, so long as I'm sharing my own (rightfully obtained) files… at least between my own devices… for my own use... there should be no violation.  There will have to a be a temptation with the BitTorrent approach of course to co-mingle media sharing with sync which will also inhibit mass-market acceptance.

The p2p approach to storage will take a long time to be adopted in corporate environments.  Just look at the struggle of Skype and Dropbox in the Enterprise continuing even today.

Conclusion


There is an emerging trend now to think "mobile first" in development.  Companies that are oriented toward browser based "traditional" Internet service consumption are at risk because a mobile first equivalent can come along and end-run them.  Similarly, p2p for network file storage and sync could easily become a disintermediating force for file sync and share services that currently think cloud first.  Is p2p sync a viable product in the "post cloud" world?  At least in this use case?

So who might pull this off?
- Although Skype *should* be the best horse to bet on, the Microsoft purchase, "Internet Last" thinking and internal politics may kill all hope
- I don't think BitTorrent Sync will be relevant to go mass market - too much baggage.

So that leaves LogMeIn Cubby can pull this off and steal market share from Dropbox and other "last year's tech" cloud storage and sync providers.  Of course, there are all the other MVP features above they need to get right as well, which is chancy.  Or perhaps some other startup or early/quiet competitor whose ramping up their operation right now…

(Postnote: A blog article specific to Bit Torrent's SyncApp was written a few days after this one.)

3 comments:

  1. What about the free LifeStuff ? Supposed to be released soon.

    ReplyDelete
  2. Looks like Cubby have dropped the ball with a way too expensive pricing structure for those that are mainly interested in peer to peer file sync - if I haven't dropped money on a Dropbox subscription, slim chance I will do so on Cubby!

    I'm in the process of trying the Bitorrent Sync solution, and so far so good!!!

    ReplyDelete
  3. very good analysis of requirements! - thank you!
    andrew

    ReplyDelete

Note: Only a member of this blog may post a comment.