How hard is it to build a large open source community? [spoiler alert: it’s very hard]
In this post I want to tell you about the challenges we faced in getting open source community traction and what I would call the “tyranny of numbers” (wink) The data presented is not scientific by any measure but is based on actual data from the Parallella project over the last 3 years.
To grow an open source community you need a diverse set of collaborators. Active collaborators have the following necessary traits:
- Common interests and goals (with you)
- Skills to contribute
- Time to contribute
- Motivation to contribute
Common Interests (C): There will never be multi party collaboration unless there is a common goal. In the case of Parallella, the common interest factor was open source parallel computing. While we found thousands of Kickstarter backers ready to buy our 18 core $99 Parallella computers, only a small fraction of our backers (~10%) were actually interested in programming the device.
Skills (S): Easy tasks can be knocked out by the core team in no time, it’s the really hard stuff that you need co-conspirators for. The beauty of open collaboration is that one person’s hard might be another person’s easy. In our case, chip design was the easy part! Everything else was new (and thus hard!) The harder the problem is the fewer people there are in the world with the skills needed to contribute. For the Parallella project the skills needed weren’t exactly main stream. We needed collaborators with skills in parallel programming, low level C programming, run times, algorithms, board design, FPGA development, and Linux kernel drivers.
Time (T): Any task takes time and difficult tasks take more time than simple tasks. I don’t know of any skilled engineer who has a big surplus of time! (if you know of any, send them my way please!) There have been billions of dollars spent on parallel programming research to date, so clearly the task at hand for Parallella was non-trivial. There have been over 55 academic publications/reports to date around the Parallella and 8 parallel programming frameworks, with the average time spent per project estimated as more than 200 hours. At $50/hour that’s approximately $500,000 worth of contribution. We didn’t pay for this time, but somebody did (grants, universities, tuition, etc)
Motivation (M): Open source developers with interest, skills, and excess time are exceedingly rare, but that’s not enough! There might be 10-100 other projects and technologies out there competing for attention with your project. To “win” a collaborator, you will need to convince her/him to join your project and invest precious time. Parallella as a hardware platform has competed with 100 different low power ARM based Linux computers, 10 different low cost FPGA platforms, and two incumbent HPC technologies (CPUs & GPUs).
If C, S, T, M are expressed as fractions of some Contributor Availability Total (“CAT”), then the potential actual contributors can be expressed as:
Contributors = CAT x C x S x T x M
In the case of Parallella:
- CAT: Equal to boards sold, ~10,000
- Common Interest (C): Less than ~10% of the customers care about open source enough to contribute back out of good will.
- Skills (s): Based on our experience only ~10% of Parallella customers had the skills needed to program the device effectively.
- Time (T): Only ~10% of those who bought the board ever actually found the time to program it (regardless of skill level)
- Motivation (M) : Assumed to be 100% if they bought the board, but with time that could change. (ie timing, change of priorities, bad fit after purchase)
Plugging in these rough estimates we find that 1/1000 of all Parallella CATs (ie ten total) are candidates for being active project contributors. In actuality, based on the parallella examples and publications, the number seems closer to 1/100, but you get the point. It’s a great start, but it’s certainly not Linux or Raspberry Pi type community numbers. I will be forever thankful to the ~100 friends who helped drive the Parallella project forward over the last 3 years but of course I wish the community would have been larger and more active.
In the case of hardware projects like Parallella, the CAT number is highly correlated with hardware cost since people actually have to buy something to be part of the community. In this respect, open source software communities are much easier to grow. If we could have reduced the selling price of the Parallella from $99 to $25 we could have increased our CAT numbers greatly, but our fall out ratios would also likely have been lower overall (ie a higher ratio of consumers to programmers). Thus it’s not clear it would have been worth it, even if it was possible. (which it wasn’t)
My conclusion: If you want to grow a large and flourishing community (software or hardware) you need to make sure your “CAT” is well over 1 million users or do everything possible to improve all of the fall out factors. [The next post will discuss some of the things you can do to improve collaborator fall out factors]