ARN

Why open source is selfish

Companies don't support open source for purely altruistic reasons. They expect a return on their investment.
  • Matt Asay (InfoWorld)
  • 07 September, 2021 11:45

A friend recently sent me a DM on Twitter, suggesting the thing AWS really needs is “a flagship [open source] project” to boost its open source bona fides. He then offered some examples of what others have done: “Where’s AWS’s Android, Kubernetes, Tensorflow, VS Code?” Most of these are from Google, with the exception of vscode, which is a Microsoft project (not to be confused with Visual Studio Code, which is built on vscode but isn’t itself open source). It’s a familiar argument, but not a persuasive one. After all, AWS has Firecracker, the CDK, and other open source projects. But that’s not really the issue.

My problem is with the implied suggestion that companies contribute open source out of altruism, that they’ve built up positive open source reputations by blessing the world with peace, love, and open source code. This makes for clever tweets, but it’s a false narrative. Developers may contribute for the sheer love of code; companies don’t. Never.

Therefore, it’s useful to ask why a company has, or has not, contributed code.

Open source is hard work

Perhaps you’ve worked for companies with unlimited resources. I haven’t. Even fabulously wealthy organisations funded by runaway successes like Google’s advertising business, Adobe’s Photoshop, Microsoft’s Windows and Office cash cows, etc., always have finite resources.

Now, couple that with the reality that open source is hard.

How hard? Matt Klein, a senior engineer at Lyft and founder of the successful Envoy open source project, says that it’s a “f—-ing lot of work.” Not just coding, either, but all the other things (marketing, business development, etc.) that go into making a project successful. Worse, there’s no way to know in advance if all that work will pay off: “The benefits are not super clear. It’s not a slam dunk. You don’t know if you’re going to win, and if you don’t win, it’s a net negative.”

Even if you’re an unaffiliated developer building open source code in your free time, the demands on your time keep increasing, as Tidelift Cofounder Luis Villa has explained. “Developers clearly serve their self-interest by learning basic programming and people skills. It is less clear that they serve their self-interests by becoming experts in issues that, in their day jobs, are likely delegated to experts, like procurement, legal, and security.” Yet an open source project maintainer increasingly needs to think about end-to-end security of her project, file-level licensing, and more. It’s a “f—-ing lot of work,” to borrow Klein’s phrase.

This is why Lyft now evaluates whether to open source code based on whether or not they think they can “win” with the project, attracting enough outside interest to make it worth all the bother. “I’m not an open source purist,” Klein says. “I’m a capitalist.”

He’s not alone.

The ‘why’ of open source

We can laud Facebook and Google for their contributions to open source artificial intelligence (AI) software like PyTorch and TensorFlow, respectively, but let’s not kid ourselves that the companies released this code out of dazzling benevolence. In the past, I’ve talked about cloud companies using open source as on-ramps. Recently, Brookings Institution Fellow Alex Engler picked up this theme, suggesting that “for Google and Facebook, the open sourcing of their deep learning tools (TensorFlow and PyTorch, respectively), may have [the effect of] further entrenching them in their already fortified positions.” Half a decade after releasing the code, these companies still do most of the development (which is just as true of AWS and its Firecracker and CDK projects and Microsoft with vscode, lest you think I’m picking on Google and Facebook.)

Why does it matter? Because open source gives both companies a key, strategic lever to pull, argues Engler: “By making their tools the most common in industry and academia, Google and Facebook benefit from the public research conducted with those tools, and, further, they manifest a pipeline of data scientists and machine learning engineers trained in their systems. In a sector with fierce competition for AI talent, TensorFlow and PyTorch also help Google and Facebook bolster their reputation as the leading companies to work on cutting-edge AI problems.”

I’m not suggesting the companies are bad for doing this. I’m simply suggesting that companies don’t contribute code out of charity. Resources are finite. If a company spends money and resources to contribute code, it’s because they’ve done the math and believe they’ll earn a return on that investment.

Let’s look at Microsoft as an example.

A few examples of capitalistic open source

Microsoft is the world’s largest open source contributor as measured by the total number of employees actively contributing on GitHub. (Yes, I know this is an imperfect way to measure. Happy to hear your alternatives.) Why does Microsoft contribute? A few years ago I argued that quite often, “Open source is what underdogs do to win.” Despite its heft on the desktop and enterprise data center, Microsoft used to be a rounding error in cloud. One way the company sought to earn developer love and a seat at the cloud table was by metamorphosing from open source pariah into open source hero. It took years, but it’s paying dividends in terms of rising market share for Microsoft Azure.

Then there’s Google. Beyond its high-profile projects like Kubernetes (an opening salvo in the multicloud war, which has become a primary competitive wedge for Google) or Android (helping dislodge Apple’s lock on the smartphone market), Google has also been quick to partner with open source companies. But that work, Google Open Source Director Chris DiBona said back in 2019, isn’t due to “some sort of generous magical deal.” It was a way to “give customers what they want.” At the time, it also happened to be a way to effectively position Google Cloud against its competitor AWS.

What about AWS? AWS has arguably had less need to open source its code. Why? As the cloud market leader, anything that potentially helps competitors catch up would probably not get approval within the company, unless there was overriding strategic value. Using that lens, let’s look at Firecracker, a new kind of virtualisation technology that powers AWS serverless products such as Lambda. When announced, company representatives noted: “As our customers increasingly adopted serverless, we realised that existing virtualisation technologies were not developed to optimise for the event-driven, sometimes short-lived nature of these kinds of workloads. We saw a need to build virtualisation technology specifically designed for serverless computing.”

I wasn’t part of the team that released Firecracker, so I have no inside knowledge of the rationale. But those two sentences suggest that the company is hoping that more Firecracker equals more serverless adoption which, presumably, will increase the AWS lead in that market. Nefarious? Absolutely not. But at AWS, as at Lyft, Microsoft, Google, and every other company, things don’t get open sourced unless there’s a compelling business reason.

Perhaps my friend is correct. Perhaps AWS does need to open source some big flagship product. But if it does, it won’t be because AWS wants to improve its reputation with random folks on Twitter (or writers like me). The reason will be, as with Google and others, to help drive greater customer adoption of its own products. This is just how (open source) business works.