Open Sources 2.0/Open Source: Competition and Evolution/Open Source and Proprietary Software Development

From WikiContent

Jump to: navigation, search
Open Sources 2.0

Chris DiBona

In this chapter, I present a perspective on the similarities, differences, and interactions between open source and proprietary software development.

Contents

Proprietary Versus Open Source?

Before you go any further, throw off any notion that the proprietary developer is somehow a different person from the open source developer. It is uncommon for a member of the open source developer community to do only open source for a living. Only the most prominent, or loaded, members of the open source community come close to having this kind of freedom. It is indeed rare to find a developer who develops only with proprietary tools and libraries. Even Visual C++ and C# developers benefit from a great variety of code and libraries that are free for use in their programs.[1]

My career has focused on open source development for the last 10 years, and I'm constantly pleasantly surprised by how open source development and proprietary resemble each other. I believe this is because proprietary developers are educated by the adventures of their slightly crazy open source cousins, but I also know that open source developers have learned just as much from proprietary developers.

Don't read this as an attempt to muddy the difference between proprietary and open source programs. They are different, sometimes very much so. However, they come from the same people, and they're using a lot of the same methods and tools. It is the licenses and the ideals behind open source programs that make them remarkable, different, and revolutionary.

The Example Culture

A lot of people, when talking about open source software development, say that open source developers enjoy a great productivity gain from code reuse. This is true, but in my experience all developers, not just open source developers, benefit from the existence of free-of-charge standard libraries and code snippets. For decades, proprietary developers have had a great variety of prepackaged libraries to choose from, but these proprietary libraries haven't taken root in the same way that freely usable, open libraries have.[2]

Code reuse? Knowledge reuse!

In Linus Torvalds' essay from the first Open Sources, he talked about how the rise of open code was delivering on the promise of reuse touted by proponents of the Java© programming language specifically and object-oriented programming in general.

That said, it has been my experience that there is a point at which software developers will go out of their way to avoid reusing code from other projects. In some shops, they call it "not invented here" (NIH) syndrome, and some companies are famous for it. But even those shops use standard kernels, libraries, and compilers. The real difficulty here is in figuring out where the NIH line lies. Although the answer is different for every single programmer and team, all still can (and still do) learn from the open code out there, which is a unique advantage of open code. While both open and proprietary code can be reused in a wide range of circumstances, open code enables something further: knowledge reuse. By examining the code itself, the developer can learn how a particular problem is solved, and often how that solution is an instance of a general solution type. It is this kind of reuse that Linus applauded and that the NIH developer misses.

Then why not simply use other people's code? There are a number of factors to consider before code is incorporated, and these must be understood before one can understand the role that Free Software has had in development.

Speed of development

There are very real barriers to using other people's code. You have to examine how to interface with said code, and you need to review the code to make sure it meets your standards for security, license, style, and correctness. You also need to integrate it into your version control and build system.

None of these problems is insurmountable, but they have to be worth surmounting. To wit: if all I need is a routine to do something simple, such as iterate through an array of numbers and perform some simple operation on them, using someone else's software would be a waste of time.

When developing, I like to use large libraries only when I either don't want to deal with a technology, or I don't fully understand it and don't feel qualified to implement it. For a recent project, I was pulling newsfeeds from weblogs and performing a kind of natural English language processing on it. I thought that using a tool called a "stemmer" to normalize the data would make my later analysis more accurate.

Implementing the routines to download and process feeds could have taken a month or two, and this is exactly the kind of development I don't like to do. To properly implement a stemmer, I'd likely have to get my graduate degree and then write it—which would impact my deadline a bit—so I downloaded programmer-friendly libraries that did each of these tasks. The stemmer was available under the Berkeley Software License, and the feed parser was available under the Python Software License, both of which are very easy to deal with and do not require any onerous post-incorporation duties. I was thus able to save time and have better code.

That said, some things I'm very interested in developing myself. Since I was doing this project as an excuse to learn a natural-language processing algorithm, which was interesting to me, I wanted to write that part of the program myself. I was (and am) also fascinated with a problem I think I'll have in storing the results such that I can quickly retrieve them from a database. I haven't solved that problem as of this writing, but I don't necessarily want to use other people's code for that. I have read some code and examples in textbooks and online that will help me with the former, but the storage problem is mine, for now.

This gives you an idea where the line was for me in this particular project, but others have the same reticence for other, subtler reasons.

A particularly difficult codebase

What makes software difficult to add to your code? Sometimes the code is simply in the wrong language. Maybe you are using Perl and want to tie some code into a C or Python module. That's not always so easy. Maybe the code was really developed on only one platform—say, an Intel machine—and you want it to work on your iBook, which runs on a PowerPC processor.

The problems with using other people's code can be legion. Maybe their routines were implemented assuming a machine with a lot of memory or processor cache, making it perform poorly or, worse, unpredictably,[3] on your target platform. Maybe the software was developed for an earlier version of your programming language, so a lot of features you would have implemented with a standard library call are instead implemented from scratch, thus reducing future maintainability.

Problems arise with canned libraries as they get older. For instance, the aforementioned feed parser library is useful because its author, Mark Pilgrim, is very good at keeping it up to date with the 13 "standards" that lie behind that "xml" button on your favorite blog or web site. If the library were to fall into disuse, or Mark were to stop working on it and no one else picked up the work, I'd likely change to a different library or choose to maintain it myself.

There is another reason to not use someone else's code, and it will look amazingly petty to all but the programmers reading this.

Technically speaking, this:

int myfunction(int a)
{
    printf("My Function %d\n",a);
}

is the same as this:

int myfunction(int a) {
    printf("My Function %d\n",a);
}

which is the same as this:

int myfunction(int a){    printf("My Function %d\n",a);}

and this:

int myfunction(int a){    
  printf("My Function %d\n",a);
}

They compile to the same result on any given compiler.

I could go on, but I won't. The point is that, depending on the programmer or dictated company style, each of these is wrong, evil, bad, or awful, or perhaps one is acceptable. Not all programmers and companies care about style, but many (one might argue the smartest) do. The ones that do care actively dislike the ones that don't and do not want to use their code. Should they have to touch the offending library, they will inevitably have to make it "readable." Whether you call this refactoring or prettifying or whatever, it can drive a programmer away from a hunk of code, unless it really brings something fantastic along with it.

"My Goodness," you might consider asking, "are programmers delicate, petty creatures?" No, there are some very good reasons to have consistent code style. It aids in debugging. Some say it reduces bugs (I'd agree). It makes code navigation much faster and makes it easier for people to write tools to generate and manipulate code than they might otherwise. There are other reasons too, but I don't want to get too arcane. Some languages, such as Python, have very rigid appearance rules, as appearance can dictate how a variable can be used. Style may appear to be a trivial concern, but it isn't.

Comfort

Maybe you just want to do it yourself. Businesspeople in the industry who have grown up around open source often comment that duplication of effort, or "reinventing the wheel," is not time well spent. I rarely hear this from programmers. When people hear about KDE and GNOME, or Linux and BSD, or even more esoteric arguments about which window manager to use, inevitably someone will chime in, "Obviously, they had a lot of time on their hands. Otherwise, why would they have started from scratch?"

The implication is that the programmers have somehow wasted time. When I choose to reimplement some technology or program, I know what I'm doing, and even if it is a "waste" of time or duplication of effort, I think of it as practice. And when I can enjoy the luxury of implementing from scratch, I really like the results, because they're all mine and what I've developed works exactly the way I want it to.

But Why So Many of the Same Things?

Business, of course, is interested in productive developers, and productive developers don't rewrite things, right? No, not necessarily. People rewrite code all the time. The more-informed companies recognize that this type of thing is often inevitable, and the best and most resourceful encourage this kind of mental knife sharpening, because it leads to better developers and better code. Given the time, programmers often prefer to learn from other people's code without actually using the code, and if open source ends up as one big repository of example code, I call that a success.

Also, computers change. Computers, languages, compilers, and operating systems change so quickly that a periodic rewrite of some code becomes vital, from a performance perspective. To take advantage of the newest processors, architectures, and other advancements, a recompile will certainly be required and will likely expose issues with your code (architecture changes lead to this directly).[4]

But people are using libraries, code, and examples from open source code, copying them into their codebases rapidly. Certainly this happens. Don't let my counter cases fool you. It is a rare codebase that doesn't involve some open source software, whether it is merely in the form of a standard library or a widget library, or is full of the stuff. This is by design; if every program had to write every instruction down to the operating system, or the machine itself, there would be no programs. The iterative building process, programs on top of libraries on top of the operating system, is so productive that I can't imagine someone ignoring it. Even for the smallest embedded systems, designers are using the GNU compilers to create great programs for their devices: compile, flash, and go.

Libraries, System Calls, and Widgets

Here we begin to see how open source ideals have changed proprietary development. When proprietary software developers create a program, they may use free software, created or derived libraries, widget libraries, and tools. This includes developers targeting proprietary operating systems such as Windows and OS X. Developers creating software, whether for OS X, Unix/Linux, or Windows, commonly use free tools to do so. They almost always use free libraries in the creation of their programs and often use free user interface elements during the creation of their systems.

Some might think I'm indulging in some mission creep for free software here, assigning a larger role to it than it maybe should enjoy. I'm not. I'll take it even further: if there hadn't been free tools like the GNU compiler collection, the industry would have been forced to create and release them. Otherwise, the computer industry as we know it would not exist and would certainly not be as large as it is right now. This is not to imply that companies somehow owe something to the free software community. However, companies do help out when they can reap a long-term benefit. IBM understands this, as do Novell, Google (my employer), and many others. Even Microsoft uses and releases code under a variety of licenses, including the GPL (its Unix services for Windows) and BSD (Wix), but Microsoft is conflicted both internally and externally, so it's not as easy for it to embrace open source.

Am I saying that without free tools, the compiler would try to charge a per-program fee? No, I think that if free tools hadn't arrived and commoditized the compiler, other competitive concerns would have kept the price of software development tools accessible and cheap. That said, I think free tools played a big part. Free and open source software changed expectations. Microsoft and Intel make no attempts to prevent developers from using their compilers to create free software or software that is counter to their corporate goals. Client licenses, a common fixture in the email/workflow market, are unheard of for mainstream development tools.

If there is one thing about free software that is downright scary to proprietary development shops, it may be this: software that is licensed per client almost always comes under attack from free software. This is forcing in the software industry a shift away from such per-client licenses in all but the most specialized verticals—for instance, the software that runs an MRI machine, or air traffic control software, both of which are so specialized as to not count, because every client is custom. The grand irony here is that in some industries, such a high cost is attached to developing software that some are forming very open source-looking consortiums to solve common software development problems.

Distributed Development

Distributed development is more than just a fad or even a trend. Organizations and companies large and small are using diverse, globally distributed teams to develop their software. The free software development movement showed the world how to develop internationally. Well before SourceForge.net became a site that every programmer had heard of, projects working together over the Internet or far-flung connected corporate networks developed much of the software that we use today.

In fact, the tools they developed to do that are now considered the baseline standard for developerd everywhere. What company in its right mind doesn't mandate that its programmers use some form of version control and bug tracking? I ask this rhetorically, but for a long time in the software business, you couldn't make this assumption. Small development shops would back up their data, for sure, but that's not version control.

Distributed development is about more than just version control. It's also about communications and bug tracking and distribution of the end result of software.

Understanding Version Control

Programming is an inherently incremental process. Code, then build, then test. Repeat. Do not fold, spindle, or mutilate.[5] Each step requires the developer to save the program and run it through a compiler or interpreter. After enough of these cycles, the program can do a new thing or an old thing better, and the developer checks the code into a repository, preferably not on his machine. Then the repository can be backed up or saved on a hierarchical storage system. Then, should a developer's workstation crash, the worst case is that the only work lost is that done since the last check-in.

What is actually stored from check-in to check-in is the difference from one version to the next. Consider a 100-line program, in which three lines in a program read:

for (i=1; i < 1; i++) {
    printf("Hello World\n");
}

and one link needs to be changed to:

for (i=1; i < 100; i++) {
    printf("Hello to a vast collection of worlds!\n");
}

which would then be checked back in. The system would note that only one line had changed and store only the difference between the two files. This way, we avoid wasting storage on what is mostly the same data.

The value of having these iterations can't be overstated. Having a previously known, working (or even broken) copy can help in the event of an editing problem, or when you're trying to track down a bug that simply wasn't there a revision ago. In desperate cases, you can revert to a previous version and start from there. This is like the undo option in your favorite word processor, but one that persists from day to day.

Version control isn't used just in development. I know of IT shops and people who keep entire configuration directories (/etc) in version control to protect against editing typos and to help with the rapid setup of new systems. Some people like to keep their home directory in a version control system for the ultimate in document protection. There is even a wiki project that sits on top of the Subversion version control system.

Additionally, good version control systems allow for branching—say, for a development and a release branch. The most popular version control system that many open source projects use is CVS.

CVS

CVS, the concurrent versioning system, allows developers all over the world to work on a local copy of a codebase, going through the familiar "code, build, test" cycles, and check in the differences. CVS is the old standby of version control, much in the same way RCS/SCCS was before it. There are clients for every development environment and it is a rare professional developer who hasn't been exposed to it.

Since it is easy to use and install and it enjoys wide vendor support, CVS continues to be used all over the world and is the dominant version control platform.

Subversion

Only the rise of Subversion has brought real competition to the free version control space. With a much more advanced data store than CVS and with clients available for all platforms, Subversion (SVN) is also very good at dealing with binary data and branching. Both are things that CVS isn't very good at. SVN is also very efficient to use remotely, and CVS is not; CVS was designed for local users and remote use was tacked on later. Additionally, SVN supports a variety of access control methods, supporting any authentication scheme that Apache does (Subversion is an Apache project), which includes LDAP, SMB, or any the developers wish to roll for themselves.

What About SourceSafe?

SourceSafe isn't really version control. Local version control, whether CVS or SourceSafe, is just backup, requiring a level of hardware reliability that simply doesn't exist on a desktop. Since SourceSafe is not designed to be used remotely, you take the life of your codebase in your hands when you use it. There are some SourceSafe remoting programs out there if you must use SourceSafe, but I can't recommend them so long as decent, free SVN and CVS plug-ins exist for Visual Studio.

The Special Case of BitKeeper

BitKeeper, which was written by Larry McVoy, was chosen by Linus Torvalds to use for version control for the Linux kernel. For the Linux kernel, BitKeeper was a very good choice, given the kinds of problems that arise with Linux kernel development. Written for distributed development, BitKeeper is very good at managing multiple repositories and multiple incoming patch streams.

Why is this important? With most version control systems, all your repositories are slaves of one master and resolving differences between different slaves and masters can be very difficult.

The only "problem" with the kernel team's use of BitKeeper was that BitKeeper was not a free software program, although it was available for the use of free software developers at no charge. I say was because Larry McVoy recently decided to pull the free version, thus making it impossible for Linux kernel developers to work on the program without paying a large fee.[6]

A great number of developers lamented the use of a proprietary tool for free software development, and the movement off BitKeeper, while disruptive, is a welcome change.

BitKeeper is a tool designed with the open source software model in mind. It has found success among large proprietary development houses specifically because the problems that faced the kernel team in 2001 are the same ones that increasingly face proprietary development shops. All of these teams, not just those working on open source development projects, now face multiple, far-flung teams that are engaged in collaborative development and struggle to do it effectively.

Collaborative Development

You have a developer in Tokyo, a team in Bangalore, a team in Zurich, and a shop in Seattle, all working on the same codebase. How can you possibly keep the development train from coming off the rails? Communication!

IRC/IM/Email

One might imagine that only now, with the advent of IM and VoIP, can developers keep up with each other. In fact, developers have stayed in touch in something approximating real time since the early days of Unix, when they began to have a great variety of communications tools to use.[7] Early on, two developers on the same machine used the Write or Talk Unix programs, which allowed for a simple exchange of text between users. This grew into Internet Relay Chat (IRC) and then Instant Messenger (IM).

Email itself plays the most important role in development. It is the base packet of persistent knowledge that distributed developer teams have. Wikis are also taking hold as repositories of information.

VoIP

Strangely (to nondevelopers) voice simply hasn't caught on as a terrific tool for ongoing developer communications. While a regular conference call is useful for keeping everyone moving in the same direction, the idea of vocal input while developing would drive many coders away screaming. The phone isn't evil, but maintaining an uninterruptible flow can be very important to developer productivity. Phones also do not create a logfile or other transcript that can be referred to later. Don't take my experiences for gospel here. Read the book Peopleware[8] for more information about this. Everywhere I've ever worked, the one constant has been developers wearing headphones, but listening to music, not other developers yammering in their ears.

SourceForge

The online site SourceForge.net is the largest concentration of open source projects and code on the planet. SourceForge boasts some 100,000 projects and 1 million registered developers, and people use its integrated version control, project web hosting, file release mechanism, bug control, and mailing lists to write a vast amount of software. Pulling together these features on a free platform for open source developers proved to be a revolutionary concept. Before, people were left implementing this themselves with Bugzilla (a bug-tracking mechanism) and CVS or some other version control/bug-tracking facilities.

SourceForge represents, for a lot of people, the next stage in developer environments. VA Software, the company that runs SourceForge through its Open Source Technology Group (OSTG) subsidiary, sells this sort of solution into the enterprise, as does the Brisbane, California-based Collab.net.

Software Distribution

While free software developers know how to code, what about getting the code in front of the user? In the early days of the Free Software Foundation (FSF), the answer was to send out tapes and disks to users who wanted the tools, for a reasonable fee. Now that so many people have connections to the Internet, boxed software is beginning to show its age, but software producers are really just now learning from open source how to distribute software in this way.

Dependencies

When you compile a piece of software, you sometimes end up relying on libraries that you must call from your program to do some task. If you try to run the program without the expected complement of libraries, it cannot run or it may run poorly. Open source developers have created some very smart packaging and installation systems and filesystem methods that can make this a more tractable problem. Once they created these packaging systems and combined them with the Internet, they got online updating. The irony is that, in a lot of ways, Linux and Unix were schooled in this by Windows. A common complaint regarding Linux when comparing it to Windows and OS X is that software can be very difficult to install. One could argue that Windows isn't all that easy to install either, but since Windows is preinstalled on most computers, this is an argument that often falls on deaf ears.

I don't think Linux developers have learned to do installation well yet. There are some standouts, but for the most part, installation ease is still a work in progress. One thing free and proprietary share is the appreciation for and development of online updating systems. This is something Linux distributions get very right. In short, once Linux is installed on your machine, it can be very easy to keep it up to date.

Online Updating/Installation

Online updating is a terrific way of getting software onto your machines. More importantly, it is a terrific way to maintain a secure system over time. Since Linux distributions don't have to worry about software license ownership, it is very easy for the software to determine whether to download a patch or fix, and thus many Linux distributions have systems to facilitate this. Proprietary software development houses such as Microsoft are still trying to figure this out. It is a hard problem when you mix it with licensing concerns. Additionally, when it's done wrong, you can literally crash thousands, or in the case of Microsoft and Apple, millions of machines, so it is really critical to do well. That the Debian and Fedora Core Linux distributions do this at all is quite a feat.

Want a sticky issue? Do you trust your software vendor to allow it to automatically update your software? For some, this question is heard in these ways:

  • Do you trust Microsoft to update your operating system?
  • Do you trust a bunch of bearded Unix programmers to update your system?

How you react to these questions has a lot to do with the realities of how difficult the problem is, how successful previous auto-updates have been in the past, and how trusting you are—which brings up the subject of the next section.

How Proprietary Software Development Has Changed Open Source

Open source isn't magic, and developers aren't magicians. No developer is immune to security problems and bugs creeping into his code.

Bugs/Security

Free, open, proprietary, closed....Bugs happen. I think open source means fewer bugs, and people have written tens of thousands of words explaining how they agree or disagree with me. One thing I know I'm right about is that both kinds of code have bugs. Bugs persist longer in closed codebases, and their closed nature keeps bugs persistent.

If I may paraphrase Socrates, "An unexamined codebase is dead," and by dead I mean killed by the hostile environment that is viruses, worms, crackers, and Trojans. Like bugs, security flaws happen in both free and closed software. As a project matures, it must assemble a mantle of testing and quality assurance (QA) techniques that are vital to its ongoing health. I think open source development has learned much from the processes that proprietary software development houses have come up with to support their paying customers.

Testing and QA

As projects mature, so do the testing suites around them. This is a truism for free and for closed software codebases, but the research around this originated in commercial software/hardware and in academia, and open source software has been a ready consumer of this information. The most popular talk I attended lately was in unit testing for Python at the O'Reilly Open Source Conference. The room was packed, with people sitting in the aisles. Testing is huge and is required for any project, free or not.

Project Scaling

Scaling is hard. Whether we're talking about development group size, bandwidth, space, or whatever, scaling any programming project is nontrivial.

Software development has its limits. Product teams can't grow too fast or too large without one of two things happening—either disintermediating technology or project ossification. Fred Brooks's seminal book, The Mythical Man-Month, covered this in depth, and the existence of F/OSS development methodologies doesn't change that. In fact, the tools and changes free software has brought to prominence are all around disintermediation and disconnected collaboration.

F/OSS isn't magic. It isn't breaking the speed of light. Most projects, with some notable exceptions, are composed of small teams, with one to three people doing the vast majority of the coding. If you care about project size, you would be well served by reading the findings of the Boston Consulting Group's study of open source software developers on SourceForge. This revealing study analyzes project metrics and motivations. For one thing, you see that projects almost always comprise fewer than five active developers. Many projects have only one developer.

So, what am I talking about when I say disintermediating technology? Look at it this way. Imagine that one person decides to create a cake from scratch. He'd have to start with a cake mix and some milk and an egg, right? No! He'd need chocolate, milk, flour, yeast, water, and the other ingredients, right? No! Just for the milk, he'd need a cow, some food and water for the cow, a bench, a milk bottle, a chiller, a pasteurizer, a cap and a rag, some bag balm, and so on, right? Well, you're getting closer. The point is that we accept interfaces all the time, and the successful project finds these interfaces, formalizes them, and spreads the work out along these lines.

We accept power at 120 volts at 60 hertz alternating current. We don't generate the power ourselves. We accept that we don't need to dig for oil, refine it, and pour the refined gas into our cars. We use interfaces with different systems all the time. Programs, too, have interfaces, and the success of a program is in how it manages these interfaces.

Proprietary or not, a successful program is one that interfaces effectively between systems and teams working on these systems. Microsoft doesn't have 5,000 engineers working on Windows. It has them working on the kernel, the printing subsystem, the windowing system, the voice synthesis module, and other components. More importantly, it has groups that work on interfacing between the systems so that they (theoretically) work as a whole. Likewise for the Linux kernel; Linus interacts with a number of captains who control different subsystems, including networking, disk drives, memory, CPU support, and so on. Fractionation, when possible, is key, and when not possible, disastrous—which is why groups working to integrate the whole and making sure the interfaces are appropriate can make all the difference in the success or failure of a project.

This interface management is something that free software has done very well. Many commercial developers would be well served to learn from open source's interface management practices.

Control

Control is something customers and end users have never had over their code. You don't buy proprietary software, you rent it, and that rental can be rescinded at any time. If you read the end-user license agreements (EULAs) that accompany proprietary software, you may be left with the feeling that you are not trusted and not liked all that much. For instance, in Microsoft Word's EULA, there is this charming note:

You may not copy or post any templates available through Internet-based services on any network computer or broadcast it in any media.

So, if you were to take a standard Microsoft Word template (which all templates are derived from) and make one that is suited to your business as, say, a publisher, you would be in violation of your EULA with Microsoft, and thus vulnerable to its lawfirm.

Controlling your software destiny is something I consider extremely important. Take, for instance, my employer, Google. We are able to fix and change the Linux kernel to fit our very specific needs. Do we have to check with Linus or one of his lieutenants before, during, or after we change the network stack? No. If we were running NT on our machines, we would be unable to get such changes made, and were we to enter into a deal where Microsoft would incorporate our requested changes, we would in effect be informing a competitor of our development strategy.

Another example is a recent service pack from Microsoft, which featured a firewall and antivirus package. This package, which is turned on by default after service pack installation, was aimed at stopping the viruses and Trojans endemic to the Windows experience. Funnily enough, it considered iTunes a virus and presented a fairly confusing message asking the user to authorize the program's use of the network.[9] That Microsoft's own media player, which has common network access methods, wasn't impeded is telling.

Your computer is not your own; you only borrow that which makes it useful, and when that is taken away, you are left with nothing but a toxic pile of heavy metals and aluminum.

I think this is a subtle but important part of open source's popularity. Many people and companies are interested in controlling their own destiny, and Linux and other open source programs make this possible.

Intellectual Property

Free-software developers believe in intellectual property, probably more so than people who never consider open source software. Developers creating open source have to believe, as the entire structure of the GPL, BSD, MPL, and other licenses depends on the existence of copyright to enforce the clever requirements of those licenses.

When you hear people criticizing free-software developers as guiltless communists or pie-in-the-sky dreamers, it is worth remembering that without copyright, there can't be free software.

Discussions concerning intellectual property and free software usually revolve around two issues: patents and trademark. Software can be patented, and things can be trademarked. Exactly how these intersect with free software is complex. Can a piece of software which is patented be released under the GPL and still hold to the letter of the license? Can a program name be trademarked and then released under the BSD and still be a meaningful release? Legal opinion and precedence thus far provide no definite answer.

Open source developers are learning, though, paying attention to the current events around intellectual property and how it affects them.

The reality of intellectual property is something modern developers are almost required to learn. Learning the laws concerning software is the way to protect themselves from those who might send the feds out to arrest them when they come to the United States. I know that sounds like I'm typing this with tinfoil on my head, but I am not kidding.[10]

The problem with this learning process is that it does take time away from coding, which is not good and is a net loss for free software—which may indeed have been the whole point.

Some Final Words

While open source software is about freedom and licenses, it is nonetheless true that open source costs less, under many circumstances, than proprietary software. This is an important aspect of free software. Additionally, it has to be cost competitive against other free products, just as software that costs money must compete against an open source/free offering.

Free Things Are Still Cheaper Than Expensive Things

When I say "competes against other free products," I'm talking about pirated copies of Windows, Office, SQL Server, Oracle, and many others competing against Linux, OpenOffice, MySQL, Postgres, and other best-of-breed free software applications. These applications are doing very well in environments that have little regard, legally or culturally, for software licenses.

Free things have a velocity all their own, and people forget that. I'll leave you with a little anecdote from when I was working for a large law firm in Washington, DC. I was still in college studying computer science, and I ran the law firm's email network during the day. This was 1996 or so, and TCP/IP was clearly the big winner in the network format wars versus NetBIOS and SNA, to a degree that no one could have appreciated. I was in the elevator with one of the intellectual property attorneys at the firm—a fairly technical guy—when he said something like: "You know, if TCP/IP had been properly protected and patented, we could have rigged it so that every packet cost money; they really missed the boat on that one."

Where would the Internet be if this was true? I don't know, but I do know one thing: the Internet would not be running TCP/IP. So, enjoy the freedom of open source software. It is there for you!

Notes

  1. Traditionally, one difference between open source and proprietary development teams has been that open source teams are, in general, geographically quite dispersed. However, in this age of outsourced, offshored, and distributed development, even proprietary development has become highly dispersed geographically.
  2. This will likely inspire many to cite their favorite commercial library. A full survey of libraries, both commercial and open source, would be required to validate this statement properly. This is an educated assumption on my part, as when commercial libraries manage to gain any sort of prominence, open source developers tend to fill the gap, thus overshadowing the commercial project.
  3. This might seem strange, but programmers are OK with the odd performance hit sometimes. Unpredictable results lead to crashed programs, however. This is not good, no matter what you've been told.
  4. For example, you write a program on your handy laptop, you compile it, and it runs great. Later, you run it on your fabulous dual Opteron server. It crashes because you assumed an integer was 32 bits and the Opteron (running a 64-bit OS like Linux, of course) has 64-bit integers. This is a basic error that comes up in a lot of different ways during 32-64 bit transitions.
  5. This sentence is famous for being printed on punch cards, an early way of providing computers with data. If they were folded, spindled, or mutilated, they jammed the readers—which makes one speculate what the punch card programmer used for version control. The answer is right there in front of you: as the cards went through revisions, they swapped out cards and retained the old, original cards.
  6. The kernel team is in the process of moving off of BitKeeper as of this writing.
  7. In fact, the Unix "write" command allowed hackers in the 1970s to communicate in a fashion not so different from IM.
  8. Tom Demarco and Timothy Lister, Peopleware (New York, NY: Dorset House Publishing Company, 1999).
  9. iTunes has a nifty sharing mechanism whereby users stream music to other iTunes users over the network. It's pretty neat.
  10. I wish I were, but I'm not. It happened to Russian developer Dmitry Sklyarov, who intended to discuss his reverse engineering of the Adobe PDF file format at the DefCon developers conference in Las Vegas. Upon landing at McCarran International Airport, he was met by the FBI, which placed him under arrest under the auspices of the Digital Millenium Copyright Act on behalf of Adobe Corporation. As a result, Linux kernel developers no longer have a substantial meeting in the United States, choosing instead to meet in Canada and Australia, two countries that do not have similar laws and rarely extradite for intellectual property-related crimes of this nature. Developers felt this was necessary because the Linux kernel uses code that was reverse engineered. Reverse engineering, by the way, is what made Dell, Phoenix, AMI, AMD, EMC, and a large number of other companies both possible and profitable.
Personal tools