Linux, OpenBSD, Windows Server Comparison:
Scalability
The most common meaning for the word scalability when discussing
computers, seems to be, how many processors in a single machine,
an operating system is capable of supporting. It seems to me
that if this is really important, then the UNIX products from
Compaq, Hewlett-Packard, IBM, Sun, and perhaps others would be
the primary contenders, and none of the systems discussed here are
really leading choices. As I have no experience with any of
these kinds of systems, that's speculation.
Another way to define scalability might be to build large computers
from individual units that can be applied to a single computing task
or a parallel supercomputer. Here we mean a cluster of machines that
work together to solve a common problem. Recent projects of this type
have consisted of hundreds to thousands of Intel CPUs running Linux.
Linux clusters include both high end commercial offerings with many
CPUs housed in a single cabinet and home grown solutions including
arrays of identical machines and even mixtures of different processor
types and speeds. A Scientific American article quoted in the next
paragraph stated "As of last November, 28 clusters of PCs,
workstations or servers were on the list of the world's 500 fastest
computers." It's not clear how many of these are Linux based.
One, not in the top 500, described at
http://stonesoup.esd.ornl.gov/
first became operational in 1997. The
page at the Oak Ridge National Laboratory self describes this as "No
Cost Parallel Computing." This cluster of over 130 machines, is about
40% Pentiums, a few Alphas and the rest 486s. The Intel machines are
all Linux and the Alphas, Digital UNIX. All are discarded, obsolete
computers. As faster machines become available, the slowest nodes are
replaced. This demonstrates how Linux can be used to build special
purpose, high performance systems at an exceptionally low cost to
performance ratio.
Since my experience is in small environments, I'm going to define
scalability in a manner that is applicable to such environments. I'm
going to use scalability to mean the ability to move and reconfigure
machines as needed to make effective use of available resources and to
add processing power as needed when and where it may be needed to
perform necessary functions. Another way of looking at this is to ask
how easy is it to install, configure and reconfigure computers, how
easily can a configuration be moved from one computer to another or be
split between multiple systems and how well does each operating system
work with a mix of active applications and servers. Also whether or
not there are any fundamental performance differences between the
different OSs should be considered. If it consistently takes twice as
much CPU power to perform a similar task on one OS as another it will
require more or faster computers.
System Performance
Regarding system performance, there is no simple or easy way to make a
broad and reliable generalization that one operating system is faster
or slower than another. Generally comparisons are based on
benchmarks, or a carefully structured set of tests that are timed to
provide relative performance figures. To the extent that the
benchmarks perform operations similar to those that are performed on a
production machine in a ratio that closely corresponds to what will be
done on a production machine, a benchmark gives some indication how
the production machines might perform. The more focused the functions
of a specific machine, such as a server running one or two server
applications, the more important that the benchmark measure the
functions that the machine will actually be performing most of the
time.
Static Web Pages
One "benchmark" that's shown up for a few years, consistently shows
Windows NT and more recently Windows 2000 and IIS beating Linux and
Apache by very large margins serving static web pages. Some have used
this to suggest that Windows is faster than Linux or at least IIS is a
faster web server than Apache. Apache uses, and starts if necessary,
a separate process for each active page request. Starting a process
is resource intensive compared to the relatively simple task of
receiving an http request, reading a disk file and transmitting it to
the requesting client; it's quite likely Windows threaded architecture
is more efficient, possibly enormously more efficient at this specific
task.
The reviews I'm referring to are the web server reviews that PC
Magazine has run for several years. The static page loads described
range from one to eight thousand per second or nearly 4 to 25 million
per hour. This is up with the busiest sites in the world except such
sites rarely serve static pages anymore and surely not off a single
server. The May 2000 review doesn't even describe the Linux
machine(s) configuration. I doubt any configuration changes would
have made the Linux machine faster than Windows 2000 but one has to
wonder how well optimized the Windows machine was and what if anything
might have been done to improve the Linux machine performance.
PC Magazine gives both Windows 2000 and Sun Solaris 8 "Editor's
Choice" acknowledgment but leading web sites keep picking Apache on
various platforms including heavily Linux and FreeBSD as well as Sun.
According to
Security Space
the top 1000 web sites, Apache has 60%,
Netscape Enterprise 15% and IIS 11%. They use an interesting
weighting system (counting links to the sites like Google does) but
every measure I've ever seen, gives Apache approximately a two to one
or better lead over IIS. The Security Space rankings are very
interesting because they give Apache a much larger relative market
share among large web sites than the broader measures that look at
millions of web sites. You wouldn't expect Apache or free or open
source software to rank well at the larger sites that presumably could
afford commercial solutions unless they found a clear overall
advantage. As PC Magazine does not even describe the Linux system
configuration, one has to wonder what biases are affecting PC
Magazines choices and rankings.
Red Hat has built a new web server Tux, for Linux. This is a web
server optimized for speed. In June 2001 tests by eWEEK Labs
another subsidiary of Ziff Davis, which is also PC
Magazine's parent, show Tux to be almost three times as fast as Apache
on Linux and more than two and half times as fast as IIS on Windows
2000 (Tux -12,792, Apache - 4,602, IIS - 5,137 transactions per
second). This shows the danger of relying on any "benchmark",
especially a single application based benchmark, to estimate platform
performance. Change the application or the task and you can get
totally different results. It's also worth noting that on the
"transactions per second" which includes a mixture of static and
dynamic content places Apache only modestly behind IIS where the
"static page" test shows enormous differences.
Returning to the "poorly" performing Linux Apache web server, it's
worth noting that one single CPU Linux server running Apache was able
to serve approximately 1000 static pages a second or 3.6 million
static pages an hour. How many web sites in the world serve pages at
this rate? How many do it with a one, single CPU server? How useful
is this benchmark to anyone? Who has a Gigabit (20 some T3 lines)
connection to the Internet? I mentioned this performance result not
to create a straw man, but because PC magazine is well known and for
several years, particularly on this test, Linux and Apache look very
weak compared to NT and IIS.
The feature list included GUI management wizards for IIS but listed
none for Apache. Apache does have a GUI configuration tool, Comanche.
Comanche may have been excluded because it's technically not part of
Apache, the authors didn't know about it or some don't consider it up
to IIS standards as a GUI management tool. The feature list did not
include Apache's text configuration file, even though, for those who
know what they are doing, a text configuration file is the easiest way
to access and manipulate a server configuration. For "Development
tool included" an "Optional Visual Studio Enterprise Edition" is
listed for IIS but the only system, Apache, to include full source
code, editors and compilers to do anything you need to the system has
"None". PC Magazine has a bias in favor of Microsoft, and a very
strong bias towards systems that are easy for a novice to set up and use
and against systems that may require professional skills to use to
optimal advantage. Who you listen to matters.
Other "Benchmarks"
Just before I wrote this, I ran a small Perl benchmark. On
identical machines, according to this Linux is about 20% faster
than NT and NT is about 20% faster than OpenBSD. On some other
benchmarks I did in the past, OpenBSD blew away Linux and on
others, Linux outperformed OpenBSD; I couldn't do all tests on
all systems because I didn't have comparable development tools on
all systems. Earlier in 2001, I ran the command line
version of SETI@home, a floating point intensive application, on
most of my systems for about two weeks. Every SETI@home run is
different so comparisons are only approximations. On essentially
identical machines, the OpenBSD version typically ran about 50%
faster than the Windows NT version, with Linux roughly 10%
slower than NT. The then current version of SETI@home software
was not available for OpenBSD; the commands and output were
almost identical. Did the OpenBSD version do less or was OpenBSD
much faster on this task?
Since Celerons acquired built in cache, the trade press has often
referred to them as price performance bargains; on the same
version of OpenBSD, I found 533Mhz Celerons to run 2.5 to 3 times
slower than a PIII 500 running the identical SETI application.
Treat all performance comparisons with skepticism. The only
performance that matters are your production systems, which are
not likely to directly correlate closely to any standard
benchmarks.
Hardware Requirements
My first NT machine was a Pentium 133. I started with 32MB RAM; this
was totally unacceptable. Increasing this to 64MB made an almost
useable machine. 128MB seemed to be the right amount of RAM. 96MB
may be a useable minimum of RAM but it's such an odd amount today that
for all practical purposes 128MB should be considered the minimum
amount of RAM for an NT or Windows 2000 machine. The Pentium 133 is
pretty sluggish by today's standards but is still useable (in my
opinion) but is the absolute minimum that should be considered.
A no longer avaialble ZDNet review said
that 128MB on a 400Mhz P2 is about the minimum to run Windows 2000
and XP "acceptably". Any version of Linux or OpenBSD will run in
much less memory on a much slower machine.
I believe they can both, with effort,
be installed on an 8MB 386 though this may not be true of the 2.4
Linux kernel. From what I've read they are comfortable on 32MB,
486's. If you tried to run the X Window system on such a machine it
would be unacceptably sluggish but I don't
think a GUI is necessary or desirable on a server. Why does Windows
require so much more hardware to run than the UNIX like systems?
Obviously much of it is the need to support the GUI, which in the case
of Windows is not an optional component. It also suggests that the
GUI is likely to impose a performance overhead. Without the X Window
components loaded, the Linux or OpenBSD system should have more
resources to make available for the server applications that are the
reason for being. I know a Red Hat 7.1 Linux workstation typically
has about twice as many processes running with X Windows active as a
Linux system without.
Red Hat 7.1 used as a desktop system with X Windows on a P3 500
with 128MB RAM seems roughly comparable to NT Workstation on a P2
450 with 256MB RAM. Both are fine with the last couple of
applications that have been used but both can be aggravating slow
in using an application that has not been used for some time. The
NT machine, despite its much larger memory, can be especially
irritating in this regard but I tend to have a lot of
applications open at the same time. Increasing memory on both to
384MB seems to noticeably improve the Linux machine but makes no
difference on the Windows NT machine.
Windows, at least NT, uses the memory it has incredibly inefficiently.
When I wrote this, I had 20 open windows. I frequently have more. On a
machine with 384MB of RAM, Performance Monitor shows 330MB available
and 130MB committed for a total of 460MB. In other words even though
there should be 254MB of available (physically free RAM, 384 - 130)
Windows NT has cached to disk 76MB (460 - 384). How is it possible to
be so stupid about memory management? This machine has lots of memory
to spare (254MB) yet more than half of the used memory (76 of 130MB)
has been cached to disk causing long waits when I switch to a window
that hasn't been used for a while.
Increasing the Linux machine to 384MB eliminated waits.
Because I don't use that machine as my primary machine, I tend to
have less open. I've deliberately opened far more than I ever do
in normal use and waited minutes, hours, days and returned to
that machine. There are no more waits no matter how long it's
been since an application was used. I assume the NT memory
behavior has to do with assumptions about memory availability
which was typically not more than 16MB when NT was designed.
This should be easy for Microsoft to fix in Windows 2000. I hope
they have but would be curious to know from anyone with first
hand experience.
Linux and OpenBSD's low hardware requirements allow machines that would
otherwise be discarded to be used effectively. Any process that is
not intrinsically resource intensive is a candidate. A dedicated
DHCP server could be an example.
OS Performance Comparisons
My sense, and I have no proof for this, if you had comparable
text based server applications on similar machines, is that they
would run consistently but only moderately faster on either Linux
or OpenBSD compared to NT provided that the X Window system was
not loaded. How much, if any faster, would depend on exactly
what the application did and would likely vary as other functions
were performed. If the X Window system were loaded, the open
source systems would likely loose their advantage. If a user
were actively engaged in management tasks at the console using X
and a user were doing similar things under Windows, I'd expect
the Windows system to have somewhat of an overall performance
advantage on a broad range of management tasks and server
applications. As always, whether this was true, would depend on
exactly what was being done. Since the X Window system is a GUI
that can be dropped on top of any text based UNIX system and
Microsoft Windows systems are now built as a single system where
the GUI isn't an add on but an integral part of the system, I
would find it somewhat surprising if Windows wasn't typically
faster on graphical tasks given comparable hardware. There are
advantages as well as disadvantages, to tight integration.
Except for a GUI management interface, GUI server applications are a
contradiction in terms. Still, native Windows applications are almost
invariably developed with a GUI management interface and only
applications ported from other environments retain text based
management interfaces. It may be misleading to even refer to a "text
management interface" as these server applications have no interactive
mode. They simply get their settings from the command line and or a
disk file. Typically the disk file is a text file that an
administrator maintains with their preferred editor.
It is difficult to see how, across a broad range of server
applications, assuming generally comparable levels of quality,
how properly configured, lean applications, lacking the overhead
of a GUI should not generally outperform counterparts that
depended on a system that always includes significant GUI
components. The GUI management components won't actually be part
of the executable server applications. Rather, just as the text
editor is the interface to the text configuration file, the GUI
management tools will be the user interface to the Registry,
Active Directory or wherever else Microsoft or the specific
vendor has most recently decided to store configuration settings.
Since the X Window system is not closely integrated with the
operating system, if there is no recent management activity and
the OS needs memory, all the X Window components should
eventually be swapped to disk. With Windows and the GUI and OS
being one and the same, there are much more likely to be
significant GUI components, that will never be swapped to disk.
In the spring of 2001, the Linux 2.4 kernel gained multithreading
capabilities and multiprocessor efficiencies which are likely to
greatly narrow if not eliminate two areas where Windows systems have
had performance advantages.
Some might say that in a computing world now dominated by GUI based
operating systems, it's an unfair and unequal comparison to strip
these components from Linux and OpenBSD system for comparisons with
Windows systems. As a counter, I'd say the only rational approach to
configuring machines that have well defined purposes such as servers,
is to examine those purposes and tailor the machines as much as
practical to serve those purposes. It's a major strength of Linux and
all the open source BSD systems that the GUI really is not part of the
essential OS and must be added as specific option in the install
process. It doesn't even need to be removed; it simply never needs to
be installed in the first place. The only purpose for performance
testing is to approximate the actual configurations that will be used
in live environments; if a Linux or OpenBSD server will run in
production without a GUI, that's the only reasonable way to
performance test the systems. If you performance test without a GUI,
you can't add the GUI for ease of learning or use comparisons.
A definitive answer to the relative performance merits of Windows NT
or Windows 2000, Linux and OpenBSD cannot be answered without
carefully controlled experiments. Even if there were a clear answer,
a change in the primary server application or important changes in the
infrastructure environment might yield very different results. Those
who need to deal with true high volume servers, need to do a
significant amount of testing to find the best platform and
application choices.
Price Performance Ratio
It's an interesting exercise to compare what two different operating
systems can do on the same hardware, but this is not likely to produce
a number many businesses really care about or should care about. A
much more important metric is the price performance ratio. If the
operating systems cost exactly the same, then comparing them on the
same hardware makes good sense. If however one operating system is
free or very low cost and the other costs thousands of dollars, it
makes sense to compare two machines where the total installed costs of
hardware and software are the same.
When I last checked, a 5 user license for Windows 2000 cost around
$700 and for a machine that was to be used as a public web server,
Microsoft added an item called an Internet Collector Fee which cost
just under $3000. When it's almost a given that additional Resource
Kit, backup and or defragmentation software will need to be added, the
starting software cost for a public, Windows 2000 web server, is
effectively over $4000. You can buy a nice medium size, hardware
only, server for that; in fact, such a server could likely handle all
the traffic that several T3 lines could handle. When you add the
Microsoft licensing costs and additional necessary software to the
price of hardware, I simply do not see how Linux and OpenBSD systems
could fail to outperform Windows systems on a price performance ratio,
across a broad range of applications.
I have no basis for making performance distinctions between Linux
and OpenBSD except as regards multi processors and clustering.
OpenBSD supports neither technology so cannot compete against
either Linux or Windows where such solutions are applicable.
OpenBSD is developing SMP but given the years that Windows and
Linux have already had multi processor support and its
comparatively limited resource base, it's hard to see how OpenBSD
could reasonably hope to compete effectively in this area.
Scalability As Cost Effective Performance
There is more to scalability, as we are discussing it here, than
either the best performance for a single defined task on a specific
hardware configuration, or performing the most operations at the
lowest cost per operation when both hardware and software costs are
included. These are simply raw performance measures. In a very small
environment, even one high end, single CPU server may have more
processing power than is needed for the required tasks and thus due to
its high costs is not an appropriate solution. All environments need
to meet their performance requirements at reasonable costs.
The smallest environments may start with a single server on which all
services run or a very small number of machines may provide different
subsets of services such as file and print, e-mail, business
management applications, etc. For a business to survive, over time
software will be replaced and or added, staff change or grow, machines
age and be replaced. From a systems management perspective, there may
be a significant advantage if all server applications can be kept on
or moved to a single machine that runs on widely available and
inexpensive hardware.
Unless there are essential business management applications, that
run only on non Intel hardware platforms, there can be a strong
push to move to Windows NT and now 2000 just because of the
number of applications that are available for it. It's worth
remembering, that unless special terminal server versions of NT
or 2000 are being used, most of the applications the users work
with do not run on the server but rather on their PC's. Users
may load the applications or data from the server's disk drives,
but both Linux and OpenBSD, as do nearly all other UNIXs, also
support Windows disk sharing if needed.
Some of the applications or services that really are server
applications and execute on a server are FTP, Telnet, SSH, SMPT, time,
DNS, HTTP, POP3, Portmap, Auth, NNTP, NTP, NetBIOS *, IMAP, SNMP,
LDAP, HTTPS, and Kerberos *; the asterisks indicate multiple
affiliated protocols.
Nearly every business will want NetBIOS or the Novell, Macintosh or
UNIX counterparts for disk sharing but will want to keep these visible
to local only and not Internet computers. Very small businesses may
choose to have their ISP provide all Internet related services and
thus run nothing but a file and print server and a host for business
management applications. In such a simple environment, if the
business management applications run on Windows or Novell it makes
sense to use this platform for file and print sharing as well.
Likewise, if the business management applications run on a UNIX
platform, the options for providing Windows or Novell like file and
print services from the Unix system should be investigated, as a
satisfactory solution here could significantly reduce system
management overhead.
If the organization is large enough to want to manage its own e-mail
system, have an Intranet, or consider hosting a public web site, FTP
or list servers, then it should take the selection of servers very
seriously. Given Microsoft's large market share and comparative
unfamiliarity of the open source alternatives, Microsoft is often
selected without there ever being an actual selection processes. If
they are even thought about, commercial UNIXs are often seen as too
expensive and open source UNIXs as "not supported", which is not
correct and addressed elsewhere.
Relocating Server Applications
Since here we are focusing on scalability it's worth looking at
factors like how well a machine supports a mix of server applications,
how easy a specific configuration can be duplicated on a different
machine and how easily a service can be added to or removed from a
specific configuration. Any server should be able to support a mix of
applications and varying load levels though to some extent it's not
surprising if a server becomes less stable as the number of active
applications and loads (CPU, memory, disk and network I-O) increase.
If a single server has been sufficient and you now need to add another
server, the more you understand about what is going on in the current
server, the easier it will be to select wisely which processes should
be moved to a new server. Windows NT provides two tools. Task
manager gives an interactive snapshot with a regularly changing
display. Depending on sort order the entire display may jump around
or numbers change if tasks are shown in alphabetical order; task
manager provides no logging. Performance monitor can provide logging
but you must "catch" a process actively executing before it can be
tracked. Processes that are started, execute a second or two an exit
are almost impossible to track (with either tool) though they may
consume most of the systems resources. I've seen an NT system
afflicted with such a process. We could see it appear in Task Manager
but it never lasted long enough to learn anything useful.
UNIX's ps command on the other hand, can capture just about
everything the system knows about a process, including its
parent. A script running ps in a loop, without delays, and
saving its output should generate a huge amount information. Any
process listed, regardless of how briefly it lasted could be
tracked back via the path of parents to the process spawning the
resource consuming children. This is an example of a rare
occurrence that cannot be monitored on Windows but can easily be
monitored on UNIX. UNIX keeps records on many factors related to
every process as it executes. Ps can track them all and display
or save them at the user's choice. Though it doesn't draw nice
interactive graphs like Window's Performance Monitor does, it's far
more informative. Systematic use of ps can develop a profile of
whatever is going on a system.
Because the hardware specific components of a UNIX system are isolated
and generally well known, an existing UNIX configuration can be used
to create a custom install process allowing the configuration to be
migrated to a more powerful machine if that is all that is required.
Such a process is documented in detail in my Hardening OpenBSD section
of the web site. The "fullback" script I use on my Linux systems
could be resorted over a fresh install on a new machine to migrate a
complete machine configuration. Nothing similar is possible on
Windows systems as even a minor hardware change can cause a restore to
fail. There are disk reproduction systems available for Windows.
Generally these are intended to develop a standard configuration that
will be applied to multiple machines. Presumably the duplicating
software knows how and where to make the necessary adjustments to
account for hardware differences, assuming the duplication can be to
non identical machines. These are also normally intended to be used
at the beginning of a system's life and not late in its life cycle
when its registry has already grown and begun to impose performance
degradations.
Duplicated UNIX machines do not need to be kept as replicas of each
other. After a few minor IP adjustments, services turned off on one
machine could be left executing on the other or vice versa,
effectively splitting the application server load between the two
machines. The ease with which UNIX machines can be duplicated, makes
it relatively straight forward to have several machines performing
essentially the same function. With load balancing software, this
could work quite well for web servers or FTP servers.
Because UNIX systems are logically designed, highly modular, and
provide the tools to manipulate their parts, they lend themselves
well to small and medium size organizations being able to add and
change them as needs change. The same is not true of Windows.
Migrating Windows functions nearly always means building new
machines from scratch. If applications have been customized over
time, it can be very difficult to replicate their settings on a
new machine.
Top of Page -
Site Map
Copyright © 2000 - 2014 by George Shaffer. This material may be
distributed only subject to the terms and conditions set forth in
https://geodsoft.com/terms.htm
(or https://geodsoft.com/cgi-bin/terms.pl).
These terms are subject to change. Distribution is subject to
the current terms, or at the choice of the distributor, those
in an earlier, digitally signed electronic copy of
https://geodsoft.com/terms.htm (or cgi-bin/terms.pl) from the
time of the distribution. Distribution of substantively modified
versions of GeodSoft content is prohibited without the explicit written
permission of George Shaffer. Distribution of the work or derivatives
of the work, in whole or in part, for commercial purposes is prohibited
unless prior written permission is obtained from George Shaffer.
Distribution in accordance with these terms, for unrestricted and
uncompensated public access, non profit, or internal company use is
allowed.
|