Sunday 8 April 2007

LEARNING LOG TWO - web servers

To begin with I am a little unaware of what exactly a web server does and what makes a ‘good’ web server, therefore I will briefly investigate the basics before indulging in to the comparison of Apache and an alternative web server.

What makes a good Web server?

• One that quickly, reliably and securely transfers data between the client and the server via Hypertext Transfer Protocol (HTTP).
• One that rapidly retrieves and delivers the requested files/scripts to the browser for the client.
• Ultimately a good web server “serves content to web browsers on client machines” (http://www.siteground.com/apache%20servers-hosting.htm) consistently and promptly.

So from the information I found above I suspect that a logical diagram of a web server and the relationships with its neighbors is as follows:


[Via Appendix]



The “most popular WWW server on the internet” (http://web-hosting.candidinfo.com/server-operating-system.asp) is apparently Apache which is the first web server I will investigate.

From http://www.iuk.be/ist205/sess6.html I found that Apache is said to be a “powerful [and] flexible” web server that can be configured and customized to an extremely extensive amount as required by the user. This is proven by Tobias Schlitt (http://schlitt.info/applications/blog/index.php) that for a “large” file it only takes Apache “0.004” seconds to find and serve the demanded content to the client browser. This is undeniably very fast however Schlitt does not state what a “large” file is; this would be interesting to know in order to compare the number of bytes the file contains to the load time. Either way, Apache clearly ships pages at a rapid rate.

Furthermore, Apache can run on all popular operating systems including Windows XP, windows 2000/2003 as well as Netware, OS/2, NT, Linux and the majority of UNIX versions. Despite the fact that Apache is clearly a Multi-platform web server, Apache “runs best on Linux” (http://web-hosting.candidinfo.com/server-operating-system.asp) as it runs faster on this O.S. Considering this, if a user wanted to gain the most functionality out of this web server then they must obviously use a Linux O.S. This is a whole new issue within itself as Windows are more popular than Linux. Despite this, Apache still runs very well on all the other operating systems.

Apache is clearly a constantly improving web server as it (Apache 2.0) only used to run on “Unix based operating systems… [and]…on windows 2000” (http://www.shop-script.com/glossary.html) but clearly has been made far more flexible and is still “actively being developed.” (www.iUK.be)

Apache’s constant improvement and updates is born through its open source nature. This is undoubtedly an advantage not only due to its unbeatable cheapness but also for users who want to “add functionality.” (http://web-hosting.candidinfo.com/server-operating-system.asp) One can write “modules using Apache module API.” (www.iUK.be) There are two main areas to change information on the Apache web server;

• httpd Config files
• Per-directory .htaccess files ” (http://www.garnetchaney.com/htaccess_tips_and_tricks.shtml)


• httpd Config files can only be entered by server administrators however .htaccess files can be accessed by users and put in to “their individual directories” (http://www.garnetchaney.com/htaccess_tips_and_tricks.shtml) which will then countermand the httpd Config files.

The above website noticeably gave me an incite in to why Apache allows users to configure it as well as administrators. On the other hand I didn’t know exactly what .htaccess files allowed users to do. So I went on to research more in to this Apache element. I found via http://www.freewebmasterhelp.com/tutorials/htaccess/ that .htaccess can provide a number of services; the most common being password protection for “specific files or directories” or the presented webpage when the requested file “is not found (error 404).” (http://www.free-webhosts.com/definition/htaccess.php)
It is clear that Apache is a powerful web server. The .htaccess files, telling the “server how to behave” (http://www.free-webhosts.com/definition/htaccess.php), gives reason to why the user can configure the web sever almost exactly how they want it.
There are alternative web servers to Apache that are within the competing market. These include Microsoft IIS, Sun, Zeus and thttpd. I have also found a currently strong contestant I had not heard about called LightTPD, which is also an open source web server. I believed that Apache was no doubt the best after my investigation however after research in to LightTPD my opinion has largely changed.

LightTPD is known for its good “security, speed, compliance, and flexibility… With a small memory footprint compared to other web-servers, effective management of the cpu-load, and advanced feature set (FastCGI…).” (http://www.lighttpd.net/) Admittedly, this was stated from the LightTPD website itself and so this could be rather biased, therefore I went on to look for other opinions.

Mark Andrachek (http://webmages.com/archives/2005/03/16/apache-alternative) states that “it’s much faster and lightweight than Apache… [and] also has Fast CGI support…and uses less than 4MB of ram.” This is noticeably smaller than that required by Apache which uses “220MB of memory” (http://forums.vpslink.com/archive/index.php/t-1033.html); that is the apache 2.0 version. This proves the “small memory footprint” statement by http://www.lighttpd.net/ themselves.

I found an interesting comparison; as stated earlier Apache can load a large file in 0.004 seconds however Tobias Schlitt (http://schlitt.info/applications/blog/index.php?) also discovered that LightTPD on the other hand loads the same large file in “0.001” seconds. LightTPD so far seems to be winning the better web server race. Although this may seem the case, others argue that LightTPD “is too fast” (Durgaprasad http://durgaprasad.wordpress.com/2006/09/28/lighttpd-vs-apache-http-server/b). I think there is just no pleasing some people.

Regardless, Durgaprasad goes on to say that it has an “excellent….ability to spawn fastcgi processes” which means that if the server becomes loaded with “heavy traffic” then the server can “automatically do the load balancing”. This confirms the statement by http://www.lighttpd.net/ that LightTPD is “perfect…for every server that is suffering load problems.”

Like Apache, LightTPD is open source and so can be configured as needed, however unlike Apache, LightTPD does not have aid for .htaccess files. This may be seen as a problem however although this gives Apache flexibility it does in fact “slow apache down” (http://webmages.com/archives). This suggests that LightTPD does not have that extra obstacle; Apache reads the httpd Config files and the .htaccess files which then tend to overrule the httpd Config files, whereas with LightTPD “everything is entered directly in to the main config file.” (http://webmages.com/archives/) This may be partly the cause for LightTPD having a faster server time than Apache. On top of this, although .htaccess files may be “overwritten very easily” (http://searchnetworking.techtarget.com/sDefinition/0,290660,sid7_gci214573,00.html) this can create issues “for users who once could access a directory's contents, but now cannot.” (Previous website). From the previous website it also states that .htaccess files can be “retrieved by unauthorized users” which clearly shows a glitch in the security of Apache.

Additionally, where Apache configuration uses the writing of a module, Vincent Delft says that LightTPD “has integrated support for scgi and doesn’t need an additional module” (http://mail.mems-exchange.org/durusmail/quixote-users/5663/). Apache does; a user must use mod_scgi if scgi is needed for Quixote applications.

LightTPD seems to be taking over security wise, speed wise and feature wise when juxtaposed with Apache. The fact that it powers a number of “popular Web 2.0 sites like You Tube, Wikipedia and meebo” (http://www.lighttpd.net/) simply proves its might, reliability and capability.

I then went on to look in to relational database management systems. I already knew that RDBMS are system software that store data across related tables where primary and foreign keys provide the link between them. I knew that DBMS’s manage data as well as access, retrieve, secure data and sustain its integrity. I had only worked with Microsoft access when creating databases in the past. I had heard of the term MySQL and knew it had something to do with RDBMS but did not know exactly what. So this was where I first launched my investigation.

To begin with I wanted to find out exactly what SQL stands for: that being ‘Structured Query Language’ and is said to be “the most popular computer language used to create, retrieve, update and delete…data from” (http://en.wikipedia.org/wiki/SQL) RDBMS. I went on to find out that MySQL is commonly used by “connecting to a MySQL server, choosing a database, and then using the SQL language to control the database.” (Andy Harris PHP 5/MySQL programming for the absolute beginner p.305) Immediately my knowledge of MySQL and its association with RDBMS was clear; it is simply a RDBMS itself, allowing “many different tables to be joined together” (Michael K. Glass Beginning PHP5, Apache, MySQL: web development p.7) like any other RDBMS.

As I was advised to investigate servers and then RDBMS, I first off didn’t understand the connection between the two. However, via Beginning PHP5, Apache, MySQL: web development p.7, I found that “MySQL is the database construct that enables PHP and Apache to work together to access and display data in a readable format to a browser.” I realised that when a user requests web data, PHP tells the server (i.e. Apache) and the server then gets the data from a RDBMS (i.e. MySQL). Now, I could go on to investigate MySQL in further detail.

MySQL is an open source RDBMS, which means for the user “lower cost & Total Cost of Ownership” (http://www.mysql.com/news-and-events/on-demand-webinars/embedding-oem-2005-09-22.php?gclid=CLSWi6Gb5YoCFQMrlAodGicIwQ ) as well as the fact that users can “tailor it to their needs” (http://www.google.co.uk/search?hl=en&q=define%3A+MySQL&meta) MySQL is apparently known for its “cross platform portability…superior performance, scalability and reliablity…small footprint…[and]…ease of use.” (http://www.mysql.com/) This perfected statement is not surprisingly from the MySQL website itself and so I went on to find out where the real glitches were, as there were bound to be some.

It was confirmed that MySQL “works on many operating systems and with many languages.” (Andy Harris PHP 5/MySQL programming for the absolute beginner p.303) due to its default table format. MySQL can run on Windows to UNIX systems however it works best on the latter. Furthermore, this website confirms the ‘small footprint’ statement as the MySQL databases are “compact on disk and use less memory and CPU cycles.” MySQL can even be used on 64-bit processors due to its use “of 64 integers in the database.” (http://www.tometasoftware.com/MySQL-5-vs-Microsoft-SQL-Server-2005.asp) The praised performance is noticeably due to its compactness, sleekness and efficiency.

With regards to efficiency, MySQL works fast and thoroughly “even with large data sets” (Andy Harris PHP 5/MySQL programming for the absolute beginner p.303) hence why it mounts effortlessly into “large, query-heavy databases” (http://www.tometasoftware.com/MySQL) and is said to be created for arduous loads and dealing with “complex queries” (Michael K. Glass Beginning PHP5, Apache, MySQL: web development p.7) MySQL is so far proving to be the “very powerful program” stated by Andy Harris. The MySQL website seems not to be hiding any faults, as yet.

Moreover, I found that MySQL serves “core functionality…require[ed] at a very low cost.” (http://www.tometasoftware.com/MySQL) MySQL being open source is obviously free itself to download however there are costs applied when GPL license limitations want to be avoided. Yet these cost merely $400 which is nothing when compared to commercial DBMS’s which I will discuss in a short while. So far MySQL still seems to be meeting the standards first set.

What is more, with MySQL, replication is positively supported as it can be done “easily and quickly”(http://www.tometasoftware.com/MySQL) to a number of slave machines. This suggests that even when the server fails data is kept unharmed. However, finally, to my expectancy, I found out that MySQL does have its faults. Although data may be reserved unbroken and features “password and user verification…for added security” (Michael K. Glass Beginning PHP5, Apache, MySQL: web development p.7), the same website above reveals that its basic table security support is restricted and it does not have “adequate security for government applications.” This clearly portrays that MySQL can not be used in government systems hence why they are better suited for “lower-tier” applications (last two quotes: http://www.tometasoftware.com/MySQL) as opposed to enterprise level ones.

Regardless of the intact data when the server shuts down, with MySQL, if there is unforeseen power failure “data can be lost and the data store corrupted” (http://www.tometasoftware.com/MySQL) Undeniably MySQL’s recovery is very poor, which is clearly a concern as its key job is to look after data. Another example of MySQL’s poor data management is that, although it may seem an advantage that data types are adaptable when they’re entered, “you can [also] enter dates that are not really days, such as February 30…[and]…store dates with missing information” (Julie C. Meloni Sams teach yourself PHP, MYSQL and Apache, p.275) This obviously means that data redundancy is a very high risk. This is ridiculous; MySQL could at least manage simple validated dates.

What’s more, this RDBMS may have “a number of utility programs” (Andy Harris PHP 5/MySQL programming for the absolute beginner p.304) and may have implemented some advanced new features (cursor support, stored procedures, triggers and foreign keys) however these are in the waiting of being “stabilized and rationalized” (http://www.tometasoftware.com/MySQL) So although it serves “core functionality” as earlier quoted, this does not mean that the operations are faultless. In fact the features need to be restructured across a number of the MySQL suites including “InnoDB, MyISAM, MaxDB and the new data clusters” (http://www.tometasoftware.com/MySQL)

MySQL started off as the son of RDBMS perfection however it now proves to have many imperfections highlighting the statement that “MySQL is nowhere near the competitive enterprise field of the more established SQL server” (http://www.tometasoftware.com/MySQL)

I then went on to find an alternative to this RDBMS, one of which was not open source and so I expected will have a lot less faults as money is now in the equation. I found via www.microsoft.com/sql/default.mspx that there is a RDBMS called Microsoft SQL server. Apparently this SQL server version (2005) provides a “business intelligence platform” designed for data absorption and examination to “make better decisions, faster.” In addition this website also suggest that Micro SQL operates “applications faster and more efficiently.” As well as providing “secure default setting”. So far this does not seem to have that much above MySQL.

It’s also stated (via http://www.microsoft.com/sql/prodinfo/overview/whats-new-in-sqlserver2005.mspx) that their SQL server has “reduced application downtime, increased scalability and performance, and tight yet flexible security controls”. As before, this faultless statement to no surprise is drawn from Microsoft themselves; I will venture in to the hidden truths about this ‘perfect’ RDBMS and compare the two.

To begin with the most obvious disadvantage as already mentioned is the fact that this RDBMS is not free and only offers a “free license for development use only.” (http://www.tometasoftware.com/MySQL) The actual cost of purchasing Micro SQL is a “whopping $1, 400” however it is said to be worthy of this fat price as I am yet to find out.

Via http://www.hostmysite.com/support/sql/whatsnew; although Micro SQL is a commercial system it does allow users via the “SQL Management Tools” to “customize and extend their management environment and [ISVs]” in order to create more tools and functions they may require. I first thought customization was only available with open source products yet this has clearly set my knowledge straight and a noticeable advantage MySQL had over Micro SQL is in fact deceased.

Unlike MySQL, Micro SQL’s advanced features that have been wholly executed have “long stabilized” (http://www.tometasoftware.com/MySQL). There is no reconstruction needed, which gives reason to it being in the “high-end of database systems”. (http://www.tometasoftware.com/MySQL)

Not to get over excited; I then found out that these wonderful, overwhelming features means that the system is overall more intricate and therefore puts more strain on the “memory and hard disk storage…[resulting in] poorer performance” (http://www.tometasoftware.com/MySQL) than MySQL. For one to benefit from such a complex system a large, powerful and committed hard drive is most definitely needed. This can be a turn off as this requires even more time and money on top of the purchase of the SQL server. However, one could see this as Micro SQL not being the problem as it is “limited only by hardware and application design.” (SQL Server Technical Article http://download.microsoft.com/download) It is not itself a limitation.

Micro SQL, much like MySQL, assists data duplication. However, SQL does so in three diverse ways (“snapshot, transactional and merge.”) (http://www.tometasoftware.com/MySQL) where the snapshot method caters for static databases; those that hardly ever change; transactional method that caters for those that are constantly changing and the merge method that permits “simultaneous changes”. The above website also conveys that if there are changes that clash then a “predefined conflict resolution algorithm” will solve the issue. Via www.google.com/search I learned that algorithms are used to “analyze data into components.” If there is data that does not match up, the algorithm will analyze the data and correctly change it and/or put it in to its appropriate component. This is a major advantage over MySQL’s single method.

The above also suggests that there is low risk of data redundancy, unlike MySQL. This is proven by the statement via www.tometasoftware.com that this SQL server completely provides for “security at the column level.” The fact that MySQL only does so at the ‘table level’ undeniably shows that this is a much less secure system. This gives reason to why Micro SQL manages data security, swift reinstatement and has less potential of data distortion.

On the subject of security, Micro SQL is worthy of the C-2 compliant certificate; sufficient government application security, meaning that it can be used in government systems not just small businesses. In fact, it is highly unlikely that small companies will purchase this system due to its heavy price tag when free systems such as MySQL are available. This gives reason to why Micro SQL is used for “large enterprise databases” (http://www.aspfree.com/c/b/MS-SQL-Server/) as opposed to mere diminutive to medium scoped databases like MySQL.

Ok, so the MySQL server is said to provide ‘ease of use’ as stated at the beginning by its own website, however, Micro SQL can be incorporated with “Microsoft Visual Basic .NET, and Microsoft Visual C# .NET” (http://www.hostmysite.com/support/sql/whatsnew) meaning that creating code is possible without the user having to know complex SQL elements. The integration of the .Net framework also permits delivered “security, scalability, and availability for your enterprise data and analytical applications” (http://www.re-invent.com/sqldatabasehosting/sql2005.aspx) Nevertheless, one must be trained in the “elaborate mechanisms” of this SQL server in order to replicate and transfer dynamic data.

Overall, it is clear that both RDBM systems have their pros and cons. Where MySQL has its cons this is often excused due to its unbeatable price tag and availability, and where Micro SQL server has its heavy price tag this is excused due to its overall better features, security, recovery and more intelligent platform. However, it must be admitted that overall Microsoft SQL is the “more secure, reliable, and productive platform for enterprise data and BI applications” (http://www.vmware.com/vmtn/appliances/directory/node/651) than the free MySQL. If a large business requires a robust, reliable and intelligent RDBMS then Microsoft SQL “wins hands down.” (www.tometasoftware.com)



REFERENCES:


Websites:

What are some popular Web server programs?
http://web-hosting.candidinfo.com/server-operating-system.asp
Author: Unknown
[Date accessed: 07/03/07]


Apache
http://www.iuk.be/ist205/sess6.html
Author: Dan
[Date accessed: 07/03/07]


Apache vs. Lighttpd: "echo" performance
http://schlitt.info/applications/blog/index.php?/archives/504-Apache-vs.-Lighttpd-echo-performance.html
Author: Tobias Schlitt, Saturday, October 28. 2006
[Date accessed: 08/03/07]

Glossary
http://www.shop-script.com/glossary.html
Author: Unknown
[Date accessed: 08/03/07]

HTACCESS files useful tips and tricks
http://www.garnetchaney.com/htaccess_tips_and_tricks.shtml
Author: Garnet R. Chaney
[Date accessed: 11/03/07]

.htaccess Tutorial
http://www.freewebmasterhelp.com/tutorials/htaccess/
Author: David Gowans
[Date accessed: 11/03/07]

Lighttpd
http://www.lighttpd.net
[Date accessed: 16/03/07]


Apache Alternative
http://webmages.com/archives/2005/03/16/apache-alternative
Author: Mark Andrachek, March 16th, 2005
[Date accessed: 16/03/07]

Apache RAM use
http://forums.vpslink.com/archive/index.php/t-1033.html
Author: Flamesrock
[Date accessed: 16/03/07]

http://durgaprasad.wordpress.com/2006/09/28/lighttpd-vs-apache-http-server/b
Author: Durgaprasad, September 28th, 2006
[Date accessed: 17/03/07]
htaccess
http://searchnetworking.techtarget.com/sDefinition/0,290660,sid7_gci214573,00.html
Author: Unknown
[Date accessed: 17/03/07]

lighttpd vs apache
http://mail.mems-exchange.org/durusmail/quixote-users/5663/
Author: Vincent Delft
[Date accessed: 21/03/07]

http://en.wikipedia.org/wiki/SQL
Author: Unknown
[Date accessed: 21/03/07]


For OEMs Only: Embedding MySQL
http://www.mysql.com/news-and-events/on-demand-webinars/embedding-oem-2005-09-22.php?gclid=CLSWi6Gb5YoCFQMrlAodGicIwQ
Author: Unknown
[Date accessed: 22/03/07]


http://www.google.co.uk/search?hl=en&q=define%3A+MySQL&meta
Author: Unknown
[Date accessed: 07/03/07]

MySQL 5.0 vs. Microsoft SQL Server 2005
http://www.tometasoftware.com/MySQL-5-vs-Microsoft-SQL-Server-2005.asp
Author: Ebryant
[Date accessed: 18/03/07]


www.microsoft.com
Data Management and Analysis Solution for Your Enterprise-www.microsoft.com/sql/default.mspx
What's New in SQL Server 2005
http://www.microsoft.com/sql/prodinfo/overview/whats-new-in-sqlserver2005.mspx
Author: Unknown
[Date accessed: 21/03/07]


http://www.hostmysite.com/support/sql/whatsnew
Author: Unknown
[Date accessed: 20/03/07]


SQL Server Technical Article
http://download.microsoft.com/download
Author: Unknown
[Date accessed: 20/03/07]


http://www.aspfree.com/c/b/MS-SQL-Server/
Author: McGraw-Hill/Osborne
[Date accessed: 19/03/07]

Microsoft SQL server 2005
http://www.re-invent.com/sqldatabasehosting/sql2005.aspx
Author: Unknown
[Date accessed: 19/03/07]


Microsoft SQL Server 2005 Enterprise Edition Virtual Appliance
http://www.vmware.com/vmtn/appliances/directory/node/651
Author: Unknown
[Date accessed: 23/03/07]


BOOKS

PHP 5/MySQL programming for the absolute beginner
Author: Harris. A
Publisher: Boston, MA : Premier Press, c2004.

Sams teach yourself PHP, MYSQL and Apache [electronic resource]: all in one
Author: Meloni, Julie C
Publisher: Indianapolis, Ind : Sams, c2004.

Beginning PHP5, Apache, MySQL: web development
Authors: Michael K. Glass, Yann Le Scouarnec, Elizabeth Naramore, and Jason Gerner
Publisher: Wiley Publishing, Inc, Indianapolis, Indiana, c2005.

1 comment:

Tim said...

It looks like you've provided many good MySQL references. I thought that this link to my writeup of how to quickly setup AutoMySQL Backup might help also you:
http://timarcher.com/?q=node/11

Its a really quick and easy way to backup your MySQL databases if you run on a Unix Platform (mine runs on RedHat AS3)