Web Technology
Virtually from the beginning, web browsers had
a simple way to go back to documents you had previously seen.
Thus after your side trip to Japan, you could go back twice and
be right back where you had been when you left the New York
document.
A web browser is the client side program that connects to a web
server which makes documents available. The protocol is the
Hyper Text Transfer Protocol or HTTP. Because of the jumping
around that was envisioned with HTTP, it is a stateless protocol.
All the other protocols mentioned so far have been session based.
In these, a client makes a connection to the server and the
connection is held open until the user or originating computer
takes an action to end the session. If the user is inactive,
after a predetermined time out period, the server drops the
connection.
HTTP is fundamentally different. A browser
makes a connection to a server, requests a document and as soon
as the document is sent, the server closes the connection and
retains no knowledge of that connection. If the same browser
subsequently requests another document from the same server, the
server sees it as an entirely new connection, no different than a
request from a brand new browser. While this seemed the logical
when HTTP was first developed, it has had significant
implications on the design of applications as the web has moved
from the delivery of simple text documents to e-commerce
applications.
While web servers are capable of transmitting
any type of document requested by a URL, web browsers are
designed to display only specific types of documents. The
primary document type that a web browser is designed to display
is a Hyper Text Markup Language or HTML document. An HTML
document is an ASCII document that contains a minimum set of HTML
tags that define the essential document structure plus any number
of optional tags that are designed to describe the document's
structure or appearance.
All web browsers also display plain
ASCII text documents. As browser technology has advanced,
browsers have become capable of displaying a growing number
of document types. If a browser does not know how to display a
document type, it offers the user the option to save it to disk.
Browsers also can make use of helper programs called plug-ins
to display document types that the browser does not understand.
HTML is a simple, fixed, subset of Standard
Generalized Markup Language (SGML), an extremely complex document
description language. HTML is composed of tags that always start
and end with paired angle brackets, < >. A number of HTML
tags are used in pairs where the opening tag has the simple <
> structure and the closing tag has the form of </ >.
Between the angle brackets or angle bracket slash combination is
text that identifies the specific HTML tag. Each tag in some way
affects the presentation of the document in a browser. Where the
tags are paired, the contents between the tags are affected in a
predefined manner.
In addition to the tag name which follows the
opening angle bracket, many HTML tags may contain attributes that
are typically in the form of attribute name, an equal sign and
the attribute value. Attributes further refine the presentation
of the document in a manner specific to the tag and attribute.
One of the key tag pairs is the so called
anchor tag which is a paired HTML tag. The most important
attribute of the anchor tag is the HREF attribute which contains
a URL. The text between the opening and closing anchor tags
typically describes the document pointed to by the URL. The form
is as follows: <a href="URL">descriptive text</a>.
HTML is not case sensitive.
When displayed in a browser the descriptive text
would typically be underlined so that users will know that this
is a hypertext link and that clicking on the descriptive text
will cause the referenced document to be retrieved. This document
does not intend to describe the HTML language, except as it
relates to significant web design issues. For a comprehensive
reference on HTTP and HTML, see the HTML Sourcebook. Be
sure to get the latest revision as HTML is being updated and
extended with some regularity.
A complete URL consists of four parts. The
first part is the protocol which is HTTP by default but can also
be FTP in any contemporary web browser. Other protocols may be
supported by some browsers. The second part is the host name,
i.e. the name of the web server machine, on which the document is
located. The third part is the virtual path to the document.
The fourth and final part is the actual document filename. An
example URL might be:
http://www.xyz.com/dir1/subdir2/mydoc.htm
In this example the protocol is http and is
separated from the host name by a colon and two slashes. xyz.com
would be the domain name of the company or organization that
owned or ran the web site. Www is the host or machine name on
which the web server software runs. It could be a dedicated
machine that runs only a web server or www might be an alias for
another machine. Though web host names typically begin with www
this is purely conventional and there is no requirement that web
site names begin with www. This was once almost universal but it
is increasingly common to find web sites whose names do not begin
with www.
Dir1 is the first subdirectory under the
directory defined as the web or document root directory. Technically there
is no reason that this could not be the root of a Unix system or
the root directory of a Windows partition but for security
reasons this is not, or should not, ever be the case. Subdir2 is
the next subdirectory under dir1 and mydoc.htm is the actual
document filename. On a Unix server this might actually be
/home/httpd/html/dir1/subdir2/mydoc.htm and on a Windows server
this might be c:\inetpub\wwwroot\dir1\subdir2\mydoc.htm. Note
that on a Windows server, the URL still contains forward slashes
even though the directory naming convention in Windows is
backslashes. In the Unix example, html is the web or document
root directory and in the Windows example it's wwwroot.
Top of Page -
Site Map
Copyright © 2000 - 2014 by George Shaffer. This material may be
distributed only subject to the terms and conditions set forth in
https://geodsoft.com/terms.htm
(or https://geodsoft.com/cgi-bin/terms.pl).
These terms are subject to change. Distribution is subject to
the current terms, or at the choice of the distributor, those
in an earlier, digitally signed electronic copy of
https://geodsoft.com/terms.htm (or cgi-bin/terms.pl) from the
time of the distribution. Distribution of substantively modified
versions of GeodSoft content is prohibited without the explicit written
permission of George Shaffer. Distribution of the work or derivatives
of the work, in whole or in part, for commercial purposes is prohibited
unless prior written permission is obtained from George Shaffer.
Distribution in accordance with these terms, for unrestricted and
uncompensated public access, non profit, or internal company use is
allowed.
|