[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

(Fwd) File: "MORGAN PRV5N6"



Der zweite Aufsatz aus PACS-R.
Ronald Schmidt



------- Forwarded Message Follows -------
Date:          Tue, 20 Sep 1994 10:29:07 -0500
From:          "L-Soft list server at University of Houston (1.8a)"              <LISTSERV _at__ UHUPVM1.UH.EDU>
Subject:       File: "MORGAN PRV5N6"
To:            Ronald Schmidt <SCHMIDT _at__ HBZ.HBZ-NRW.DE>


+ Page 5 +

-----------------------------------------------------------------
Morgan, Eric Lease.  "The World-Wide Web and Mosaic: An Overview
for Librarians."  The Public-Access Computer Systems Review 5,
no. 6 (1994): 5-26.  To retrieve this file, send the following e-
mail message to listserv _at__ uhupvm1.uh.edu: GET MORGAN PRV5N6
F=MAIL.  Or, use the following URL: gopher://info.lib.uh.edu:70/
00/articles/e-journals/uhlibrary/pacsreview/v5/n6/morgan.5n6.
-----------------------------------------------------------------

1.0  Introduction

     The WorldWideWeb (W3) is the universe of network-accessible
     information, an embodiment of human knowledge.  It is an
     initiative started at CERN, now with many participants.  It
     has a body of software, and a set of protocols and
     conventions.  W3 uses hypertext and multimedia techniques to
     make the web easy for anyone to roam, browse, and contribute
     to. [1]

This paper overviews the World-Wide Web (frequently abbreviated
as "W3," "WWW," or the "Web") and related systems and standards.
[2]  First, it introduces Web concepts and tools and describes
how they fit together to form a coherent whole, including the
client/server model of computing, the Uniform Resource Locator
(URL), selected Web client and server programs, the HyperText
Transfer Protocol (HTTP), the HyperText Markup Language (HTML),
selected HTML converters and editors, and Common Gateway
Interface (CGI) scripts.  Second, it discusses strategies for
organizing Web information.  Finally, it advocates the direct
involvement of librarians in the development of Web information
resources.

2.0  Background

In 1989, Tim Berners-Lee of CERN (a particle physics laboratory
in Geneva, Switzerland) began work on the World-Wide Web.  The
Web was initially intended as a way to share information between
members of the high-energy physics community. [3]  By 1991, the
Web had become operational.
     The Web is a hypertext system.  The hypertext concept was
originally described by Vannevar Bush, [4] and the term
"hypertext" was coined by Theodor H. Nelson. [5]  In a hypertext
system, a document is presented to a reader that has "links" to
other documents that relate to the original document and provide
further information about it.

+ Page 6 +

     Scholarly journal articles represent an excellent
application of this technology.  For example, scholarly articles
usually include multiple footnotes.  With an article in hypertext
form, the reader could select a footnote number in the body of
the article and be "transported" to the appropriate citation in
the notes section.  The citation, in turn, could be linked to the
cited article, and the process could go on indefinitely.  The
reader could also backtrack and follow links back to where he or
she started.
     The HyperText Transfer Protocol (HTTP) that allows Web
servers and clients to communicate is older than the Gopher
protocol.  The original CERN Web server ran under the NeXTStep
operating system, and, since few people owned NeXT computers,
HTTP did not become very popular.  Similarly, the client side of
the HTTP equation included a terminal-based system few people
thought was aesthetically appealing. [6]  All this was happening
just as the Gopher protocol was becoming more popular.  Since
Gopher server and client software was available for many
different computing platforms, the Gopher protocol's popularity
grew while HTTP's languished.
     It wasn't until early 1993 that the Web really started to
become popular.  At that time, Bob McCool and Marc Andreessen,
who worked for the National Center for Supercomputing
Applications (NCSA), wrote both Web client and server
applications.  Since the server application (httpd) was available
for many flavors of UNIX, not just NeXTStep, the server could be
easily used by many sites.  Since the client application (NCSA
Mosaic for the X Window System) supported graphics, WAIS, Gopher,
and FTP access, it was head and shoulders above the original CERN
client in terms of aesthetic appeal as well as functionality.
Later, a more functional terminal-based client (Lynx) was
developed by Lou Montulli, who was then at the University of
Kansas.  Lynx made the Web accessible to the lowest common
denominator devices, VT100-based terminals.  When NCSA later
released Macintosh and Microsoft Windows versions of Mosaic, the
Web became even more popular.  Since then, other Web client and
server applications have been developed, but the real momentum
was created by the developers at NCSA. [7]

3.0  The Client/Server Model

To truly understand how much of the Internet operates, including
the Web, it is important to understand the concept of
client/server computing.  The client/server model is a form of
distributed computing where one program (the client) communicates
with another program (the server) for the purpose of exchanging
information. [8]

+ Page 7 +

     The client's responsibility is usually to:

     o    Handle the user interface.

     o    Translate the user's request into the desired protocol.

     o    Send the request to the server.

     o    Wait for the server's response.

     o    Translate the response into "human-readable" results.

     o    Present the results to the user.

The server's functions include:

     o    Listen for a client's query.

     o    Process that query.

     o    Return the results back to the client.

A typical client/server interaction goes like this:

     1.   The user runs client software to create a query.

     2.   The client connects to the server.

     3.   The client sends the query to the server.

     4.   The server analyzes the query.

     5.   The server computes the results of the query.

     6.   The server sends the results to the client.

     7.   The client presents the results to the user.

     8.   Repeat as necessary.

This client/server interaction is a lot like going to a French
restaurant.  At the restaurant, you (the user) are presented with
a menu of choices by the waiter (the client).  After making your
selections, the waiter takes note of your choices, translates
them into French, and presents them to the French chef (the
server) in the kitchen.  After the chef prepares your meal, the
waiter returns with your diner (the results).  Hopefully, the
waiter returns with the items you selected, but not always;
sometimes things get "lost in the translation."

+ Page 8 +

     Flexible user interface development is the most obvious
advantage of client/server computing.  It is possible to create
an interface that is independent of the server hosting the data.
Therefore, the user interface of a client/server application can
be written on a Macintosh and the server can be written on a
mainframe.  Clients could be also written for DOS- or UNIX-based
computers.  This allows information to be stored in a central
server and disseminated to different types of remote computers.
     Since the user interface is the responsibility of the
client, the server has more computing resources to spend on
analyzing queries and disseminating information.  This is another
major advantage of client/server computing; it tends to use the
strengths of divergent computing platforms to create more
powerful applications.  Although its computing and storage
capabilities are dwarfed by those of the mainframe, there is no
reason why a Macintosh could not be used as a server for less
demanding applications.
     In short, client/server computing provides a mechanism for
disparate computers to cooperate on a single computing task.

4.0  Uniform Resource Locator

The Uniform Resource Locator (URL) is a fundamental part of the
Web.  It is utilized to concisely describe and identify both the
protocol used by and the location of Internet resources. [9]
     In general, a URL has the following form:
protocol://host/path/file.  "Protocol" denotes the type of
Internet resource.  The most common are: "gopher," "wais," "ftp,"
"telnet," "http", "file," and "mailto" (electronic mail).  "Host"
denotes the name or IP (Internet Protocol) address of the remote
computer (e.g., 152.1.39.42 or www.lib.ncsu.edu).  "Path" is a
directory or subdirectory on a remote computer.  "File" is the
name of the file you want to access.
     Using variations of this general form, you can use URLs and
Web browsers to access just about any Internet resource.  Here is
an example of a URL for an FTP session:

     ftp://ftp.lib.ncsu.edu/pub/stacks/alawon/alawon-v1n04

This URL results in the following actions: 1. FTP to
ftp.lib.ncsu.edu, 2. log on as anonymous, 3. change the directory
to /pub/stacks/alawon/, and 4. get the file alawon-v1n04.
     Since Web browsers understand and implement the File
Transfer Protocol (FTP), you do not have to remember all the
commands necessary to do FTP.  All you have to remember is how to
create a URL for an FTP session.

+ Page 9 +

     Here is an example of a URL for an HTML document:

     http://www.lib.ncsu.edu/stacks/alawon-index.html

This URL opens up a HTTP connection to www.lib.ncsu.edu, changes
the directory to stacks, and retrieves the file
alawon-index.html.  URLs are more complicated than the general
form illustrated above; URLs can also provide the means to
present the logon name for Telnet connections, a communications
port, an index/search query, and/or an HTML anchor.  Here is an
example of a URL for a Telnet session:

     telnet://library _at__ library.ncsu.edu:23/

In this example, "library" denotes the logon name and "23"
denotes the communications port.  (Port 23 is the standard Telnet
communications port.)  Thus, a Web browser can initiate a Telnet
session.  This example opens up a Telnet connection to
"library.ncsu.edu," and, depending on the user's browser, the
user may be reminded to log on as "library."  This URL does not
use the "path" or "file" parameters because they are meaningless
for Telnet sessions.
     On the other hand, to manually query the Geographic Name
Server, the URL would be:

     telnet://martini.eecs.umich.edu:3000/

Since the Geographic Name Server requires no password, no
password is specified; however, since the Geographic Name Server
"listens" on port 3000, a nonstandard port number must be
specified.
     WAIS searches can be specified using URLs.  Unfortunately,
at the present time, only NCSA Mosaic for the X Window System
directly implements the WAIS protocol.  WAIS URLs have the
following form:

     wais://host:port/database?query

"Port" is assumed to be 210 (the standard WAIS/Z39.50 port),
"database" is the source file to search, "?" delimits the
database from the query, and "query" is the your search strategy.
Here is an example of a URL for a WAIS search:

     wais://vega.lib.ncsu.edu/alawon.src?nren

+ Page 10 +

Gopher servers and files can be specified with URLs as well.
Since Gopher resource specifications require "Type" identifiers
and paths to Gopher resources often include spaces, Gopher URLs
usually deviate from the norm.  Here is an example of a URL for a
Gopher subdirectory:

     gopher://gopher.lib.ncsu.edu/11/library/

Notice the pair of 1's after the Internet name of the computer.
These 1's specify the resource as a directory.  On the other
hand, the following URL specifies a specific text file within
that directory:

     gopher://gopher.lib.ncsu.edu/00/library/about

The "00" denotes a text file.
     Constructing URLs is more difficult when the path and/or
file names of the Internet resources contain special characters
like spaces or colons.  In these cases, escape codes must be used
to denote the special characters.  For example:

     gopher://gopher.lib.ncsu.edu/0ftp%3amrcnext.cso.uiuc.edu _at__ /
     pub/etext/etext91/aesop11.txt

This long URL first asks a Gopher server (gopher.lib.ncsu.edu) to
FTP a file (aesop11.txt) from an anonymous FTP server
(mrcnext.cso.uiuc.edu).  Notice the "%3a" and " _at__ " in the URL.
They are used to denote a colon (":") and at sign (" _at__ "),
respectfully.  Furthermore, notice the zero proceeding the "ftp."
This is used to identify the remote file as a text file.
     As you can see, Gopher URLs are particularly difficult to
decipher.  The easiest way to construct a URL for a Gopher item
it to access the Gopher server via a Web client, traverse the
Gopher menus until you locate the resource, and then copy the
displayed URL from the appropriate part of your client's screen.
     In summary, URLs unambiguously describe the location of
Internet resources.  Using URLs as a standard, Internet client
programs like Web browsers can interpret URLs and retrieve the
desired information.  URLs describe the protocols and locations
of Internet resources without regard to the particular Internet
client software the user is employing to access them.

+ Page 11 +

5.0  Example Web Client Software

Four examples of Web client software are described here: MacWeb,
NCSA Mosaic for Microsoft Windows, Lynx, and NCSA Mosaic for the
X Window System.  These particular pieces of software are
described because I think they presently represent the best
clients for the most common computing environments (i.e.,
Macintosh, Microsoft Windows, character-terminal-based VMS or
UNIX, and X Window System).
     The real power of these Web clients (usually referred to as
"browsers") is their ability to understand multiple Internet
protocols.  Each of the browsers described understands how to FTP
files, act as Gopher clients, and read and interpret the output
of Web servers.  Additionally, each of these pieces of software
understand "forms," an HTML extension allowing the user to
complete electronic forms similar to Gopher+ ASK blocks.  While
none of these clients can directly understand the Telnet
protocol, each can be configured to load and run Telnet software.

5.1  MacWeb

As the name implies, MacWeb is a Web browser for the Macintosh.
Written at the Microelectronics and Computer Technology
Corporation (MCC), MacWeb is distributed via the Enterprise
Integration Network (EINet). [10]  MacWeb requires System 7 and
at least MacTCP version 2.0.2.  MacTCP is an operating system
extension available from Apple Computer that allows Macintosh
computers to understand the Transmission Control
Protocol/Internet Protocol (TCP/IP) necessary for Internet
communications.  A very important piece of software called
"StuffIt Expander," is strongly recommended when using MacWeb or
NCSA Mosaic for the Macintosh (MacMosaic). [11]  StuffIt Expander
is a utility program used to translate and uncompress files;
compressed files are usually retrieved via FTP archives.
     The advantages of MacWeb are that it is fast, has an elegant
and easily customizable interface, supports the automatic
creation of HTML documents from its hotlists, and indirectly
supports the WAIS protocol by launching MCC's WAIS client,
MacWAIS.
     Its disadvantages are that you cannot select and copy text
directly from the screen and, when the displayed text is saved as
a text file, the displayed text looses all of its formatting.

+ Page 12 +

5.2  NCSA Mosaic for Microsoft Windows

NCSA Mosaic for Microsoft Windows is bound to be one of the more
popular Web browsers since most people have or will have
Microsoft Windows-based computers. [12]  NCSA Mosaic for
Microsoft Windows requires a WINSOCK.DLL.  Like MacTCP, the
WINSOCK.DLL software allows your computer to understand TCP/IP.
Common WinSock packages include LAN WorkPlace for DOS and Trumpet
WinSock.  Additionally, NCSA Mosaic for Microsoft Windows
requires the 32-bit Windows extensions (Win32s).  Win32s runs on
80386, 80486, or Pentium computers.  The Win32s software is
available via anonymous FTP from NCSA.
     One of the nicest features of NCSA Mosaic for Microsoft
Windows is the ability to customize its menu bar.  By editing the
MOSAIC.INI file, you can delete or add menu items to the menu
bar.  Consequently, you can configure the client and have it
display commonly used Internet resources.
     At the present time, you cannot select nor copy text from
the screen.  Therefore, if you want to save displayed text, you
must use the application's "Load to Disk" option.

5.3  Lynx

Lynx is a basic Web browser that is intended to be used on DOS
computers or "dumb" terminals running under the UNIX or VMS
operating systems. [13]
     Lynx clients are wonderful when your only Internet
connection is located on a remote computer (i.e., most dial-in
access) or when you need to provide a lowest common denominator
interface (e.g., VT100 terminals).
     Lynx clients don't support image or audio data, but they do
support the "mailto" URL.  Mailto URLs are used for the Simple
Mail Transfer Protocol (SMTP), the Internet mail standard.  When
a Lynx client user selects a mailto URL, the user will be
presented with a "form" to complete and the resulting text from
the form will be delivered via Internet mail to the person or
computer specified in the URL.

+ Page 13 +

5.4  NCSA Mosaic for the X Window System

NCSA Mosaic for the X Window System, coupled with NCSA's Web
server (httpd), really gave the Web the momentum and visibility
it has today. [14]  This full-featured browser supports copy and
paste from the display.  Direct WAIS support is also provided,
and URLs such as wais://wais.lib.ncsu.edu/alawon?nren are valid.
At the present time, just about the only thing it doesn't support
is the mailto URL.
     The disadvantage of NCSA Mosaic for the X Window System is
that it requires a relatively powerful computer.  While a
Macintosh equipped with MacX or a Microsoft Windows computer with
HummingBird Communications' eXceed/W can run X Window terminal
sessions, NCSA Mosaic for the X Window System really requires
direct access to a UNIX or VMS machine running the X Window
System software.

6.0  Example Web Server Software

If you want to become a Web information provider, you need to
utilize Web server software.  This section describes the most
popular Web server software for the most common computing
platforms (i.e., Macintosh, UNIX, VMS, and Microsoft Windows).

6.1  MacHTTP

MacHTTP is an Web server for Macintosh computers. [15]  Written
by Chuck Shotton, MacHTTP is one of the easiest servers to set up
and configure.  In fact, it is so easy it works "straight out of
the box."  MacHTTP requires System 7 to support advanced features
like AppleScript.  MacHTTP runs on Macintosh II-type computers
(e.g., Macintosh IIci, SE/30, LC, Centris, and Quadra computers).
It does not run on low-end Macintoshes based on the Motorola
68000 microprocessor (e.g., Macintosh Plus, SE, and PowerBook 100
computers).  MacHTTP requires MacTCP.

+ Page 14 +

     Because of its simple installation, I recommend the use of
MacHTTP to learn the basics of Web servers.  Since it is so
small, just about anyone can create a server on their desktop
computer and effectively experiment with serving HTML documents.
A Macintosh is not recommended as an institution's primary
server, since the potential user population may be very large.
On the other hand, a group of Macintosh servers that were linked
together via the HTTP protocol to form a single virtual server
could easily distribute the load, with each server supporting a
subset of an institution's HTML documents.

6.2  NCSA httpd

Based on the number of postings to comp.infosystems.www
newsgroups, NCSA's httpd seems to be the most popular Web server.
Running under the UNIX operating system, httpd is distributed
both as source code and in binary form for the many "flavors" of
UNIX. [16]  This server is robust and only slightly difficult to
configure.
     If you have a UNIX computer at your disposal and your
server's intended audience is large, then I recommend the use of
NCSA httpd.  I recommend this for several reasons.  First, this
server is widely supported by the Internet community; you can
always find an expert, and it is easier to get help for this
server than for the CERN server.  Second, since it runs under
UNIX, it is intended to coincide with other applications running
on the same computer, like Gopher, WAIS, or a list server.
Finally, many Common Gateway Interface (CGI) scripts are written
in Perl, a programming language most at home on a UNIX computer.
(CGI scripts are described in more detail later.)

6.3  CERN httpd

If you have a VMS computer, you cannot use the NCSA http server;
however, there is an appropriate Web server available.  It is a
port of the CERN httpd server by Foteos Macrides of the Worcester
Foundation for Experimental Biology.  Like the servers described
previously, the CERN httpd server for VMS comes in binary form as
well as in source code form. [17]  Configuration is not as easy
as MacHTTP or NCSA httpd for Windows, but it is not any more
difficult than NCSA's httpd server for UNIX.  Presently, the
server does not support the POST method, the preferred method of
transmitting information from forms to CGI scripts, but it works
just the same.  One advantage of VMS is its strong scripting
language, DCL.  DCL is works well for CGI scripts.

+ Page 15 +

     If you plan to maintain a server, your intended audience is
large, and you have a VMS computer at your disposal, then I
recommend using this server software.  If you have a UNIX
computer, use the NCSA http server instead.

6.4  NCSA httpd for Windows

Robert B. Denny has ported the NCSA httpd server to Microsoft
Windows. [18]  Like MacHTTP, it worked for me "right out of the
box," and it supports all the standard features, such as forms,
CGI scripts, graphics, and access control.
     Its disadvantages are that it is considered slow and it
requires a lot of system resources (memory and CPU power) as well
as a WinSock-compatible TCP/IP driver (just like NCSA Mosaic for
Microsoft Windows).
     This server would make a good platform for PC users to learn
the basics of HTTP and server maintenance.  Like MacHTTP, I would
not recommend this application as the main server of an
institution, such as an academic library.

7.0  Web Servers Versus Gopher Servers

There are several reasons why Web servers should be used instead
of Gopher servers.
     First, in terms of computing resources, Web servers are more
efficient since most of the information processing is distributed
to the client software.  A Gopher client can effectively have
access to FTP and WAIS services, but the Gopher server is doing
all the work.  On the other hand, Web clients (for the most part)
understand these protocols and take the load off the server.
     Second, because Web clients understand HTML, Web servers are
not limited to making their information available via menus.
Thus, more descriptive texts and abstracts can be added to
hypertext links making it easier for the user to evaluate
possible choices.
     Third, Web servers are significantly easier to maintain.
For example, every "study carrel" of the North Carolina State
University Libraries' Web server consists of a single HTML file
created either with a public domain editor or via a report from a
database program.  This is so much easier to maintain and manage
than all the link files and directories of the study carrels in
the Libraries' Gopher server.

+ Page 16 +

8.0  HyperText Markup Language

The HyperText Markup Language (HTML) is used to format documents
delivered by Web servers.  The formal HTML standard can be read
from the CERN server, [19] and a few style guides are available
from the WWW Developer's JumpStation. [20]  A subset of the
Standard Generalized Markup Language (SMGL), HTML's strengths and
weaknesses are well documented by Price-Wilkin [21] and Barry.
[22]  Therefore, only a brief overview of HTML will be provided
here.
     HTML files are simple ASCII files containing rudimentary
"tags" describing the format of a document.  Creating an HTML
document is a lot like using the old word processing program
WordStar.  (Remember WordStar?)  For example, to print a word in
boldface type using WordStar, the user would first select text
from the screen.  Then the user would enter a code like "^b."
This code would be inserted before and after the selected text.
When the document was printed, WordStar would interpret the "^b"
and print boldface letters until another "^b" was encountered.
HTML works in a similar fashion.  The author goes through his or
her document surrounding text with special codes denoting format.
Since the Web employs the client/server model, there is little
control over the fonts and styles of formatted text at the client
end.  Therefore, HTML provides logical rather than stylistic
formatting capabilities.
     The basic structure of an HTML document looks like this:

     <HLML>
     <HEAD>
     <TITLE>My First HTML Document</TITLE>
     </HEAD>
     <BODY>
     Hello, World!
     </BODY>
     </HTML>

The <HTML> and </HTML> tags define the document as an HTML
document; the <HEAD> and </HEAD> tags denote the leading matter
of a document; the <TITLE> and </TITLE> tags specify the
document's title; and the <BODY> and </BODY> tags specify the
location of the formatted text.  Notice how the second tag of
each tag pair is identical to the first tag except the second tag
includes a backward slash ("/"); the backward slash denotes the
completion of a logical formatting option.

+ Page 17 +

     Within the body of an HTML document there can be many other
formatting constructs.  Examples include the <P> tag for
paragraph marks and the <BR> tag for simple line breaks.  There
are also the ordered list (<OL>) and unordered list (<UL>) tags
that allow the user to create lists of numbered items and
unnumbered items, respectively.  An ordered list results in
formatting something like this:

     1.   apples
     2.   pears
     3.   bananas

An unordered list results in something like this:

     *    red
     *    white
     *    blue

The real utility of HTML is not its ability to format text.
Rather, its real strength lies in its ability to transport a user
from one section of text to another (or to a completely new
document) by clicking on (or selecting) highlighted words.  This
hypertext capability is HTML's greatest asset.
     The hypertext features of HTML are implemented with tags
called "links."  Links are tags containing either an anchor, URL,
or both.  Section headings are usually used as anchors in HTML
documents.  Thus, anchors are used to navigate to another section
of the presently viewed document or, when used in conjunction
with a URL, to navigate to a section of a different document.

9.0  HTML Converters and Editors

Creating HTML documents by hand can be a laborious process; it is
easy to forget all the various tags and formatting rules.
Consequently, there are a growing number of software tools
available to make the HTML document creation process easier.

9.1  Simple HTML Editor (S H E)

Simple HTML Editor (S H E) is an HTML editor in the form of a
HyperCard stack. [23]  It requires a Macintosh and HyperCard
version 2.1 (or HyperCard Player).  Optional editor features
require MacWeb and the AppleScript extensions.

+ Page 18 +

     The creation of a document is a four-step process.  First
you create a new document.  Second, you enter text into the
document.  Third, to enhance your document, you select text from
the screen and choose a markup option from the menu.  Finally,
you save the document.  Specific knowledge of HTML is not
necessary, but it helps.
     Unique features of S H E include Balloon Help, forms
creation, and one-step preview if you have MacWeb.  Like all HTML
editors (with the possible exception of HoTMetaL), S H E is not a
WYSIWYG editor.  In other words, the user is presented with raw
HTML when editing.  Another limitation of S H E is its inability
to create documents longer than 30,000 characters.

9.2  HTML Assistant

HTML Assistant is a Windows-based HTML editor. [24]  It works
like other editors in that you enter text on the screen and make
changes to the text's characteristics by selecting the text and
choosing a markup option.  Like S H E, HTML Assistant is not a
WYSIWYG editor, but it to has the ability to test your work with
a Web browser at the click of a button.

Other features include:

     o    A user defined toolbox enables you to easily include
          new markup text as more features are added to HTML.


Listeninformationen unter http://www.inetbib.de.