Credit
by Evaluation and Resources for Professional Experience
in Information
Technology
What
follows in these pages is my effort to
identify,
organize, and describe my prior learning in the broad area of
information
technology. Between the years 1990 and 2002, I worked as a system
administrator
at a number of companies.
For
the purposes of this paper, I have tried to
create a curriculum
framework that would allow someone to see all the aspects of
my work
in, and knowledge of, computer operations: operating systems,
software
systems, servers, software systems development, and
administration. In
addition
to my own description of what I did in these areas, I have also
collected (and
linked to) various web sources (such as those found in WikiPedia) and
to
syllabi that I researched in these areas. In this way, my hope
was to
provide a
good view of the entire industry in a syllabus framework, which would
also help
me to organize and present my Empire State College degree program plan.
|
Credits |
Web Resources |
Syllabi |
|
|
|
|
||
|
5 |
|
||
|
5 |
|
||
|
|
|
||
|
Total |
15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
||
|
2 |
|||
|
1 |
|||
|
1 |
|
||
|
2 |
|
||
|
Total |
8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
|
||
|
5 |
|
||
|
Total |
10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
|
||
|
1 |
|||
|
1 |
|||
|
1 |
|
||
|
Total |
4 |
|
|
|
|
|
|
|
|
|
|
|
|
|
Design |
1 |
||
|
Development |
1 |
||
|
Total |
2 |
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
|||
|
1 |
|||
|
4 |
|
||
|
Total |
9 |
|
|
Credit
by
Evaluation
Topics
Operating
System (OS)
SunOS
The
SunOS operating
system was developed by engineers who had graduated from the
The SunOS was and extension of the
When I started working in financial technology, SunOS, running on the
Sun
workstations, provided the platform where most of the Internet
development occurred. Companies, such
as Goldman
Sacs, literally invested billions of dollars a year in contractor fees
building
trading systems that were based on the Internet Protocols such as
TCP/IP. They, and others, helped propel the Open
Systems
methods of computer communication by creating sophisticated
interconnected operations from the original Berkeley
developed operating system and its sockets networking software.
In
the late 1980’s
many Unix vendors had
embraced the original ATT Unix
system in such a way that it surpassed the
operation capabilities of the Sun Microsystems's
Sun
Microsystems made
the change
to the ATT architecture only after it realized that it needed
to enter
the big iron market.
Because Sun Microsystems had converted to
the ATT version
late, and would therefore go through a period of flux and instability
with its new operating system, I decided to migrate my
career to a
system that had matured with the ATT version of Unix all along.
The system that I chose was Hewlett Packard's Unix, HPUX.
Moving to HPUX
was also a
lucrative move for
me as it was commonly chosen for well-funded big iron
operations, where administrators could expect better salaries.
It also presented gratifying challenges. Because it was
a corporate oriented system and, almost void of
imaginative
innovation. My ability to bring cutting edge applications
from the
forward-thinking public domain to HPUX made me a valued
administrator. Important
was my adaptation of the free and openly available GNU C language
compiler to
HPUX for the Merck pharmaceutical company.
Linux
is central to the
concepts of freedom in computing; its lead developer has copyrighted
it in
such a way that Linux code is effectively in the public domain.
In
1992, its
development scheme was presented to the world in such a way that many
talented
software developers joined the Linux effort and dedicated themselves to
its success.
Today it is
technically robust and at least half the world wide web services use
it. Linux’s
success is important in understanding changes in the Information
Society since the operating system was first introduced in the early
1990's. A culture and
community of mutualist programmers and designers now
produce the
highest quality computer products available. They use a
paradigm that goes
beyond traditional economic principles; theirs is a mutualist
effort, where the
majority of the
work is done by volunteers, and a lesser portion is gifted by
corporations.
Today, Linux contributes to science as the operating system of choice
for the
very huge super computers, which are actually vast and closely-knit
networks of
the familiar personal computer. Best known of these huge
systems are
the Google search engine and the government weather prediction system. A parallel contribution
may emerge where the
mutualist culture created to build the Linux system (recently termed
e-mutualism) will
form groups dedicated to the most important of sciences, making
medicine.
I was able to accelerate my career by using Linux at home to develop
system
administration applications. Linux
was
naturally applicable as a study project for computer clubs; I
used it as the basic area of study in my mentoring organization, the
Linux Society, where I
gave talented high school students operational hands-on
experience.
Today the Linux Society exists online and members are active
in
developing social
purposes for the Internet.
Software
Systems
Object Oriented
In
the mid-90’s
at the New York Stock Exchange, I had been
given the task of developing a system to take a snapshot of all the
specific
settings of the trading sessions of specialists on the exchange
floor. In
the event of a computer systems outage all these settings could be
applied to
the recovery process so that all the specialists would be brought back
to the
exact same screens they had before the outage.
The amount of detail that was required to achieve this level of
recovery accumulated to the point where
simple
strings of stored data could no longer contain the
information. I discovered that the data was grouped into
categories, from which I could create sub
categories. Using the Perl language I created a
system that used
back slashes, "\", to show nesting categories and
sub categories within a single string of characters. The
level of nesting
(or depth) of the data
was determined by the
number of backslashes in front of the data.
Shortly thereafter, the Object Oriented (OO) version of the Perl
language
came
along, changing the landscape for me and other systems administrators.
It offered a far better way of containing
information
requiring depth of categories; this concept is called
complex
structures. With this I was able to extend my earlier work
with nested
information.
Complex structures allow data and associated variable names to
be stored
in highly ordered locations. The Perl OO language provided
a simple
syntax and gave facilities to retrieve, update,or
delete data. The tree-like data stored within a
Perl OO program
can be printed to the screen and viewed, making de-bugging infinitely
easier.
At important points along the computational process, this
data can be
written to the disk either in readable text format or in more
efficient binary
format.
Beyond the complex structures, which are data objects, there are the
other
components of OO programming which use the complex structures to create
small-ish programs called modules.
can be plugged virtually anywhere into other pieces of
code.
This modules may contain procedural (old fashioned, non-object) code,
or
modular OO code. Modules can be installed within modules, and
again
within modules, to a nearly endless degree.
In the initial, procedural, pre-object oriented incarnation of
programming languages, the
module is
called a library. In OO programming a module is much more
alive; it is
not just a repository of code and the modules themselves may very
likely need to take a front
seat
role during the execution of a program. For this to happen,
objects of
code are allowed to communicate with each other by allowing one code
object to
set variables within another. As the code, often an object of
code, loads
modules of different kinds, its abilities and characteristics can
change
depending on the nature of the modules it uses.
These are the components of OO Perl that were most useful in
administering
large systems installations. Because systems management is
more of a
"nuts and bolts" discipline than applied computer science, there is a
natural limit to the complexity of software written to increase systems
integrity. Success often depends on a cognizance of the
abilities of
co-workers. A simplistic approach works best, gleaning from
object
oriented technology only its most useful features;
Perl has been developed with this approach.
Decision
Support in
Information Data Centers
During
the mid-to-late
1990’s, a major buzzword was the term metric, a well-defined
and
descriptive characteristic which is given a value. I saw
metrics used in
economic analysis as well because I was usually working for financial
concerns. In computing, a simple example of a metric can be
the
percentage of use of the basic brain of a computer, the CPU (central
processing unit).
A high
percentage of CPU use would imply delays in the return of information;
a low percentage implies the probability of higher
performance. Many
metrics are available for most computer systems including memory
usage, disk
usage, available space, and operating system configuration
characteristics.
Hewlett Packard made available for its HPUX system a metric sensing
tool that gives amazing detail to the
perception of
performance of a computer system, including the network communication
components. This package included delay metrics: metrics that
indicate the
time taken for a process to occur, while other processes are waiting
for its
completion. These
delays, or waiting
periods, often have a one to one relationship with the experiences of
the people
using the systems for business operations. By creating
generalized
metrics from combinations of these specific metrical measurements, that
is, by creating
meta-metrics, data can be produced that
accurately predicts when the users were feeling frustration from system
delays. Technology
staff using these
metrics can know exactly when these users will make "help desk"
calls. This way, the staff can initiate a timely repair or
enhancement
before the system performance becomes unacceptable; users never have to
suffer
frustration, and the operation remains supportive of the business
effort.
The most useful tool for helping decision making in systems technology
is the
predictive graph. Graphs can have linear regression lines,
fitted curves,
and numerical indicators. A second set of curves and numeric
values,
called correctness indicators, determine the usefulness of the
curves in
predicting systems performance behavior. Metrical information
is
collected from remote computers and devices into a database to be made
available for analysis. There are two ways to process these
collected
metrics. One is to supply the data to an administrator who is
equipped
with a statistical analysis tool. All desktop tools of this
type
have the
ability to
refine the data into accurate predictive curves. Another way
is to
analyze the data at the server where the data has been collected,
and supply information to the administrators in the form of
pre-made
graphs and numerical charts.
The formulas that create the pre-made graphs often need to be modified
as
experience improves the techniques for using correction indicators.
Since
the Perl language is so widely used in research science, sophisticated
statistical formulas are well supported by the Perl e-mutualist
community.
When managers seek information to help budget the purchase of new
equipment, it
is exceeding unlikely that they will load a statistics package; they
are more
likely to seek a pre-made picture from for their web browser.
In
cases
where higher-ups
cannot ignore looming disaster,
simplified
and
highly up to date graphic representations are desirable to help them
make decisions.
In the technology support scheme, there are always large numbers of
queued
tasks requiring attention and remediation. Knowing exactly
when a
performance problem will affect productivity helps administrative
groups schedule the efforts
at
keeping systems at maximal efficiency. Careful planning
with respect
to prioritization can avert crisis,
especially in
an administration group
suffering from staff shortages.
In many cases, metrical analysis tools are really alert systems
to indicate spurious problems. Quickly finding the
root cause of the
alert and a solution is aided by the graphical and numerical
information. When problems are well known because they happen
often,
reactions can be automated through a process common to databases; this
is called triggering.
Over the long term, meta
studies of metrics can be used to find
areas of general weakness in the over-all operation.
By
far the most common
database is the SQL relational model; examples are Oracle and the
public domain
MySQL. They are built of collections of tables very similar
to common spreadsheets. The SQL language queries these tables
in many
ways to
produce or insert specific information values or sets of values. The system
protects the tables much as an
operating system protects files.
Internally it controls users’ access to tables;
externally, it
controls access from the network with authentication security
systems.
The history of the relational database is steeped
in
extreme math: tuple calculus, domain calculus, first order logic, and
relational algebra. While high-level mathematics inspired the
first
relational databases, the SQL data manipulation language is
deliberately designed to be very
simple.
It has three basic modes: data manipulation library (DML), data
definition
language (DDL), and data control language (DCL).
In the DML, there are four "manipulation" commands: Select (find a
row),
Insert (add a row), Update (modify a row), and Delete.
In the DDL, you have: Create to make a database, table index or stored
code
snippet; and Drop to destroy one of those entities.
DCL controls user activity inside the database; Grant gives the ability
to
access data or create structures; and Revoke takes these privileges
away.
As an example, the common command Select is modified with: From
(meaning
the
table); Where (something is true or not true about the data to narrow
it down);
and Order (which is a sorting modifier). There are, of
course, deeper functionalities to these technical
verbs and
modifiers.
Noting the spreadsheet analogy to a relational database, one can
instantly
assume that relational databases are extremely useful in
business. Tables
are named for well known accounting components; the types of business
information that can
be inserted into the tables is specified and well-defined.
The retrieval of sequential of sequential information in relational
databases is highly efficient, especially in the production
of reports.
Somewhat more
complicated is the process of the insertion, or updating, of
information into
very large relational tables. Database systems create mapping
tools
called indexes
to speed the process; often the indexes are larger than the
entire rest
of the dataset.
Missing entirely from the relational model is the concept that types of
information maybe undefined and that desirable data maybe distributed
across
different systems, where the types and locations of the data are
unpredictable. My initial work with
complex structures
grew from a need to discover and store this kind of elusive
data.
Having the resources of only me to work with (and no budget beyond my
salary),
the techniques I developed were made from the simplest components;
tools available
from the Perl language itself.
Had a well-funded corporate team been assigned my data collection
tasks, existing
products
would have been modified and overlays would have been created for
relational
databases to give them an object oriented appearance. In
effect,
this is how OO databases work; they
layer object
oriented technology over the relational database with extensive
mapping.
Collecting information from distributed sources probably would have
also seen the
modification of existing network databases or directory systems.
In developing systems data collection and analysis systems, I used all
the
benefits the Perl had made available with its object oriented version. My Perl programs would
collect data, work with
it, and, as part of the process, create complex structures.
The
process
of saving
data in the OO paradigm is called persistence.
The data structures
held in memory by the program are serialized into
coded data strings and
then written to disk in binary or text format.
At some point, that data will be collected for analysis.
Alternatively, the same data that has been serialized into a
string (just as
in creating persistent objects), and can be sent across the network to
a
server
process running on a centralized management machine. The
server process,
technically a dataserver, would take this small complex data structure
in
serialized format, and place it a determined place within the much
greater body
of collected data.
Entirely using Perl, data was collected and added to the central
database
representing the entire operation. From that vast
complex structure,
a central
machine was able to create charts and graphs giving historical trends
or
instantaneous snap-shots. Up to date information is supplied
to the
administrators screen and, in some cases, triggers within the
dataserver proved
instantaneous solutions to problems. The complex structure on
this
dataserver was a collection of small persistent object files of about a
megabyte in size. The OS’s file systems provided
both storage and
data-location mapping in its tree-like structure, enhancing the
efficiency of this
paradigm. Effective
OS's speed access
to data files by keeping commonly used data-chunks in memory, instead
of
relegating them to only to disk when a file is updated.
Memory
access is, of course, many times faster
than disk access.
Data
mining often refers
to the searching of the huge repositories of sales data that have
accumulated
in corporate database archives over the past few decades.
Sales experts
saw an opportunity to enhance marketing by sifting through these data
heaps
hoping to find patterns in buying trends, often to the granularity of a
specific individual. From
this sifting
they could target people more effectively with advertising.
Similar to data mining is data discovery,
which to me, it is much more interesting; discovery seeks to
find useful data
trends from networks,
possibly
as large as the whole Internet. In
the mid-1990’s
data discovery presented an exciting prospect. Of
specific
interest were large amounts of system data, especially error
messages,
which were lingering in files distributed across the
network on computers.
They indicated problems that were not being
reported, and even error information that was being correctly captured
by
logging systems was not being processed and understood.
By passing these messages through character pattern recognition
algorithms, aided by Perl language, I was suddenly able to reveal
troves of
previously unknown vital information.
Since Perl does not require types of data to be specified in advance,
called
data typing, and since Perl complex structures require no
specific data
size; information of any kind can be stored in Perl persistent
objects. The database system that I had created for compiling
system wide
performance trends now had a name, DepthDB.
I created a tool
set of modules that culminated in a full featured dataserver along
with web-based
server code giving users a free form repository of searchable
information. From this I built a web based editor that could
be used to
create and manipulate metrical data structures, as well as record more
general
information.
In a common business use for data mining, companies have scanned
records for
embedded medical information about employees to determine if employees
are a
health risk and to fire them for that reason. To many
people, this
describes a form of data mining that is an incredibly negative use of
the
Information
Society.
Better uses of data mining benefit drug invention; patterns
of patient reactions to medication, called toxicity, can be recovered
that
would otherwise be lost. One of my accomplishments at Merck
was in
supporting a mathematical group doing just that form of discovery.
For
the world wide web (WWW)
to work you need a pair of components: a web browser, and a
server system that provides it with information it requested in the
HTML
format. Apache,
an e-mutualist and
free project, is the most widely used web service system. An
e-mutualist
community of developers who, through selfless cooperative work checked
by a
relentless peer review process, has made Apache popular by providing a
system
that is usable, safe and elegant.
Named
for the
numerous update patches, or fixes, applied by programmers in its early
days, the Apache community
has now
grown into a collection of Java related products that compete
successfully at
the corporate conglomeration level, and provide a reference model for
the design
of protocols and standards.
To provide information for a user from within a database, a web service
program
has to go through several steps: it has go to a database; collect
information from it; and then convert
the
information into a browser-friendly format such as HTML, or an updated
version
called XML (extensible markup language). The user at the web
browser
requests information by sending a query to the web server.
That query has
the familiar httpd://www.something.com prefix and concludes with a form
of code
that is complex enough to specify the needed information, yet still
trivial to
read and interpret. The web server has a facility to allow
programmers to
embed programs within the Apache server. These programs
provide
the logic necessary
for
contacting the database on behalf of the user and providing the code
from
which the
browser
forms a
readable web page. A
common
facility of this type is used by the DepthDB system: the
CGI
(common gateway interface). CGI is the original web
information
service; Perl developers pioneered and advanced this
technology creating web services, as we know them today. The
CGI system is still widely used; it is so efficient that, in the case
of DepthDB, performance limits have never been
reached. The use of
Perl in DepthDB's dataserver, as well as the dataserver's reliance on
efficiencies
provided by the operating system and its file systems, give DepthDB
great system
scalability.
In very large web services designed to handle vast numbers of users,
the
software systems used to supply response information utilize a lot of
code; they are too complex to develop and maintain from within a
webserver.
To
accommodate these relatively new and quickly growing systems, a new
type of
server had to be created: the application server. An
application server
keeps its code base away from the webserver in an area known as a
container, which is usually
on
another
computer.
The
web server is
only used as
an interface to communicate with the Internet, and all the programming
logic is
provided from within the application server’s container of
code.
Java is the language of these systems; it utilizes all
available
object
technology to enable the new huge web services.
The container system
allows for increasingly sophisticated systems that are programmed by
large
teams of developers through the engineering of object oriented modular
technology. An important feature of Java is the high-level of
control
applied to programmer not found in other languages; this type of
control is
almost absent from the Perl language. Code written in Java is
highly
structured and the data is specifically typed.
This means that it
lacks the flexibility in data handling found the Perl language, as well
as Perl’s
generous freedom in allowing programmers
a personal coding
style. Java’s coding style is
controlled
by the compiler for the benefit of management, whereas Perl’s
lack
of
restrictions in style has allowed for the creation of software that
borders on anarchy.
An
amazing thing
happened in the early 1970’s with the development of the Unix system by Bell Labs, on
behalf of the phone monopoly.
A language had been developed that allowed a systems operator to
actually
communicate with the core of the system, and get human-readable answers
back
from it. Technically a batch language, it differed
diametrically from the
systems control languages of the same type that had been
developed
by IBM for
operating its
mainframes. This added a
unique warmth to the
previously inaccessible central system of a computer; it was given the
descriptive
name of
Shell and was further developed into an entire programming
environment.
A host of tools were added making Shell a highly
sophisticated
programming environment that was easy, and fun, to use.
In the Open Systems environments, typically Unix
and
Linux, Shell is the language spoken by the very interior of the system:
the
kernel. The Shell initiates all programs. Even the
systems
initiation process, called "init", is initiated by Shell.
Init,
in turn, creates more Shell instances. It provides the glue
that binds
all of the system, and gives every user access to the system.
Small applications
can be created out of its text programs, which are called
scripts. Shell is always alive somewhere in the system, in at
least one
instance but usually in dozens.
In its conversational nature, Shell reads and interprets every
character as
soon as the user hits the new-line key, or as soon as an end-of-line
character
is met in a script. Since each character is individually
examined, it is
much less efficient to the computer than a compiled program. By comparison, compiled
programs are fed to
the computer in the language of the central processing unit (CPU),
machine code, and then
they are executed at the computer's top speed.
The
Perl language came
about as a response to Shell's slowness and a need by systems
administrators
to be able to quickly create efficient code to help manage applications
in
the large, and growing, Open Systems environments of the 1990's.
Perl is
interpreted in the
sense that Shell is, Perl programs are called scripts and they
are
kept in text format; the
Perl program reads the characters just as Shell does.
Perl is much faster because the program
code is read in one pass, and then effectively pre-digested for the
computer
into something called bytecode. The bytecode is then
compiled by another
Perl facility into machine code to be executed by the
computer. The
existence of bytecode allows a form of unreadable, almost machine code,
to be
created that is portable across all computers, where only the
second phase of
the compiling process is specific to the native computer. The
Java
language has a similar arrangement but goes a step further in that
Java
programs are stored in this pre-digested bytecode and not as text code
scripts.
Perl has been described as a language with the simplicity of Shell;
the speed of C; and includes the powers of Shell’s
supporting
programs, sed and awk. The joke was, "it is too good, get rid
of
it"
Perl is a wildly successful example of e-mutualism; it
was purely
democratically
created by and for systems
administrators. The OO evolution
of Perl used only the features of object oriented methodology that
would be beneficial to its users,
who were
often its developers.
It gave such freedom of use that I, a systems
administrator and not a programmer,
created OO code whose elegance pioneered several concepts.
Perl is fascinating to the Information Society because of its
popularity and
success. Until the emergence of Java as a corporate language,
virtually
every bit of code to be executed by web servers was written in
Perl.
Java, on close examination, is very similar to Perl and it is probable
that
Java's architecture was inspired by Perl.
There is a single repository of Perl modules called the comprehensive
Perl
archive network (CPAN); it is the most advanced system of its kind.
It
is a structured single repository of Perl modules containing virtually
all the
publicly available Perl code. Every officially recognized
Perl module can
be installed on almost any computer from this network
archive. The Perl
modules that support the CPAN have the ability to check your Perl
installation
for completeness before adding functionality. When a module
is requested,
the CPAN code recursively loads every prerequisite module to
assure that
the desired module has all the supporting sub-modules. Since
much of Perl
code is really C code glued within the Perl
modules, there is code compiling in the process; and a series of
regression
tests are run against every installed module.
CPAN is brilliant as a guide and prototype for future public
domain software support systems. Using code distribution
systems
modeled after CPAN,
virtually
any computer (no matter how small) can be fully supported wherever it
goes.
I discovered this potential capability when the CPAN was
first
developed in the
late 1990’s; yet there is no hint of this probable future for
a CPAN-like
architecture in any description or definition anywhere on the
web.
Presently being developed is a Perl interpreter called the Parrot
virtual
machine (Parrot VM), which
promises to be factors faster than the
commonly known Java
virtual machine (JVM). Because
it is
written in a native Parrot language that is very similar to low-level
computer
assembly language, it can potentially have a closer affinity
with the computer than any other virtual machine.
Also being developed is a model called ubiquity, which
describes
a world wide network of a vast number of tiny computers, all
with
network
capabilities, spread out across humanity. These tiny
computers are
sometimes described as wearables.
These three highly efficient technologies: the CPAN repository of code,
the
Parrot VM, and ubiquity; could be combined into a single networked
paradigm
that could completely change the focus of the Information
Society. Today,
civilization is completely dependent on centralized computer operations
that
are accessed by bulky computers running wasteful code that
is usually corporate owned and proprietary. By combining
these three technologies,
humanity could
be served on a completely personal level; together they could
ultimately pull the
control
of computing out of the hands of proprietary owners and distribute it
to the whole of humanity.
Computing, and
therefore the Information Society, would become user-centric; but since
a computer cannot focus directly on
a person, more
accurately, this form of
computer
communication would be described as user-data-centric.
Perl’s technological progress over the past five years has
been focused
on a new and much different version of the language called Perl6, along
with the
underlying virtual machine, Parrot VM. The new virtual
machine
was named after a deliberate rumor, an April fool's joke. The
design of
Parrot is pure genius, but unfortunately, the arrival date of a
workable version
will be some forty years in the future.
I am not sure why Perl6 is needed; Perl
is so amazingly good that replacement seems unnecessary; work being
done on it
by the e-mutualist community draws volunteer resources away from the
Parrot VM
project. Competition
with Java seems to drive Perl6
development; but there already is a fully developed language directly
evolved
from Perl called Ruby. On
the other hand,
the
Parrot VM’s
technology is so inspired that its value won’t be realized
until the
Information Society experiences it; I hope a way can be found to
accelerate its development.
My curiosity about Ruby prompted me to ask expert Java developers what
they
thought of Ruby in comparison to Java. After looking it over
they all said
unequivocally, "Ruby is the perfect language."
The
"ML" in
HTML (hyper text markup language) stands for markup language and
represents a type of language used
for developing computer based documents. Markup languages
have
existed in computing since the 1970’s; the original example
is known as SGML, which introduced the use of angle brackets
(“<”,
“>”) for controlling text. A
desire by
the scientific community
for an efficient way of sharing information evolved the Internet into
the World
Wide Web with the introduction of HTML.
The most useful feature of
HTML is its
namesake concept of hyper text, where web page readers can
access relevant information about a word, or phrase, by clicking on it,
if it is
highlighted. The clicking action brings a reader to another
hypertext
document. A computer that served web pages, a web server, was
introduced in 1990;
and web
services were made free for use over the Internet in 1993.
Only
two years
later, in 1995, large portions of the world were experiencing instant
access to
hypertext enabled information.
HTML encourages page coders to use relative concepts. Letters
can be
described as being larger or smaller than previous ones in varying
degrees, and
the widths of columns are designed to be proportioned in percentages of
the width of a page.
When loaded into a browser occupying a largish desktop
window, the
columns will appear wide and short. If the same page is
loaded into a
much smaller browser window, the columns will be narrow and deep.
This
allows a variety of computers to access the same pages, displaying them
in a
way appropriate for their specific
capabilities of the computers' hardware. Font types
and sizes can be determined by the user by changing settings
on the
browser. With the use of cascading style sheets, more control
is
granted to
the user. HTML specifications do allow for the types of
controls
typically used in word processors and typesetting.
Unfortunately they are
overused, taking away much of the flexibility originally intended for
web pages.
Cascading style sheets (CSS) are remarkable in that they take the
attributes controlling
the graphic layout out of the HTML code, and place them in the head of
the
document in the form of CSS code. When a browser renders a
page with CSS
support, it finds ID tags within the markup elements in the HTML code
and references the CSS
code to
see what instructions are needed to lay out the page. The CSS
code can be
kept in a separate file to be accessed by the browser for any number of
related pages,
giving them all similar appearance. The CSS code can then be
easily
modified to give the same set of pages a completely different
appearance.
Larger fonts or brighter colors can be provided for people
who need them
for sight reasons. Users can specify their own CSS code and
keep a
personal CSS file available to enhance pages to their personal
requirements.
XML (extensible markup language) is very common now. It was
once
considered a revolution in information engineering, but it is now
considered a
successor to HTML; XML it is useful mostly
for creating a variety of document
types. The excitement that XML experienced started around the
Millennium;
it was promoted as the interface between all the various data
processing
systems that cannot communicate with each other because of proprietary
restraints in their data coding formats.
Unlike HTML, which has specific markup tags, XML's tags are entirely
user
defined and non-proprietary. Tags can therefore resemble, for
instance,
the same accounting conventions that are used in the relational
database.
The references that would determine what those tags are,
are usually kept in a separate file called a DTD.
XML provides some of the benefits of object oriented complex structures
but without significant
support. It was proposed that XML become the storage medium
itself, where
XML pages would replace the data in databases. This created
unrealistic
expectations that XML would become a dominant computer language, rather
than
just a text markup language.
The sophistication that XML has brought to HTML has propelled the
publishing
industry into the contemporary Information Society.
By combining
CSS with XML, browsers can render XML data into web pages.
The ID
tags within the XML structures that link the structures to the
definition reference tables can be associated with the familiar HTML
printing codes for creating pages with the use of CSS code.
Unfortunately the browsers
themselves
have not been able to fully comply with up to date CSS standards.
XML is commonly used for formatting Internet tools other than
browsers, such
as chat tools and RSS news readers. Systems already related
to the web,
such as Apache, use XML for configuration files.
Presently, the popular public domain office suite Open Office saves
all its
data in a native XML format. Microsoft is promising that by
2006, Microsoft
Office
documents,
including Word's, will be saved in XML. This significant
because it
will make Microsoft document formats much more available to alternative
office
suites, giving those suites a better chance in the software market.
Computers
typically operate
in the client/server mode. When users require information on
their
personal computers their computer, called a client, makes a connection
to a
server. Technically speaking, the software client running on
the user’s computer binds
to a
socket running on the server.
The server software process, often called a daemon, provides the server
socket. The socket
is always available and waiting
for a computer to bind to it; data
can then
be
returned to the client software being run on the user's
computer.
This
is the common
network connection model used by the Information Society, and was
designed in
When accessing a web page, a user's computer makes a single connection
and the
server returns a single response. The web server will
probably
handle
many
requests before that particular user makes another
connection. This
single request and response scenario describes stateless
connections.
In
comparison, when administrators communicate with a distant computer,
they use a
remote Shell to create a stateful connection to that computer using the
SSH
encrypted
communication protocol. The SSH server on the remote computer
is waiting for a connection to come to it from an attaching
client which
is on the administrator's computer. It will carry out an
authentication
procedure; it will then read the keys as they are typed into user's
keyboard.
The information from the keyboard is sent to the computer running the
server where it is given to the kernel. The response from the
kernel is
sent by the server through the socket back to the user's computer where
the information is written on the monitor. The SSH server
then
waits
for
more
characters to come from the user's keyboard. This continuous
cycle of input by the user and response from the server defines the
stateful connection.
A
file server often holds the
personal workspaces for users in a central location, rather than having
these directories spread across a network on personal
computers.
The file server effectively supplies remote
disks for users, keeping theirs elsewhere on the network. A
file server can contain any kind of
data including entire applications. A benefit of the
keeping
data
on a file
server, rather than a user's computer, is the ease provided for
maintaining data;
backups
of the data as well as system updates can be performed easily if the
data is centrally located. Safety enhancements, such as
disk mirroring,
can insure the integrity of data sets by protecting them against disk
failure. Network
disk sharing was important in earlier days of computing when disk
space
was very expensive.
The typical file service system in Open Systems (Unix
and Linux) environments is network file sharing (NFS), and was
originally
released by Sun Microsystems in 1984. The Microsoft
counterpart service (to
the Open
Systems environments), is
called
system message block (SMB.) The public domain Samba
system provides
the same SMB service on Open Systems so that Microsoft
workstations may rely on Open Systems servers.
NFS is also
available for Microsoft machines, both for desktop clients and as a
file
server.
Technically
speaking,
three words define data services. Databases are
simply collections
of data, whereas relational database management systems (RDMS) are data
storage
and retrieval products. A typical RDMS is the Oracle system
that includes query languages; storage and indexing systems;
and a wide
variety of specific functionalities. Data servers
are generally
machines that contain databases, and run the querying and
updating systems. Described
more accurately, a dataserver is
the instance of data storage and access software running on the machine
that comprises the
RDMS.
IT groups have the responsibility of designing the schemes
that contain
the data, managing the data present on the systems, and controlling
access to
it. A database would presumably run within a single RDMS, but
may include
many machines. A data mining operation, for instance, would
very likely
span several dataserver, where a separate application server
would
be
responsible for dredging the data, and it would probably run
on a separate
machine.
A
single powerful
computer can operate a very large web service. The Apache web
server is
very sophisticated, yet the
actual process of
serving a page of HTML code is almost trivial. Web servers
have
been implemented in microchips no larger than a matchstick
head. I
found in my
DepthDB
application that the vast majority of the work done during the whole
process,
where data is requested and presented to the user, was actually done by
the browser in formatting the page on the screen. Features
that
I have
implemented in Apache are the encrypted tunnel (SSL), password
protection, CGI
Perl scripts, Mod_Perl (enhances Perl server code), and virtual hosts
(allowing
multiple websites to use a single server).
If one
views the multiple
requests that a user makes to a Web server as a
session, the state of the session (information specific to a
user's progress through the system) is usually kept either in the
database that is linked to the server, or in
a temporary holding unit on a user's machine called a cookie.
Alternatively, data specific to the user's session can be
encapsulated
in the data strings to be sent back and forth as part of the web
communication process between the web
server and the user's browser.
The concept of keeping the state (session) information within the
"back and forth" communication process (rather than in a database or a
cookie)
is the route that I
chose for DepthDB. I created a specific tool using complex
structures to encapsulate
the
user's specific information. The HTML standards
include a "hidden
form", which is where I stored the complex
structure for the trip from the browser to the server.
For the
return trip, the complex structure is provided to the browser in the
form of
an environmental variable.
The Apache server runs a CGI script every time it is accessed; it then
removes
the program code from memory, only to have to have to reload it again
the next
time there is a user request for its services. Having to
reload
the same code many times is
wasteful;
when the server is accessed tens of thousands of times, the constant
reloading becomes a
source of
delays. Perl and Apache solve this problem with the Mod_Perl
facility, which allows the Perl program to live within the Apache
program. As
the
Apache program is always alive,
the commonly used Perl
program within it only has to be accessed once. Other
facilities
like
this exist in
Apache for most languages.
The
chosen language for
very big corporate web programs is Java. The
programs that provide
web services in Java are called servlets and server pages
(jsp). The
business logic behind the web services is built into a modular
architecture
called Enterprise JavaBeans (EJB.) The Java 2 Enterprise
Edition
(J2EE)
provides standards for containing the web components.
Typical of the containers, which really define the application server, are Tomcat (from the
Apache community) and JOnAS (from
ObjectWeb.) Both application servers boast e-mutually
provided
code which is freely and
openly available.
Servlets are Java programs that descended from the web applets that are
often embedded in HTML pages. Applets were
the original use for Java, but Java's design
principles made
it ideal
for wide
scale modular architectures.
JSP is a way to create HTML pages by embedding the server logic within
stored
web page code as references. When the page is designed by an
HTML coder,
the references are written into the pages where the data should appear
when the
page is created for a user. When
the markup code
is created for the user's browser by the server, the values are
injected into the page code in these
locations; the web page and the business data are effectively blended
with the use
of these
references. With
this arrangement, HTML
coders can just develop web pages, while Java programmer
counterparts can
work
entirely on business logic. This is an improvement over the previous
technique; where pages had to be built by the server code
itself,
requiring
Java programmers to be expert HTML coders as well.
The server software had to
be written to
print HTML code while it was printing the output from the business
software. The DepthDB system,
being based on CGI
technology, uses the latter more awkward process creating less
attractive, more utilitarian, web pages.
JavaBeans are the independent class component units of the Java2
architecture
from Sun Microsystems.
Application servers are referred to as middleware. They
provide systems
transparency for programmers so they don't have to be concerned with:
the
operating system; the specifics of network computing; or huge array
of
interfaces usually required of a modern web based
application.
The
application
server communicates with the web in the form of HTML and XML; it links
to
various kinds of databases; and often it links to systems and devices
which can
range from huge and irreplaceable legacy applications to home
appliances.
Portals are a very common application server system by which
organizations can
manage information for their users. A portal
provides a
single point of entry for
all users where they
can
access information services transparently
from any device anywhere within the sphere of the organization.
They can work flexibly inside or
outside of the organization's offices, and they can attach themselves
to any part of the
organization.
Managing
all these
systems is daunting; there are dozens of services run in an IT
department. The
most important is the domain name service
(DNS) that maps system addresses to the systems’ names;
others support services
include automated directories of users and systems; monitoring and
decision
making systems; and security services. Maintenance
and updating responsibilities are assigned to well-protected machines;
they make
modifications to all the remote machines in a highly secure supervisory
mode, often as an automated process.
All these small but crucial roles take place from within the systems
administration
group,
usually from inside the machine rooms of the data center.
Systems
Development
Software
Systems Analysis
and Design
This
is a
quote from an object oriented programmer. Many people would
see this
approach of systems development as counter intuitive, as putting the
cart
before the horse.
The
traditional approach of engineering a project is called the waterfall
approach;
it is straight forward and resembles the way any team might approach a
project.
First they would state the requirements, analyze them, design a
solution
approach, and then envision a framework for that solution.
They
would then develop
code for the system and test the code as it is being created.
Assuming everything works,
they would
deploy the system to the customer.
While
the
waterfall approach is probably the most common method of system design,
it has
recognized short comings. It lacks flexibility for
easily correcting
design errors as they are found during development, and it cannot adapt
to
changing requirements. The customers requiring the
software system often
discovers that they have more needs as the system is being developed.
The
quote at
the beginning shows the opposite of the waterfall approach; it is
typical of the
philosophies of
the newer schools of software engineering called radical
approaches. These new software
development
approaches came into being during the hectic growth of the
1990’s; conservative
engineers are leery of them.
The
above
quote refers to the test driven approach; it is from an
individual
programmer, not a manager or an engineer. It shows a design technique
were
a
programmer first develops a finite list output that a future code
module is
expected to produce;
and then the programmer continually tests the module as the code is
being written.
The module is written to exactly fit that list of
requirements;
it assures
that it will
work perfectly, and that no unnecessary code is written into the module.
It
also
enhances one of the greatest values of the modular programming world;
it
assures that the code is potentially reusable. Effectively,
the module the programmer is developing serves two masters:
the testing software that he wrote, and the
higher-up
modules within the system that will call this module. Future
users
of the code will
feel comfortable that module
is usable for their purposes because
the
test software stands as a reference. The potential
reusability of a module is as important as its integrity is.
When
the test is
being used to developer the software, the programmer is not looking for
proof
that it works, but for the parts of the module that do not yet
work. The
test software seeks to fail the module in ways that tell the programmer
what
next needs to be created, or modified.
The
developer
who wrote the opening statement uses a unit test to develop a small
module for
a large system. In comparison, a test of the whole system is
called a
functional test. Developers sometimes request a set of
requirements
specific enough for a functional test, even before a large project is
designed. While
this approach may
seem absurd, it helps
developers bring home a point
by challenging customers to fully understand
what their system needs really are. Most important to
understanding the
test driven approach is that the development process is only as good as
the tests
are. A
programmer practicing this approach has to be a good test developer as
well as
a systems engineer.
A very effective iterative life cycle approach has been developed which
describes a series of four milestones that a software systems project
must pass before it can be called successful. With
the first milestone,
the
business case of a system has to be developed; expectations of its
benefits have to justify the resources being invested in it.
To
pass the second milestone, the components of the system have
to be
well understood with respect to their activities and the system's
participants; it is here that the
architectural structure is envisioned. The actual
construction of the
system
and its code is done to reach the third milestone. Finally,
to
pass the forth milestone, the system is
delivered to the user community for an acceptance test; the
system is operated in parallel to existing systems to assure its
integrity. If, in this test,
the
product cannot be used, it goes back to the drawing board, or is
possibly halted and forgotten.
In the development of a sophisticated application, the above scenario
is actually nearly impossible
to achieve in the four step life cycle process. In practice,
the development process is broken down into smaller iterations
of
this process. The project goes through many repetitions of
the
four step life cycle process; each iteration of the project through the
life cycle
process produces a working prototype
of the final product where the body of the project gains size and
sophistication. The lifecycle process is also applied to
each of the subset
modules as well, allowing an organic growth from within the core of the
project. Developers can halt the project anywhere along
iterative cycle
process knowing it is
much cheaper to rework design concepts than rewrite the code.
As the entire application iterates through the life cycles, the design
team continually releases increasingly
usable
code to the user community. All
the
while, they gather feedback from the users during the user
acceptance testing that follows each of the life cycles. As
an
added benefit, the interim releases
bring value to the enterprise in that they provide at least
some
of the needed functionality. Now that the user community
has gained
experience with prototype versions of the actual final product, they
will almost certainly
discover new requirements. Their suggestions for new or
improved
features are implemented into the next life cycle, giving the
project value
beyond initial expectations.
Not all projects can be delivered in repeating
life cycles.
Aircraft
software systems, for instance, have to be delivered in whole and in
perfect condition before
the first flight of he aircraft. This type of software
development requires true engineers, not just talented
developers. The testing
is done exclusively by simulation systems; the testing process is
highly
mathematical. Simulation testing is the only
feedback
mechanism available to designers and programmers when building this
type of system.
Even aircraft systems design examples describe precisely what the
developer in the
initial
quote seeks from the outset; testing is a driving force of
development. In the lone developer's case, the coder is own
software engineer,
and
therefore he is solely responsible for his work.
Some software system development approaches concentrate on the human
side, proposing close
personal bonding in
the coding process. With these approaches, everybody on the
team takes ownership
of
all the code
in the project; each team member improves it for integrity and
tightens
it for
efficiency as soon as problems are noticed. Bonding in the
group
guarantees the equal distribution of knowledge about the system so that
there
is a universal understanding of how the entire system works; all users
understand how to write effective test programs to further drive
development. Correct knowledge sharing is
important.
My experience using modules from the Perl CPAN was sometimes
frustrating in that the
regression tests obviously didn't test the real substance of the module
because things weren't working. Only by field testing the
modules
was I able to prove their value.
As
can be expected, the
modular
nature of software systems engineering parallels the nature of
object
oriented software paradigms. Surprising,
however, are
similarities between abstracted design concepts and the actual
computers and
their software. The model view controller (MVC) is
an
abstracted concept that
seeks to separate each web based transaction into three phases; the
input, or controller, phase (input from a
user's keyboard); the
logic, or model, phase (where the systems do their internal work); and
the
visible results, or view, phase (answers are returned to the
users
screen). This abstraction layer fits directly over
the schematic of
an applications server.
These two tables illustrate a progression from the abstracted concept
through
the logic phase to
the hardware layers where the arrows represent the communication
components.
Typical Web Example
|
Controller |
-> |
Model |
-> |
View |
|
Input |
-> |
Processing |
-> |
Output |
|
Keyboard Typing |
-> |
Containerized J2EE Code Modules |
-> |
Web Page Creation and Delivery |
Motorist
lockout Example
|
Input |
-> |
Processing |
-> |
Output |
|
User Computer |
-> |
Application Server |
-> |
Remote Device |
|
Hand Held Device |
-> |
Containerized Code |
-> |
Car Computer |
|
User Tells of Lockout |
-> |
Code Logic Confirms |
-> |
Car Unlocks Door |
Developments in the design and
creation of large software applications have not been
limited to books, paradigms and
practice.
Actual languages have been created to facilitate
development. The
universal modeling language (UML) extends the tool set concept to a
unique form
of code now owned by IBM. The giant successfully sells
management
packages built from it to well-funded critical projects for
huge amounts
of money.
Another important approach to design, Aspect engineering, has resulted
in
languages that specifically add integrity to systems. Aspect
programming
assures integrity in systems by finding areas of concern that
are common
to every part of a system, yet don't necessarily relate to any of the
specific purposes of the system. These are
called
cross-cutting concerns. Typical examples of these are logging
and
debugging. There is a very strong resistance in most
programming projects
to providing verbose error information about the progress of a system.
Likewise, debugging is considered an after thought because it
does not
directly add to the benefits expected of the system.
Aspect orientation makes these important, yet often
ignored, functions part of the internal
workings of the modules. Since this functionality is buried
in the
system, it is transparent during the normal programming process.
It does
not impede the coding efficiency of the programmers and only minimal
effort is
needed to enact the capabilities. Programmers insert
triggers, called
join points, in important locations within the code to enact the added
integrity features.
As part of the design focus on the users, attention is given to assure
good
communication all around. Cards are used to abstract the
design
of the
entire system
to pull the focus away from technical details. This helps
workgroups
subjectively model the basic premises of the system's
existence,
strengthening the project's goals. The cards represent the
components of
the design model; they can be manipulated in various ways to represent
the states
and activities of the system, and the progress of the
project.
When used in one way, in the Object Model, the placement of the cards
represents the structure and substructures of the system's attributes,
operations, and associations. Used
another way, in the
Dynamic model, the cards show the behavior of the system in Sequence
(collaboration) diagrams, Activity (work flow) diagrams, and Statechart
diagrams (a slice of the system's activity at any moment.)
They cards are useful at meetings: simply stacking them into
piles
representing levels of progress can assure management that the project
is
succeeding; flipping through the cards and passing them
around can focus
discussion to specific areas of concern or modules; the scope
of
discussion can be widened to the entire project by combining the cards
into
groups.
There is an irony here; software systems engineering as I
have described it, is completely absent in the creation of some of the
most
significant
software there is. I am unaware of any software I am using
right now that
has benefited from simulators, state diagrams, or cards. The
modeling of
these systems is done mostly through email communication, at the group
level,
and a form of meditation is done during the programming
process. Maybe this
sparse planning scheme is detrimental, especially during the
hours of solitude experienced by most e-mutualist
programmers.
Perl, for one, seems to have gone astray with its
Perl6 and
Parrot projects; the arrival dates of usable versions of these projects
are so far
away.
It seems as if the volunteer work force for Perl is spread
too thinly for
its lofty goals; yet I am not sure a greater workforce would
accomplish the necessary development work more quickly. I can
show examples of
violent personality
clashes
over the Usenet between Perl programmers. It is possible that
the
techniques of the
radical developers, including the seeming silliness of using cards,
could
create a
sense of abstraction to reduce energy dissipation.
Perl has, none the less, built a model for the future management and
distribution of the world's code in its CPAN
repository. The Linux operating system is waiting for some
legal
governance to push
back the crushing dominance of the Microsoft monopoly; if and
when that happens, the free Linux operating system will be given a
chance to eclipse the XP system of Microsoft. Very likely,
other public
domain software projects will then also replace their proprietary
counterparts.
Hindsight
is a powerful learning tool for
software design and a design approach has been created to utilize it.
Design patterns are the encapsulated experiences of
programming efforts. The experiences are analyzed, documented and then
placed
in an online repository. Programmers can use these design
patterns
as templates, just as seamstresses lay patterns over
cloth. Patterns
can be fitted to any programming challenge or, alternatively, rejected
without any cost.
The
patterns, when viewed as a single repository, represent the sum
experiences of
software development and are an excellent learning resource.
Using these
tested and proven paradigms helps illuminate subtleties in
designs that
can result in major problems later on in projects. With the
use
and reuse of patterns, experience
accumulates improving the patterns. Programmers
and engineers
who understand design patterns, and participate in their improvement,
benefit from them by
creating code that is improved in structure, integrity, and
readability.
Like all other design approaches, design patterns use abstracted object oriented concepts. They provide generalized solutions that are documented in a format independent of any particular programming language, design paradigm, or problem type.
When
working with common
computer systems, observations fall into four distinct categories, CPU
(central
processing unit), Memory,
Disk and Network.
These four categories create the CMDN
model. Within these
categories, a wide variety of metrics are available to quantify the
usage, status and configuration details of all types of
systems components
and sub-components.
The software to run the computer is the operating system (OS).
The most
important
system code in the OS layers just above the CPU; it is called
the kernel. Memory
is shared by
all software, and configuration changes to it are from within the
kernel, or in
code very close to the kernel's code. Kernel code is so
important
to
the system's
operation that it is always kept in memory and is modified by changing
active
memory.
The management of the CPU and the OS is generally known as
performance
tuning. Updates to the OS usually involve
recompiling the
kernel and its supporting code, along with their installation into the
OS. The
boot process
connects the initial code of the OS kernel to the startup code within
the computer
hardware. The term is derived from the phrase "raising
yourself
by
your own bootstraps." In a more generalized sense, the boot
process
also
attaches the kernel to the file systems of the OS. There the
kernel
program finds the Shell executable code as well as modules
that
are loaded directly into the kernel's memory space.
When accessed, the kernel makes the modules an extension of
itself,
usually to
allow user programs access to hardware devices.
Disk systems are technically called storage. Storage
differs from
memory in that storage persists between re-boots, whereas memory is
wiped
clean. The object oriented term persistence derives from the
permanence
of disk stored data. Disks are typically divided into a
series of large
partitions preceded by a tiny slice at
the very
beginning of the disk called the MBR. This initial portion is
given to
the boot strap process and contains a wholly independent program.
The
e-mutualist community has developed the grand unified boot loader
(GRUB) to
live in
the MBR. It functions as an exceedingly small operating
system to be able
to give increasing capabilities to the boot process.
Disks, for instance, are usually only allowed four usable partitions on
a PC
computer with the last partition being split up to form extended
partitions. For the Open Systems OS's (Unix and Linux), the
limitation of four
partitions has
never been adequate because these operating systems keep different
kinds of
data on a larger number of different file systems in different
partitions. There
is a
protocol to extend the
number partitions by dividing
up the last partition in to sub partitions, but this arrangement
adds
unnecessary complication to the process of disk management.
Logical volumes (LVM) solve problem of a shortage of partitions by
allowing the OS to create a
software layer above the disks. LVM software breaks up all of
the data on
the file systems into tiny units and assigns each of them to arbitrary
locations on a disk, or across an array of disks. In this way
the
data is abstracted to allow the
user and the system to
have
complete flexibility in arranging the data across disks.
LVM software uses this flexibility to keep concurrent copies
of the data,
a technique called mirroring, or to provide the ability to resize and
move file
systems at
will.
Really good file system software has existed for decades in Open
Systems, but high
integrity file system software
has only
recently reached the common desktop computer. Today,
computers equipped
entirely with disk integrity suites from the public domain
can withstand repeated power
outages without damage to the
data. A technique called journaling is used to provide this
capability; the best available example is the public domain
ReiserFS file system.
When combined with freely available LVM software and an array
of cheap
disks, an average person
can build a storage system that as
reliable as any found in a well funded corporate datacenter.
Computer networks, technically speaking, operate outside of
the
computers that they link together;
yet much of the network hardware necessary to
operate networks
is within the computers. Network configuration is likewise
handled
locally on the computer, as is performance monitoring.
Ethernet is the most
common wired
communication protocol and has enjoyed two decades of stable and
efficient
performance.
Sometimes the CMDN categories overlap. Running out of memory
is like
running out of air. To prevent this form of complete failure,
the
operating system has algorithms
to move pieces of
infrequently accessed memory to the disk. Small portions of
data in
memory continuously trade places with data on the disks as memory
becomes
scarce, or stored memory is needed to run programs. From the
perspective of the systems operation, the
data moved to
the disk is effectively still memory, but it is referred to as virtual
memory.
Therefore key memory metrics include references to
disk-related
performance
and capacity
data.
In another example of CMDN overlap, file systems seen by the user may
not be on the computer;
they may be
on a file server. The file structure programs are
dependent on
network performance since all the data used has to travel
across the
network. Network performance metrics are more relevant in
this case than
disk performance metrics.
User support is handled by administrators. Originally all
Open
Systems
users operated within the Shell environment; many still do as it is
very
efficient and boasts high integrity. The Shell environment
offers
an ideal
programming and
Web administration environment; artifacts of it are visible
today as
components of the web. Shell users are listed in the password
file
and they are
given their own directories in the file system, which is where they
arrive when
they log in. These password files are merged into small
network databases
so that many local computers can recognize users as they traverse
the network.
Users can also find their familiar home directory on every
machine thanks
to the data sharing abilities of file servers. The systems capabilities of users, with their
login and data storage privileges, are managed by a
systems administration group.
Web applications, such as portals, have their own login schemes, which
are
handled from databases. This is necessary to insulate the
application
from the operating system so that the
applications can be moved to different types of computers and operating
systems. The users' privileges are managed in tables in
databases that are attached to the Web or application servers.
The
management
of large scale applications tends to fall to systems
administration groups
because of the complexity in joining the operating systems with the
applications. Issues continually arise in accommodating
applications,
especially during their inception phases.
Closely related to the systems administrator is the database
administrator.
The work locations of these two types of administrator
are rarely
separated by much physical distance. The fields of systems
and
database
operations are so integrated that the large systems management paradigm
depends
on their blended knowledge and responsibilities. Database
administrators
manage all the data within the databases; maintain the database
functionality;
and
control access to the stored data. They create and drop
databases
and tables; add or remove users; grant or revoke access
privileges; and manage the
custom
query code stored within the system. They also manage the
transactions
made by dataserver as well as the relationships between linked
dataserver,
either as clusters or widely distributed data storage systems.
Application servers are sophisticated enough to require specialized
support; their newness and complexity require frequent
proactive
training for the support staff. New business code from
development groups arrives regularly; and
as the
science behind their operation continually evolves, new application
server software often needs to be installed.
The continual installation of new code modules creates a
closeness
between the application server administrators and the software
developers.
Issues surrounding application server support from the
technical
perspective bonds the applications administrators with the systems
administration groups.
Application server
dependence on relational database systems assures that database
administrators
are constantly consulted for support by the applications
administrators. The
management of
users who are serviced by the applications servers can fall almost
anywhere; frequently, the business community is responsible
for
user
administration if the services they receive are purely financial.
Increased
specialization will occur as application servers begin to reach out to
the
rapidly increasing networks of intelligent devices connected by
wireless
communications. These small devices are expected to exist in
most
expensive equipment and durable goods in the next few decades.
A class of smaller services handled by
systems administrators
include mail servers, backup servers and technical support
ticketing
systems. Directory services such as LDAP (which usually
handle people
oriented information), or distributed
authentication systems (which can create a unified login system)
function
as small
databases but their support falls to systems
administrators. Other
important services include DNS (which relates Internet names
to physical
system address), FTP (for moving files around the Internet), HTTP (the
technical name for web services), SSH (secure log-in and file transfer
system
used by administrators), SMTP (email), and SNMP (a systems monitoring
protocol).
Internet
Networks are
the backbone of today's Information Society, yet the networking staff
tends to
be somewhat isolated from the rest of the computing environment.
Their
importance, however, is not
trivial. Their
activities
range
from the installation and repair of electrical cables and
connecting devices, to intensely difficult tasks; they design
and
manage the
vast spanning topologies of wire and radio pathways connected
by sophisticated multiplexing gateways.
Networking is abstracted into the layers of Application, Transport,
Network, Datalink, and Physical.
The Application layer is what we encounter personally. It is
the layer
where the web, email, database, and file sharing activities
operate.
Beneath this is the Transport layer. This is the domain of
the electrical
packets containing application data that travel back and forth between
computers and devices. TCP (transport control protocol) is
the
most common
protocol for
managing packets of data. SMB (server block message) is the
file system
sharing protocol used by Microsoft.
The Network layer lies below the transport layer but, in a more
accurate sense,
the network layer supports the transport layer. On this
level, routing
devices utilize the address information in the data packets to guide
them
around the Internet. When the routers send packets
along to their
destinations, they also add mapping information to the packets to
indicate how the
data
should get there and back again. The routers use mapping tables to
describe the effective paths around the Internet; the
maintenance
these tables is the responsibility of senior network
administrators.
The familiar groupings of four numbers, each ranging from 1 to 256, and
joined
by
dots, is used by the Internet protocol (IP) to number all the computers
and
devices on the Internet. Its range of 4
billion addresses was mistakenly thought to be adequate
to give unique numbers to all of the world's network nodes.
The
original Internet addressing protocol, IPv4, is being
supplanted with the IPv6 protocol. IPv6
will
offer a large enough array of available addresses to expand
the
Internet into other
galaxies. By putting encryption information right into
the packets, the IPv6 designers have assured that data is protected at
the lowest level.
IPv6, unlike IPv4,
has an address scheme that is not readable by humans. IPv6
computers and devices can only be referred to by
their associated
system names; the management of their addresses requires
software
tools. IPv4 will
always be desirable for local network administration, such as home
networks, because of the
simplicity of its numbering scheme. Routing technology
supports
IPv4 contained within IPv6 networks with the use of algorithms that
translate addresses
automatically, allowing the two protocols to co-exist comfortably.
On the
original IPv6
interest mailing list, I remember making the initial suggestions for
the
internal information encryption scheme.
This possibly
makes me a significant contributor to present Internet.
The Datalink layer brings together the abstracted concept of the
Internet protocol with the physical reality of the management of the
groups of electrons that make up digitalized packets.
For providing this service, Ethernet is the preferred
protocol. It was invented at the famous
Xerox
Palo Alto Research Center, but the Xerox Corporation
never thought the protocol was worth the effort of keeping.
They allowed
its primary researcher, Bert Metcalf, to leave the lab with the
technology. He
formed
the 3COM corporation to perfect the use of Ethernet; 3COM built much of
the early network interface equipment.
Beneath it all is the physical layer that simply describes the cables,
connectors and the silicon microchips that channel the electrons of the
packetized
data.
Added to the physical layer today is wireless technology. It
is very
popular and its contribution is significant because it gives
users
complete freedom in their need to access the Internet backbone.
Because
wireless exists as waves in open air, its functionality is
physically limited.
Socially important, the
potential exists for wireless to provide completely free and widely
available networks. At
some point, however, all data packets have to become earth bound to
utilize the
efficiency of traditional cable networks.
In
the
early1990’s, security responsibilities fell to the systems
administration groups I
worked in.
As a profession, we were outstanding at the task.
While lecturing teenage members of the Linux
Society, I sensed that
they enthusiastically absorbed the synergistic concepts for integrity
we developed
over a
decade ago. They seemed to instantaneously benefit from the
administrators' culture of responsibility and teamwork, enabling
them as
responsible members of the Information Society.
The simplest component of security is the password. As a
requirement, it
has to be meaningless and cannot appear in any dictionary, or on
lists of
previously used passwords. Passwords are made up of
characters and can
be often
seen as they pass through networking devices.
The encryption of data traveling across
networks provides the next
level of security protection, by preventing the viewing of passwords,
and other important data.
SSL
(secure socket layer) is the most common implementation of network data
encryption; you know
you are
using SSL when the URL in your web browser address space says HTTPS
rather than
HTTPD. In the SSL encryption scheme, every systems
user has
two cryptographic
keys: one public and one private. The public
one is provided
to everyone expecting to encrypt data to send to a specific user.
The private one
is held dearly
by that user because it is used to decrypt the data. It is
impossible to
break code protected by recent
versions of encryption technology; the technology
makes users feel secure in
their transactions, yet it gives anxiety to
governments. OpenSSL is
the public domain version; it is very popular, and it gives the SSH
shell its
encryption abilities.
Security
within the
operating system of the computer is like the
database control language used by relational database systems.
It determines where you can go within the file systems, and
what
you can look
at, change, or move around. In Open Systems
machines, files
have what are called
permission bits set in
their basic digital control
section, the inode. This is visible with the "ls -l"
command.
It has three categories: user, group, and world; and three modes: read,
write, and execute. Needless to say, you definitely don't
want
your personal
information to be world writable, and probably not world readable as
well. System programs are typically world
executable, which means that anyone with permission to log on
to
the system can run these programs. The average
user probably
does not have access to most
of the system files; therefore the usefulness of available
programs that modify files on a system is limited by the user's access
to files. Typically, users are only allowed to modify or
create
files in their home directories.
Capabilities
are an
alternative form of control over the access to
files on a computer. In a capabilities control system, access
to
files on the system is not granted to a user as a privilege; rather,
access to
the files is granted to programs. The operating system
registers access to specific files with the names of specific programs.
An
unregistered
program brought by a user to the computer would initially
have no
files associated with it; therefore, it could do no harm.
In a present day Open
Systems
computer, user access to a file is determined
by the access bits sent within
the file's inode and the ownership of the file. The
very sensitive password
file
is open to being read by everyone having access to the computer.
Existence on an Open Systems
computer is defined by a user's operating within a Shell
process; Shell, in
turn,
needs access the password file. Shell needs the
reference to
the
location to the user's home directory in the password file for the
initial process of logging in; for the login, it also needs access to
the users encrypted pass word. Shell
might also need to access to the password file to allow the user to
switch to a different user name; such as the supervisory
user, root. The passwords in that file are kept in a crypt
format to prevent their being read. Crypts are unreadable, of
course, but access to crypts gives
malicious users the opportunity of trying a variety of techniques to
find weak passwords; any system with
exposed password crypts is potentially vulnerable. In recent
years, this Open Systems vulnerability was resolved by moving
password crypts outside of
the password file using a complicated and awkward work-around
arrangement that is difficult to maintain. The added
complications create further potential for vulnerability known as the
"dumb deputy" scenario.
On a system using the
capabilities
scheme of access control, this important and interesting example of
systems vulnerability does not exist. Only the relevant
programs
would have access
to
the password file, rather than every user. Only programs
that have a legitimate need to see the crypts are given that
capability; users will never see the crypts,
and malicious users will never
be able to use crypt breaking software to attack the
computer.
Environments
(Professional
and Mentoring)
Linux Society
DTCC
Merck
Marsh & McLennan
NY Stock Exchange
Chase
Barings
EJV (Electronic Joint Venture)