This text has first appeared as
part of Wozniak, P.A., 1995, Economics of learning (Doctoral Dissertation at the University
of Economics in Wroclaw) and has since become a theoretical
inspiration for developing SuperMemo 8 for Windows. The current version has been updated by
new observations and puts more emphasis on distributed hypermedia
systems.
Modern hypermedia systems
encompassing the ability to adapt to the properties of human
memory and cognition
Dr Piotr Wozniak
Prof. Witold Abramowicz
Feb 28, 1997
In this text we would like to show
the need for developing knowledge access systems that would
account for the imperfectness of human perception, information
processing and memory. The implementation of such systems will
result in enormous savings in the process of learning at all
three stages of knowledge acquisition by the mind: (1) knowledge
access, (2) learning and (3) knowledge retention. In particular,
we will try to stress the importance of repetition spacing
algorithms, as well as the importance of the application of the
newly introduced concept of processing, semantic and ordinal
attributes in hypertext documents.
Fusion of the hypertext paradigm
with techniques targeted against human forgetfulness
Historically,
the development of repetition spacing algorithms proceeded from
common sense paper-and- pencil applications to increasingly
sophisticated computer algorithms that have finally been
implemented in commercial products that have gained substantial
popularity among students of languages, medicine, and many more.
This
development process was almost entirely orientated towards the
maintenance of the acquired knowledge in the students
memory. Currently, there is a possibility of a similar
development process being initiated in reference to retrieval and
acquisition of knowledge.
Effective
learning is based not only on being able to retain the learned
material in ones memory. Before that, the to-be-learned
knowledge must be identified, pre-processed with a view to
understanding, classified and selected with respect to its
relevance and importance. This process can greatly be enhanced by
means of simple techniques, which make an excellent material for
computer implementation.
This
implementation is more and more urgent with the diminishing role
played by printed materials in the wake of an increasing role of
World Wide Web and the vast market for CD-ROM title releases
across the board of all possible subject domains. The
straightforward use of a pencil, that is often instrumental in
ones work with the printed matter, becomes increasingly
impossible with more and more multimedia titles appearing on
CD-ROM and with a rapid growth of hypermedia available via global
computer networks. Some visionaries are even predicting the death
of the printed matter as we know it. The gap between the
effectiveness of browsing printed vs hypertext documents seems to
grow by the minute, though still very little attention is paid to
the readers or users ability to leave the trace of
his work in the document. Most of hypertext systems distributed
on CD-ROM provide the user only with annotation and bookmark
tools, which leave much room for improvement.
Let us
shortly present exemplary tools and techniques that can be used
in working with printed textbooks, and what inspiration this
might provide for the design of future hypermedia documents.
- The
first problem with books to read is that there are
usually too many of them. A good selection of the most
applicable material is the first step to effective
acquisition of knowledge. This subject, however, we will
leave out from the consideration. This is because we
would like to entirely focus on the authoring systems for
development of hypertext documents, as well as the tools
that would enhance such documents, and made them more
attractive from the students standpoint. The new
technologies, most notably CD-ROM, will make the
authors choice easier in this sense, that the vast
capacity of the media will leave less stringent constrain
on what not to include in the final shape of the
document. When we extend it to World Wide Web, the
question becomes irrelevant. With appropriate navigation
and search tools, the hyperspace might virtually remain
unlimited.
- After
selecting the learning material, the important tool to
use is a bookmark. Apart from reference materials like
encyclopedias, dictionaries, computer documentation, etc.
most of the printed material provides the possibility and
often requires a substantial dose of linear progress
through the contents. As time slices allocated for
reading, often break ones work in the middle of a
linear section, bookmarks are of indispensable value.
With the advent of hypertext applications, the average
length of a linearly processed text is likely to drop
dramatically. However, bookmarks do not only serve as
pointers to interrupted reading, but also provide the
means of a customizable table of contents, which can be
used for quickly accessing sections which are of the
greatest interest. Bookmarks have been an early and
ubiquitous child of hypertext documents; therefore, we
will also not consider them in the reasoning that
follows.
- After
picking a book, and selecting the relevant bookmark, the
process of reading or browsing begins. First of all, the
same bookmark that was used in accessing a particular
chapter or section, may serve as the pointer that helps
keeping the sight focused on the processed paragraph.
This is particularly useful in richly illustrated texts,
or at moments when external interruptions require
frequent shifting the sight beyond the reading area. In a
hypertext document, the counterpart of a paper bookmark
used in reading a textbook, should be a cursor that
highlights a single semantic unit that is currently being
processed. The importance of such a cursor may go far
beyond the sight guidance of a traditional bookmark. Such
a cursor will later on be called a semantic focus. It is
not difficult to notice that modern textbooks go further
and further into making particular semantic units of the
text less context dependent. In other words, by picking
up a randomly selected sentence from a modern textbook,
we are more likely to understand it that it would be
possible in the textbooks written in the style from a few
decades ago. The general trend is to shift from prose to
more precise forms of expressions. This will be most
visibly seen through proliferation of illustrations,
formulas, insert boxes, enumerations, underlined text,
etc. This trend comes from the increasing tendency to
convert linear textbooks to picknread
reference materials. This makes the job of a hypertext
document author much less of a trouble. This will also
make semantic units live life of their own, with the
benefit for knowledge retrieval and acquisition.
- The
most important part of a good textbook processing
technique is to leave traces of ones work in the
text. After all, let the book itself learn about what the
readers progress is, and not keep the entire burden
in that reference on readers memory. First of all,
it is useful to prepare a page chart for every carefully
studied book. The page chart keeps the record of each
page processed, and the current processing status. The
processing status may assume at least the three following
values:
In
some cases, it may also be worthwhile to separate a few
degrees of the attribute processed (or read). After all,
the page might have been read once, twice, or several
times, with all its semantic units changing the
processing attributes during each passage.
The rationale behind page charts is to have a constant
opportunity to control the speed and direction of
processing a particular textbook; the greatest advantages
being: (1) no need to refer to fully processed pages
marked with done, and (2) giving priority to new material
(intact) as opposed to the material that has already
been, at least partly, processed (read).
- As
mentioned earlier, all semantic units are marked with
processing attributes during the progress of reading.
These are:
The
obvious rationale behind marking semantic units with
processing attributes is never to refer to irrelevant or
memorized units, to focus the reading attention on
relevant units, and to use to- be-memorized units only
during the process of selecting new material for
memorization.
In a
majority of presently available hypertext systems, it is
difficult to develop an equivalent of page charts. Such a
document still leaves an impression of straying in a sea of
information with little chance for managing the access in a
rational way. The main problems here are: (1) how to make sure
that one does not wade again through once processed material
(during the reading process, it is easy to have a pleasant
impression of knowing everything beforehand just to discover that
some of the formulations evoke a déjà vu effect), (2) how to
make sure that no important section is missed (perhaps the
increasing drive toward large hypertext documents that cannot be
encompassed in any way will eliminate this concern altogether).
Sooner or later, developers of hypertext tools will discover that
there is much more to reading printed books that what has until
now been encapsulated in standard hypertext technologies.
Let us
consider a collection of proposed enhancements to generic
hypertext systems that would provide solutions to the problems
mentioned in the preceding section.
- The first
of the mentioned problems concerned selection of the
material. What generic systems have to offer in that
respect is: (1) possibility of choosing a title, (2)
collapsible tables of contents, (3) searching tools, and
(4) bookmarks. All that still leaves the reader with the
entire document to work with.
The first and the easiest step toward the customized
content is editable table of contents. We will discuss
the possible add-ons to tables of contents in Point 4 as
we address the problem of page charts.
A much more complicated, however, and probably more
desirable approach to customizing documents to particular
needs are document filters. Boolean and fuzzy search
procedures standardly included in hypertext documents are
usually armed with the ability to yield the list of
topics collected in the search. Such a list is usually
presented in the sorted form using one of the two
criteria: (1) semantic order, and (2) number of search
hits. Indeed, such a newly generated list of topics can
be viewed as a customized table of contents. However,
such a table has no attribute of persistence, in other
words, it is usually destroyed by repeating the search
procedure. Moreover, if the newly generated table of
contents was all the reader was interested in, there is,
as a rule, no way of hiding the remaining contents of the
document from other browsing procedures.
A document filter might have similar searching abilities
as the mentioned standard search procedures; however, the
output of the search might have a form of the new
document with a new table of contents. Additionally, a
keyword system, or better yet, semantic attributes
associated with particular topics or even semantic units,
might be used in the search. In other words, instead of
looking for words or phrases, the search would look for
keywords or even semantic content expressed through
semantic attributes.
The ultimate solution with respect to document filters is
to let them collect all relevant semantic units and,
literally, generate a new document from the collected
pieces. Before such a solution might be implemented,
quite a great deal of progress in natural language
processing will be required. In contrast, as it will be
demonstrated in Points 4 and 5, some handy solutions
concerned with processing attributes might be just a few
steps away.
- As
mentioned earlier, bookmarks are already a standard
fixture in all documents that have anything to do with
hypertext capability. Bookmarks may serve as a way of
constructing a customized table of contents upon locating
the most relevant topics used in ones work with the
document. In the context of document filters, one might
only propose that one of the possible outcomes of search
should be an editable bookmark table, that would make it
possible to employ the results of search long after it
actually took place.
- The
important role of semantic focus will be shown only later
when we consider the link between a hypertext document
and a database with the learned material generated during
the browsing process. At this point we only note that its
function can be compared to a selection bar in menus or
caret cursor in edit controls or word processor. The
position of the semantic focus indicates the currently
processed semantic unit. Very much like in the case of
cursors or selection bars, the actions undertaken by the
user or reader will affect only the selected unit. This
actions might be: (1) change the processing attributes of
the unit, (2) change semantic attributes of the unit
(e.g. to determine the future search outcomes), (3)
transfer semantic items associated with the unit to a
database with the learned material, and (4) perform an
editing action on the unit (delete, print, transfer to
another document, etc.).
- Page
charts are most painfully missing upon moving from
printed matter to hyperspace. The division of books to
pages seemed quite artificial, but the benefits of
charting are definitely worth this little inconvenience.
In the case of hypertext documents, the concept of a page
seized to exist being replaced with the concept of a
topic. The best link to the entire semantic structure of
topics from the human standpoint comes via table of
contents; hence the most obvious implementation target
for a counterpart of page charts. A flexible table of
contents that would make paper the commodity of the past,
should meet the following conditions:
As in
the case of page charts, the reader should have the
possibility to mark topics with processing attributes
(which are initially set to intact). Marking a topic as
irrelevant or done would be equivalent to erasing it from
the table of contents or leaving it in an easily
distinguishable form, e.g. grayed. Marking a topic as
processed might be enhanced by the indicator of the
degree of processing, which might also be reflected in
the appearance of the topics title in the table
(e.g. through coloring). Obviously, the process of
tagging topics with processing attributes should be
available both at the contents level and the topic level.
- Finally,
individual semantic units should also be markable with
processing attributes. Initially, all semantic units
would be marked as intact. Upon the first reading,
irrelevant items should be marked as irrelevant, and,
depending on users choice, disappear from the text
or appear grayed in their original place. Semantic units
of utmost importance, might be immediately transferred to
a database with to-be-memorized items. At the very least,
this process would allow the user to paste the content of
the semantic unit, reedit it and place it in a selected
database. However, a much more desirable solution is to
associate all semantic units in a hypertext document with
ready-made collections of items that might be transferred
to or linked with the student's database with a
key-stroke (e.g. after optional selection and
pre-processing). Items marked as memorized could also,
depending on the set-up, become invisible or
distinguished by different coloring. The remaining items
could be marked with a degree of relevance (or number of
reading passes); the highest degree being equivalent to
the attribute to-be-memorized. The degree of relevance
might contribute to the application of ordinal attributes
that might be later used in prioritizing once-accessed
items for secondary access. Similarly, to-be-memorized
items might also be tagged by ordinal attributes that, in
this case, would determine the memorization order. If the
processing attributes were applied, the user would be
able to quickly skip the parts once identified as
irrelevant, as well as to pay less attention to those
sections that have already been entirely mastered by
means of a module using repetition spacing algorithms.
The usual situation is, that at the early stages of
processing the document, the intact topics and units are
of the highest processing priority. As the work
progresses, the once-referred-to units may increasingly
get into the focus of attention (e.g. in the order
determined by their ordinal attributes). This will, in
all likelihood, move their processing status to
increasing degrees of relevancy, up to the point where a
decision is made to memorize a particular semantic unit.
In an optimum situation, a collection of simple
techniques should be developed to make sure that the
flexible table of contents makes it possible to
quantitatively assess the progress of processing the
semantic units in a given topic. For example, the topic
title in the table could be associated with a bar chart
showing the proportion of semantic units in the intact,
irrelevant, relevant and memorized categories.
Our experience
shows that there is a great potential for an increase in the
effectiveness of using hypertext documents in case the proposed
tools are provided in both the software shell and in the document
in question.
Our hope is that
in the future, the student will not ever have to work with a
repetition spacing algorithms employed by a dedicated program
like SuperMemo. The optimum situation is that the student will
obtain access to a hypermedia knowledge base (e.g. withing the
framework of World Wide Web) with a seamlessly integrated
algorithms for optimum spacing of repetitions (e.g. as a plug-in
to a Web browser). In other words, the focus should shift from
software and its options, to knowledge itself.
Naturally, the development of a hypermedia interface for a
knowledge base associated with a database used in learning, will
put much greater burden on the authors of a particular learning
system. However, the increase in the effectiveness of accessing
and learning knowledge will certainly fully compensate the higher
development costs.
In the optimum case, all semantic units relevant to learning
should be associated with predefined, well-structured items
(often in the standard question-answer form). A single semantic
unit might generate from one to several individual
to-be-memorized items. In other words, developing a seamless
hypermedia knowledge base integrated with repetition spacing
algorithms would triple or quadruple the authors effort and
costs.
A subset of the aforementioned
technological solutions is currently available in SuperMemo 8 for
Windows.
Back to SuperMemo home page