Project Number: CS-DCB-0205



A Major Qualifying Project Report:

submitted to the faculty of


in partial fulfillment of the requirements for the

Degree of Bachelor of Science


Brent Newey


Anitra Setchell


Michael S. Wicks


Date: April 24, 2003



Professor David Brown

Computer Science Department


This project studied the feasibility of adapting a website using a genetic algorithm, in response to user access patterns.  We created a hierarchical website and allowed a genetic algorithm to modify its structure.  We ran experiments to gather data about how users accessed the site before and after its modification.  We found that user access time generally improved after the site had been modified, and conclude that this is a viable approach for adaptive websites.


We would like to thank:

Dr. David C. Brown for creative consulting, guidance, editing, and other advisory duties, and for allowing us to use his CS3041 students in our experiments.


Professor Glynis Hamel for allowing us to use her CS2005 students in our experiments.


The B02 Techniques of Programming (CS2005) and the B02 and D03 Human-Computer Interaction (CS3041) students for participating in our experiments.


Andrew Smith, James Kent, and Peter Wright for creative consulting and editorial purposes.


Distribution of Effort


Abstract – Anitra Setchell

Introduction – The MQP Team

Problem Statement – The MQP Team

Related Research:

Adaptive Websites – Michael Wicks

Genetic Algorithms – Anitra Setchell


Website Design – Brent Newey

Experiment Design – Michael Wicks

Genetic Algorithm – Anitra Setchell


Website – Brent Newey

Experiment – Brent Newey & Anitra Setchell

Genetic Algorithm – Brent Newey

Results and Evaluation – Anitra Setchell

Conclusions – Michael Wicks

References – Anitra Setchell

Appendices – Anitra Setchell

Editing – The MQP Team


Content – The MQP Team

Assembly – Brent Newey

Experiments: Brent Newey

Genetic Algorithm:

Crossover – Anitra Setchell

Fitness Function – Anitra Setchell

Site Generator – Michael Wicks

Mutation – Michael Wicks

Ordering Function – Brent Newey

Recording Function – Anitra Setchell

Parent Selection – Anitra Setchell

Data Structures – Brent Newey

Website Representation – Michael Wicks

Assembly, Running, and Troubleshooting – Anitra Setchell


Extraction – Brent Newey

Statistical Analysis – Anitra Setchell

Table of Contents

Table of Contents. 1

Table of Figures. 3

1. Introduction. 4

1.1 Experiments. 5

2. Problem Definition. 6

3. Related Research. 7

3.1. Adaptive Websites. 7

3.1.1. Adaptive Hypermedia Domains. 8

3.1.2. Adaptation Methods. 10

3.1.3. Adaptation Usefulness. 12

3.2. Genetic Algorithms. 13

3.2.1. Why Genetic Algorithms?. 13

3.2.2. How genetic algorithms work. 14

3.2.3. Reproduction. 16

3.2.4. Crossover, Mutation, Schemata. 17

3.2.5. Fitness. 19

3.2.6. Other Representations. 20

3.2.7. Generations and population size. 21

4. Solution. 23

4.1. Website. 23

4.1.1. Representing a Website. 23

4.1.2. Using Websites to Construct a Hierarchy. 28

4.2. Experiment 32

4.2.1. Experiment #1. 32

4.2.2. Experiment #2. 35

4.2.3. Experiment #3. 38

4.3. Genetic Algorithm.. 38

4.3.1. Data Structure. 38

4.3.2. Generations. 39

4.3.3. Fitness Function. 39

4.3.4. Reproduction. 42

5. Methodology. 44

5.1. Website Design. 44

5.1.1. Technology News Site (used in Experiment 1) 44

5.1.2. Sports Site (used in Experiments 2 and 3) 45

5.2. Experiment Design. 47

5.2.1. Problems. 47

5.2.2. Solutions. 49

5.2.3. Data Collection. 50

5.2.4. Implementation details – Population and Programming. 51

5.2.5. Example Task and Website. 52

5.3. Genetic Algorithm.. 52

5.3.1. Programming Logistics. 52

5.3.2. Fitness Function. 53

5.3.3. Populations. 54

6. Results and Evaluation. 55

7. Conclusions. 60

References. 63

Appendix A: Sports Site (Experiment 2) 64

Appendix B: Sports Site (Experiment 3) 65

Appendix C: Animals Site (Experiment 3, unused) 66

Appendix D: Experiment Results. 67

Experiment 2. 67

Experiment 3. 69


Table of Figures

Figure 3.1: Basic crossover. 15

Figure 3.2: Genetic algorithm.. 15

Figure 3.3: One-point and two-point crossover. 17

Figure 3.4: Uniform crossover. 18

Figure 3.5: Simple schemata & binary chromosomes that fulfil them. 18

Figure 4.1: A Simple Website. 24

Figure 4.2: Representing a Simple Webpage. 25

Figure 4.3: Adding a Content Unit 25

Figure 4.4: Representing the New Webpage. 26

Figure 4.5: Three Content Units. 27

Figure 4.6: Website for Figure 4.5. 27

Figure 4.7: Changing the Priority. 28

Figure 4.8: Website for Figure 4.7. 28

Figure 4.9: Representing Pages. 29

Figure 4.10: Step One in Constructing the Hierarchy. 29

Figure 4.11: Step Two in Constructing the Hierarchy. 30

Figure 4.12: Step Three in Constructing the Hierarchy. 30

Figure 4.13: Completing the Hierarchy. 31

Figure 4.14: Introduction to Experiment 1. 32

Figure 4.15: Example task for Experiment 1. 33

Figure 4.16: Experiment 1. 34

Figure 4.17: Introduction to Experiments 2 & 3. 35

Figure 4.18: Example task for Experiments 2 & 3. 36

Figure 4.19: Selecting a target topic. 37

Figure 4.20: Fitness function. 40

Figure 4.21: Path to target content 41

Figure 4.22: Website crossover 43

Figure 6.1: Average time taken on experimental site. 55

Figure 6.2: Topic selection frequencies. 56

Figure 6.3: Topics selected at least twice. 56

Figure 6.4: Mean times for experiments 2 & 3 with confidence intervals. 58

Figure 6.5: Significant changes in mean time. 59

Figure 7.1: Word count on a website. 60

1. Introduction

One could reasonably say that the Internet is becoming the world’s electronic library, with information readily available on practically any topic imaginable. However, as the Internet’s bookshelves grow, it becomes harder to find specific information. Once a user finds a website related to an informational query, the user may have trouble navigating through that site in search of the information.

One way to deal with this problem is to make the website adaptive. “An adaptive website is a website that uses information about user access patterns to improve its organization and presentation.” [Perkowitz & Etzioni 1997]. By improving the organization and presentation of the website, a user might be able to find desired information more easily.

A new approach to making a website adaptive is through the use of a genetic algorithm (GA). GAs are used to simulate evolution in a computerized environment. One principle of evolution is sexual reproduction. Just as in natural evolution, a GA produces new websites by breeding two individuals from a set of websites (a population).

Another principle of natural evolution is survival of the fittest: the “best” individuals in a population have the greatest chance of reproducing. Similarly, a GA assigns a fitness value to each member of the population. This fitness value determines which individuals are most likely to reproduce. The fitness value is determined by a fitness function. So, for our applications, this function measures the perceived quality of each website in the population of modified websites.


1.1 Experiments

To obtain the data used in the genetic algorithm and test its effectiveness, we conducted three experiments. In each experiment, subjects had to navigate a website to find a specific target.

The first experiment gathered data on the effect of a metric (i.e. word count) on the time taken to find a target. This data was essential to the formation of an appropriate fitness function.

The second experiment was conducted on the website to be modified by the genetic algorithm. This experiment provided data on the users’ access time specific to the site to be modified. It also provided a basis for comparison with the third experiment.

The third experiment was performed on the best site produced by the GA. When compared with the data from the second experiment, the data shows that the GA produces a website that significantly lowers the search time of users on the website.

2. Problem Definition

The purpose of this project is to improve a website using a genetic algorithm. Improvement in this context means a desirable change in user behavior.

This goal of improvement has three distinct sub goals. A genetic algorithm must be implemented to make changes to the site. The website must be adapted significantly so that improvement can also be significant. User behavior must be observed and analyzed, to determine the amount of improvement.

Each aspect of the genetic algorithm must be considered carefully.  The algorithm must be provided with an initial population to which it will apply the evolutionary process. This population will be based on the website to be modified.  The algorithm must have the necessary tools to simulate evolution on this initial population.  Once this is done, the algorithm will output the result of this evolutionary process.

Adaptations to the website must be significant.  For this to be true there must be some way to define a significant change.  “Significant”, in this instance, means that it will be likely to produce a measurable change in user behavior. Hence, we must gather information about the effects on the users of alterations to the website.  This information can then be used to determine what can be adapted.

Changes in user behavior can be expressed in a variety of metrics. Therefore, we must decide on a metric that represents improvement. This metric must be observed in both the original site and in the output from the genetic algorithm.  Hence, if this metric improves, then a positive change in user behavior has been successfully achieved, and the project will have accomplished its goals.

3. Related Research

In this section, we will explain the purpose of adaptive websites and genetic algorithms, how they work, and what considerations must be taken when designing them.


3.1. Adaptive Websites

Adaptive websites automatically improve their organization and presentation by learning from visitor access patterns [Perkowitz & Etzioni 1997]. By learning from many visitors’ access patterns, these websites can change their structure and presentation in order to make the user’s goals easier to obtain. It is important to consider why this genre of websites is becoming increasingly necessary, the domains for which it is suitable in, the aspects of a website that can be modified, and factors that should be taken into account when designing an adaptive website.

Websites should be designed to help the user accomplish a goal as easily as possible; otherwise the user may become mislead and annoyed. However, the designing of a website is not that straightforward. It is obvious that not all visitors to the same website are going to be looking for the same information, but the information that is most popular should be the most accessible. Adaptive websites can address this problem by having the ability to customize the website for different types, or groups, of users, not just individual users or all users as a whole. They also can address the problem that arises when user perceptions of the website differ from how the site was originally designed.

Another downfall of static websites that adaptive websites rectify is that of keeping up with the user’s needs over time. By learning user's access patterns, they can more or less predict what information a user is going to be looking for, and then decide what information to present when [Perkowitz & Etzioni 1997].

3.1.1. Adaptive Hypermedia Domains

According to Brusilovsky [1998], the most common application areas in which adaptive hypermedia can be useful are educational hypermedia systems, on-line information systems, on-line help systems, information retrieval hypermedia, institutional hypermedia, and personalized views. Table 1 shows some existing adaptive hypermedia systems and in which application area they are being applied.

 Educational Hypermedia Systems

 Anatom-Tutor, C-Book, [Clibbon], ELM-ART, ISIS-Tutor,


 ITEM/PG, HyperTutor, Land Use Tutor, Manuel Excel,



 On-line Information Systems

 Hypadapter, HYPERCASE, KN-AHS, MetaDoc, PUSH,


 HYPERFLEX, CID, Adaptive Hyperman

 On-line Help Systems


 Information Retrieval Hypermedia

 CID, DHS, Adaptive HyperMan, HYPERFLEX, WebWatcher

 Institutional Hypermedia


 Personalized Views

 Basar, Information Islands

Table 1: Adaptive hypermedia systems and their application areas. Adapted from Brusilovsky[1998].


Educational hypermedia systems generally are used in order to teach the user material on a particular subject; accordingly, they present information systematically in a structure most convenient for learning purposes. Adaptive hypermedia can be used in this domain in order to accommodate for the user’s knowledge level of the subject. Novice users may become lost within the website due to their lack of knowledge on the subject and may need navigational help, whereas “advanced users may find the website trivial and boring” [Brusilovsky 1998].

Another domain in which adaptive hypermedia can be useful is online information systems. These systems provide users with more direct access to numerous subjects, rather than systematically displaying information, as with an educational hypermedia system. For this reason, it is difficult to accommodate all types of users. Again, the knowledge gap between novices and advanced users becomes a problem. An advanced user may seek detailed information on a subject, while a novice may desire a broader overview. For the advanced user, it may be difficult to find the detailed information needed, thus requiring the user to spend more time searching for it. By using adaptive hypermedia, the system can adapt to suit each type of user accordingly.

On-line help systems are another domain in which adaptive hypermedia can be helpful. This domain is a subset of online information systems described above, where the knowledge of the user is again a problem. The difference is that with an online-help system the general subject (the relevant application) is already known.

Another domain that adaptive hypermedia has recently been applied to is informational retrieval (IR) systems. “[This] is a new class of IR systems which combine traditional information retrieval techniques with a hypertext-like access from the index terms to documents and provide the possibility of browsing the hyperspace of documents using similarity links between documents.” [Agosti et al., 1995; Helmes et al., 1995]. Adaptive hypermedia is used in this domain to deal with the problem of having a large hyperspace, which is difficult to manually structure. Instead, the structure of the pages is adapted, creating links where appropriate [Brusilovsky 1998].

Institutional information systems provide information within a particular organization, unlike many of the previously mentioned systems, which have a more general audience. These systems are usually the combination of multiple databases. Adaptive hypermedia allows the system to tailor the way it structures and presents the hyperspace in order for different types of users to access their desired information more easily.

Adaptive hypermedia can also be used to make user access to a subset of a large amount of information readily available by using a system to manage personalized views within the information space. Each view can be devoted to a single goal or interest related to the work of the user [Brusilovsky 1998]. These views may also change over time to compensate for new, altered, or out-of-date information.

3.1.2. Adaptation Methods

Many aspects of a website can be modified in order to aid the user in achieving his goal. Some of these aspects are the ideas of promoting and demoting links, highlighting, linking, and clustering.

“Promotion makes a link or page easier to find by placing a reference to it closer to the front page of the site or by moving a link closer to the top of a page.” [Perkowitz & Etzioni 1997].  The idea of promotion and demotion of links is simple: a link is promoted if it is popular, and demoted it if it isn’t. However, although the idea may seem simple, the reasoning behind it can be complicated depending on the method used to decide how popular a link is.

In order to make the point more clear, the method discussed here will be one discussed by Perkowitz & Etzioni [1997]. They determined the popularity of an object, i.e., a page or link, in term of how many times it was accessed. The more accessed a page or link was, the more popular it was considered. Along with popularity, they also rated each page or link in terms of its accessibility. They did this by measuring the distance (number of pages that had to be accessed) between the page or link and the front page. By combining accessibility and popularity, it is clear that a page or link should be promoted if its popularity is high and its accessibility is low, and demoted if its popularity is low and its accessibility is high.

Another method of adapting a website in order to aid a user in obtaining his goal is by linking (unlinked) relevant pages together, as well as unlinking pages that do not need to be linked. The fact that many users visit two pages suggests that they are conceptually related in users’ minds [Perkowitz & Etzioni 1997]. If many users consistently access the same two pages in a single visit to the website, these two pages should be linked together. On the other hand, two pages that are linked together and are seldom accessed within the same visit to the website probably should not be linked.

Clustering can also be used to aid the user in his task. This method, like the one above, presumes web pages are conceptually related in users’ minds if many users visit the same pages during the same visit to a website. The adaptive system can reasonably assume that by combining these pages into one new page and creating a link to that page, the users will find it easier to access the information.

One approach to implementing this concept is used by Perkowitz and Etzioni[1997]: they consider the filenames of the pages, their locations in the website’s hierarchy, and the frequencies of access by users during the same visit.

This can be exemplified by a clothing site organized primarily by brand. For each clothing name brand there exists a web page that displays all the shorts the name brand offers, e.g., nike/clothing/shorts.htm, reebok/clothing/shorts.htm, etc. If multiple “shorts” pages were accessed frequently by users during the same visit, a possible solution would be to put all the information about shorts on a new page and create a link to it at some higher level in the hierarchy based on its popularity.

Another way to help users access the information they are looking for is to highlight the links that are most popular. Highlighting makes links more visible by modifying their fonts and colors. In order for this method to be effective, not all links can be highlighted, since this would defeat the purpose of highlighting. Therefore, the number of links that are highlighted should be a small percentage of the total number of links on a page.

3.1.3. Adaptation Usefulness

In practice, creating an “optimal” website is unrealistic. “Most adaptive methods tend to focus on hill climbing from the initial state to another state that is likely to be better.” [Perkowitz & Etzioni 2000]. Once an adaptive method is applied to a website, its usefulness can be determined by two factors: benefit – how much the site improved for those who use it, and impact – how many users actually benefit from the changes. When judging the usefulness of an adaptive method on a website, one may consider: (1) whether or not the adaptation created more or less work on average for users, (2) did the adaptation make the website easier to use for everyone, both novices and experts, (3) did the adaptation make more or less work for the webmaster, (4) has the site been adapted in a manner that disrupts the original design, and (5) is the human webmaster still in control of the website.


3.2. Genetic Algorithms

Genetic algorithms are used to simulate evolution in a computerized environment. Natural evolution has long been recognized as a superb adaptive mechanism. Genetic algorithms are robust, able to solve many types of problems, but are also relatively efficient.

3.2.1. Why Genetic Algorithms?

Genetic algorithms are one of many optimization methods. So why use a genetic algorithm when there are so many other ways to optimize? Many optimization methods have gaps in their effectiveness, which a genetic algorithm may be able to fill. For example, most traditional methods work best on continuous functions, and might not work at all if the data points were not connected. Most are work poorly in functions with multiple peaks, often finding only a local optimum instead of the best value.

Goldberg [1989] lists several reasons why genetic algorithms work differently from traditional optimization methods. First, genetic algorithms use a representation of the parameters, not the actual parameter set. Second, they search from a population of points, as opposed to a traditional algorithm, which would search from a single point. This is especially important in multiple-peak functions, as the algorithm can be climbing many hills at once. Third, instead of trying to determine the single best answer, genetic algorithms are based on probability. So while the chances of finding the absolute best answer are the same or lower than what they would be in traditional algorithms, it is much more likely that the genetic algorithm will find a solution that is very close to the optimum. Last, but most important for their robustness, is the fact that genetic algorithms only use the information given to them by the previous solutions. They do not need any auxiliary knowledge (example – derivatives of a continuous function) to find a solution. Goldberg [1989] calls the previous solutions “payoff information”, that is, the payoff from the previous attempt. Traditional algorithms always need additional information (unless they are just a random walk); instead of working from previous solutions, they try to determine which solutions are likely to be good before trying them.

3.2.2. How genetic algorithms work

Genetic algorithms are easiest to understand when applied to a black-box problem. The box has a number of switches which can be toggled on or off. The output from the box depends on which switches are toggled, but we don’t know which combination will give the best result. We can represent each switch with a digit – 1 if the switch is on, and 0 if it is off. So a complete representation of the switches is a string of ones and zeros. To use a genetic algorithm to solve this problem, we first generate a few possible combinations for the switches by flipping a coin and recording the result, until we end up with several strings of ones and zeros that represent possible solutions to the black box problem. (We later refer to these strings as chromosomes.)

The genetic algorithm will use only two basic operations on these strings to find an answer: copying the strings and swapping partial strings. First, we evaluate each string and give it a fitness value, which Goldberg [1989] urges us to think of “as some measure of profit, utility, or goodness that we want to maximize.” We then copy over strings according to their fitness values; strings with a higher fitness have a greater probability of contributing one or more offspring to the next generation – an artificial version of Darwin's “survival of the fittest”. Each string copy is entered into a “mating pool” [Goldberg 1989]. The members of the pool are selected for mating at random. Each pair then undergoes crossover (Figure 3.1): a position is chosen on the pair of strings and all the characters in the string after that position are copied over to the other string instead. There is also a small chance that one of the offspring produced may mutate, which would change a single value at random within the string.

Figure 3.1: Basic crossover.

So, in short, there are four main parts to a genetic algorithm: evaluation (or fitness), reproduction, crossover, and mutation. The basic steps in a genetic algorithm are shown in Figure 3.2.

Figure 3.2: Genetic algorithm. Adapted from Davis [1991].

3.2.3. Reproduction

Genetic algorithms are intrinsically parallel [Davis 1991]. They work on many possible solutions at once. Reproduction, which maintains the population of possible solutions, makes this parallelism possible.

The simplest method for choosing parents is often called “roulette wheel selection” [Davis 1991; Goldberg 1989], because it can be thought of as assigning a slice of a wheel to each chromosome, with the size of the slice corresponding to the chromosome’s fitness. If the fitness values for the whole population are summed, we may then calculate each chromosome’s fitness in proportion to the whole. This proportion is the probability that the chromosome will be selected for reproduction.

There are two replacement techniques that can be used in reproduction: generational and steady-state. Replacement refers to what population new children are inserted into. Generational replacement replaces a whole generation with the children it produces. Steady-state replacement [Davis 1991] produces one or two children at a time, and inserts them into the current population. Noisy functions, in which the same chromosome may return a different value each time it is evaluated, are better served by generational reproduction. Steady-state reproduction can be used to reduce or eliminate the chances of having duplicate chromosomes in a population. It can also allow a higher rate of mutation and crossover, without the fear of losing many good chromosomes.

Another method to prevent loss of good chromosomes is elitism. The concept of elitism makes generational replacement more like steady-state: the best individual from the generation is copied over to the next generation. Some approaches extend this concept by encouraging duplicates of good chromosomes in the population in order to increase the chance of these chromosomes being involved in crossover.

3.2.4. Crossover, Mutation, Schemata

Crossover and mutation are the two major operations used in reproduction. [Goldberg 1989; Davis 1991]  Crossover in its simplest form, known as one-point crossover, has been described above. Two-point crossover is similar (Figure 3.3), but sets both starting and ending points for the exchange of genetic material.

Figure 3.3: One-point and two-point crossover.


 The methods for two-point crossover can be extended for longer chromosomes, but they begin to lose their effectiveness. When diversity in the population is greatly desired, a third type of crossover, called uniform crossover (Figure 3.4), is used. In uniform crossover, bits from the parents are randomly divided between the children. This is done by means of a template. The problem with uniform crossover is that it can easily destroy valuable schemata (defined below).


Figure 3.4: Uniform crossover.


Mutation is the other important operation used in genetic algorithms. A unary operation, it changes the value of a single bit within one chromosome. Mutation is used to introduce diversity into a population, finding solutions that would not be possible if only crossover were used. Crossover, on the other hand, can spread favorable mutations quickly, and allows the mutations to be combined (extremely unlikely without crossover).

Schemata are important to understanding how genetic algorithms work. A schema is essentially a template of a chromosome. In a binary representation, the template has three possible values for each position: 1, 0, or “don't care” (represented in Figure 3.5 by the # symbol). It is the strings of 1s and 0s that make up the schema and provide a basis for comparison of genes. The Schema Theorem states that a schema which occurs in chromosomes with above-average fitness will tend to occur more frequently in the next generation. This is especially true with shorter schemata, as they are less likely to be broken by crossover.

Figure 3.5: Simple schemata & binary chromosomes that fulfil them.

3.2.5. Fitness

The last important part of any genetic algorithm is its fitness function. This function is based on an evaluation function. The evaluation function determines how well the chromosome performs in its domain. This evaluation is highly dependent on the problem that the genetic algorithm is trying to solve. The individuals that have the highest evaluation (the answers closest to what is desired) are given the highest fitness. The individuals with the highest fitness are more likely to be chosen to reproduce for the next generation. According to Davis [1991], there are three popular techniques for computing fitness.

The first technique is very simple: the evaluation of the chromosome is its fitness. In this technique, the numerical value given to the chromosome by the evaluation function is simply copied to the fitness. While uncomplicated, this method has its disadvantages. At the beginning of a run, when there have not been many generations, a “super-individual” (one that is much better than the average) can quickly wipe out all its competitors, even if it is not very close to being optimal. When the algorithm has been slowly converging on an optimal solution, individuals in the late generations will be “close competitors” in fitness, and the best of the group will not stand out. This technique must also be modified if it is possible for an individual to have a negative evaluation, because fitness must be positive.

The second technique is called “windowing.” In this technique, the minimum evaluation within the population is found. Each chromosome is then assigned a fitness value equal to the amount it exceeds this minimum.

The third technique is linear normalization, which orders the chromosomes by their evaluation. The best evaluation gets the highest fitness value (equal to the size of the population) and the worst evaluation gets the lowest fitness value (usually one or zero).

Both of these later techniques compute fitness by a relative scale of the current population, so they are less likely to be affected by the “super-individual” or “close competitor” scenarios that can happen at the beginning or end of the algorithm.

Davis [1991] asserts that the concept of “fitness” (in this application, we’ll refer to it as preference) can also be applied to the operations performed on the chromosomes. At the beginning of a run, crossover is most effective at finding good solutions. However, at the end of a run, when all the individuals are relatively close, mutation is the better operator. By assigning the operators a preference value that increases or decreases over time, the algorithm can perform more efficiently. This also allows the operators to work independently of each other; a chromosome may be mutated without using crossover first.

3.2.6. Other Representations

So far, we've only discussed genetic algorithms that use a binary representation for the chromosomes. However, genetic algorithms can be used with other representations as well. In fact, this is desirable if it helps understandability of the problem [Davis 1991]. For a problem where steps must be ordered, each chromosome in the algorithm is a permutation of the list of steps. With this representation, crossover is similar to the uniform crossover described earlier, but incorporates the idea that order matters. Mutation rearranges the steps within a sub-list of the entire chromosome. Among other applications, order-based representations are an important part of genetic programming [Koza 1999].

If a real number representation is used for the chromosomes, the operators are again modified. A simple crossover for real numbers is to average the two parents. A value within the chromosome may be mutated either by replacing it with a randomly selected real number, or by shifting the value up or down a small, random amount.

If some information contained in an individual cannot be changed, or is dependent on other individuals, a more structured representation must be used. For this case, Koza [1999] describes a form of structure-preserving crossover, which restricts which parts of an individual can be changed.

3.2.7. Generations and population size

Population size is important to consider when designing a genetic algorithm. Usually, the algorithm is constrained to a certain number of evaluations, so a smaller population will be able to have more generations, and a larger population must have fewer generations. A rule of thumb is that a larger population will work more slowly (needing more generations), but achieve a better eventual answer than a smaller population [Davis 1991]. This has been tested to some degree [Goldberg 1989], but cannot be proven to always hold true, due to the random nature of genetic algorithms.

The examples Goldberg [1989] and Davis [1991] use generally have a population size of 100, because it is an easy number to deal with. The real world applications they mention use population sizes around 30, because they cannot afford the slower nature of the larger populations. Other constraints can apply to the selection of population size, as well. Oliver et. al. [2002] use a population size of only 12, because their design requires the whole population to be displayed simultaneously on a monitor.

The number of generations also varies depending on the application and resources. The general rule is to run as many generations as time allows, because increased generations should result in a better solution. Since their algorithm was based on human decisions instead of a programmed fitness function, the users in the experiments of Oliver et al. [2002] were able to indicate when an acceptable solution had been found, ending the algorithm.

4. Solution

The solution to our problem has three fundamental parts: the website which is modified by the genetic algorithm, the experiments used to determine the fitness function and assess the GA-generated site, and the GA itself.

4.1. Website

The website that we used in this project was designed to work smoothly with the genetic algorithm and the experiment.  The genetic algorithm must be able to understand a description of the structure of the website so that the genetic algorithm can alter that structure.  For the experiment to be successful, the website must provide the experimental subjects with a clean interface, which they need in order to produce consistent, reliable data.  These requirements dictate some aspects of the design of the website.

4.1.1. Representing a Website

The genetic algorithm must be able to manipulate the representation of the website.  A website can be multiple pages.  To make the representation easier, the website was designed so that its structure could be represented as an array.  It is much easier for the instructions in the genetic algorithm to manipulate an array than the HTML code that is used to represent the website on the Internet.  Therefore, one of the goals of this project is to find an array that captures the relevant information of the website in a form that the genetic algorithm can understand.

Each member of the array is called a content unit.  A content unit represents an amount of web content roughly equivalent to a paragraph.  The term “roughly equivalent” is used because there are often headers, links, or buttons associated with each paragraph, and these are absorbed into the content unit to make things easier.  Consider the simple website in Fig. 4.1 as an example.

Figure 4.1: A Simple Website

There is one content unit on this website, and it contains the string “Welcome to my website.” The entirety of a website can be represented by an array with a length equal to the number of content units.  In this case, the length is one.  Figure 4.2 shows a pictorial representation of this array.  Note that in many computer science applications, arrays are zero indexed.  This means that the first member of the array has an index of zero, the second member an index of one, and so on.


Figure 4.2: Representing a Simple Webpage

Adding another content unit to a website increases its complexity (See fig. 4.3).  There must be multiple content units in the array, and there must be some way to distinguish the order of the content units.

Figure 4.3: Adding a Content Unit

There are now two content units in the array.  A second variable is added to each, representing its “priority” on the webpage. The smaller the priority, the closer the content unit is to the top of the page.

Figure 4.4: Representing the New Webpage

Priorities control how content units are positioned on the page relative to each other. Any content unit with a certain priority will always appear below any units with a lower priority, and above all units with a higher priority. For example, if there are three content units which have priorities of 3, 1, and 9, the website would contain the content unit with priority 1 first, the content unit with priority 3 second, and the content unit with priority 9 third.

When the GA acts on this array, no content information is modified, because all resulting websites must contain the same content.  If content information were to be exchanged through crossover or changed through mutation, the websites would no longer have identical content.  The information presented by each member of the population of websites must be the same.  The GA is changing the presentation of this information, not the information itself.  The priority attribute does not relay any content information.  Because of this independence from content, the GA can alter the priority to produce different ordering results without changing the content.

Consider the following array (Fig. 4.5), which produces the following web site (Fig. 4.6).

Figure 4.5: Three Content Units



Figure 4.6: Website for Figure 4.5

Let us now consider that an arbitrary number of mutations or crossovers have been performed to achieve new priority results, as shown in Figure 4.7 and 4.8

Figure 4.7: Changing the Priority



Figure 4.8: Website for Figure 4.7

This demonstrates that priority can be arbitrarily changed to produce new presentations without modifying the data.

4.1.2. Using Websites to Construct a Hierarchy

The above examples are not hierarchies.  To include the hierarchical structure of an actual website in the algorithm, a limitation is imposed that all pages of a website can only link to a page lower in the hierarchy.  This forms a tree, and the graph of this tree contains no cycles (i.e., there is only one path to each page in the tree).  This limitation allows us to easily represent the site in a fashion acceptable to the Genetic Algorithm.

The GA performs linking implicitly by binding multiple content units to a page.  Consider the array shown in Fig 4.9.  In this representation, the longer vertical dividing lines represent the ends of pages.  The positions of these lines are included in the GA data structure in a way that will be described later.


Figure 4.9: Representing Pages

In this example, content unit A is bound to the first page, content units B-D are bound to the second page, E to the third page, etc.  To create the hierarchy from this setup, we perform the following algorithm.  First, determine the index page, represented by the first group of content units before the first line.  In this case, it is just “A”:

Figure 4.10: Step One in Constructing the Hierarchy

This represents a site containing a single page with a single content unit.  The next group of content units represents page 2 of the hierarchy.  These are content units B, C, and D, located between the first and second divisions in Figure 4.9, above.  When added, the site looks like Fig 4.11.

Figure 4.11: Step Two in Constructing the Hierarchy

The second page contains three content units. Each content unit may contain only one link.  If it is at the bottom of the hierarchy, it has no links. Because there are three more content units to add in this example (as represented in Figure 4.9), this means that the tree will now contain three more branches.  Content unit B will point to the next page.  According to Figure 4.9, that page contains only content unit E.

Figure 4.12: Step Three in Constructing the Hierarchy

The algorithm continues to perform these operations in a breadth-first manner until the entire hierarchy is constructed.  Hence, it will add the next link to content unit C, the link after that to content unit D, and the link after that to content unit E.  The results of the algorithm are shown in Fig 4.13.

Figure 4.13: Completing the Hierarchy

To add additional pages to the hierarchy, the algorithm would continue to function in a breadth-first manner, adding a page first to content unit E, then to content units F, G, and H, respectively. This results in the hierarchy always being represented as a complete tree.

To represent the divisions in Figure 4.9, the algorithm uses “flag-based separation”.  Flag-based separation is a concept that incorporates binary flags into the array.  These flags are placed on the array values that have a page break directly following them.  In this example, the flags would be placed in content units A, D, E, and G.  The final content unit is always assumed to have a flag, because it will always appear at the end of a page.

Many of the website representation issues addressed here are necessary for the genetic algorithm to function properly.   By using an array data structure with a clear division of pages we can represent an entire hierarchical website using a single data structure.  The website representation discussed above was designed with these goals in mind.

4.2. Experiment

In this section, we will explain the design of each of the three experiments. For each of these experiments, the physical collection of our data was done by Perl scripts. The scripts logged the user access times, as well as the paths they took when traversing the websites. We implemented the scripts such that when a user clicked the back button on their web browser, it would not affect the collected data.

4.2.1. Experiment #1

The first experiment allowed us to determine the effect of page length on how long it took for users to find a specific link. This was accomplished by giving users the task of finding the correct link for a given subject. Word count (of the experimental page) was the independent variable.

Figure 4.14: Introduction to Experiment 1


Figure 4.15: Example task for Experiment 1

Once a user completed the training example, they were presented with one of five pages, each of different length. Each page contained similar information about five computer technologies (Figure 4.16, below). The users were asked to look for information on “online form technology”. These pages were presented in the same format as the example page, i.e. paragraphs followed by links. The topics, as well as the order they were presented, were the same on each of the five pages. Each page also contained the same proportion of information for each topic (i.e., each topic took up 20% of each web page, regardless of the page’s total length). The amount of information provided for the topics was the only difference between the five pages. By varying the amount of information for the topics, we were able to test the effect of different page lengths on the time it took the user to find the correct link.

Figure 4.16: Experiment 1

Once the user clicked on a link, the experiment terminated. This occurred whether or not the user selected the correct link. We ran the experiment in this manner to avoid a possible side effect of correctness on reading time.

We recorded the time from when the user started the actual activity to the time when the user to clicked on a link, the length of the web page, and whether or not they selected the correct link. By conducting this experiment we found a relationship, which we used in our fitness function, between the length of a web page (measured by word count) and the amount of time it takes a user to find a desired link.

4.2.2. Experiment #2

The second experiment allowed us to obtain data on how the users traversed a human generated website. For this experiment, we designed a website that contained information on unusual sports.

Figure 4.17: Introduction to Experiments 2 & 3

As with the previous experiment, users were first given an example task (see Figures 4.17 and 4.18). For this task, links were not functional, and were only placed there to familiarize the users with the format of the experiment. Next to each paragraph was a button titled “This is my topic”, which the user clicked when they thought they had found their target topic.

Figure 4.18: Example task for Experiments 2 & 3

Once the example task was successfully completed, each user was asked to select an unusual sport (Figure 4.19). This selection dictated their target topic for the experiment. By allowing this selection, we were able to obtain a frequency distribution of the number of times each sport was searched for.

Figure 4.19: Selecting a target topic

Once a target had been selected, each user navigated through the website (structure shown in Appendix A) until they believed they found the relevant information about their target. They indicated this by clicking the button next to the relevant paragraph. If the users clicked on the wrong button, they were presented with a page indicating this, and were asked to try again. Once the user selected the correct button, they were presented with a page indicating success.

The frequency distribution and the path followed to get to the target information were used in the GA’s fitness function.

4.2.3. Experiment #3

The third experiment allowed us to obtain data on how the users traversed the website modified by the GA (see Appendix H for structure). The website contained exactly the same information as the human generated website in the second experiment. The structure of the website changed, as well as the location of content units within the site. We collected time data in this experiment to compare with the times from experiment 2.


4.3. Genetic Algorithm

We designed a genetic algorithm to adapt the website. In this section, we will discuss some of the details of its design.

4.3.1. Data Structure

The algorithm simulated evolution using a population of websites. The website structure is the one described in section 4.1. The population is an array of pointers to the websites. Each website is represented by an array of content units in a fixed order. Each website also has settable values for fitness and ordering. Each content unit contains the content string (or a marker indicating the content), the word count for that content, a settable priority marker and page-end marker (as well as a number of internal variables used to determine page positioning and linking).  The population of sites was designed in this manner to make crossover between websites possible.

4.3.2. Generations

Our genetic algorithm ran several times, 500 generations each time. We ran it several times in order to get the best possible fitness value from the algorithm. We used standard generational replacement: selections from the parent generation produced a new, “child” generation. When the child generation was completely generated, it became the new parent generation. Each generation had a population size of 30 sites.  The initial population was generated by randomly mutating the original site 29 times.  The 30th site in the initial population was the original site.

When each generation was completed, the algorithm calculated fitness values for all the sites in the child population. The program recorded the best, worst, and average fitness values for each generation, and also recorded the representation of the site with the best fitness in each generation.

4.3.3. Fitness Function

As shown previously, in Figure 4.19, we allowed subjects in our experiments to select which content to search for. Once they made a selection, their task was to find this target. When calculating the fitness of a particular site, we first calculated fitness for each task. The fitness value for a site is the sum of the fitnesses for each task for that site.



Figure 4.20: Fitness function

To calculate task fitness, we looked at four factors (Figure 4.20). Frequency is the percentage of subjects that chose the given task. Frequency was a weighting factor: tasks that were never chosen were not included in the fitness of a site, and tasks that had been chosen frequently got a better fitness value in the results.  Recall that lower fitness values are better.  Thus, high frequency for a task means that other values in the fitness function (wordcount and pagecount) must be lower for this task to produce a desirable fitness.  In this way, tasks with a high frequency are given more help in the genetic algorithm. Wordcount and pagecount estimate the time it would take a subject to read through all the material in the path to the given task. Wordcount was calculated by finding the content unit that linked to the target, then totaling its length and the length of any content before it on the page. Then the process was repeated with the content unit that linked to that page. In the example in Figure 4.21, the wordcount would be the sum of the lengths of all the units in red.

Figure 4.21: Path to target content

Pagecount is simply a count of the number of links required to get from the first page to the page with the target content. In Figure 4.21, the pagecount is 2. We realized that our experiments needed to incorporate error rate. Error rate is a crude measure of how likely a subject is to click on the wrong links in their quest for the target. In each task, error rate is a constant, calculated from the results of Experiment 2. We assume errors would be similar on a new site, so pagecount (the shortest path) is multiplied by error to yield a more likely path length for the task. This new path length is multiplied by a constant to give it an effect on the fitness function comparable to wordcount.

The fitness values obtained from the function above should be proportional to the time it takes a subject to find their goal content on a site. Since we are seeking to minimize this time, sites with lower fitness values are more desirable. A high wordcount would produce a larger fitness value, as well as high error rates and pagecounts. Once fitness values have been calculated for a whole generation, the population is ordered, based on their descending fitness values. The lowest (best) fitness gets the highest order value (30 in our algorithm).

4.3.4. Reproduction

Once an order has been assigned to the members of the current population, the algorithm is ready to start the next generation. Parents are chosen in a weighted technique, based on their ordering. For example, the site with order number 30 is 30 times as likely to be chosen as the worst site of that generation, with order number 1. This technique is called linear normalization. As discussed in Section 2, this technique ensures greater diversity in selection of parents both at the beginning and end of a run. At the beginning of a run, linear normalization can decrease the perception of differences between the site with the top fitness and the next-best site. At the end of a run, linear normalization increases the perception of differences between otherwise close competitors.

Our site representation (discussed in Section 4.1.1.) was explicitly designed to allow mutation of attributes and crossover between sites. If a site was picked for mutation, two types of attributes could be affected. A page break flag could be flipped (turned on if it was off, turned off if it was on). A priority value could be mutated using “real number creep” to add or subtract a small random value from the priority. Standard one-point crossover was used. As seen in Figure 4.22, this crossover transfers many properties of the parent sites to their offspring.

Figure 4.22: Website crossover

5. Methodology

5.1. Website Design

Two primary concerns were taken into account when designing the website.  The first concern was the subject matter of the website.  The subject matter could have an adverse effect on several variables in the experiment.  The best logical choice is to select a subject matter that minimizes these effects.  The second concern is the structure of the website.  While the structure can have adverse effects on the user, a more important concern is the ability to represent the site as a data structure.

Three websites were created for this project, and an experiment was run on each of these.  The first experiment was significantly different from the second experiment, and thus the decisions made for each corresponding site were significantly different.  However, no decisions were directly made regarding the website used in experiment 3 (as it was generated by the algorithm).

5.1.1. Technology News Site (used in Experiment 1)

The content selected for the website used in the initial experiment was “technology”.  Considering that the population of users participating in this experiment was almost entirely Computer Science majors, this seemed likely to hold the interest of the users.

To avoid an incidence of prior knowledge causing the user to skip material or lose interest, material on technology was taken only from recent news, and never from front-page sources.  Topics generally focused on a new and interesting technology, such as wireless networking.

The structure of the website was very simple.  There were a total of five pages, each with a different number of words.  No single user, however, saw more than one of these pages.  The number of words on each page was varied to produce content that would fill an approximate amount of pages.  Thus, one example contained roughly ¼ of a page of information, another ½, another one page, another two pages, and the final one four pages.  This was done because we expected to see a non-linear relationship of word-count and time due to users scrolling the page.  This expected relationship did not occur.

5.1.2. Sports Site (used in Experiments 2 and 3)

The content selected for this website was unusual sports.  Sports as a general topic was selected because a large amount of people are interested in it, and choosing unusual sports ensured that few people would already know the information we were using.

There were originally some concerns about the style of website to be used.  An information page with in-text links was selected, but some of the other options are worth mentioning.  For example, a website for online shopping has the potential to benefit greatly from this technology (possible increased sales), but the experiments in this approach would be contrived.  Personal web pages were considered, but there is generally a lack of interest in the information presented on these pages, and they are usually not complex enough to warrant any restructuring.

The types of content to appear on the webpage were also important.  The initial experiment supplied only information about word count.  Because of this, other web elements, such as color, images, positioning, white space, and form elements were not used.

The news-article-style educational website format was selected because it allowed us to use links of the same type on all pages.  All information on each page is of the same level of specificity, and all links lead to a related topic, but not to further information on the same topic.  Also, the users of the website, students, would be familiar with this style of website.  Its simplicity would allow the experiment to be more precise and ensure that we do not change more than one variable.

Another reason that an emphasis has been placed on simplicity is because of the genetic algorithm.  Complex formatting, navigation bars, and other elements that do not fit neatly into the existing content unit framework would not be easily manipulated by the genetic algorithm.  Consider a navigation bar that is to the left of the content units on a page. We would no longer know the order of words on the page in this case, and could not use the information from experiment 1.

There are two possible structures that were considered in the design of the website for experiment 2.  The first is what is called a “web”, in which there is no clearly defined “top” of the site and no sublevels.  Any page can point to any other page.  This structure was rejected because of the difficulty of representing it and still maintaining genetic algorithm capabilities such as crossover and mutation.

The other structure is called a “hierarchy”, in which there is a clearly defined “top” of the site (the index page), and most links point downward.  This forms what is called a “tree”, which is an easy data structure for an algorithm to manipulate.  Also, it can be represented easily as an array, and manipulated through crossover and mutation without corrupting the data inside.

The content units within the hierarchy also needed to be structured.  Since none of the topics were necessarily related, it was not immediately obvious how to do this.  To solve this problem, we gathered information from other students as to how they thought the site should be structured.  This was accomplished by writing each topic (along with a brief description) on an index card.  The subject of this “mini experiment” was then asked to organize the cards in such a way that all cards would be grouped with related topics.

The results of the above experiment produced some general trends, which aided in our design for the second experiment’s web hierarchy.  Recall that the theme of this experiment was unusual sports.  Ball sports were often placed together, as were sports that involved teamwork, sports that involved animals, and sports that were underwater.


5.2. Experiment Design

Much of the design of the experiment was encompassed by the design of the website.  However, there were some design decisions that did not fall into that category. In particular, many requirements for the website and for the experiment were determined after receiving unsatisfactory results from our first set of experiments.

5.2.1. Problems

After designing the genetic algorithm, we realized our experiment design was flawed. Once the original site had been modified, there were a number of problems due to oversights in the design of experiment and website. The problems we found included poor indication of success, duplication of content, inconsistent linking, and changed word count.

It was difficult for a subject to indicate when they had found the content they were looking for, due to the structure of the experimental site. In the initial design, successful termination was indicated when the subject clicked the link below the first content unit on their assigned topic. This link actually led to more information about the topic, so the natural action was preserved. In the modified site, however, the link associated with specific content often did not lead to more information about its topic, but to information about unrelated topics. Since links were the indication of success as well as linking farther down in the site, clicking on the link to go away from the target topic did not make sense.

Another problem that occurred was duplication of content. This duplication was rooted in the natural hierarchical structure for our information. Each page on the second level of the hierarchy (see Appendix C) contained three content units, each about an animal in that family (reptiles, amphibians, or mammals). Each unit had a link to more information about the same animal. After the site was modified, these links no longer existed; instead, two content units about the same animal would appear in disparate areas of the site. This caused confusion when looking for target information, because it appeared in two locations, but only one of them was considered correct by the scripts recording the user’s performance.

The third problem was inconsistent linking. We realized that links in our first experimental site had three possible meanings: “example-of”, “more-information”, or “is-related-to”. The content units at the highest level were broad categories; their links led to examples of the topic. At the second level, each link led to more information about the topic. The third level contained the most general type of link, which led to information about a different, related topic. The problem with these different types of links was that these types are not preserved once the structure of the site had been changed; instead, all links became is-related-to, the most broad type. This added to the confusion caused by the first two problems described above.

To counteract the effects of the changes in the links, we attempted to write better descriptions in the links. However, this drastically changed the word count for each unit of content. Since word count was a constant in our experiments, this did not follow the scientific method.

5.2.2. Solutions

Our team came up with 3 possible solutions to this set of problems. The first was to continue the experiment as planned. The second was to change the link context. The third alternative was to alter the original experiment and re-run it.

If we simply continued the experiment, we would not alter the experiment in any way. We would then need to discuss and analyze why the results showed a decline in user performance, as well as ways we could have changed the experiment to avoid this decline. Although this approach would be easiest, it would not follow the scientific method, since two variables had changed: the types of links as well as the basic structure of the site. The results we obtain would be likely to show a decline in user performance, if we could even obtain results. The experiment itself was confusing, and many testers would be likely to give up without completing it.

The second option was to continue the experiment after making modifications to the new site. We would have added text to each content unit, adding context to the links so they might make more sense. The benefits to this method were obvious. There would be a greater likelihood of obtaining positive results, while still illustrating the combination of an adaptive website and a genetic algorithm. There would not be a large increase in the time we would need to finish the experiment. However, this method was not desirable. We would be disregarding the scientific method entirely by changing a constant within the experiment. Even with these additions, the experiment might not actually be less confusing. This alternative also required significant creative writing on our part, which might lack consistency.

Altering the original experiment was the third possible solution. In this option, we would not simply continue on with the third experiment, but return to the second. We would change the information presented to users to make it more consistent. We would also differentiate links (to navigate through the structure of the site) from goals. Although this method would require the most time and effort, there would be many benefits. This option follows the scientific method most closely. There would actually be fewer changes needed (than in the second option) to make the website less confusing. The project would use our existing framework for the website and require few modifications to the genetic algorithm. Most importantly, this method was the most likely to produce positive results and support our theory.

After careful consideration, we decided to go with the third option, as it afforded us the greatest chance of success.

5.2.3. Data Collection

The information we gathered consisted of the time it took the user to complete the experiment, the order of links clicked by the user, and some personal information about the user. We used the time data from experiment 1 to make assumptions for the fitness function. In experiments 2 and 3, the time data was used to measure improvement.  The order of links clicked was originally gathered to obtain the number of links clicked by a user within the experiment.  We later found that the order could also be used to determine the rate of errors made by a user, and incorporated this information into the fitness function as well.

Another piece of information we gathered was the topic that the user selected as their target.  The decision to allow users to select their own topic came from a desire to simulate the way a real website works.  In a real website, users are not forced to choose an item within the site to look for, so we felt this website should be no exception.  This also gave the genetic algorithm another piece of data, frequency, to use in the fitness function.  In our original design, the user was not given a choice and was assigned a target.  This was done because we wanted an equal distribution among all topics.  However, when we revised the experiment, we realized that this was an artificial imposition on users of the website.

5.2.4. Implementation details – Population and Programming

The user population for this experiment was drawn from Computer Science students at Worcester Polytechnic Institute (WPI).  We did this because it was relatively easy to acquire participants from the classes of professors.  Also, we assumed that Computer Science students would possess at least rudimentary internet skills.  We did not want a large variety of skill levels to bias the experiment.

The languages used to implement the experiments were Perl and Javascript.  These were selected because all group members had some experience with them or reference material pertaining to them.

5.2.5. Example Task and Website

Another factor of the experiment was the example given to the users before being tested on the real website.  The example was given to familiarize users with the format of the website.  For each experiment, it provided a chance to learn what the links and buttons on each page did.  It was done primarily to avoid having the users spend extra time in the actual experiment trying to learn the interface.  However, we did not want them to learn anything about the topic of the real experiment, and so the information in the example was not related to the experiment.


5.3. Genetic Algorithm

There were a number of things we needed to consider in the design of our genetic algorithm. We had to decide which language(s) to use, how to structure our fitness function, and the number and size of the generations.

5.3.1. Programming Logistics

The genetic algorithm was programmed using C++ because of the large amount of programming experience that all team members had with this language.  Also, this language allowed us to break up the code into header files, so that functions could be tested individually.

The genetic algorithm has a very modular structure.  Each part of the genetic algorithm is contained within its own header.  Each header also contained all of the helper functions for that part of the genetic algorithm.  Crossover, mutation, the ordering function, parent selection, the fitness function, and the site generator each had their own header. Assembling the genetic algorithm, in theory, should have been a simple matter of compiling all of the working header files.  However, problems occurred because the genetic algorithm was not fully compatible before implementation began.  We were forced to add functionality “on-the-fly”, to fix oversights in the original design.

5.3.2. Fitness Function

Many important design decisions went into the fitness function.  In our original set of experiments, the data used were word count, page count, and error rate.  Frequency information was not gathered (see above).  For a particular task, we used a simple formula: (word count + (page count * error rate)).  This formula was based on the idea of giving a weight to both the words the users were looking at and the links the users were clicking.  A constant could be multiplied to either factor to increase its weight.  Several times, we made minor alterations to this weight, ran the genetic algorithm, and recorded the results.  In such a way were we allowed to “tweak” the formula until it produced desirable results.  The users in the original version of the experiment were evenly distributed among the tasks, so all tasks were considered equal in the genetic algorithm.

When the experiment was revised, the element of frequency was added.  When incorporating this new variable into this fitness function, we tried several approaches.  We ultimately selected frequency as a multiplier to the rest of the function: (frequency * (word count + (page count * error rate))).  Frequency in this case was a percentage.  To find it, we took the number of times that the task was selected, and divided by the total number of users.  By adding together the fitness values of all tasks, we obtained an average weighted by the frequency.  This made sense to us, because the fitness function would now consider each task in a percentage directly corresponding with how often the task was considered by the users.

5.3.3. Populations

Initially, we ran the genetic algorithm through 100 generations.  However, we were still dubious as to whether the fitness values of the sites were converging on a best fitness.  We decided to increase the number of generations to 500.  We noticed that in some cases, there would occasionally be a jump in fitness as a mutation produced dramatically better fitness values.  This phenomenon was more likely to occur with more iterations of the algorithm.

Another consideration regarding populations was the size.  Most research on genetic algorithms recommends a population size of 100 or more.  However, our research showed that a size of 30 is more practical for real-world applications [Davis 1991; Oliver 2002].  It is not infeasible, however, to consider larger population sizes given enough raw computing power.


6. Results and Evaluation

After running Experiment 2 we ran the genetic algorithm, which produced the site shown in Appendix B. We used this site in Experiment 3, and compared the results with those from Experiment 2. (See Appendix D for summaries of the data from each experiment.) We found that users tended to find their topics more quickly in the later experiment (which used the GA-generated site), as shown in Figure 6.1.

Figure 6.1: Average time taken on experimental site

Our experimental population was 55 students for the first experiment and 39 for the second. In each experiment, the user could pick one of 18 topics to be their target. The small population size made frequency very important. A few topics were quite popular, and a few were never picked, as shown in Figure 6.2.

Figure 6.2: Topic selection frequencies

Figure 6.3: Topics selected at least twice

Frequency from the human-generated site (Experiment 2) weighed heavily in the fitness function. Topics that were picked often in Experiment 2 should have had the largest improvements in time. Improvement was based on the mean time for a topic in both experiments. Because of this, it was impossible to see if there was a reasonable change in time for topics which were not selected at least twice in both experiments. From now on, we will only talk about the topics which were selected at least twice in both experiments (Figure 6.3).

The change in frequencies between the two experiments made meaningful statistical analysis more difficult. The fitness function depended on the frequency from experiment 2, so results should be the most favorable when the frequency from experiment 3 matched that of experiment 2. In such a case, simple comparison of the values for the two experiments might be enough to indicate whether the results were significant. Since frequencies for many topics diverged, more sophisticated analysis was required.

For each topic represented in Figure 6.3, we calculated a 90% confidence interval for the mean times from the experiments. We also made this calculation for the entirety of each experiment, as seen in Figure 6.1. We used the classical formula [Petruccelli et al. 1999] for these calculations: Y ± σ(Y) tn-1,0.95, where Y is the observed mean, σ(Y) is the estimated variance, and tn-1,0.95 is the 90% critical value from a t distribution with n-1 degrees of freedom. In Figure 6.4, the observed means are represented by the blue dots, and the interval around each dot has a 90% probability of containing the true mean value. The true mean value is the mean value of an infinitely large user population. Since some of the confidence intervals are quite large, it is apparent that some of the improvements we observed are not significant.

Figure 6.4: Mean times for experiments 2 & 3 with confidence intervals

To determine which changes were significant, we calculated confidence intervals for the difference of the two means for each topic. We again used the classical formula, where variances are unknown and not assumed to be equal: Y1 - Y2 ± σ(Y1 - Y2) tv,0.95, which uses the means of the two populations and  a pooled estimate of variance. In this formula, the degrees of freedom v is based on the variances of both populations. The results of these calculations can be observed in Figure 6.5.

Any interval that is completely on one side of the x-axis is significant at a 90% confidence level. In our experiment three topics (canoe tilting, cup stacking, and skittles) improved significantly with 90% confidence. The overall improvement in time, with an interval of (-2.3, 31.7) seconds, is not significant at 90% confidence, but would be significant at any lower confidence level.

Figure 6.5: Significant changes in mean time

With several significant decreases in time taken and no significant increases, the analysis indicates that our experiments were successful.

7. Conclusions

The results and evaluations of our experiments have allowed us to make multiple conclusions. We were able to make assumptions concerning the effect of a web page’s word count on the time it takes a user to find his/her desired link. We were also able to deduce how useful genetic algorithms could be in making a website adaptive.

By analyzing the results from the first experiment, we concluded that the word count of a web page, up to the point of the location of the link the user is looking for, has a linear relationship with the time it takes a user to find that link.  The more words between the top of the web page and the link, the more time it will take the user to find the link.

We also concluded that this applies to websites as well. As demonstrated in Figure 7.1, if the target is link 8, it will be found faster in website (b) than in website (a).
















                                                 (a)                                             (b)

Figure 7.1: Word count on a website

The total number of words a user has to look at before seeing link 8 is 40+45+55+45+50+60+25 = 320 words in (a). In (b), the number is 40+55+60+25 = 180 words. By applying our first conclusion, it is clear that the lower word count of 180 will allow the user to find his/her desired link faster in (b).

By analyzing and comparing the results of our second and third experiments we concluded that a genetic algorithm can be practical for managing an adaptive website. As mentioned above, there were some significant improvements in time due to the changes made by the genetic algorithm. Because of this, we believe that a genetic algorithm is very effective in adapting a website such that frequently accessed items in a website are found quickly.

Since user’s interests change over time, we believe that it would be a good idea to run the genetic algorithm periodically at regular intervals. By doing this, the website would be adapting to users’ changing interests. This could be facilitated by automating the genetic algorithm in such a way that it updates the website’s structure without human interaction.

Other domains could also be looked at to see how useful genetic algorithms are to them. For example, e-commerce websites may be improved by genetic algorithms, which could result in higher sales. Personalized views, as described in section 3.1.1, could make topics of interest per user more readily available.

Different user groups could also be investigated to determine the usefulness of a genetic algorithm on adapting a website. Instead of having a homogenous population, as we did in this project, one might try having a heterogeneous population. Then the genetic algorithm may be able to produce multiple websites, each customized to specific subsets of the population.

In order to further investigate how useful genetic algorithms are in improving a website, many studies could be done to determine the effects of HTML elements (such as images, colors, or spacing) on user behavior. By doing this one could incorporate the results of these experiments into a more complex fitness function that handles multiple types of content. This would allow for a broader application of genetic algorithms to improve websites.

This project provided a structure for us to learn about genetic algorithms, adaptive hypermedia, experimenting with human subjects, and technical writing. We were able to apply existing skills and knowledge in the fields of artificial intelligence, modular programming, and programming for the web. The project was a good vehicle for learning because we needed to apply this knowledge in order to be successful. We were also able to recover from our early mistakes and revise the experiments in a timely manner. We are satisfied with the outcome of this project and would like to see further work done in this field.


Brusilovsky, P., ed., Methods and Techniques of Adaptive Hypermedia. Kluwer Academic Publishers, 1998.

Davis, L., ed., Handbook of Genetic Algorithms. New York: Van Nostrand, 1991.

Goldberg, D. Genetic Algorithms in Search, Optimization, and Machine Learning. Reading: Addison-Wesley, 1989.

Jameson, A. "Designing User-Adaptive Systems." 2001. German Research Center for Artificial Intelligence, Santa Fe, New Mexico. Intelligent User Interfaces (IUI) 2001 <http://www.cs.uni-sb/users/jameson>.

Koza, J., et. al. Genetic Programming III: Darwinian Invention and Problem-Solving. San Francisco: Morgan Kaufmann, 1999.

Langley, P. and Hirsh, H. "User Modeling and Adaptive Interfaces." 2000. American Association for Artificial Intelligence, Austin, Texas.

Oliver, A., Regragui, O., Monmarché, N., Venturini, G. "Genetic and Interactive Optimization of Web Sites." 2002. Eleventh International World Wide Web Conference. 12 September 2002. <>.

Perkowitz, M., and Etzioni O. "Adaptive Sites: Automatically Learning from User Access Patterns." 1997. Sixth International World Wide Web Conference. 27 August 2002 <>.

Perkowitz, M., and Etzioni O. "Towards Adaptive Websites: Conceptual Framework and case study." Reading: Artificial Intelligence, July 2000.

Petruccelli, J.D., Nandram, B., and Chen, M. Applied Statistics for Engineers and Scientists. New Jersey: Prentice-Hall, 1999.

Winston, P., Artificial Intelligence: Third Edition. 1992, Addison-Wesley, Chapter 25: Learning by Simulating Evolution, pp. 505 – 528.

Appendix A: Sports Site (Experiment 2)

Appendix B: Sports Site (Experiment 3)


Appendix C: Animals Site (Experiment 3, unused)

Appendix D: Experiment Results


Experiment 2


Time (ms)

Path Length

Average Time (ms)

Average Path Length

Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Cheese rolling





Cup stacking





Cup stacking





Cup stacking





Cup stacking





Cup stacking





Cup stacking





Cup stacking





Cup stacking















Horse vaulting





Horse vaulting





Horse vaulting





Horse vaulting





Horse vaulting





Horse vaulting





Horse vaulting





Horse vaulting











































































Pachyderm polo





Pachyderm polo





























































Experiment 3


Time (ms)

Path Length

Average Time (ms)

Average Path Length

Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Canoe tilting





Crane & tortoise relay





Crane & tortoise relay





Cup stacking





Cup stacking





Cup stacking





Cup stacking





Cup stacking





Cup stacking

























Horse vaulting





Horse vaulting





Horse vaulting





Horse vaulting


















































Pachyderm polo