USF Libraries
USF Digital Collections

A primer on neural networks in transportation


Material Information

A primer on neural networks in transportation concepts and applications
Physical Description:
48 p. : ill., ; 28 cm.
Perez, Rafael A
Pietrzyk, Michael C
University of South Florida -- Center for Urban Transportation Research
University of South Florida, Center for Urban Transporrtation Research
Place of Publication:
Tampa, Fla
Publication Date:


Subjects / Keywords:
Transportation -- Data processing   ( lcsh )
Transportation -- Automation   ( lcsh )
Neural networks (Computer science)   ( lcsh )
local government publication   ( marcgt )
bibliography   ( marcgt )
non-fiction   ( marcgt )


Includes bibliographical references (p. 44-48).
Additional Physical Form:
Also issued online.
Statement of Responsibility:
by Rafael A. Perez and Michael C. Pietrzyk.
General Note:
"November 1995."

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 025511366
oclc - 666855866
usfldc doi - C01-00157
usfldc handle - c1.157
System ID:

This item is only available as the following downloads:

Full Text
xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam 2200277Ia 4500
controlfield tag 001 025511366
005 20110909111357.0
008 100929s1995 flua b l000 0 eng d
datafield ind1 8 ind2 024
subfield code a C01-00157
b P47 1995
1 100
Perez, Rafael A.
2 245
A primer on neural networks in transportation :
concepts and applications /
by Rafael A. Perez and Michael C. Pietrzyk.
[Tampa, Fla. :
University of South Florida, Center for Urban Transporrtation Research,
48 p. :
28 cm.
"November 1995."
Includes bibliographical references (p. 44-48).
Also issued online.
0 650
x Data processing.
Neural networks (Computer science)
Pietrzyk, Michael C.
University of South Florida.
Center for Urban Transportation Research.
t Center for Urban Transportation Research Publications [USF].
4 856


A Primer on Neural Networks in Transportation: Concepts and Applications by Rafael A. Perez, Ph.D. Professor, Department of Computer Science and Engineering University of South Florida and Michael C. Pietrzyk, P.E. Senior Research Associate, Center for Urban Transportation Research University of South Florida November 1995 The opinions, findings. and conclusions expressed in this publication are those of the authors and not necessarily those of the Florida Department ofTransportatlon or the US. Department of Transportation. This report, serving as Technical Memorandum #2 has been prepared In cooperation with the State of Florida Departm ent ofTransportation and the U .S. Department ofTransportation. in partial fulfillment of HPR Study No. 0763 WPI No. 0510763, State Job No. 99700-3337-1/9, Cont ract No. B-9896, CUTR Account No. 21-17 -189L.O. (Work Plan Task 1.5), entitled "Neural Network Technology far the Evaluation of Trip Reduction Programs" (Philip L. Winters, CUTR Senior Research Associate, Principal Investigator)


TABLE O F CONTENTS Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Glossary of Tenns ........... ... ........... .................... .... 3 I. OVERVIEW OF NE URAL NET WORKS ... ............... ............ 5 1.1 General Characteristics .. ......... ...... ..... ............ 5 1.2 Training a N eural Network ....... ........... .... . .... 8 1.3 Data Preparation . .................................... I 0 1.4 Layer Selection ....................................... 1 3 1.5 Other Types of N eura l N etworks .......................... 13 1.6 Comparison of Neural Networks" to Other Te chniques ......... 1 5 1.6.1 Comparison to Expert Systems ... ... .... .... ..... 15 1.6.2 Comparison to Other Machine Learning Techniques ... 18 1 6 3 Comparison to Mu lti ple Linear Regression Techniques 19 2. EXISTING APPLICATIONS OF NEURA L NETWORKS .... . . ...... 21 2. 1 Background ..................... ... .... . ...... ..... 21 2.2 Classifica tio n .. ... ........... ..................... ... 23 2.3 Decision-Making . . ........... . . . . ......... ... 24 2.4 Pattern Recognition and Predic t ion . ......... ........ .... 26 3 OPPORTUNITIES FOR OTHER APPLICATIONS OF NEURAL NETWORKS ................... .... ... ......... . 30 3.1 Background .................................. ........ 30 3.2 Proposed Applications ............... ............ .... . 3 1 3.3 Evaluation of Employer Trip Reduction Programs . . . ..... 33 3 .3.1 Data Description ....... . ....... .... . . ...... 34 3.3.2 Neural Network Approach ............... . ..... 36 3.3.3 Data Pr eparat ion ... . ....... ......... . ......... 38 3 3 4 Neural Network Training ....... . ............ ... 39 3.4 Neural N etwork Application to T rip ReductionBenefits to Florida ................................... 40 4. CONCLUSION ... ........... ............................ ....... 4 1 5. BIBLIOGRAPHY ...... ................ ... .............. ....... 43


LIST OF FIGURES Figure 1: lnterconnectivity of an Artificial Neural Network . ................. 5 Figure 2: Typical Artificial Neuron ............. .... .... ....... .... . .... 6 Figure 3: Existing Input-Output Data Relationships .... .... .... ........ ..... . 7 Figure 4 : Trained Neural Network ............. ................ .... ........ 7 Figure 5: Two-Layer Feed Forward Network .............. ............ .... 8 F igure 6: Binary Data Preparation . . ........... .... .......... . ... ... 11 Figure 7: Cont i nuous Data Preparation ............. ... . ........ ... . . 12 Figure 8: Human Expert ............... ...................... .... . ... 16 Figure 9: Expert System . . . . . . . . . . . . . . . . . . . . . . . . I 7 F igure 10: Typical Analog Signal Fonns for Selected Ve hicle Classe s ........ ... 23 Figure 11: Three-Layer Neural Network Model for Route Choice ........ . ... 26 Figure 12 : Neural Network for S&P Stock Index Futures Trading ... .... . . . 30 Figure 13: Applications for Artificial Neural Ne twork s . .......... ...... .... 3 I Figure 14 : Neural Network for the Evaluation of Employee Trip Reduction Programs ......................... ... . .. .. 3 7


PREFACE Simply stated, neural networks simulate the abi lities of the human brain The brain is composed of hundreds of billions of cells called neurons whlch are microscopic biological "computers" sending information back and forth to each other through connections. Neurons communicate with each other and are able to learn from experience (or example) not from programming. Artificial neural networks (ANN), synonymous with neural networks represent a form of computer intelligence and work very similar to the human brain, but on a very reduced scale. By combining their ability to analyze and "learn" with a computer's ability to replicate the human b rain's complicated network of neurons and connections (as well as having the ability to process large amounts of data easily and quickly) ANNs can be "trained" to mimic the h uman brain on a computer. However, just as we are not yet able to understand all the inner workings of the complex human brain, the inner relationships of ANNs are difficul t to express in terms of rules. Indeed, one of t h e biggest drawback s with ANNs is their inability to explain themse l ves. What is commonly known is that both the human brain and artificial neural networks are capable of learning through proper trainin g and experience. We cannot detennine exactly how an artificial neural network reaches its solution, bu t we can observe how it reacts to varying inputs in predicting outputs. Artificial neural networks can enhance our capability of learning by more carefully studying the past. In particular ANNs can be used to recognize patterns, predict trends and find h idden relationships in data. The two most di s ting u i s hing (and most attractive) features of artificial neural networks are that they do not rely on rules or formulas and they are tolerant of imperfec t or incomplete data. Artificial neural networks are being used today to make predict ions learned from existing input and resulting output data in science engineeri ng medicine, banking, management marketing, manufacturi ng, and sports wagering. Many industry experts even predict that ANNs will be used in common household items by the year 2000. The objective of this report is to introduce the lay transportation professional to the concept of artificial neural networks. A general overview identifies, discusses and compares the key features of an ANN with other types of artificial i ntelligence systems. Existing and proposed application areas, both inside and outs ide the transportation industry, are presented. A glossary


bas been included (following this preface) to introduce the Jay transportation professional to common jargon included in this report regarding artificial neural networks. A comprehensive bibliography is included as a guide for recommended reading. Finally, this paper is intended to lend support for the particular intended applicatio n of ANNs --the evaluation and selection of successful employee trip reduction programs 2


GLOSSARY OF TERMS adaptive gradient learning rule : a neural network training technique to minimize the error in classification type problems. artificial neural network: a computer model made to simulate a biological neural system back propagation: a supervised learning technique i n which an error signal is fed back through the network altering weights as it goes, in order to prevent the same error from recurring. feedback: a condition that occurs when a neuron's output signals are sen t back to its own inputs or to the inputs of neurons in a previous layer or the same l ayer feed forward network: a network in which neurons take their inputs from the previous layer only and send their outputs to the next layer only. bidden layer: a layer of neurons in an artificial neural network which does not connect to the real world, but only to other layers of neurons. Kalman learning rule: a neural network training technique for type problems and noisy training data. lateral inhibition: a network structure that allows only one neuron in a group to respond to an input. neural network: highly dynamic, parallel system (mimicking the brain's neurons) that carries out information processing by means of its overall response to input. neuron: a nerve cell in a biological nervous system, and a processing unit in a neural network. Each neuron has a n umbe r of inputs and a single output. The inputs are summed and compared against a threshold va l ue which when reached generates an output signal equal to the neuron's activation value. 3


neuron pruning: elimination of neurons during training that are non-contributory. noisy data: imprecise or irrelevant data that is present within input patterns. radial basis function: a "bell-curve" (or Gaussian) type function used in hidden layer neurons. sigmoid function: a transfer (S-shaped) function which has a high and a low saturation limit, and a proportionality range between each limit. This function is "0" when the neuron's activation value is a large negative number, and" I" when the activation value is a large positive number, with a smooth transition in between. simulated annealing training method: a method of tra ining that introduces randonmess in the training process to avoid suboptimal weights. supervised learning: a method in which an eKtemal influence tells the network if its output is correct. Weights are adjusted by using training sets or an actual observer until the network s output compares to the desired (actual) output. threshold logic function: a binary function that outputs a "I" when its input is larger than a specific threshold, otherwise, the function outputs a "0" training: a process which uses one of severalleaming techniques to modify weights in an orderly fashion. training set (or pair): a list of paired input and desired output patterns used in supervised learning. All of the information the network needs to learn must be in the training set. The inputs can be numbers, symbols, or picrures. unsupervised learning: a method of learning in which no external influence is present during training to tell the network whether its output is correct. No training sets are used and there is not an e>

1. OVERVIEW OF NEURAL NETWORKS 1.1 General Characteristics Neural networks are essentially a group of highly interconnected and relatively simple computational units, as illustrated in Figure I. Each of these computational units performs processing of its inputs to produce a single output. The o utput of each unit is connected to the inputs of many other units through different weights artificial neurons Figure 1: Interconnectivity ofan Artificial Neural Network Figure 2 shows a typical artificial neuron (i.e., any one of the individual neurons in Figure I) which adds all of its weighted inputs and uses a sigmoid output function to generate its output. In some cases, output functio ns other than sigmoid functions are used. For example, the threshold logic function is often used when binary functions are being implemented with neural networks. 5


inputs w1 connection weights anificial neuron Fig u r e 2: Typi ca l Arti fi ci al Neuron outp u t There are several reasons why neural networks have been used to solve a variety of problems. As indicated previously, the most significant capability of neural networks is that of learning previously unknown relationships d i rectly from data Indeed, neural networks are bes t suited for problems for wh ich data already exists that associate a set of inp uts and o u tputs but for which n o exp li c i t r e lationships between them have yet to be established or accepted This situation is illustrated in Figure 3 .. .. ? .. observed data concl usions .. .. .. Figure 3: Unkn own Input-Out put Relati o nship 6


Since each neuron implement s a non linear mapping between its inp u ts and output, as shown previously in Figure 2, neural networks are capable of learn i ng non-linear re l atio ns h i ps that may exist in the data. These learned relationships can then be applied to prev i ous l y unseen data from that problem domain, as illustrat e d in Figure 4. This makes neural networks adaptable and specially usefu l in envi r onments where the relationships between inputs and outputs change over time. ... conclusions new dat a made from new da t a ... F igure 4 : Making Concl u s i o n s f ro m Raw Dat a Another capability of neural networks that may be important in some applications is that they are fault tolerant. If an i ndividual n e uron fail s, th e performance of the entire network does not fail but in stead slowly degrades in proportion to the number of failed neurons This is in contras t t o traditional algorithmic solutions where, if a single instruction fails, the entire so l ution fails Still another advantage associated with neural networks is that their individua l units can f unc t ion in parallel. The corresponding increase i n speed that results from this can be used effective l y in app li catio n s requiring real-time decision-making 7


1.2 Training a Neural Network Multi-layer, fully-coMected, feed forward neural networks are the most popular ones They consist of two or more layers of individual neurons where each neuron in a given layer receives inputs from all the neurons in the previous layer and its out put is input to all neurons in the succeeding layer. A two-layer, fully-coMected, feed forward network is shown in Figure 5 with an n-dimensional inpu t and m-dimensional output. The middle layer, between the input and output layers, is called a hidden layer since its outputs are not directly accessible inputs hidden neuron layer Figure 5: Two-Layer Feed Forward Network outputs The popularity of these types of neural networks is mostly due to two factors First, a two-layer feed forward network has been shown to be capable of implementing any association between inputs and outputs Second there is a well defined method, called back propagation to teach the network the relationships that exist in the data. T raining a network using back propa gation consists of finding the weight values so that the associations between input and output in an existin g data set can be duplicated by the network. This, of course, implies that training with this method requires the existence of a data se t for which correct outputs are known. This type of learning is called super v ised learning. 8


For back propagation training, the data set is usually divided i nto two groups--one group for training the network and another group for testing how well the network has learned. At least I 0 percent of the data is normally set aside for testing and is not used for training. There are, however, variations of this method, depending on the size of the data set and the size of the network. A training set is the name given to one individual set of inputs with the corr ect (or desired) output. During training, each training set is presented to the network If the output of the network differs from the correct output, then the weights of the network arc changed. The back propagation method specifies what changes to make to the weights so that the difference between the actual and the desired network output is reduced. All of the training sets are presented to the network sequentially, and corrections are made. This constitutes one training cycle or epoch. Training a network requires many training cycles until the cumulative errors of all training sets for one epoch are below an acceptable level (i.e. error crit eria as pre-defined by the neural network builder). The lower this number, the better the network is able to duplicate the associations between inputs and outputs in the training data. In swrunary, back propagation uses the tra ining sets repeatedly to bring the difference between the desired network output and the actual output to an acceptable leve l, and it docs this by appropriately changing the values of the weights in the network. Normally, back propagation does not create or destroy neurons during training --although some commercial software provides neuron pruning as an option. It is expected that once the network is able to duplicate the associations between inputs and outputs in the training data, it will be able to produce correct outputs for input data not specifically included previously as part of the training data. If this is indeed the case, then the ne twork is said to have "learned" the relationships that exist in that problem domain between inputs and outputs and has stored those relationships in the weights of the network. Thus, we start with a situation as previously illustrated in Figure 3 and end up with the s ituation in Figure 4. The purpose of setting aside data for testing is to verify that the network can respond correctly to inputs not previously seen but representative of the pr oblem domain It is possible to have a 9


network with a smal l training error but a large error when the test set is used. This often occurs because of too many neurons in the hidden layer, and the problem can be corrected by reducing their number using as a guidance in the performance of the network on both the trai ning and the test set. 1.3 Data Preparation The number of training examples required to properly train a network depends on the size of the network and the accuracy needed In addition, the training data should provide a good representation of the problem domain. Guidelines for collecting network training data and for preparing the data for input to the network have been published in several books Introduction to Neural Networks by Jeannette Lawrence is especially helpful on this top ic. When selecting a training set, the following guidelines for data selection and preparation can be used: The number of training examples should be greater than the number of hidden neurons divided by the error criteria. At least I 0 percent of the training data should be set aside for testing the network. Noise (or small amounts of artificially induced errors in the data) may be added to the original training data to increase the size of the training set when necessary. If imp lemen t ing a network to classify data into different categories, examples from each category should be included. If the network is to approximate a continuous function, data throughout the entire range should be inc l uded No contradictions should exist i n the training data. Also, data that has been unduly affected by extraneous factors should not be included. After the data is collected and before it can actually be used to train the network, it must be transformed into a number. If the data can be represented by binary values, then this number will be "0" or "I" (someti mes the binary numbers used are I or 1). Network inputs that represent attributes of a problem that can take on symbolic values are usua ll y represented by binary values as illustrated in Figure 6. 1 0


Traffic speed = Figure 6: Binary Data Preparation Network inputs that represent attributes of a problem that can take on numerical values are usually represented by continuous values between "0" and "I", as illustrated in Figure 7. When the value range of a given attribute is very large using the change recor ded for that value instead of the actual value is better This should be done only if it bas been determined beforehand that it is the change in value that is significant for the problem at hand. Along the same lines, some data would need to be transformed using methods like Fourier s (e .g., photos, signals pavement surface images, etc.) to reduce tbe dimensionality or to eliminate meaningless time variations in the o rigin al data. lfthe goal of the neural network is to make a prediction based on what has happened in previous time periods, the network inputs shou l d be set up as a separate set of inputs for each of the time periods included in the data. II


Average trip length (%mile to 25 miles; 0 to 1) vehicle (1to4;0to1) Household size (1 toB;Oto 1) Figure 7: Continuous Data Preparation 1.4 Layer Selection After the data have been prepared, selecting the number of hidden layers and neurons in those layers is the next step in the training process. There are no clear guidelines on the number of hidden layers to select. Some suggest that one hidden layer is preferable because it is theoretically sufficient and improves training speed. Others suggest two hidden layers are preferable because the network architecture would fit the problem more effectively by allowing the first hidden layer to detect local features and the second hidden layer to Jet the local features detect global features. The number of neurons in the hidden layers sho uld be small in the beginning because general characteristics are more easily learned. After the netw ork i s trained, it should then be tested. A few hidden neurons should then be added and the network retrain e d and retested This process should be continued as long as the test performance oontinues to decrease. This should ensure that the network does not memorize the training set, but that instead learns the general 12


relationships that exist between input and output. Other guidelines that are useful for training include the following: The learning rate should be smaller in the last layers than the front layers Initialization of the weights should be uniformly distributed within a small range The training examples should be presented to the network in a random fashion and not according to categories Fortunately, several neural network development packages availab le today automatically inc l ude these fearures (e.g., increasing the number of neurons in the layers) during training. 1.5 Other Types of Neural Networks As mentioned previously, feed forward neural networks are the most popular configuration of networks because of the back propagation learning method developed for them. They also have an excellent track record of success in many different problem domains. There are, however other types of neural networks. Some of the better known types are mentioned here. A detailed description can be found in Neural Networks by S Hay kin. Hopfie l d networks differ significantly from feed forward networks in the use of feedback and the high degree of connectivity In many applica t ions ofHopfield networks the output of every uni t is connected as input to all other units (not just to those in the next layer as is the case with feed forward networks) Because of feedback stability is a concern for these networks. A way to calculate the interconnecting weights between units exist s such that stability is assured in most cases. The calculation of these weights a l so serves to store specific patterns in the network T hese patterns can be reproduced by the network when presented with a noisy or incomplete version of any of the patterns stored. This represents the main application for Hopfield networks although they have also been used in optimization problems. The computational units use e i ther the sigmoid or the threshold output function. Their main disadvantage is that the number of patterns that can be stored in these networks is only about 15 percent of the computational units in the network. 1 Haykin, S., Neural Networks, MacMillan Press, New York, New York. 1994. 13


Botzmaun ne!WQrks are similar to Hopfie l d i n the use of feedback However, they differ from Hopfield networks in that the computational units are probabilistic threshold functions. These units output a "0" or a "I" with a certain probability that is a function of the weighted sum of its inputs. A simulated annealing training method exists for these networks, which allows them to more accurately learn some relationships in the data. Their main disadvantage is the slowness of not o n ly training the network but also during nonnal operation of the network. RCE ne!WQrkS are proprietary networks from Nestor Inc (Providence Rl). They are a feed forward network that automatically increases the number of neurons in the hidden layer as needed in order to l earn from the training data. The computational units used by RCE networks are somewhat different to the standard sigmoid or threshold units mentioned before. An RCE computationa l unit responds positively only to data points within its area of influence. These areas shrink or new units are created a u tomatically during training Another significant type of network is a Radial Basjs Function network. These are feed forward networks that use a radial basis function instead of the sigmoid function in its hidden layer neurons The functions are of a Gaussian fonn and they are used only in the hidden layer. The output layer is made up of neurons with a linear o u tput function. These networks are specially suited for non-linear regression type of problems (e.g., stock market predictions, credit ratings average vehicle ridership). All of the different types of networks mentioned above can be included under the general category of" supervised" learning networks. These networks require supervision during train ing i n the sense that the training data coll ected contain the correct associations between input and output that the networks are supposed to l earn There are other types of networks that do not require that the training data inc l ude the responses that the network is to duplicate For t his reason these networks are called "unsupervised" learning networks Unsupervised learning networks group together inputs that belo n g to the same class by adjus t ing the interconnecting weights in the network. The size of the groups that the network decides upon depends on user-defined parameters. These networks are useful especially in applications where it is not known beforehand the different categories t hat the data should be grouped in s i nce they can detect naturally occurring groups. They are not as b r oadly used as supervised network s and 14


consequently. the two most popular ones ART and Kohonen networks are mentioned below. ART (Adaptive Resonance Theory) groups input data into different s ets accordi n g to a v igilan ce parameter that is externally set. It uses lateral inhibition among the ne urons in the same layer and automatically creates new nodes for new groups detected i n the input data. Kohonen networks. on the other hand, attempt to create topographic maps of the input data. The output neurons are us ually placed in a one-or two-dimensional arrangement. After t raining the relative location of the neurons in the network corresponds to specific features p resent i n the input data. Kohonen networks are also called self-organizing feature maps. 1.6 Comparison of Neural Networks to Other Techniques 1.6.1 Comparison to Expert Systems Expert systems are computer programs that can solve problems in a mann er similar to how human experts s olve them. They have been used successfully now for more than 20 years on a wide variety of problems. A wide range of expert system applications can be found i n The Rise of the Expert Company by E. F eigenbaum P. McCorduck and H. Penny Ni.2 Expert systems are most often devel oped with the help of an expert system development tool. These commercially available too l s provide software structures that can capture the knowledge that human experts use to solve a specific problem. Most ofte n the expert's knowledge is in the form of "IF THEN" rules, but it can also be in terms of object descriptions and procedures that allow interaction between these objects. There are some si g nifican t differences between expert systems and neural networks. Probably the most critical difference between these two techn iques is that expert systems require that the relationships betw een the input data and the conc l usions to be derived from that data be established before the expert system is built. It is usually a human expert who bas already established these relationships and who uses them to solve problems in a given problem domain. This is illustrated in Fig ure 8 That human expert knowledge is identified through a series of 2 E. Feigenbaum, P. McCorduck, and H. Penny Ni. The Rise of the Expert Company (New York: Times Books, 1988). 15


interviews conducted by a knowledge engineer and then progranuned into the expert system. The neural network as was mentioned previously, does not need to have the relationships between inputs and outputs already established since i t can capture those relationships directly from the data. Thus, a neural network needs the data from which it can uncover the relationships while the expert system needs the expert who h as a l ready "learned" those relationships .. .. .. conclus ion s .. .. made by data expert .. .. .. Figure 8: Human Expert Another important difference between the two techniques is the way in which the knowledge of the system is encoded. After training, neural networks encode their knowledge of relationships in terms of weight values and in the interconnection between the neurons. Expert systems encode their knowledge in terms of rules, object descriptions, and procedures, as illustrated in Figure 9. It is very difficult for a developer to understand the re lationships learned by a neural network; it is much simpler to identify the knowledge used by an expert system. Because of this, some people call neural networks "black boxes" and expert systems "white boxes." Some expert systems are developed such that they can actually explain how a conclusion was reached. As previously noted, this would be a difficult feat for a neural network to accomp lish. 16


Updating the knowledge in the system is another area where neural networks and expert syste ms differ. If the problem domain changes and new knowledge is required t his knowledge must be obtained from the human expert and earefully crafted into the already existing software knowledge structures of the expert system. A neural network, on the other hand, would need inpu t that reflects the changes in the problem domain wi th the corresponding conclusions that can be drawn from that data in order to retrain itself. Retraining must include n o t only the new data but all of the old data that are still representative of the problem domain. If the change required of the knowledge within the system is m inor--for example one or two rules -then it is easier to update the expert system than the neural network However if m ore significant changes are needed then retraining the neural network would require less time and effort than ac q uiring the knowledge from the expert and changing the expert syst em. Along the same lines, the updating of a neural network can be automated relatively easily in comparison with an expert system. new data - EXPERT SYSTEM PROCEDURES IF-THEN RULE S I OBJECTS I Figure 9: Expe t1 System .. .. .. conclusions made from new data Among the similarities between expert systems and neural networks are that both techniques arc used to reach conclusions using some input data from a wide vari ety of problem domains Both technique s require that a set of data be collected to carefully test the systems after they are built with the neural network requiring an additional set of data for training 17


1.6.2 Comparison to Other Machine Learning Techniques There are other machine learning techniques in addition to neural networks. The differences and similarities between them are discussed below. Neural networks form a category of learning techniques called "connectionist." This term emphasizes the dependency of neural networks on the connectivity of a large number of computational units. Other machine learning techniques rely on the manipu lation of symbols used to create rules similar to the "IF TIIEN" rules used widely in expert systems and are grouped under the category of "symbolic" learning techniques. Thus, one of the main differences between neural networks and these other symbo lic learning techniques is in the form of the knowledge that they learn from the data presented to them. Another important difference is in the range of problem domains that they can effectively deal with. Symbolic learning methods deal mostly with classification problems that is, the assignment of a class label to an object or situation based on the specific values of a set of parameters. Neural networks deal with a broader range of problems. They can learn not only to classify data into different categories but to predict the numerical value of outputs (e.g., level-of service classification based on volume to capacity ratios or average travel speeds), serve as a real-time system controller, learn to recognize spoken words, learn to interpret a visual image, etc. Probably the most important similarity between neural networks and symbolic learning methods is that they both require a set of representative data from the problem domain in order to learn the relationships that exist between inputs and outputs. There is no need for explicit knowledge of these relationships as long as training and testing data exist. 1.6.3 Comparison to Multiple Linear Regression Techniques Linear regression techniques have been used extensively to predict the value of dependent variables based on a combination of independent variab les. L inear regression techniques use a linear combination of independent variables. The term "linear'' refers not to the form of the independent variables but to the regression coefficients in the equations below. A detailed 18


examination of regression can be found in Classical and Modem Regression with Applications by R Myers.' A simple linear regression model is of the form: y = 11o + a,x where 11o and a, are regression coefficients, "x" is the regressor (independent) variable, and "y" the response (dependent) variable. A multip l e linear regression model is of the form: y = llo + a,x, + azx, + ......... + a.x. Even if the regressor variab les appear as a polynomial, as in the equation below, the model is considered linear because it is linear in terms of the coefficients. Given a set of data associating the value of y" with the values of the x s, the objective of these regression techniques is to find the regression coefficients a,s, b,s, etc., that minimize the squared error between the observed values of "y" and those calculated by the equations above. Therefore it can be seen that regression techniques and neural networks have in common the need for collecting data that associate inputs and outputs in the problem domain of interest. There i s no need for a human expert to establish beforehand the explicit relatio n ships between input and output; the equations or the network will attempt to establish them. Another important similari t y is that they both attempt to minimize the difference between the observed values of the output and the computed value of the output. A significant difference between neural networks and regression techniques is that regression t echniques require that the form of the equation be selected beforehand. After this selection is 'R. Myers, Classical and Modern Regression with Applications. PWS-KENT. 1990. 19


made, the regression method will calculate the regression coefficients in these equations. Given a spec ific neural network, the back propagation training method will help determine the form of the equation as well as the coefficients. Because of this basic difference, most neural network practitioners claim that neural networks provide a more general framework than regression methods to establish the relationships between inputs and outputs. However, some disagree with this statement, as noted in the paragraph below Some argue that changing the number of neurons in the hidden layer of a neural network in order to obtain the best network for a given set of data is, in some manner, equivalent to changing the form of the regression equation to obtain the best equation that fits the given data There appear to be no theoretica l or empirical results that provide a defmitive answer to this argument, although some published results are available. Carpenter' reported obtaining comparable results using two-layer (one hidden) neural networks and polynomial approximations when estimating the minimum vo lume of a five bar truss. However, studies by Duna and Shekbar' claim that neural networks consistently outperformed linear regression models in predicting bond ratings. Coy et al .6claim that neural networks outperformed linear regression mode ls using both linear and non-linear func t ions of the independent variables, in predicting returns for Initial Public Offerings. Neural networks and linear regression techniques are not mathematically equivalent. As can be observed from the two regression equations above, the coefficients in these equations are linear and, therefore, these equations provide a linear combination of some function (linear or non linear) of the independent variables. Neural networks on the other hand, provide weights that represent non-linear functions of the input variables For example, the neuron outputs of the first hidden layer are non-linear functions of the input variables. The outputs of the next layer wou ld then be a non-linear combination of non-linear functions of the input variables. In general then,

the "y" obtained from a linear regressio n technique i s not equivalent to an output taken from a neuron of the second (or higher) layer of a neural networ k. 2. EXISTING APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS 2.1 Background Research and development in artificial neural networks has been ongoing since the early 1 960s. In the ir early history, ANNs were use d experimentally and gained a reputation as technical solutions in search of a problem. However, ANNs currently have been proven successful in many practical applications. This trend is expec ted to continually increase due to rapid advancements in specialized ANN software and hardware. Artificial neural networks are known to be good at classification, evaluation optimization, decision-making, pattern recognition, behavior trend prediction, image analysis filtering, and modeling control systems. Even though these applications are diverse and seem to have nothing in common, they all can be learned by comparing known inputs and resulting outputs for a large number of examp les. Hundre ds of applications of ANNs exist, many often without public acknowledgment to preserve a competitive advantage. For example, NASA is using ANNs to teach robots to pickup randomly placed objects; General Dynamics has developed an underwater listening system to distinguish different types of vessels b y the sound of their engines ; the US Air Force is using an ANN to devclop a flight s im ulator for trai ning new pilots on the ground; Ford Motor Company has an ANN that interprets sensor data from engines to diagnose problems; and a bomb detector at the TWA terminal at New York's JFK airport uses an ANN. Despite being in relative infancy ANNs are being utilized in many more applications, such as recognizing and classifYing cancerous cells, detecting structural flaws in pavements. recognizing speech and handwrit ing analyzing loan applications, predicting i n terest rates asses sing real 1 Hammerstrom, D., "Neural Networks at Work", /E Spectrum, June 1993: 26. 21


estate, and predicting the occurrence of everything from solar flares to Sudden Infant Death Syndrome.' As mentioned previously, neural networks are often used when the rules or relationships of data are unknown or difficult to exp lain A neural network can be used to solve a problem with only a general understanding of what the i mportant factors are, without knowing which factors are more important than others or how each factor is related to the others. Basicall y, if problet;n-solving is difficult to put into precise calculations and does not require exact answers (only quick, good ones), then ANN applications can be quite suitable More detailed descriptions of ANN applications in classification decision-making, and pattern recognition/prediction are provided below. 2.2 Classification An ANN has been used to classify roadways into ''rural" and ''"urban" categories by being trained to recognize on-road and off-road objects from colored two-dimensional images (photos or videotape). Given a set of preclassified examples, the ANN leameo the ideal segmentations of all objects associated with each roadway category. Objects included vegeta tion, road surface road markings, type of buildings and their locations, and road signs. Non-ideal segmentations of objects to roadway class (i.e., irregularities or uncommon features) were not examined; however, the ANN was found to correctly classify a road by recognizing 71 percent of the on-road objects and 63 percent of the off-road objects. Vehicle classification is one of the primary tasks of traffic counting. Using the analog signals (uniq ue for each class of vehicle) from an inductive loop detector, an ANN was trained to successfully classify seven different vehicle types The analog loop signals were pre-processed to form features that included 20 unique character elements. ANN training was conducted with a sample of about I ,400 field observations. The vehicle length was included during the ANN 'Jeannette Lawrence, Introduction ro Neural Networks Design. Theory, and Application, C3lifomia Scientific Software Pre$$, 1993:7-9,21-23. W.P. MacKeown, "Engineering Applications of Artificial Intelligence" Volume 7, No.2. 1994, University of Bristol, United Kingdom: 169-176. 22


training as a limiting factor. Figure 10 below indicates comparative analog signal forms. r\ Car Truck rV"N mill 1-. . . w 00 0 Bus Truck+semitrailer Figure 10: Typical Analog Signal Forms for Selected Vehicle Classes 10 As time passes, traffic marki n gs on roadways can become damaged or stained by normal wear and tear, and become unrecognizable (i.e., non functional). A system using ANN to classify the type of markings and to rank the i r degree of visibility was examined and compared to conventional human inspection techniques Scheduling the re-marking activ i ties had been done in a fair l y subjective and i nconsistent manner. ANNs were found to be a very suitable tool in bridging human-oriented and computer-oriented tasks by automating the management of this activity. Follow ing training of the ANN, recognition was faster and interruption oftratlic during recognition was virtually eliminated with only one pass of the video-recording vehicle being required. 1 1 1 0M Pursulando. and P. Pikkarainen, "A Neural Ner.vork Approach to Vehicle Classification wit h Double Induction Loops". Proceedings of the 17th Australian Road Board Conference, Part 4: 33. 11 J Hua. and F. Ardeshir, "Traffic Mark Classification Using Artificial Neural Networks .. Department of Civil Engineering-Universily of Delaware, Newark, Delaware, 1m. 23


2.3 Dedslon-Making As the magnitude and patterns of travel demand increase and change over time, various transportation improvements may be considered for enhancing capacity and improving mobility. The selection and timing of these improvements becomes a particular prob lem especially when the costs and benefits are interdependent and improvements have multi period planning horizons. The Calvert County, Maryland highway network was used as a training example for an ANN used to select improvement projects based on total travel times resulting from combinations of projects and implementation schedules/sequences.n Past decision-making in project selection and resulting link travel times were used as training pairs for the ANN learning process. Compared to conventional modeling for traffic distribution and assignment, which tends to contain fixed relationships in estimating tripmaking characteristics, an ANN approach was found to estimate more accurate link travel times more quickly using fewer input variables for complex network evaluation. ANNs were also found to better account for aggregate effects of minor and major transportation network changes. The behavior of gap acceptance (i.e., when a driver will choose to turn out into the flow of traffic from a "minor" cross-street) by vehicles at intersections with stop signs involves the complex interaction of geometric, traffic, environmental and human factors. Acceptance or rejection of gaps by the "minor" cross-street traffic stream affects the capacity and delay analysis for stop controlled intersections. An ANN was developed to predict driver decision-making in accepting or rejecting gaps at rural low volume, two-way, stop-controlled intersections. The results were compared to the decision-making reliability of a binary logit model for the same application. A total sample size of 5,230 field observations was used as ANN training input and output. Inputs were turning movements in major direction, turning movements in minor direction, queue in minor direction, stop type (complete stop or rolling), presence ofvehicle(s) at the opposite approach, speed in major direction, waiting time, and size of gap. Outputs were acceptance or "C. Wei and P. Schonfeld, "An Artificial Neural Network Approach for Evaluating Transportation Network Improvements", Journal of Advanced Transp<>rtation_, Yol. 17. No. Z, Institute for Transportation at Duke University, Durham, North Carolina, 1994: 129-151. 24


rejection. The ANN performed at an accuracy rate of 88 percent, compared to 79 percent for the binary-logit model. u Route choice is dependent on personal driver characteristics as well as known characteristics of the alternative routes. A driver's decision-making regarding route selection can be reasonably modeled based on data collected from recent experiences. Advanced Traveler Informa tion Systems (A TIS) currently being developed for use under congested freeway conditions (roads ide or in-vehlcle messaging) that are intended to provide real-time advice to drivers will also have to correlate expected driver behavior u nder congested conditions in order to optimize or balance mobility and travel demand within a travel corridor. In thls application, drivers were subjected to 32 days of simulated advice (e.g. when to remain on freeway ver sus exiting at a particular side road) at an average level of75 percent accuracy. The ANN was trained to predict driver decision-making under various scenarios of speed, delay and accuracy of advice so as to better es timate the benefits of advanced traveler information systems. Figure 11 illustrates the A NN for thls analysis. It was found that most drivers made route choices based on their recent experiences. Acceptance of advice from ATlS, particulady if recently given inaccurate information, was very lo w under various prevailing speed-delay conditions. However, over time, drivers began to follow adv ice from the A TIS if it had provided accurate information over recent consecutive tripmaking e v ents This type of ANN application can also be used in reverse fashion t o opt imi ze ramp metering rates in a freeway system. u P.D. Pant, "Neural Network for Gap Acceptance at Stop--Controlled Intersections". Journal of TransportaJion Engineering, American Society of Civil Engineers, Vol. 120. No.3, May/ June 1994: 432 25


s I D E R 0 A D VARIABLES INPUT LA YEA Current Advice Spe ed Delay Ag reement F Sa tisfaction R E E w A Sp eed y Delay HIDDEN LAYER OUTPUT LAYER 1 Freeway o Side Road Figure 11: Neural Network Model for Route C h oice" 2.4 Pattern R ecognitio n and P re d ict i on In the application of pattern recognition and predicti on, severa l areas have been evaluated. In particular, this paper will discuss: (a) traffic flow forecasting, (b) traffic incident detection, estimating origin-{jestination traffic patterns, (d) collision avoidance (e) pavement evaluation and (f) banking and investment options-one of the original applications of neural networks t" H.Y. Yang ct al . "Exploration of Driver Route Choice with Advanced Traveltr fn(onnation Using Neural Network. Concepts".lnstltute of Transportation Studies, University of California-Davis. pres-ented at the 1993 Annual Meeting of the Transportatio n Research Board, Washington, D.C. 26


Given a past series of conditions that have led to specific patterns of traffic flow, an ANN can be trained to predict the occurrence of congested roadway links. This is particularly applicable when large, complex data sets need to be quickly analyzed. Both systemwide and area-specific results can be obtained, using data easily captured by sensors but infrequently combined for analysis. However, great care must be taken to ensure that the training data is not biased in any way and is truly a representative sample. For example, the Virginia Department of Transportation used an ANN model to forecast peak IS-minute traffic volumes. Results were compared to three other (more conventional) modeling techniques, and the ANN was the second most accurate compared to actual counts." With a reasonable capability to predict traffic flows and resulting areas of recurring congestion, an ANN can also be applied to automatically detect traffic incidents resulting from recurring congestion. Efficient freeway manage ment, particularly timely response to incidents, must rely on quickly utilizing all available traffic surveillance data (vid eo and numeric) Real data from previous incidents (speeds, vehicle densities, ramp volumes, etc., by time of day and location) must be used for training to reduce the rate of"false alarms." The ab ili ty to accurately predict origin-destination travel patterns is fundamental to solving any transportation problem. Cause and effect vectors (origin-destination matrix) can be used to train the ANN model in recognizing travel patterns in response to previous changes in population, income, major trip attractors (shopping malls, schools, recreational areas etc.), transit service, development density vehicle registration, location of available facility capacity, etc. Resulting traffic distribution patterns can be predicted with ANN because the "weights" of the ANN are assumed to represent the elements of the origin-destination matrix. Experiments by Nesser, Hoeffiinger, and Nijhuis in Germany in 199116 showed that collision avoidance problems can be successfully tackled by neural networks. A neural network was trai ned from seven different patterns for collision avoidance to provide vehicle control and enhance driver decision-making maneuvers to stay on the roadway. The ANN model was also I} B Smith and M. Oemetsky. "Traffic Flow Forecasting for Intelligent Transportation Systems", Virginia Transporta t ion Resear

able to correct for adverse conditions (high cross winds and slippery road surface) to maintain vehicle control and avoid hitting fiXed objects along the roadside. Vehicles were equipped with 13 sensors, each recording relative position and direction according to changing speed and steering angle. Intelligent cruise control systems now being developed rely on neural networks to recognize significant or sudden vehicle headway changes and automatically adjust throttling to restore conditions to the "desired" headway (separation from lead vehicle). The severity and extent of pavement surface distress conditions can become a subjective evaluation process Pavement management systems require consistent decision-making based on accurate evaluation. Artificial neural networks have been utilized to classify severity and distress from pixel pavement images acquired in the field Distress l evels w e re compiled for alligator cracking transverse and longitudinal cracking, and utility cuts (patched areas). These relative comparisons are then u t ili ze d to rank and schedule pavement restoration activities. In this particular applicat ion of ANNs, computer vision technology and expert systems are integrated in the evaluation and assessment process. Additionally, the application of ANNs in pavement deterioration modeling has been demonstrated to be a reasonab l e deci s ion-making tool when a large data base on pavement conditions is available. The ANN model is trained to predict deterioration based on various samples of pavement condition data (i n puts) that correspond to pavement roughness coefficients (outputs) Road roughness is defined as the deviation from a true planar surface with characteristics that affect ride quality vehicle dynamics, and drainage Predicting the progression of roughness during the design life of pavements is critical to pavement management decision-mak ing. This application of ANNs offers advantages over conventional statistical methods because ANNs can learn from historical pavement conditional data. However, additional work is needed to identify which pavement condition variables best characterize pavement roughness. Within the banking and financial forecasting industry (where ANNs have been applied quite rigorously) many applications for ANNs have been tested. ANNs have been used to predict corporate bond ratings where conv e ntional mathematical modeling techniques have yielded poor results and it was difficult to apply oth e r rule-based artificial intelligence systems. A two-17 N O Attoh-Oki ne. "Predicting Roughness Progression i n Flexible Pavements Using Artificial Neural Networks", Proceedings of the Third lntemolionol Conference on Monoging Povement.s, Yo/. I May 2226, 1994: 55. 28


layered and three-layered ANN predicted 83 percent of the AA bonds correctly.'8 ANNs have been used as a technique for forecasting currency exchange rates, generally viewed as unforecastable because the relationships between financial markets and the global economic system are not well understood. For example, ANNs were able to reveal that hourly (not daily) changes in exchange rates improved forecasting success .19 ANNs have been used as a tool to predict cotporate bankruptcy Financial data were collected on 59 firms that failed during a three-year period, and additional data were gathered on 59 corresponding firms that did not fail during the same period. Twelve performance ratios (e.g., net profits on net sales) were developed for the 118 firms as input to the ANN model. Compared to a discriminant analysis, ANN correctly predicted the firm's success or failure 12 percent more accurately in the first year 50 percent more accurately in the second year, and 33 percent more accurately in the third year.2 0 ANNs have been applied to Standard and Poor s 500 stock index futures to predict market behavior over different training and testing periods. The best ANN model achieved an average contract gain of$63,308 over the 1994 simulated trading period.21 Figure 12 illustrates the 8node input layer 8-node hidden layer, and single-node output layer for the aforementioned ANN model. Input variables include opening price, high price, low price closing price moving average, relative strength index, rate of change, and market (trading day) breakdown. Output is categorized as long (buy) or short (sell). The trained network was able to predict 158 oft he 253 (or 62.5 percent) correct trading decisions "S. Outta and S. Shekhar, "Bond Rating: A No n.Conse"ative Application of Neural Networks", Computer Science Division University of California-Berkeley. 1988. "E.W. Tyree and J.A. Long. "forecasting Currency Exchange Rates: Neural Networks and the Random Walk Model", Department of Business Computing, City University of london United Kingdom. Software Engineering Press, 1995: SJ-61. 20 D.T. Cadden. "Neural Networks and the Mathematics of Chaos-An Investigation of Three Methodologies as Accurate of Corporate Bankruptcy," Management DepartmentQuinnipiac College. Harden, Connecticut. 1991 : 52. J.H. Choi et al., "Trading S &. P 500 Stock Index futures Using a Neural Network." Dankook University, Cheonan, Korea, Software Engineering Press, 1995: 29


Long/Short t Open High Low Oasc MA l I RSI t t ROC MB Figure 12: Neunol Network for S&P Stock Index Futures Tnoding 3. OPPORTUNITIES FOR OTHER APPLICATIONS OF NEURAL NETWORKS 3.1 Background From the previous sec t ion of this paper, it has been pointed out that the best application s f o r neural networks involve situations where there arises a need for classification, panem recogniti on, decision-making, or pattern recognition and prediction. Within the tran sportation industry, there are certainly application areas that have not yet been fully explored that involve the situations noted above. The report will briefl y discuss some of these potential areas and conclude with a detailed description of the intended new application for artificial neural networks: selection and evaluation of the most feasible emplo yee trip reduction program (s). "Ibid 30


3.2 Proposed Appl icati ons Figure 13 recaps the previously discussed transportation applications for ANNs and includes suggestions for other not-yet-tested transportation applications. Figure 13: Ap pli cations for Artificia l Neural Networks Previous l y T ested Suggested for T

specific site developments, and predicting the magnitude of through traffic growth are briefly discussed below. As new developments emerge, their specific traffic generation characteristics are typically treated in a macroscopic sense (i.e., areawide traffic generation). The contribution of new traffic attributed to a specific development is often contested. Traffic impact reports are needed for growth management practice in Florida (see Chapter 163 of the Florida Statutes) in assessing impact fees and access design requirements. Also, the magnitude of newly-generated traffic must be distinguished from passer-by traffic, which has traditionally been very difficult to assess Based on "be fore" and "after" site development traffic data (counts, driver surveys, visual observations, etc.), ANNs could be utilized to predict this type of traffic. When considering areawide traffic forecasts, the changing origin-destination patterns for through traffic (traffic with no origin or destination in the study impact area) are too unwieldy to keep current. Computer traffic models often rely on travel patterns estab lished from travel surveys conducted decades ago. The magnitude of through-area traffic becomes even more difficult to predict because the smaller the study area becomes, the larger the external impact area may become. Through traffic estimates tend to rely on straight extrapolation relationships based only on population-to-border crossing traffic ratios. ANNs can take "before" and "after" socio economic conditions (within the immediate study area and the external impact area) associated with the resulting travel data to better predict these patterns without the use of expensive travel and cordon line surveys. Two other aspects of transportation modeling could also lend themselves to neural network application; estimation of mode choice and vehicle occupancy. Both of these modeling variables can better be predicted at the household (disaggregate) level using a wider assortment of existing socio-economic and demographic data that may not be conducive to the pre-determined relati onships established by the probabilistic and multinomial logit models typically used for this estimation process (e.g., when a new future mode or vehicle type is introduced). Finally, tens of thousands of large (I 00+ employees) companies have attempted to initiate programs to reduce the number of single-occupant vehicle commute trips in response to public air quality regulations. The ability to accurately predict which type ofprogram(s) would be the 32


most effective in reaching a trip reduction target has been a challenge to say the least. Linear regression techniques have been utilized in the past, but only with marginal success. The application of ANNs is ideally suited to solving this particular dilemma. The discussion that follows will describe the background and initial details of this application to date. 3.3 Evaluation of Employee Trip Reduction Programs The leading source of air pollution in many of the more than I 00 urbanized areas across the U nited States with air quality problems is vehicle emissions. In an effort to curb vehicle emissions, the Clean Air Act Amendments (CAAA) of 1990 have targeted large employers in the areas with the worst air quality, requiring them to develop strategies to reduce the number of employees that drive alone to work T he CAAA requires employers in areas designated as severe or extreme non-attainment areas to develop trip reduction plans to increase the average vehicle ridership (A VR) by 25 percent over the area's baseline. According to the Association for Commuter Transportation, over 30,000 employers at nearly 35,000 worksites employing more than 13,000,000 employees are subject to these trip reduction requirements contained in the CAAA Employers face substantial penalties for not registering, failing to file a plan, or failing to implement a plan. Some areas even propose to fine employers for failure to reach their designated vehicle ridership target. Employer policie s related to work location work schedules and parking strongly influence transportation mode choice decisions made by employees. The challenge for employers is to develop a plan that is responsive to the business needs of the organization as well as to the CAAA regulations. The challenge to the regulating agency is to apply fair and consistent guidance to employers i n the development of plans that can be met in a cost-effective manner. Presently there is no method that can predict, with any acceptable degree of accuracy, the effect that plans implemented by employers will have on the mode of transportation used by employees. The large number of variables that may affect the average vehicle ridership at a given company site, as well as a lack of experience in dealing with this relatively new situation, have prevented the development of predictive models for this problem. There have been "human experts" called upon to describe how a spe c ific plan for a company of a given size at a specific 33


location, with a certain nwnber of employees on each shift, will affect the A VRbut their accuracy has been questionable. Consequently, this lack ofhwnan expertise has made it impossible to take an expert system approach to build a software model for this problem. Within the last few years, some organizations have begun collecting data on companies that have implemented plans to attempt to achieve the A VR goals set for them by CAAA. The existence and availability of this data provides an opportunity to develop a model that can predict A VR, given a set of company characteristics and the introduction of one or more incentive plans. As previously mentioned, neural networks and regression analysis are techniques capable of discovering input/output relationships directly from data. Attempts have already been made by some to use regression analysis on this data, but reports indicate that these attempts have failed to build a successful model. No attempts at using neural networks for this task have been reported Our objective is to use neura l network technology with this data to build a model to predict A VR. If our approach is successful, the model would offer the following advantages: I. Streamline the development of trip reduction plans for employers 2. Provide a basis for consistent review by the regulating agencies 3. Improve the cost-effectiveness of transportation demand management programs In addition, the following closely-related applications have been identified that would also benefit from this approach: Mobility management plans Congestion management systems Growth management Concurrency management systems 3.3.1 Data Description Data were collected from several thousand companies from the Los Angeles area, some of which companies had implemented strategies to increase their A VR based on the regulation that was the basis of the CAAA Employee Commute Options (ECO) requirement. Some companies had 34


implemented only one strategy, others had a mixture of strategies, and still others had implemented none. Of those companies with implemented strategies, some were in their first year while others were in their subsequent years of strategy implementation. A large amount of data were collected from these companies. Each record in the database identifies a company site, and there are approximately 27,000 records in the database. Each record has 349 fields. These fields include the following categories of information for each company site: Company site characteristics Alternative modes of transportation available to the employees Incentive and disincentive plans Alternative work arrangements Company physical location Different modes of transportation selected by the employees and the site's measured A VR There are 24 fields that describe specific characteristics for each company site included in the database. These fields include mainly information on the total number of employees at the site and the number of employees in different job classifications There are 12 fields that describe alternative modes of transportation availab l e to the employees They include information such as the number of buses and routes to the site, the number of company-owned buses and vans the number of employee-owned buses and vans, and whether there are bike paths to the site. There are four fields that describe four different time shifts for employees working at each site. They include the number of employees starting work between 6 a.m. and I 0 a.m (the ECO mandate and cost-effectiveness for employer trip reduction pertains only to this period) between 12 noon to 5:30p.m. between 5:30p.m. and 6 p.m., and between 10 p .m. and 12 midnight. There are 62 different types of incentive plans that have been used by these companies to try to improve their AVR. Twenty-eight of these are classified under non-fmancial incentives and 35


include flex-time, preferential parking areas, bike racks, and lockers, etc. The remaining 34 incentives are classified as fmancial incentives and include carpool subsidies, auto services compressed work weeks, etc. Each of these incentives is described in the database by one field indicating whether the plan bas been implemented or not, and three other fields indicating the number of current participants in the plan, the number of target participants, and the dollar value of incentive for the plan. Thus a total of248 fields (or 71 percent of the data fields) are used to describe the company incentive plans in this database. In addition, there are 28 fields in each record that indicate the frequency with which information related to the incentive plans adopted by the site is communicated to employees. Information such as frequency of articles in company newsletters, frequency of letters to employees, frequency of postings on rideshare bulletin boards, etc., are included in these fields. Finally, there are 23 fields that describe the mode of transportation used by employees on each site for commuting to work. This includes information such as the number of employees driving alone, or using different types of carpools or vanpools, or the number of employees riding buses or bicyc les, etc. The average vehicle ridership is calculated using the total number of employees divided by the total number of vehicles used and is included as one of the fields. All of these are considered to be the result of applying the different incentive plans in each particular site. In other words, these fields are considered to be the dependent variables; all the other fields the independent variables. 3-3.2 Neural Network Approach A back propagation neural network is a good candidate to build a model that predicts the effect of incentive plans on A VR using the data described above. Since it is not known beforehand how any of the fields included in the database may affect the A VR, all the fields representing the independent variables are in cluded as inputs to the neural network. Of the 23 fields representing the dependent variables only average vehicle ridership (A VR) is used as the output of the neural network since all other fields within this category are used directly to calculate the A VR Figure 14 illustrates the different categories of inputs to the neural network and output for this particular application. Since each one of the input categories is made up of a number of individual inputs as described above, the entire network includes many more units than those shown in this figure. 36


Company Characteristics Altematlw Modes lnc:entlves and Disincentives A lte matlve Wort< Arrangements Locations Inputs Hidden Layers Output Average Vehicle Ridership Figure 14: Neural Network for the Evaluation of Employee Trip Red uction Programs 3.3.3 Data Preparation The first step in the prepar ation of the data for use with the n eur al network i s to e limin ate obvious inconsis ten cies in the data. These inconsistencies may be the result of fau lt y reponing by the indivi dual compan y sites or mistakes in transcription from the original paper source to the e le ctro nic database 37


The following criteria were used on this data as a fitS! step to eliminate inconsistencies: I. Elimination of a field from the entire database if all of the entries for that field are zero. 2. Elimination of a record if all of the entries in the fields for that record are zero. 3. Elimination of records with invalid data, e.g., if the A VR value is less than one or the number of employees in a given shift is less than zero. After eliminating the inconsistencies in the data, the orig i nal data were reduced to approximately 18,000 records with 250 fields per record It is this data which is being used in the neural network building process. As mentioned previously, even after inconsistencies have been removed from the data, a significant amount of data preprocessing is still required before it can actually be used by the neural nerwork This can be done manually by the neural network programmer or done automatically by the neural network development package In this case, the neural network development package that was selected can perform all of the following preprocessing of the data: Selection of the appropriate representation and scaling function for each fie l d Separation of the original data into a training, test, and validation set 3.3.4 Neural Network Training After preprocessing of the data, the actual building of the neural network model can begi n. This process was previously described in Section Care must be taken during this phase not to overtrain the network with too many neurons in the hidde n layers nor undertrain it with too few. This requires that the performance of the network on both the training and the test sets be monitored carefully as the number of hidden layer neurons is changed. The neural network development package selected for this application facilitates the different aspects of the training 38


pro<:ess by including the following features: I. Allowing the networ k builder to select from two different learning rule s -a Kalman learnin g rule for regression type problems with noisy data (applicable to the predi ctive problem described here) and an adaptiv e gradient learning rule for all other problems. 2. Allowing the network builder to change learning rates, error tolerance noise susceptibility and other parameters to improve network tolerance 3. Autom atically add ing new hidden layer units to improve network performance. 4. Automatically building a number of possible networks and selecting the best. As the neural netw ork train ing process gets underway with approximat el y 18 ,000 employer si te characteri stic data r ecords for the Los Angeles area, answers will be found for the following: The accuracy with which neural network s can predict A VR value s for these companes Which combination of company characteristics can best predict A VR After a neural netw ork model for A VR pred iction is built, it will be field t ested at a selected site in Florida. This tes ting w ill addres s the following: How accurate the neural network model is in predicting A VR for companies which are not from the Los Angeles area How effective the neural network model is as a tool to evaluate pro spective plans designed to achieve a target A VR 39


3.3.5 Neural Network Application to Trip Reduction -Benefits to Florida As mentioned previously, the 1991 ISTEA and the Clean Air Act Amendments of 1990 stress the importance of multimodal solutions to transportation problems and elevate the importance of transit and travel demand management (TOM) in the transportation planning process. MPOs are developing and implementing Congestion Management System (CMS) plans that can be used to provide information on transportation system performance and alternative strategies (like TDM) to alleviate congestion and enhance the mobility of people and goods. The diversity of services and, until recently, the lack of consistent reporting requirements among commuter assistance programs offered few alternatives for measuring TDM performance or predicting impacts in Florida. The neural network model will satisfY this reporting deficiency The challenge for MPOs is to develop a tool to help quantifY the impacts of employer-based strategies on traffic congestion and mobility. The challenge to the FOOT DCA, and implementing agencies is to apply equitable and consistent guidance to MPOs on what can be expected under various scenarios or what it would take to meet a given goal (e.g., areawide or corridor average vehicle ridership target) in a cost-effective manner. This project will apply neural network technology to streamline the development of trip reduction plans for employers and developers in response to concurrency requirements, and provide a basis for consistent review by the regulating agencies. The neural network model that is developed will allow FOOT, DCA, MPOs, and commuter assistance programs to estimate the impact of trip reduction plans for a single worksite or multiple employers based on existing site characteristic data. The FDOT Commuter Assistance Program Office will be able to proactively assist local governments in evaluating CMS strategies, and predict the likely success of various mixes of trip reduction programs. Local commuter assistance programs will be able to use the neural network model software to develop prototype trip reduction plans as guidance for employers in their service area to reduce parking demands, decrease onsite traffic congestion, 40


and enhance overall mobility. DCA and local governments can assess the traffic impact of strategies to reduce trips for new development proposals. Lastly, developers can benefit from the expected reduction in the regulatory staff time in the review of employer trip reduction plans, and receive more consistent reviews of plan submittals 41


4. Conclusion Neural networks simulate the abilities ofthe brain. The brain is composed of hundred s of billions of cells called neurons, which are in themselves microscopic biological computers sending information back and forth to each other. Neurons communicate with each other and are able to learn from experience (or example), no t from programming. Artificial neural networks (ANNs) synonymous with neural networks, represent a form of computer intelligence and work similar l y to how the human brain works, but on a very reduced scale ANNs can enhance our capability of learning by more carefully studying the past, and are t olerant of imperfect or incomplete data, and do not rely on rules or formulas ANNs currently have been proven successful i n many practical applications. ANNs are known to be good at classification evaluation optimization decision-making, pattern recognition, behavior trend prediction, image analysis, filtering, and modeling contro l systems. Even though t hese applications are diverse and seem to have very little in common, they all can be learned by comparing known inputs and resulting outputs for a large number of examples. The transportation industry bas been a popular application domain for ANNs Existing transportation applications of ANNs include roadway classification from visual images, vehicle classification using signals from inductive loop detectors in the pavement, prioritizing pavement markings replacement from deterioration rates taken from visual surveys, selection of transportation improvement projects, "gap acceptance" behavior of cross-street vehicles waiting at stop signs predicting driver route choice, traffic flow forecasting, traffic incident detection, advanced collision avoidance systems, estimating origin-destination patterns, and pavement surface distress evaluation. Several as yet untried applications of ANNs in transportation discussed in this report inclu de improving the estimation of "passer-by" traffic for new site developments and refi ne ment of forecasting the magnitude of through traffic growth for majo r urban areas. 42


Following a detailed introduction of ANNs, this report sununarizes the specific intended application to date of ANNs in the evaluation and selection of employee trip reduction programs. Approximately 18,000 data records of employee site characteristics for the Los Angeles area (input), each with 250 fields per record, are being utilized to build, train and evaluate various ANNs. Performance and selection of best ANN model will be based on neural network output compared to actual average vehicle ridership (A VR) observations. Finally, and most importantly the application of neural network technology will streamline the development of trip reduction plans for employers and developers, and provide a basis for equitable, consistent review by the regulatory agencies. 43


BIBLIOGRAPHY Arain, Manzoor A., Raglan Tribe Edgar An, and Chris Harris. "Action Planning for the Collision Avoidance System Using Neural Networks." Paper presented at the Intelligent Vehicle Symposium, Tokyo, July \4-16, 1993. Attoh -Okine, Nii 0. "Predicting Roughness Progression in Flexible Pavements Using Artificial Neural Networks Paper presented at the Third I nternational Conference on Managing Pavements." Vol. I, San Antonio, Texas May 22-26, 1 994. Bloch! Bernard and Lampros Tsinas. "Road Recognition in Traffic Scenes by Neural Networks. Paper presented at the 1st International Conference on Road Vehicle Automation, Bolton UK, 1993. Bowen, James E. "Using Neural Nets to Predict Several Sequential and Subsequent Fu tur e Values from Time Series Data." IEEE 1 991. Dougherty M.S., H.R. Kirby and R.D. Boyle. "Artificial Intelligence Applications T o Traffic Eng ineering." Artificiallntelligence Applications to Traffic Engineering, (1994): 233 -250. Cadden, David T. "Neural Networks and the Mathematics of Cahos-An Investigation of These Methodologies As Accurate Predictors of Corporat e Bankruptcy. IEEE, (I 991 ): 52-57. Carpen ter, William C., and Jean-Francoi s Barthelemy "Common Misconceptions About Neural Networks As Approximators. Journal of Computing in Civil Engineering, Vol. 8, No. 3 (J uly 1994): 345-359. Chang, Chin-Ping Edmond, and Kunhuang Huarng. "In cid ent Detection Using Advanced Technologies." Paper presented at the Transportation Research Board 72nd Annual Meeting, Washington, D. C., January 1993. Choi Jae Hwa, Myung Kee Lee, and Moon-Whoan Rhce. "Trading S & P 500 Stock Index Futures Using a Neural Network ." Software Engineering Press ( 1995): 63-72. Clark, Stephen D., MarkS. Dougherty, and Howard R. Kirby. "The Us e of Neura l Networks and Time Series Models For Short Term Traffic Forecastin g : A Comparative Study Paper presented at the Summer Annual Meeting, University of Manchester, Institute of Science and Technology England, September 13-17 1993. Collins Edward, Sushmito Ghosh, and Christopher Scofield. "An Application of a Multiple Neural Network Learning System to Emulation of Mortgage Underwriting Judg ements." Nester Inc., Providence, Rhode Island. 44


Coy, Steven, Ravikwnar Balasubramanian, Bruce Golden, Ohseok Kwaon, and Heshmat Beirjandi. "Using Neural Networks to Predict the Degree of Underpricing of an Initial Public Offering." Paper presented in Proceedings of3rd International Conference on AI Applications on Wall Street, New York City, June 6-9, 1995. Dougherty, MarkS., Howard R Kirby, and Roger D. Boyle. "The Use ofNe.ural Networks to Recognise and Predict Traffic Congestion." Modified version of paper presented at the 6th World Conference on Transport Research, Lyon June 1992. Dutta, Soumitra, and Shasbi Sbekbar "Bond Rating: A Non-Conservative Application of N eural Networks." Paper presented in Proceedings of the International Conference on Neural Networks, Vol. II, !988. Fagbri, Ardesbir and Jiuyi Hua "Evaluation of Artificial Neural Network Applications in Transportation Engineering." Trans portat ion Research Record 1358, 1992 Feigenbaum, E., P. McCorduck, and H. Penny Ni. The Rise ofrhe Expert Company, Times Books, New York, New York, 1988. Garavaglia, Susan. ''N eural Network Mode l Performance: Comparing Results in Photo Finish Situations." Software Engineering Press, 1995. Hammerstrom, Dan. "Neural Networks at Work." IEEE Specrrum (June 1993): 26-32. Hammerstrom, Dan. "Working with Neural Networks." IEEE Spectrum (July 1993): 46-53. Haykin, S. Neural Networks, MacM illan New York New York, 1994 Hiemstra, Ypke and Christian Haefke. "Predicting Quarterly Excess Returns: Two Multilayer Perceptron T raining Strategies." Software Engineering Press, 1995. Hinton, Geoffrey E. "How Neu ral Networks Learn From Experience Scientific American (Septe mber 1992): 145-151. H siao Cbien-Hua, Ching-Teng Lin, and Michael Cassidy. "Application of Fuzzy Logic and Neural Networks To Automatically Detect Freeway Traffic Incidents." Journal ofTransporration Engineering, Vol. 120, No.5 (SeptemberfOctober 1994): 753-772. Hua, Jiuyi, and Ardeshir Fagbri. "Applications of Artificial Neural Networks to Intelligent Vehicle Highway Systems. Transportation Research Record 1453. NO DATE. 45


Hua, Jiuyi, and Ardeshir Fagbri. "Developm ent of Neural Signal Control System Toward I ntelligent Traffic Signal Control." Paper presented at Transportation Research Board 74th Annual Meeting, Washington, D.C., January 22-28, 1995. Hua, Jiuyi, and Ardeshir Faghri. "Dynamic Traffic Pattern Classification Using Artificial Neural Networks ." Transportation Research Record 1399, nd. Ivan, John N ., Joseph L. Schafer, FrankS. Koppelman, and Lina L.E. Massone. "RealTime Data Fusion for Arterial Street Incident Detection Using Neural Networks." Paper presented at the Transportation Research Board 74th Annual Meeting, Washington, D.C., January 22-28, 1995. Jang, Gia-Shuh, Feipei Lai, Bor-Wei Jiang, and Li-Hua Chien. "An Intelligent Trend Prediction and Reversal Recognition System Using Dual-module Neural Networks. IEEE, 1991. Jurgensohn, T., C. Raupach, and H.P. Willumeit. "A Model of the Driver Based on Neural Networks." Technische Universitat Berlin, Germany. nd. Kalapatapu, Ravi, and M.J. Demetsky. "Modeling Schedule Deviations of Buses Using Automatic Vehicle Location Data and Artificial Neural Networks." Paper presented at the T ransporta tion Research Board 74th Annual Meeting, Washington, D.C., January 22-28, 1995. Kaseko, Mohamed S., Zhen -Ping Lo, and Stephen G. Ritchie. "Comparison of Traditional and Neural Classifers For Pavement-Crack Detection." Journal ofTransportation Engineering Vol. 120 July/August 1994. Khasnabis, Snehamay, Tomasz Arciszewski, Syed Khurshidul Hoda, and Wojciech Ziar ko. "Urban Rail Corridor Control Through Machine Learning: An Intelligent Vehicle-Highway System Approach." Transportation Research 1453, nd. Kikuchi, Shinya, Raman Nanda, and Vijay Perincherry. "A To Estimate T rip 0-D Patterns Using A Neural Network Approach." Transportation Planning and Technology, Vol. 17 (I 993): 51-65. Kirby, H R., R.D. Boyle, and M S. Dougherty. "Recognition of Road Traffic Patterns Using Neural Networks." Paper presented at the SERC Sponsored Conference, Brunei, University of West London, UK, September 1993 Krovi, Ravi, Balaji Rajagopalan, Akhilesh Chandra, and Ned Kumar "Financial Classification: Performance of Neural Networks in Lepotokurtotic Distributions. Software Engineering Press, 1995. Lawrence, Jeannette. Introduction to Neural Networks: Design. Theory, and Applications. Nevada City: California Scientific Software Press, 1993 46


Lyons, Glenn. "Calibration and Validation of a Neural-Network Driver Decision Model." Traffic Engineering+ Control, Vol. 36, No. I 1995. MacKeown W.P. J., P. Greenway, B T. Thomas and W. A Wright. Engineering Applications of Artificial Inte llig ence, Vol. 7, No. 2 (1994): 169. Mead, W.C., H.N. Fisher, R.D. Jones, K.R. Bisset, and L.A. Lee. "Application of Adaptive and NelUlll Network Computational Teclmi ques to T raffic Volume and Classification Monitoring." Transportation Research Record 1466 nd. Myer, R. Clas sical and Modem Regression wirh Applications, PWS-KENT, 1990 Neusser, S., B. Hoeffiinger, J.A.G. Nijhuis, A. SiggeJkow and L. Spaanenburg. Institute for Microelectronics Stuttigart, Gennany. Paper presented at International Symposium on Automotive Technology and Automation, Florence, Italy, May 20-24, 1991. Overboom, Annerruck "All Roads Lead to Rome ... Forecasting Slot Times for Alternatives Routes using Neural Network Techniques." KLM Royal Dutch Airlines Operations Research Department Paper presented at the AGIFORS 32nd Annual Symposium, Budapest, Hungary, October 4-9, J 992. Pant, Prahlad D., and Purushothaman Balakrishnan. "Neural Networ k For Gap Acceptance At Stop Controlled Intersections." Journal a/Transportation Engineer ing, Vol 120, No.3 (May/June 1994): 432. Pomerleau, Dean A "Neural Networks for Intelligent Vehicles. "Paper prepared for the Intelligent Vehicles 93 Symposium, Tokyo July 14-16, 1993. Pursula, Matti, and Petri Pikkarainen. "A Neural Network Approach To Vehicle Classification With Double I nduction Loops Paper presented at the 17th ARRB Conference (Part 4) A ugust 199 4 Ritchie, Stephen G. Mohamed Kaseko and Behnam Bavarian "Development of an Intelligent System for Automated Pavement Evaluation." Transportation Research Record 131l, 1991. Schonfeld, Paul M., and Chien-Hung Wei "An Artificial Neural Network Approach for Evaluating Transportation Network Improvements Journal of Advanced Transportation, Vol. 27, No.2, 129-151. Smith, Brian L ., and Michael J. Demetsky "Short-Tenn Traffic Flow Prediction: Neu ral Network Approach." Irans!Xlrtation Research Bt:j;Ord 1453, nd. 47


Smith, Brian L., and Michael J. Demetsky. "Traffic Flow Forecasting For Intelligent Transportation Systems." Virginia Transportation Research Council, Final Report, June 1995. Stephanedes Yorgos J. and Xiao Liu "Neural Networks In Freeway Control." Transportation Research Board. National Research Council. "Intelligent Transportation System: E valuation Driver Behavior, and Artificial Intelligence Washington, D.C. : National Academy Press, 1994. Transportation Research Board National Research Council. "Expert Systems for Transportation Applications." Washington, D.C. 1987. Tyree, Eric W., and J.A. Long. "Forecasting Currency Exchange Rates: Neural Networks and the Random Walk Model." Software Engineering Press, 1995. Utans, Joachim, and John Moody. "Selecting Neural Networks Architectures via the Prediction Risk: Application to Corporate Bond Rate Prediction IEEE, 1991. Xiong Yihua, and Jerry B Schneider. "Transportation Network Design Using A Cumulative Genetic Algorithm and Neural Network." Transportation Research Record 1364, nd. Yang, H ai, Ryuichi Kitamura, Paul P. Jovanis Kenneth Vaughn, Mohamed Abdel Aty, and Prasuna DVG Reddy. "Exploration of Driver Route Choice with Advanced Traveler Information U sing Neural Network Concepts." Paper presented at the Transportation Research Board 72nd Annual Meeting, Washington, DC, January 10-14, 1993. Zang, Hongjun Stephen G. Ritchie, and Zhen-P ing Lo. "Macroscopi c Modeling of Freeway Traffic U sing an Artificial Neural Network." Paper presented at the Transportation Research Board 72nd Annual Meeting, Washington, DC, January 10-14, 1993. 48