USF Libraries
USF Digital Collections

Development of models for understanding causal relationships among activity and travel variables

MISSING IMAGE

Material Information

Title:
Development of models for understanding causal relationships among activity and travel variables
Physical Description:
Book
Language:
English
Creator:
Ye, Xin
Publisher:
University of South Florida
Place of Publication:
Tampa, Fla
Publication Date:

Subjects

Subjects / Keywords:
Travel behavior
Discrete choice model
Econometric modeling
Endogenous variable
Mixed logit model
Discrete-continuous model
Dissertations, Academic -- Civil Engineering -- Doctoral -- USF   ( lcsh )
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Abstract:
ABSTRACT: Understanding joint and causal relationships among multiple endogenous variables has been of much interest to researchers in the field of activity and travel behavior modeling. Structural equation models have been widely developed for modeling and analyzing the causal relationships among travel time, activity duration, car ownership, trip frequency and activity frequency. In the model, travel time and activity duration are treated as continuous variables, while car ownership, trip frequency and activity frequency as ordered discrete variables. However, many endogenous variables of interest in travel behavior are not continuous or ordered discrete but unordered discrete in nature, such as mode choice, destination choice, trip chaining pattern and time-of-day choice (it can be classified into a few categories such as AM peak, midday, PM peak and off-peak). A modeling methodology with involvement of unordered discrete variables is highly desired for better understanding the causal relationships among these variables. Under this background, the proposed dissertation study will be dedicated into seeking an appropriate modeling methodology which aids in identifying the causal relationships among activity and travel variables including unordered discrete variables. In this dissertation, the proposed modeling methodologies are applied for modeling the causal relationship between three pairs of endogenous variables: trip chaining pattern vs. mode choice, activity timing vs. duration and trip departure time vs.mode choice. The data used for modeling analysis is extracted from Swiss Travel Microcensus 2000. Such models provide us with rigorous criteria in selecting a reasonable application sequence of sub-models in the activity-based travel demand model system.
Thesis:
Dissertation (Ph.D.)--University of South Florida, 2006.
Bibliography:
Includes bibliographical references.
System Details:
System requirements: World Wide Web browser and PDF reader.
System Details:
Mode of access: World Wide Web.
Statement of Responsibility:
by Xin Ye.
General Note:
Title from PDF of title page.
General Note:
Document formatted into pages; contains 203 pages.
General Note:
Includes vita.

Record Information

Source Institution:
University of South Florida Library
Holding Location:
University of South Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001920199
oclc - 187912549
usfldc doi - E14-SFE0001842
usfldc handle - e14.1842
System ID:
SFS0026160:00001


This item is only available as the following downloads:


Full Text
xml version 1.0 encoding UTF-8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam Ka
controlfield tag 001 001920199
003 fts
005 20080107130409.0
006 m||||e|||d||||||||
007 cr mnu|||uuuuu
008 080107s2006 flu sbm 000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14-SFE0001842
035
(OCoLC)187912549
040
FHM
c FHM
049
FHMM
090
TA145 (ONLINE)
1 100
Ye, Xin.
0 245
Development of models for understanding causal relationships among activity and travel variables
h [electronic resource] /
by Xin Ye.
260
[Tampa, Fla] :
b University of South Florida,
2006.
3 520
ABSTRACT: Understanding joint and causal relationships among multiple endogenous variables has been of much interest to researchers in the field of activity and travel behavior modeling. Structural equation models have been widely developed for modeling and analyzing the causal relationships among travel time, activity duration, car ownership, trip frequency and activity frequency. In the model, travel time and activity duration are treated as continuous variables, while car ownership, trip frequency and activity frequency as ordered discrete variables. However, many endogenous variables of interest in travel behavior are not continuous or ordered discrete but unordered discrete in nature, such as mode choice, destination choice, trip chaining pattern and time-of-day choice (it can be classified into a few categories such as AM peak, midday, PM peak and off-peak). A modeling methodology with involvement of unordered discrete variables is highly desired for better understanding the causal relationships among these variables. Under this background, the proposed dissertation study will be dedicated into seeking an appropriate modeling methodology which aids in identifying the causal relationships among activity and travel variables including unordered discrete variables. In this dissertation, the proposed modeling methodologies are applied for modeling the causal relationship between three pairs of endogenous variables: trip chaining pattern vs. mode choice, activity timing vs. duration and trip departure time vs.mode choice. The data used for modeling analysis is extracted from Swiss Travel Microcensus 2000. Such models provide us with rigorous criteria in selecting a reasonable application sequence of sub-models in the activity-based travel demand model system.
502
Dissertation (Ph.D.)--University of South Florida, 2006.
504
Includes bibliographical references.
516
Text (Electronic dissertation) in PDF format.
538
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
500
Title from PDF of title page.
Document formatted into pages; contains 203 pages.
Includes vita.
590
Adviser: Ram M. Pendyala, Ph.D.
653
Travel behavior.
Discrete choice model.
Econometric modeling.
Endogenous variable.
Mixed logit model.
Discrete-continuous model.
690
Dissertations, Academic
z USF
x Civil Engineering
Doctoral.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.1842



PAGE 1

Development of Models for Unders tanding Causal Relationships Among Activity and Travel Variables by Xin Ye A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Civil and Environmental Engineering College of Engineering University of South Florida Major Professor: Ram M. Pendyala, Ph.D. John J. Lu, Ph.D., P.E. Manjriker Gunaratne, Ph.D., P.E. Xuehao Chu, Ph.D. Gabriel Picone, Ph.D. Date of Approval: May 30, 2006 Keywords: travel behavior, discrete choice model, econometric modeling, endogenous variable, mixed logit model, discrete-continuous model Copyright 2006, Xin Ye

PAGE 2

Acknowledgments I am indebted to my academic advisor, Dr. Ram M. Pendyala, for his guidance, patience, encouragement and support throughou t my five-year Ph.D. study. I thank Dr. John Lu, Dr. Elaine Chang, Dr. Manjri ker Gunaratne, Dr. Xuehao Chu, Dr Gabriel Picone and Dr. Steve Polzin for serving on my committee and providing their valuable suggestions. I am grateful to Dr. Chandra R. Bhat and Mr. Abdul Pinjari in Department of Civil, Architectural & Envi ronmental Engineering at University of Texas at Austin. Abdul used to be one of my colleagues and cl ose friends at USF. After he joined the Ph.D. program at UT Austin, we still keep in touch and continue our friendship. At the early stage of this study, I greatly benefits from his help and discussion with him. I also want to express my gratitude to my colleague, Amlan Banerjee and my schoolmate, Pan Liu. Pan gave me an important suggestion at the final stage of this study. In addition, I would like to acknowledge Dr. Giovanni Go ttardi in Jenni+Gottardi AG, Zurich, Switzerland for providing me with comprehens ive data used in th is study. Finally, I dedicate my dissertation effort to my pare nts Bangjie Ye and Rong Chen and my wife Wenjia Guan. Without their support it would never happen.

PAGE 3

Table of Contents List of Tables iv List of Figures vii Abstract viii Chapter One: Introduction 1 1.1 Background 1 1.2 Objectives 8 1.3 Scope 9 1.4 Outline 10 Chapter Two: Modeling Methodology 11 2.1 Background 11 2.1.1 Review of Lin ear Regression Model 11 2.1.2 Review of Simultaneous Equations Model (Joint Relationship Among Continuous Variables) 13 2.1.3 Review of Structural Equations Model (Causal Relationship Among C ontinuous Variables) 15 2.1.4 Review of Discrete Choice Model 21 2.1.5 E ndogenous Variable in Discre te Choice Model 26 2.2 Modeling Methodology for Causal An alysis in Discrete Choices (Discrete Choice Modeling Methodol ogy with Endogenous Variable) 27 2.2.1 Recursive Bivariate Probit Model (Causal Relationship Between Two Binary Choice Variables) 27 2.2.2 Simultaneous Equations Model Using Lee Transformation (Causal Relationship Between On e Continuous Variable and One Multinomial Choice) 32 2.2.3 Mixed Simultaneous Equations Model Using Flexible Error Structure 39 2.2.3.1 Mixed Discrete-continuous Mode l (Causal Relationship Between One Continuous Variable and One Multinomial Choice Variable) 39 2.2.3.2 Mixed Binary-multinomial Choice Model (Causal Relationship Between One Binary Choice Variable and One Multinomial Choice Variable) 47 2.3 Non-nested Test for Choosing Alternative Causal Structure 53 i

PAGE 4

2.3.1 Cox Test for Separa te Families of Hypothesis 53 2.3.2 Non-nested Test in Discrete Choice Model 53 2.3.3 Extension to Discrete-continuous Model System 56 2.4 Monte Carlo Study for Bivariat e Probit Model and Lee Model 59 2.4.1 Introduction 59 2.4.2 Monte Carlo Studies for Bivariate Probit Model 60 2.4.3 Monte Carlo Studies for Recu rsive Bivariate Probit Model 63 2.4.4 Monte Carlo Studies for Lee Model 72 2.4.5 Summary 78 Chapter Three: Dataset Pr eparation and Description 80 3.1 Introduction to Swiss Travel Survey 80 3.2 Dataset Description at Household Level 80 3.3 Dataset Description at Person Level 82 3.4 Dataset Description at Trip Level 83 Chapter Four: Empirical Estimation Results 86 4.1 Causal Models Between Tr ip Chaining and Mode Choice (Recursive Bivariat e Probit Model) 86 4.1.1 Background 86 4.1.2 Dataset Preparation and Desc ription for Modeling Analysis 89 4.1.3 Model Estimation Results 94 4.1.3.1 Estimation Results for Non-work Tours 94 4.1.3.2 Estimation Results for Work Tours 100 4.1.4 Model Performance Comparisons Based on Non-nested Test 103 4.1.5 Discussions and Conclusions 106 4.2 Causal Models Between Activ ity Timing and Activity Duration (Mixed Discrete-continuous Model and Lee Model) 111 4.2.1 Background 111 4.2.2 Data Preparation and Description for Model Analysis 113 4.2.3 Model Estimation Results 118 4.2.3.1 Estimation Results for Non-commuters 118 4.2.3.2 Estimation Results for Commuters 124 4.2.4 Model Performance Comparisons Based on Non-nested Test 135 4.2.5 Discussions and Conclusions 136 4.3 Causal Models Between Trip Timing and Mode Choice (Mixed Binary-multinomial Choice Model) 142 4.3.1 Background 142 4.3.2 Dataset Preparation and Descrip tion for Modeling Analysis 145 4.3.3 Model Estimation Results 150 4.3.3.1 Estimation Results for Non-commuters 150 4.3.3.2 Estimation Results for Commuters 156 4.3.4 Model Performance Comparison Based on Non-nested Test 161 4.3.5 Discussions and Conclusions 162 ii

PAGE 5

Chapter Five: Conclusions and Discussions 168 5.1 Contribution to the Field 168 5.1.1 Methodological Contribution 168 5.1.2 Behavioral Contribution 169 5.1.3 Practical Contribution 169 5.1.4 Empirical Contribution 170 5.2 Future Research Direction 171 References 173 Bibliography 180 Appendices 182 Appendix A: Gauss Code for Genera ting and Storing Halton Sequences 183 Appendix B: Gauss Code of Mixed Discrete-continuous Model (Exemplified by Non-commuter Model Where Time-of-day Choice Affects Activity Duration) 185 Appendix C: Gauss Code of Discrete-continuous Model Based on Lee Transformation (Exemplified by Non-commuter Model Where Time-of-day Choice Affects Activity Duration) 190 Appendix D: Gauss Code of Mixed Binary-multinomial Choice Model (g i Fixed at 1, Exemplified by Non-commuter Model Where Binary Time-of-day Choice Affects Multinomial Mode Choice) 193 Appendix E: Gauss Code of Mixed Binary-multinomial Choice Model (|f i | = |g i |, Exemplified by Non-commut er Model Where Binary Time-of-day Choice Affects Mu ltinomial Mode Choice) 198 About the Author End Page iii

PAGE 6

List of Tables Table 1.1 Summary of Bias in When x 1 2 is Omitted 13 Table 2.1 Statistics of Estimators for Bivari ate Probit Model 64 Table 2.2 Statistics of Estimators for Recursive Bivariate Probit Model 69 Table 2.3 Statistics for Nonnested Test Application 72 Table 2.4 True Values of Parameters in the Model 74 Table 2.5 Statistics of Estimators from Joint Estimation Procedure (without Endogenous Variables) 76 Table 2.6 Statistics of Estimators fr om Recursive Estimation Procedure (without Endogenous Variables) 77 Table 2.7 Statistics of Es timators from Joint Estimation Procedure (with Endogenous Variables) 77 Table 2.8 Statistics of Estimators fr om Recursive Estimation Procedure (with Endogenous Variables) 79 Table 3.1 Household Characteristics of Swiss Travel Microcensus 2000 81 Table 3.2 Person Characteristics of Sw iss Travel Microcensus 2000 83 Table 3.3 Trip Characteristics of Swiss Travel Microcensus 2000 (Trip Purpose Distributio n by Trip Mode) 84 Table 3.4 Trip Characteristics of Swiss Travel Microcensus 2000 (Trip Mode Distribution by Trip Purpose) 84 Table 4.1 Household Characteristics of Swiss Travel Microcensus 2000 and Zurich Subsamples 91 Table 4.2 Person Characteristics of Swiss Travel Microcensus 2000 and Zurich Subsamples 92 Table 4.3 Crosstabulation of Mode Choice and Tour Type for Non-work Tours 93 iv

PAGE 7

Table 4.4 Crosstabulation of Mode Choice and Tour Type for Work Tours 93 Table 4.5 Non-work-tour Model Variable De scription and Statistics (N = 4901) 96 Table 4.6 Non-work-tour Model 97 Table 4.7 Work-tour Model Variable Desc ription and Statistics (N = 1711) 101 Table 4.8 Work-tour Model 104 Table 4.9 Comparisons of Goodness-of-fit of Recursive Bivariate Probit Models 106 Table 4.10 Household Characteristics of Swiss Travel Microcensus 2000 and Sample for Model of Maintenance Activity Duration and Time-of-day Choice 115 Table 4.11 Person Characteristics of Swi ss Travel Microcensus 2000 and Sample for Model of Activity Duration and Ti me-of-day Choice 116 Table 4.12 Description of Endogenous Variables in Non-commuter Sample 117 Table 4.13 Description of Endogenous Variables in Commuter Sample 117 Table 4.14 Description and Definition of Variables in Timing-duration Model 125 Table 4.15 Non-commuter Model (Duration Time-of-day) 126 Table 4.16 Non-commuter Model (Time-of-day Duration) 127 Table 4.17 Commuter Model (Duration Time-of-day) 133 Table 4.18 Commuter Model (Time-of-day Duration) 134 Table 4.19 Simulation-based Hypothesis Test for Error Covariance of Identified Mixed Discrete-continuous Models 135 Table 4.20 Comparison of Goodness-of-fit of Timing-duration Models 137 Table 4.21 Household Characteristics of Swiss Travel Microcensus 2000 and Sample for Model of Mode Choice and Time-of-day Choice 146 Table 4.22 Person Characteristics of Swiss Travel Microcensus 2000 and Sample for Model of Mode Choice and Time-of-day Choice 147 Table 4.23 Crosstabulation of Mode Choice and Time-of-day Choices for Non-commuters 148 v

PAGE 8

Table 4.24 Crosstabulation of Mode Choice and Time-of-day Choices for Commuters 149 Table 4.25 Variable Description in Timing-mode Choice Model 152 Table 4.26 Non-commuter Model (Mode Time-of-day) 155 Table 4.27 Non-commuter Model (Time-of-day Mode) 157 Table 4.28 Commuter Model (Mode Time-of-day) 159 Table 4.29 Commuter Model (Time-of-day Mode) 160 Table 4.30 Comparison of Goodness-of-fit of Timing-mode Choice Models 162 vi

PAGE 9

List of Figures Figure 1.1 Joint Relationship Between Mode Choice and Destination Choice 7 Figure 1.2 Causal Rela tionship Between Mode Choice and Trip Chain Type Choice 8 Figure 2.1 Distribution of z (N = 930 and N1 = 1000) 68 Figure 4.1 Diagram of Consistent Causal Relationship Identified by Joint Timing-duration Model for Non-commuters 138 Figure 4.2 Diagram of Consistent Caus al Relationship Id entified by Joint Timing-duration Model fo r Commuters 138 Figure 4.3 Diagram of Causal Relati onship of Mixed Binary-multinomial Choice Models for Non-commuters 164 Figure 4.4 Diagram of Causal Relationship of Mixed Binary-multinomial Choice Models for Commuters 164 vii

PAGE 10

Development of Models for Unders tanding Causal Relationships Among Activity and Travel Variables Xin Ye ABSTRACT Understanding joint and causal relationships among multiple endogenous variables has been of much in terest to researchers in the field of activity and travel behavior modeling. Structur al equation models have b een widely developed for modeling and analyzing the cau sal relationships among travel time, activity duration, car ownership, trip frequency and activity frequency. In the model, travel time and activity duration are treated as conti nuous variables, while car ownership, trip frequency and activity frequency as ordered discrete variables. However, many endogenous variables of interest in travel behavior ar e not continuous or ordered discre te but unordered discrete in nature, such as mode choice, destination c hoice, trip chaining pattern and time-of-day choice (it can be classified in to a few categories such as AM peak, midday, PM peak and off-peak). A modeling methodology with involve ment of unordered discrete variables is highly desired for better understanding the causal relationships among these variables. Under this background, the proposed dissertation study will be dedicated into seeking an appropriate modeling methodology which aids in identifyi ng the causal relationships among activity and travel variables incl uding unordered discrete variables. viii

PAGE 11

In this dissertation, the proposed mo deling methodologies are applied for modeling the causal relationship between three pairs of endogenous variables: trip chaining pattern vs. mode choice, activity timi ng vs. duration and trip departure time vs. mode choice. The data used for modeling analysis is extracted from Swiss Travel Microcensus 2000. Such models provide us with rigorous criteria in selecting a reasonable application sequence of sub-models in the activity-based travel demand model system. ix

PAGE 12

Chapter One: Introduction 1.1 Background Understanding joint and causal relationshi ps among multiple dependent variables has been of much interest to researchers in the field of activity and travel behavior modeling (Fujii and Kitamura, 2000; Golob, 2003). An important reason for this interest is the transition of travel demand model fr om trip-based approach to activity-based approach and from aggregate le vel to disaggregate level. Trip-based approach, namely four-step travel demand model consisting of trip generation, trip distribution, mode split and ne twork assignment, played a key role in transportation planning over the past decades. Nowadays, as the planning emphasis shifts from infrastructure construction to transportation system management, travel demand management (TDM) and transportation contro l measure (TCM), the trip-based model, which solely focuses on trip frequencies, is insufficient to evaluate emerging policies of transportation planning. The performance of policies is unlikely to be measured by the number of trips being suppressed or induce d, which are estimated in the trip-based modeling system. However, activity-based model is considered a powerful tool for evaluating these policies in a reasonable fram ework. Built upon the theory that travel is derived demand from activity, activity-based ap proaches directly model and estimate the activity variables (e.g. activity ti ming and duration) and then deri ve the travel variables. 1

PAGE 13

Thus, activity-based model is reflective of the change in activity pattern as well as in travel pattern in response to planning policies. It is considered that peoples activity pattern is a better measurement for evaluati ng policies than trip frequency due to its potential to reflect peoples life style and life quality. The other tendency in the development of travel demand mo del is its moving from aggregate level to disaggregate le vel in developed countries over the past decades. The most conventional travel dema nd model was developed at aggregate level, in which traffic analysis zone (TAZ) is treated as basic analytical unit. In such models, the independent variables generally inco rporate demographic and socio-economic characteristics and geographical information at zone level, while the dependent variable is the total number of trips originated from TAZ. The general argument on aggregatelevel model is that zones do not really make trips but only people living in the zone do. The model aggregating the total number of tr ips by the people in th e zone as dependent variables lacks of a sound behavioral base due to its disregarding the interactions and constraints among the trips made by the same i ndividual or by the individuals in the same household. On the other side, the disaggreg ate-level travel de mand model directly considers the each household or individual travel er as basic analytical unit instead of TAZ in aggregate-level model, thus the rules governing peoples travel behavior can be reflected in the model. Cooperating with activity-based approach, disaggregate-level model can be developed at indi vidual level for realizing a mi cro-simulation of travelers activity and travel patterns w ith adequate consideration of interactions and constraints among both activity and travel decisions. Albotrass by Timmermans and Arentze (2000), CEMDAP by Bhat et al. (2004) and FAMO S by Pendyala et al. (2005) are typical 2

PAGE 14

examples of the daily activity -travel pattern micro-simulators that have been developed recently. Similar to the traditional trip-based model, the practical activity-based model incorporates a series of sub-models such as activity generation model, activity timing and duration model, mode split model, destination choice model etc. The sequence of submodels application is a critic al issue in the process of micro-simulation because only the model truly reflecting travel ers decision-making process is capable of providing an accurate estimate on their activity and travel patterns, mode and destination choice behavior etc. However, in most cases, there is no theorem that can explicitly identify the ambiguous process of decision-making on travel behavior. For example, if discretionary activitys timing and duration are estimated in separate sub-models, it is not straightforward to determine the sequence of the model implementation. It can be conjectured that people may first time the activities then decide the activity duration conditional on the timing, or they decide the activity duration earlier and then time the activity given the predetermined duration. Extensive data may be collected for answering these questions but conventional trip diary da ta is usually the only resource for travel behavior analysts and travel demand modelers Merely based on conventional trip diary data, this dissertation is dedi cated for a better understandi ng of the ambiguous decisionmaking sequence in travel behavior by analyz ing activity and travel variables within an advanced econometric modeling framework. There are dependent variab les and independent variab le in statistical or econometric models. Both dependent variab les and independent variables are random variables. Independent vari ables are predetermined and influence the mathematical 3

PAGE 15

expectation or expected occurrence probability (e.g. discrete model or count data model) of dependent variables. He reinafter, expectation is referred to as mathematical expectation or expected occurrence probability in the interest of brevity. In activitybased travel demand model, th e dependent variables include activity and travel variables such as activity timing and duration, activity location choice, trip timing, mode choice and route choice, etc. Peoples demographi c characteristics (e.g. ag e and gender), socioeconomic characteristics (e.g. income and em ployment) and geographi cal characteristics (e.g. land use pattern and transportation system ) serve as independent variables. Given different values of the independent variables, the expectation of dependent variables will appear different. For example, females are expected to allocate more time on shopping activities than males, where female is an independent variable and shopping activity duration is a dependent variable. In most econometric models, random distur bance needs to be specified to take account of unobserved or unspecified independent variables. It is necessary to assume that the specified independent variable is not correlated with the random disturbance for accurately estimating the impact of this vari able on the dependent variable. Once this assumption is violated, this independent variable is called endogenous variable. Otherwise, this variable is called exoge nous variable. In ec onometric literature, instrumental variables are usually adopted for accurately estimating the impact of endogenous variable. The details about this approach will be reviewed in Chapter 2. In many cases, dependent variables are not mutually correlated, particularly in the context of travel behavior modeling. In th is dissertation, two types of relationships among dependent variables: joint relationshi p and causal relationship are defined. The 4

PAGE 16

joint relationship between two dependent vari ables indicates the existence of common unobservable variables which simultaneously in fluences the expectation of these two dependent variables. For example, as shown in Figure 1.1, car ownership is an exogenous variable for shopping mode choi ce and the distance between home and shopping center is an exogenous variable fo r both shopping mode choice and shopping destination choice. However, there are probably unobserved variables which simultaneously influence these two dependent variables. For instance, in US household travel survey, household income data collec tion suffers a rather low response rate (usually lower than 70%) due to the consid erable privacy attention. Thus, income is rarely specified into an applicable travel demand model but income may simultaneously influence shopping mode choice and shopping destin ation choice. In th at case, these two variables, indicating shoppi ng mode choice and shopping des tination choice, have joint relationship according to the previous definition. Causal relationship between two depe ndent variables indicates that the expectation of one dependent variable is predetermined and then influences the expectation of the other depe ndent variable. Herein, th e causal relationship is not referred to as a deterministic cause-effect rela tionship, where cause must lead to effect. Instead, causal rela tionship in this dissert ation specifically indicat es that one endogenous variable exerts impact on the expectation of the other endogenous variable. Figure 1.2 illustrates the causal relationship, where car ownership and household size serve as exogenous variables for mode c hoice and household size serves as exogenous variable for trip making within a home-based trip chain (tour). There are probably some unobservable variables such as income, habit, preference, which are not specified into the 5

PAGE 17

model but simultaneously influence expectati on of both dependent variables. Meanwhile, one may conjecture that there might be causal relationship between both dependent variables: mode choice and stop-making in home-based trip chain. On one hand, multistop trip chain making is more dependent on auto mode given its convenience and flexibility (the causal direction is described by the arrow in Figure 1.2). It means that if people first decide to pursue a multi-stop trip chain, the expectation of auto mode selection turns to be higher. On the other hand, auto usage may stimulate the desire to make multi-stop trip chain. The travelers using auto have potential to serve passengers, thereby causing more stops within a trip chai n (the causal directi on is described by the dashed arrow in Fig 1.2). It means that if people first decide to use auto mode, the expectation of multi-stop trip ch ain making turns to be higher. Joint relationship and causal relationship ubiquitously exist among the travel and activity variables due to the complexity of tr avel behavior. Similar relationships exist between travel time and activity duration (Kitamura et al., 1996; Golob et al., 2000), activity timing and duration (Pendyala and Bh at, 2004), trip chaining and mode choice (Bhat and Sardesai, 2006; Ye et al., 2006), trav el timing and mode choice (Tringides et al., 2004) and activity-travel pattern and time use pattern between household members (Meka and Pendyala, 2002). From the viewpoint of econometrics, th e impact of exogenous variables on the continuous endogenous variables can be consis tently estimated us ing Ordinary Least Square (OLS) method in a linear regression model. However, the impact of one dependent variable on the other dependent va riable cannot be consistently estimated using OLS method due to common unobservable variables (see Chapter 2 for details). 6

PAGE 18

Structural Equations Model (SEM), in a simultaneous equation modeling system, has been a full-fledged approach for consistently and efficiently estimating the coefficients and modeling the causal relationships among c ontinuous variables and ordered discrete variables indicated by c ontinuous latent variab les. In travel behavior study, SEM has been widely developed for modeling and an alyzing the causal rela tionships among auto ownership, travel time, activity duration, trip frequency and activity frequency. Golob (2003) gives a comprehensive review of SEM application in travel behavior study. In those models, travel time and activity durat ion are treated as continuous variables, whereas auto ownership, trip frequency and ac tivity frequency usually as ordered discrete variables. The mechanism of SEM will be reviewed in Chapter 2 in a detailed manner. Distance from Home Shopping Mode (Drive, Ca rpool or Transit) Shopping Center A vs. Shopping Center B Unobservable Variables Car Ownership Figure 1.1 Joint Relationship Between Mode Choice and Destination Choice However, many variables of interest in tr avel behavior study are not continuous or ordinal but unordered discrete in nature, such as mode choice, des tination choice, trip chaining pattern and time-of-day choice (if the timing can be classified into a few categories including AM peak, midday, PM peak and off-peak). A modeling methodology with involvement of unordered discre te variables is highly desired for better 7

PAGE 19

understanding the ambiguous decision sequenc e and causal relationships among travel and activity variables. 1.2 Objectives This dissertation intends to propose a modeling me thodology integrating the unordered multinomial discrete endogenous variables into the framework of structural equation model that used to only allow the causal analysis among continuous endogenous variables and ordered discrete endogenous variables. There is more than one approach to r ealize the objective. Based on different assumptions, one may have different model structure and estimation results. This dissertation will be focused on two types of models: the causal model based on Lees transformation (Lee, 1992) and the mixed causa l model. The performance of these two types of model will be compared in the c ontext of activity-travel behavior analysis. Car Ownership Mode Choice (Drive, Carpool or Transit) Number of Stops in Home-based Trip Chain (One Stop or Multiple Stops) Household Size Unobservable Variables Figure 1.2 Causal Relationship Between Mode Choice and Trip Chain Type Choice 8

PAGE 20

1.3 Scope The proposed modeling methodology will be applied to help understand the causal relationship between the following relevant travel and activity variables: Timing and duration of maintenance activities Trip chaining pattern and mode choice Trip departure timing and mode choice Causal analysis between timing and dur ation of maintenance activities is conduced by joint modeling methodology based Lee transformation and mixed discretecontinuous model. Simulati on-based hypothesis test will be proposed to examine the significance of error correlation in mixed discrete-continuous models. Recursive bivariate probit model is adopted to analyze the causal relationship between trip chaining patter and mode choice. The causal relations hip between trip depa rture timing and mode choice is analyzed in the proposed mixed bina ry-multinomial choice model, in which trip departure timing is treated as a binary c hoice (peak vs. non-peak) and mode choice as multinomial choices (Single-occupancy vehicl e, High-occupancy ve hicle, Transit and Non-motorized mode). In addition, appropriate statistical tests will be applied for comparing the performance of competing models under altern ative causal structures in a rigorous way. Non-nested test for discrete choice model (B en-Akiva and Lerman, 1983) can be directly applied to compare recursive bivariate probit model and mixed binary-multinomial choice model under alternative causa l structures. However, it is inappropriate to directly apply it for joint discrete-continuous model. In this dissertation, an extension of non9

PAGE 21

nested test will be proposed for rigorous ly comparing non-nested discrete-continuous model. 1.4 Outline The remainder of the dissertation is orga nized as follows. Chapter 2 initially reviews the existing modeling methodology fo r causal analysis among continuous and ordered-discrete endogenous va riables, and then proposes the modeling methodology for causal analysis with involvemen t of unordered discrete variab les. Then, non-nested test and its extension are formulated, followed by a series of Monte Ca rlo studies regarding the modeling methodology and the statistical test. Chapter 3 briefly introduces Swiss Travel Microcensus 2000 from which datase t is extracted for m odeling analysis and provides a brief description of datasets. Chap ter 4 presents the empirical results of model estimation within the scope of the causal m odeling analysis between trip chaining pattern and mode choice, between ac tivity timing and duration, a nd between trip departure timing and mode choice. Conclusions a nd contributions are summarized and some recommendations for future research are provided in Chapter 5. 10

PAGE 22

Chapter Two: Modeling Methodology 2.1 Background 2.1.1 Review of Linear Regression Model In econometrics, linear regression model is a standard model for quantifying the impacts of exogenous variables on the expectation of contin uous endogenous variable. Assume the random continuous vari able y can be expressed as: y = 0 + 1 x 1 + 2 x 2 + + n x n + u, (2.1.1) where x 1 x 2 ,,x n are random exogenous variables of interest, 0 1 ,, n are constant coefficients associated with exogenous variab les, and u is a random variable which takes account of all the other unspecified or unobs erved factors influencing y. A critical assumption of linear regression model is that the expectation of u gi ven all the exogenous variables is zero, denoted as E(u|x 1 x 2 .., x n ) = 0. The impact of exogenous variable x i on expectation of y can be measured by E(y|x i )/ x i = i In other words, the impact estimation of the exogenous variable on endogenous variable is attributab le to an accurate estimation of i Based on a random sample from populati on with respect to random variables y and x 1 x 2 ,,x n Ordinary Least Square (OLS) method can offer statistically consistent estimators of i by minimizing the sum of observed [y ( 0 + 1 x 1 + 2 x 2 + + n x n )] 2 The estimators can be simply derived and expres sed in a form of matrix algebra. Let Y = 11

PAGE 23

X + u, Y is a column vector of (y 1 y 2 y N ), where N represents the sample size. X is an N n matrix of and u is a column of (u Nn 2N1N n2 22 21 n1 12 11x..xx ........ x..xx x..xx 1 u 2 u N ). Z = = (Y X ) (Y X ). (2.1.2) N 1i 2 inn i22i11 0i)]x x x ([y To minimize Z, let Z/ = 0. It is easy to show that XY + 2(XX) = 0, then and Y'X)X'X( 1 2 Z/ 2 = (XX) -1 > 0, which ensures that z is minimized. Since and Y = X + u, then it can be shown that By taking expectati on on both sides of the equation, one may obtain that It indicates that is an unbiased estimator of If Var(u|x) is denoted as Y'X)X'X( 1 u'X)X'X(u) X('X)X'X( 1 1 )X|u(E'X)X'X()X| (E1 2 Var(|X) = E[()()] = E[uu](XX) -1 = 2 (XX) -1 In the following simple example, the prope rty of estimators is discussed under the situation where one exogenous variable is unobservable. Suppose the population model is y = X 1 1 + X 2 2 + u but X 2 is unobservable. One can only specify X 1 in the model as y = X 1 1 + u where u = X 2 2 + u. Applying OLS to estimate the misspecified model, one may obtain (2.1.3) y'X)X'X( 1 1 11 1 Since the true population model is y = X 1 1 + X 2 2 + u, y in Equation (2.1.3) can be rewritten as )u X X('X)X'X( 22 111 1 11 1 u'X)X'X(X'X)X'X(1 1 11 221 1 11 1 (2.1.4) 12

PAGE 24

Take expectation on both sides of the equation, then given the assumption that E(u|x 21 1 11 211X'X)X'X(E ) E( 1 ) = 0. The extent to which is inconsistent depends on the second term including 1 2 and the correlation between x 1 and x 2 Table 1.1 offers a summary of bias in when x 1 2 is omitted from the model specification. Table 1.1 Summary of Bias in When x 1 2 is Omitted Corr(x 1 x 2 ) > 0 Corr(x 1 x 2 ) < 0 2 > 0 Positive bias Negative bias 2 < 0 Negative bias Positive bias The bias in estimators caused by the om ission of exogenous va riables is called omission bias. Such bias ubiquitously exis t in a linear regression model when it is applied to travel behavior anal ysis, particularly in the mode l aimed at modeling the causal relationship among activity and travel variable s. More advanced estimation technique has been developed for consistently estima ting the impact of endogenous variables. These approaches will be reviewed in Section 2.1.3. 2.1.2 Review of Simultaneous Equations Model (Joint Relationship Among Continuous Variables) Linear regression model can only accommodate one de pendent variable. In activity-based travel demand model, there is usually more than one continuous dependent variable of interest (e.g. travel time and ac tivity duration). Some exogenous variables, which are unspecified or unobservable but influence these dependent variables simultaneously, lead to correlations among random disturbances in each single linear regression model. As defined previousl y, these dependent variables have joint 13

PAGE 25

relationships. Seemingly-Unrelated Regressi on (SUR) model is an appropriate modeling framework for multiple continuous dependent variables by accommodating their joint relationships in the rando m error correlations. SUR model takes the form of Y i = X i i + u i i = 1,2,M, where Y i is the i th continuous dependent variable, X i is a vector of exogenous explanatory variables for Y i i is a vector of model parameter for X i u i is random disturbance in the i th model. The set of equations may be written as M 2 1 M 2 1 M 2 1 M 2 1u ... u u ... X000 0...00 00X0 000X y ... y y (2.1.5) Let u = [u 1 , u 2 ,, u M ] and assume E[u|X 1 X 2 ,X M ] = 0 Assume (2.1.6) I...II ............ I...II I...II V ]X,X ,X|E[uu'MM 2M 1M M2 22 21 M1 12 11 M 21 where I represents identity matrix. Since each equation is a classical linear re gression model, Ordinary Least Square (OLS) estimators for the parameters in each e quation are consistent but inefficient. OLS method cannot utilize complete information from the data and yield estimators with lower confidence level. Instead, Generalized L east Square (GLS) met hod can provide both consistent and efficient estimators for all the parameters in the model systems, where GLS = [XV -1 X] -1 [XV -1 Y]. (2.1.7) 14

PAGE 26

Let = then V = MM 2M1M M2 22 21 M1 12 11... ............ ... ... I ( denotes Kronecker product) V -1 = -1 I Thus, SUR = [X( -1 I )X] -1 [X( -1 I )Y] and Var( ) = [X( SUR -1 I )X] -1 (2.1.8) A feasible estimator of matrix can be obtained by estimating each of the M equations separately by OLS method a nd using the residuals to estimate ij i.e. T e'e ji ij Here, T is sample size and e i e j represent OLS residuals in single equation i and j, respectively. In summary, GLS method, which is able to accommodate the joint relationships among the continuous variables in its covariance -variance matrix of random error terms, needs to be applied for both consistent and efficient estimators in SUR model. Recursive estimation using OLS provides consiste nt but inefficient estimators. 2.1.3 Review of Structural Equations Mode l (Causal Relationship Among Continuous Variables) Similar to the SUR model, Structural E quations Model (SEM) is also a set of linear regression models, but it differs from SUR model in its inclus ion of the dependent variables as explanatory vari ables in the modeling system. This characteristic makes SEM a powerful modeling methodology for anal yzing the causal relationships among continuous endogenous variables. A typical structural equatio ns model (with G continuous endogenous variables) is defined by a matrix equation system as shown in 15

PAGE 27

Equation (2.1.9). (2.1.9) G 1 G 1u . u B XY Y . Y This equation can be rewritten as (2.1.10) uXBYY (or) (2.1.11) ) uX()B-I(Y1where Y: a column vector of dependent variables, B: a matrix of parameters associated w ith right-hand-side endogenous variables, X: a column vector of exogenous variables, : a matrix of parameters associat ed with exogenous variables, and u: a column vector of error terms associated with the de pendent variables. SEM specifies dependent variables Y as e xplanatory variables as well as the other exogenous explanatory variables and estimates parameter matrix B to capture inherent causal relationship among dependent variables Y. The non-zero correlation between Y and u caused by simultaneous equations violates the assumption of Ordinary Least Square (OLS) method: E(Y|u) = 0, thus OL S method for each single linear regression model will not yield consistent estimator on parameter matrix B. Based on Instrumental Variable (IV) approach, econometricians developed 2Stage Least Square (2SLS) and 3-Stage Least Square (3SLS) method to obtain consistent estimators of parameters associated with e ndogenous variables. 3SLS estimator is more efficient than 2SLS estimator, since the former accommodates unequal variance of u in each single equation. 16

PAGE 28

Here, the mechanics of 2SLS is illustrate d in a simple example. Suppose one has the following structural equations mode l with two dependent variables. 2222 12 111uxy uyxy (2.1.12) where y 2 is specified into the model for y 1 and x 1 and x 2 are exogenous variables, thus x 1 and x 2 are uncorrelated with u 1 and u 2 u 1 and u 2 are correlated due to the common unspecified variables. Because y 2 must be correlated with u 2 y 2 must be correlated with u 1 Thus, y 2 is an endogenous variable in the model for y 1 OLS method cannot yield consistent estimator for One may obtain consistent estimator in the following way. Let Z = [x 1 x 2 ], X = [x 1 y 2 ], b = Let where Z is called instrumental variables. Then 1 1 1 1 1 1 1u'Z)X'Z(b)uXb('Z)X'Z(y'Z)X'Z(b ) N u'Z (limp) N X'Z (limpb)b (limp1 N 1 N N Since x 1 and x 2 are uncorrelated with u 1 and y 2 is correlated with x 2 0) N u'Z (limp1 N and 0) N X'Z (limp1 N Thus, which indicates that is consistent estimators for both b)b (limpN 1 1y'Z)X'Z( 1 and Alternatively, it can be shown that the same consistent estimator can be achieved by two stage least s quares (2SLS) method. In the first stage, regress y 2 on x 2 to get and In the second stage, regress y 22 1 222y'x)x'x( 22 1 222222y'x)x'x(x xy 1 on x 1 and to get the same consistent estimator for 2y 1 and In addition to Least Square (LS) approach, Maximum Li kelihood (ML) method can also be applied to consistently estimate the parameters in SEM. Limited-Information Maximum Likelihood (LIML) estimate and Full-Information Maximum Likelihood 17

PAGE 29

(FIML) estimate in ML method are exactly th e counterparts of 2SLS and 3SLS in Least Square (LS) method. With normally dist ributed disturbances FIML is not only consistent but also efficient among all the estimators. The other advanced estimation approaches, such as Asymptotically-Distrib ution-Free Weighted Least Squares (ADF or ADF-WLS) which is free of the asymptotical distribution on random terms, have been developed and applied in literature (see Golob, 2003 for more detailed review on estimation method of SEM). The endogenous variables in SEM can be eith er continuous or ordered in nature. For the ordered discrete variable x, it is assumed that ther e is a latent continuous variable u which is normally distributed with zero mean and unit variance. The connection between x and u is that x = i is equivalent to i-1 < u < i where 0 = 1 < 2 < ... < k-1 and k = + Here, all the i are called threshold values. If there are k categories, there are k-1 unknown thresholds. Essentia lly, the procedure is exactly same as developing an ordered probit model without a ny explanatory variables. A series of conditions can be established as: If 0 < u < 1 x = 1, then Prob(x = 1) = ( 1 ) () = ( 1 ) ; If 1 < u < 2 x = 2, then Prob(x = 2) = ( 2 ) ( 1 ); If k-1 < u < k x = k, then Prob(x = k) = (+ ) ( k-1 ) = 1 ( k-1 ). Then one can formulate a likelihood f unction for all the observations as N 1i )kx( 1-k )2x( 1 2 )1x( 1i i i)(1...)(-)()( L (2.1.13) 18

PAGE 30

Thresholds i are estimated by maximizing the log-lik elihood function. According to the estimated thresholds, this procedure can tr ansform ordered discrete variables into continuous variables, which serves as endogeno us variables in SEM instead of original ordinal variables. SEM has been widely applied in the cont ext of travel behavior analysis. The literature is briefly review ed as follows. Kitamura, et al. (1992) and Golob, et al (1994) are the first known application of SEM to joint activity durat ion and travel time data. Kitamura (1996) and Pas (1996) are two overviews that include discussions of the role of SEM in activity and time-use modeling. Lu and Pas (1997) present an SEM of in home activities, out-of-home activities (by type), and travel (measured various ways), conditional on socioeconomic variables. Estimation is by normal maximum likelihood, and the emphasis is on interpretation of the direct and indirect effects. The data ar e derived from the Greater Portland, Oregon metropolitan area. Golob and McNally (1997) present an SEM of the interaction of household heads in activity and travel demand, with data from Portland. Activities ar e divided into three types, and SEM results are compared using maximum likelihood (ML) and generalized least squares (GLS) estimation methods. They conclude that GLS methods should be used to estimate SEM when it is app lied to activity participation data. Fujii and Kitamura (2000) studied the latent demand effects of the opening of new freeways. The authors used an SEM to determine the effects of commute duration and scheduling variables on after work discretiona ry activities and their trips. Data are collected from the Osaka-Kobe Region of Japan. 19

PAGE 31

Kuppam and Pendyala (2000) presented three SEMs estimated by GLS using data from Washington, DC. The models focuse d on relationships between: (1) activity duration and trip generation, (2) durations of in-home and out-of-home activities, and (3) activity frequency and trip chain generation. Simma and Axhausen (2001) developed an SEM that captured relationships between male and female heads of household with regard to activity and travel demands. The dependent variables included car ownership, distances traveled by males and females, and male and female trips by two types of activities using data from the Upper Austria. Meka and Pendyala (2002) investigated th e interaction between two adults in one household in terms of their travel and activ ity time allocation by SEM based on Southeast Florida data. An interesting trade-off within non-work travel time and non-work activity time between two adults was quantified and interaction of travel decisions between household members was verified by SEM. Golob (2003) pointed out that current lim itation is that SEM estimation methods will only support dichotomous and ordered poly chotomous categorical variables. This implies that a multinomial discrete choice variable must be represented in terms of a multivariate choice model by breaking it down into component dichotomous variables linked by free error covariances (Muthen, 1979). However, in that case, the discrete choice model is inconsistent with utility ma ximization theorem when it is embedded into the current SEM system. 20

PAGE 32

2.1.4 Review of Discrete Choice Model In travel demand model, unordered discrete dependent va riables, typified by mode choice, are usually modeled in discrete choice modeling framework. Mcfadden (1973) initially derived the multinomial logit m odel based on random utility and utility maximum theorem. This modeling methodology is briefly reviewed as follows: Given that each individual has a feasible choice set denoted by C n we define J n J to be the number of feasible choices. The probability that any alternative i in C n is chosen by decision maker n if and only if the random utility U in corresponding to alternative i is greatest among all U jn where j J n j i: )ij,Cj,UUPr( (i)Pn jn in n and ]ij,Cj),Umax(UPr[ (i)Pn jn in n Let where V inininV U in is a systematic component of random utility. In practice, V in is usually parameterized by a linear comb ination of explanatory variables as ( 0 + 1 x 1 + 2 x 2 + + n x n ). This linear specification is same as the specification in linear regression model. in is a random component of random utility, which takes account of unobserved factors that influence the random u tility value. In linear regression model, the random component is assumed to be nor mally distributed, whereas in multinomial logit model, the random components in are assumed to be i.i.d. standard gumbel distributed. The reason for sele cting this distribution in place of normal distribution is to derive a simple probability f unction for observations by taking advantage of properties of gumbel distribution: 1. Maximum among a number of independent gumbel random variables with identical scale parameter is still gumbel distributed. 21

PAGE 33

2. The difference between two independent gumbel random variables with identical scale parameter is logistically distributed. It can be shown that I 1j jn in n)Vexp( )Vexp( (i) P The parameters i in systematic components can be easily estimated by Maximum Li kelihood Estimation (MLE) method. One serious problem in multinomial log it model is its IIA (Independence of Irrelevant Alternatives) property. IIA property holds if for a specific individual the ratio of the choice probabilities of any two alternatives is entire ly unaffected by the systematic utilities of any other alternatives. IIA problem can be expressed in terms of the crosselasticity of logit probabilities. Multinomial logit model has uniform cross-elasticities: the cross-elasticities of all alternatives with respect to a change in an attribute affecting only the utility of alternative j are equal for all alternatives other than j. It is not reasonable in real cases. For deriving the probability function, it is critical to assume the random components are identically and independently distributed, in which IIA problem is rooted. If this assumption is violated, we cannot obtain such a simple probability function. In real cases, one is usually unable to specify all the explanatory variables into the systematic components because quite a few variables are unobservable or unquantifiable. Omitted influential variables will be absorbed into the random components. If two random components take account of common omitted variables, they will be correlated rather than being independent This situation is analogous to that in SUR model discussed in Section 2.1.2. Howe ver, the correlation among the random error terms in discrete choice model is more harm ful than the correlation in multiple linear regression models. In multiple linear regression models, even if the correlation is not 22

PAGE 34

accommodated, the estimators of coefficients fo r exogenous variables are still consistent. However, in multinomial logit model, th e ignored correlations or unequal variance among random error terms will lead to inconsistent estimators of coefficients because the probability function used for estimating coefficients is not correctly formulated under the incorrect assumption that random error terms are mutually independent and identical. Since unobserved factors influencing such disc rete choices are mutually correlated, they should be considered to have joint relationships according to the definition of this dissertation. Nested logit model is widely adopted in travel demand modeling arena, which can overcome IIA problem by accommodating the jo int relationships among discrete choices. It is assumed that there exists routine of sequential choice behavior, following which decision makers first select a choice combinati on (nest) that is composed of two or more alternatives with correl ated random error terms and then sele ct each alternative in the nest. The utility function for alternative i in one nest can be formulated as U i = V i + n + i where the random component of utility function consists of two parts n and i n is common random component appearing in all the utility functions in the nest and i is i.i.d distributed. With the presence of n the correlation among the a lternatives in one nest can be accommodated in the model. Assume i is standard gumbel distributed with scale parameter as 1 and n is distributed so that n + i is gumbel distributed with a positive scale parameter Since the variance of ( n + i ) must be greater than that of i and the variance of gumbel dist ribution is equal to 2 /6 2 must fall into the range from 0 to 1. It turns to be a standard to examine whether the selected alternatives belong to the same nest when nested logit model is applied. 23

PAGE 35

A disadvantage of nested logit model is that the correlation between two choices in different nest is completely ignored. There is no substitution pattern between two choices under two different nests, but it is sometimes not the case in reality. To accommodate such substitution pattern, cro ss-nested logit model can be adopted by taking the formula belonging to GEV (Genera lized Extreme Value) family proposed by Mcfadden (1974), therefore the mathematical form of cross-nested logit model is not unique. The application of cross-nested logi t can be found, but not much widely, in the literature of travel beha vior analysis and travel demand model (Vovsha, 1995). A multinomial probit model can be formulated if the random error terms in utility functions are assumed to be multivariate normally distributed instead of gumbel (extreme value) distributed. Due to the desirable property of normal dist ribution, variance and covariance associated with utility functi ons can be accommodated in multivariate normally distributed random error terms. In the past decade, considerable advance has been made in the estimation technique of multinomial probit model. GHK simulator (2000) is developed for computing the likelihood value of multinomial probit model, particularly for the case where num ber of alternatives in choice set is greater than three. However, in the literature of travel de mand analysis, the application of multinomial probit model can be rarely found presumably due to its difficulty in computation and estimation. Bhat (1995) proposed heteroskedastic logit model which assumes the random error terms in utility functions are still independently gumbel distributed but with respect to unequal variance. Except for the accomm odation of correlations among random error terms, the assumption of unequa l variances is an alternative way to release IIA problem 24

PAGE 36

in multinomial logit model. The likelihood f unction of heteroskedastic logit model does not have closed form, thus a numerical method, such as the a dopted Laguerre-Gauss Quadrature method, is required to approximate log-likelihood functi on and to estimate the model parameters. Mixed logit model is a generalization of multinomial logit model. It involves the integration of multinomial logit formula over the distribution of ra ndom parameters. The typical probability function from mixed logit model is d)|(f )x'exp( )x'exp( )(PI 1j qjj qii qi (2.1.14) where represent random parameters whose dens ity function can be represented by f( | ). is constant parameters associated to the de nsity function. In practice, only a subvector of is randomized and the rest of them stills remain constant. If 0 term (alternative specific constant) in utility function is randomized, the mixed logit model can allow a flexible random error structure for a comp rehensive accommodation of the substitution pattern among the alternatives. The probabi lity function (2.1.14) does not have a closed form, thus numerical method is required to ap proximate the probability values. Train and Mcfadden (2000) use Maximum Simulated Li kelihood Estimation (MSLE) Method to estimate a mixed logit model, where M onte Carlo method wi th a pseudo-random sequence is applied to approximate the like lihood function without closed form. Bhat (1996) initially adopte d Halton sequence, which is a quasi-random sequence more evenly covering the distributional domain, to approxi mate the likelihood func tion of mixed logit model and estimate the modeling parameters Halton sequence used in MSLE takes much less time than pseudo-random sequence does to reach the same level of estimation 25

PAGE 37

accuracy. With the advance of simulation technique in estimation, mixed logit model turns to be prevalent in the area of travel behavior modeling and considered as a new and promising generation of discrete choice model (Walker 2002). 2.1.5 Endogenous Variable in Discrete Choice Model Discrete choice model may also contain endogenous variables as the specified variables in utility function ar e correlated with ra ndom error terms. Analogous to linear regression model, the coefficient of e ndogenous variables cannot be consistently estimated in a conventional di screte choice model. Simila r to linear regression model, instrumental variables can be used for consistently estimating the coefficient of endogenous variables. There are two type s of endogenous variables: continuous vs. discrete (endogenous dummy variable), co rrespondingly, we may need two different approaches for these two types of endogenous variables. The joint model system for linear regression model can be adopted to overcome the endogenous problem in discrete choice models. As long as th e error correlations are accommodated in the model system, the coefficient of endogenous variable can be consistently estimated. Such modeling methodology is highly desired for causal analys is among discrete choi ces that entails the specification of endogenous variables into ut ility functions. This modeling methodology will be proposed in next section (Section 2.2). 26

PAGE 38

2.2 Modeling Methodology for Causal Analysis in Discrete Choices (Discrete Choice Modeling Methodology with Endogenous Variable) 2.2.1 Recursive Bivariate Probit Model (Causal Relationship Between Two Binary Choice Variables) The causal relationship between two binary choices can be modeled in a bivariate probit modeling framework. The modeling methodology will be presented in the context of causal analysis between trip chaining patt ern and mode choice. The terms trip chain refers to a sequence of trips that begins at home, involves visits one or more other places, and ends at home. Depending on the number of places visited within the tour or chain, the tour may be classified into two patterns: simple and complex. A tour or chain with a single stop or activity outside the home locati on is defined as a simple tour, whereas a tour or chain with more than one stop outside the home locat ion is defined as a complex tour. Depending on the usage of auto in the tour, the tour is classified into two modes: auto and non-aut o. If the tours complexity /simplicity and auto/non-auto mode choice are treated as two binary choices, the bivariate probit model can be formulated at tour level to simultane ously analyze their probabilities with accommodation of random error correlation. It is very important to allow the correlation between random error terms in the model system. Analogous to FIML estimation in linear regression model system, the coe fficient of endogenous variable can be consistently estimated in two binary probit m odels as long as the co rrelation between two error terms is accommodated. 27

PAGE 39

The general formulation is as follows: (2.2.1) qq q q qqq q M 'xT T 'zMwhere q is an index for observations of tour (q = 1, 2, Q); Mq is a latent variable representi ng the mode choice for tour q; Tq is a latent variable representing the complexity of tour q; Mq = 1 if Mq > 0, = 0 otherwise; i.e., Mq is a dummy variable indicating whet her tour q uses the auto mode; Tq = 1, if Tq > 0, = 0 otherwise; i.e., Tq is a dummy variable indicating whether tour q is complex; zq is a vector of explanatory variables for Mq *; xq is a vector of explanatory variables for Tq *; are two vectors of model coefficients a ssociated with the e xplanatory variables zq and xq, respectively; is a scalar coefficient for Tq to measure the impact of tours complexity on mode choice; is a scalar coefficient for Mq to measure the impact of mode choice on the choice of tour complexity; q and q are random error terms, which are standa rd bivariate normally distributed with zero means, unit variances, and correlation i.e. q, q ~ 2 (0, 0, 1, 1, ). Based on this normality assumption, one can derive the prob ability of each possible combination of binary choices for tour q: ] 'x, 'z[ )0T,0Mprob(q q2 q q (2.2.2) ] ), 'x(, 'z[ )] 'x([ )0T,1Mprob(q q2 q 1 q q (2.2.3) 28

PAGE 40

] 'x), 'z([ )] 'z([ )1T,0Mprob(2 1 q q (2.2.4) )] 'x([ )] 'z([ 1)1T,1Mprob(q 1 q 1 q q (2.2.5) ] ), 'x(), 'z([ q q 2where 1[.] is the cumulative distribution function for standard univariate normal distribution. 2[.] is the cumulative distribution function for standard bivariate no rmal distribution. The sum of the probabilities for the four co m binations of two binary choices should be equal to one, i.e., 1)1T,1M(prob )1T,0M(prob)0T,1M(prob)0T,0M(probq q q q q q q q (2.2.6) Substituting equations (2.2.2) through (2.2.5) in to equation (2.2.6), it can be shown that (2.2.7) ] ), 'x(), 'z([ ] 'x, 'z[ q q 2 qq2 ] 'x), 'z([ ] ), 'x(, 'z[ q q 2 q q2This equation does not hold unless either or is equal to zero. This requirement, known as the logical consistency condition (M addala, 1983), will lead to two different recursive simultaneous modeling structures s uggesting two different causal relationships: 1. = 0, 0 (Mode Choice Tour Complexity) qq q q qq q M 'xT 'zM (2.2.8) 29 In this structure, mode choice is predeter m ined as per the first functional relationship. Then, the choice of mode is specified as a dummy variab le in the second functional relationship for tour complexity to directly measure the impact of mode choice on the complexity of the trip chain or tour.

PAGE 41

2. 0, = 0 (Tour Complexity Mode Choice) qq q qqq q 'xT T 'zM (2.2.9) Conversely, one may consider th e alternative structure in which tour complexity is predetermined as per the second functional relationship. The complexity of the tour is specified as an explanatory va riable influencing mode choice as per the first functional relationship. Thus, the desirable feature of the bivariate probit model in which the coefficients of two endogenous dummy vari ables do not coexist in both functional relationships provides an appr opriate modeling framework to analyze the unidirectional causality between tour complexity and mode choice. To facilitate formulating likelihood functions, equations (2.2.2) through (2.2.5) can be rewritten in a format including onl y the cumulative distribution function of the standard bivariate normal distribution. ] 'x, 'z[ )0T,0Mprob(2 q q (2.2.10) ] ), 'x(, 'z[ )0T,1Mprob(2 q q (2.2.11) ] 'x), 'z([ )1T,0Mprob(2 q q (2.2.12) ] 'x, 'z[ )1T,1Mprob(2 q q (2.2.13) Equations (2.2.10) through (2.2.13) and the corresponding likelihood functions can be summarized by the following general formulati ons for the two different unidirectional causal structures (Greene, 2003): 1. = 0, 0 (Mode Choice Tour Complexity) ] ),M 'x( 'z [ probqqq qqqq2q (2.2.14) 30

PAGE 42

31 Q 1q qqq qqqq2 ),M 'x( 'z L (2.2.15) 2. 0, = 0 (Tour Complexity Mode Choice) ] 'x ,T 'z [ probqqqqqqq2q (2.2.16) Q 1q qqqqq qq2 'x ),T 'z( L (2.2.17) where and 1M2q q 1T2qq As the likelihood functions of the recu rsive bivariate probit model and the common bivariate probit model are virtually identical, parameter estimation can be accomplished using readily available software such as LIMDEP 8.0 (Greene, 2002). The endogenous nature of one of the de pendent variables in the simultaneous equation system can be ignored in formulating the likelihood function. Analogous to multiple linear regression model system for continuous variables, the exogenous variables in the utility f unction without endogenous dummy variable serves as instrumental variables for the endogenous dummy variable. This modeling methodology is also suitable for better estimating th e impact of endogenous dummy variable in a binary choice model as long as good inst rumental variables are available. This modeling method has been frequently adopted in economic literature. For example, Greene (1998) applied this modeli ng methodology to quantify the impact of the inclusion of a womens studies program on th e offering of gender economics courses in liberal arts colleges. Rhine et al. (2006) use this modeling methodology to estimating the influence of being unbanked (not having checking and/or saving account) on the probability of obtaining financial servi ces from currency exchanges.

PAGE 43

2.2.2 Simultaneous Equations Model Using L ee Transformation (Causal Relationship between One Continuous Variable and One Multinomial Choice) In the context of travel behavior analysis, we often m eet the situation where there are two dependent variables: one is con tinuous and the other is multinomial unordered discrete in nature. For example, out-of-ho me activity type choice and activity duration can be considered as this type of two depe ndent variables. Out-of-home activity can be shopping, recreation, service (taki ng children to school, riding friends to airport etc.), thus the variable indicating ac tivity type choice is unordered di screte in nature. While the activity duration for each activity type can be treated as continuous dependent variable. People will jointly make decisions on activ ity type choice and activity duration but one usually cannot observe all the influential f actors regarding activity type choice and activity duration. Thus, a modeling methodology is required to accommodate this kind of joint relationship. Analogous to SUR model for continuous dependent variables, one may introduce the correlation be tween random errors into the joint model system. However, discrete choice is usually modele d in a logit-based m odeling framework, where the random error terms must be gumbel distributed. Unlike normal distribution, correlation cannot be accommodated between tw o gumbel distributi ons or between one gumbel distribution and one nor mal distribution. From the perspective of multivariate statistics, there are infinite number of possible joint distributions given two gumbel marginal distributions and a constant correlation between them or given one gumbel marginal distribution, one normal marginal distri bution and their constant correlation. In other words, one cannot derive a unique joint distribution fo r an identifiable likelihood function that allows the correlation between gumbel random error terms in logit-based 32

PAGE 44

discrete choice model and normal error terms in linear regression model for continuous dependent variables. As bivariate normal distribution can allow a constant correla tion between its two marginal univariate normal distributions, L ee (1983) proposes a transformation that converts gumbel error terms into normal error te rms so as to establish a bivariate normal distribution between discrete c hoices and continuous variable. Bhat (1998) applied this discrete-continuous modeling methodology based on Lee transformation to jointly model travelers' activity-type choice fo r participation, home-stay dura tion before participation in an out-of-home activity and out-o f-home activity duration of part icipation. In this study, activity-type choice is modeled as unordered discrete variables using multinomial logit model, while home-stay duration and out-of-home activity duration are modeled as continuous variables in two loglinear regression models. In addition, Bhat (2001) jointly modeled commuters activity type choice, activity dur ation, and travel time deviation to the activity location relative to the direct tr avel time from work to home using the same modeling methodology. Pendyala and Bhat (2004) extended this modeling framework by specifying endogenous unordered discrete variables a nd endogenous continuous variables as explanatory variables in mutual model functi ons so as to quantify the causal relationship between them. If the model de veloped in Bhat (1998) is cons idered an extension of SUR model system by integrating unordered discre te variable and continuous variables, by analogy, the model with endogenous variables as explanatory variables in Pendyala and Bhat (2004) is exactly an extension of SE M model involving both continuous variables and unordered discrete variable within a causal modeling system. Analogous to SUR 33

PAGE 45

model, the joint estimation technique adopted in Bhat (1998) will improve the efficiency of parameter estimators but will not influence the consistent property of parameter estimators which can be obtained from either recursive or joint estimation approach (see Section 2.4.2 for an examination). However, similar to SEM model, the joint estimation technique is necessary for the model for consistently estimating the parameter for endogenous variables. The following is modeling formula and estimation method for discrete-continuous modeling methodology a dopted in Pendyala and Bhat (2005). Let i be an index for alternatives in disc rete choice set (i = 1, 2,, I) and let q be an index for observations (q = 1, 2,, Q). Consider the following equation system : qq q q qiqiqii qi D x a a z u (2.3.1) qi ~ i.i.d. Gumbel(0,1), q ~ N(0, 2). uqi is the indirect (latent) utility associated with the ith choice for the qth observation, Dq is a vector of dumm y variables of length I representing discrete choice, is a column vector of coefficients, i.e. representing the effects of different discrete choice on activity duration, qi is a standard extreme-value (Gumbel) distributed error term assumed to be independently and identically distributed (i.i.d.) across alternatives and observations, is a continuous variable and is its coefficient. The error term q is assumed to be i.i.d. normally distributed across observations with a mean of zero and variance of qa2. In Equation 1, the alternative i will be chosen (i.e., Dqi =1) if the utility of that alternative is the maximum of I alternatives. Defining umaxvqi qj ij ,I,,2,1j qi (2.3.2) 34

PAGE 46

the utility maximizing condition for the choice of the i th alternative may be written as: Dqi if and only if i z qi > v qi Let F i (v qi ) represent the marginal distribution function of vqi implied by the assumed IID extreme va lue distribution for the error terms qi (i=1,2,,I). Using the properties that the maximum over id entically distributed extreme value random terms is extreme value distributed and the difference of two identically distributed extreme values terms is logistically di stributed, the implied distribution for vqi may be derived as: ij qjj qi i)zexp()yexp( )yexp( )yvPr()y(F (2.3.3) Therefore, (2.3.4) )az(F)1DPr(qiqi ii qi )az(F1)0DPr(qiqi ii qi (2.3.5) Both F i (y) and -1 (y) (inverse of standard normal cu mulative distribution function) are monotone increasing functions, so ]}, )v([F )]az([FPr{ ] vazPr[ 1) DPr(qii 1 qiqi ii 1 qiqiqi i qi (2.3.6) Let then (2.3.7) ] )v([F vqii 1 qi },v)]az([FPr{ 1) DPr(* qi qiqi ii 1 qi It can be easily shown that v qi is standard normally distribu ted. One can introduce a new latent variable: qi qiqi ii 1 qiv)]az([F D (2.3.8) which is able to indicate binary response of D qi since Pr(D qi > 0) = Pr( > 0) = Pr(D qi qiqi ii 1v)]az([F qi = 1), (2.3.9) Pr(D qi < 0) = Pr( < 0) = Pr(D qi qiqi ii 1v)]az([F qi = 0). (2.3.10) 35

PAGE 47

Equation system (2.3.1) may now be rewritten as: qq q q qi q qi q qi qiqi ii 1 qi D x a 0 Dif 1 D0, Dif 0 D,v)]a z( [F D (2.3.11) A correlation i between the error terms vqi and q is allowed to accommodate common unobserved factors influencing the discrete ch oice and the continuous variable. Since aq is partially determined by q and vqi is correlated with q if i is unequal to zero, aq is correlated with random error term vqi in the first equation. Similarly, Dq is also correlated with random error term q in the second equation. The endogenous nature of dependent variables Dq and aq entails the full-information maximum likelihood method to jointly estimate their corresponding parameters and Limited-information maximum likelihood estimation (sequential estimation) does not provide consistent estimators for the coefficients of endogenous variables. In Equations (2.3.4), replacing aq with the second equati on of (2.3.1), one obtains: Pr(Dqi = 1) = (2.3.12) ) xz(Fqiiiq iqi iiSimilarly, it can be shown that Pr(Dqi = 0) = if D) xz(F1qijiq iqi iiqj = 1 (2.3.13) 1)0DPr()1DPr(qi qi, then i i = i j (2.3.14) Three possible restrictions may be imposed on the m odeling coefficients to satisfy Equation (2.3.14) known as logical consistency: 1. i 0 and i = j 0, which implies that the continuous variable appears in the right hand side of the equation for the discre te choice and a vector of dummy variables corresponding to the discrete choice also appe ar in the model for the continuous variable. 36

PAGE 48

However, the coefficients on the dummy variables must be mutually identical. The modeling specification constraint by this condition is prac tically meaningless, since discrete variables ought to have varied impact s on the continuous variable and thus have unequal coefficients. 2. i 0 and i = j = 0, which implies that the continuous variable appears in the utility function of the discrete choice variab le but the discrete c hoice variable does not appear in the model for the continuous variab le. This restriction will lead to a recursive structure for the endogenous variables, wher e the continuous variable is predetermined and then influences the discrete variable. 3. i = 0, in which case Equation (2.3.14) is always satisfied; then i and j can take any unequal values. This restriction will lead to the other recursive structure, where the discrete variable is predetermined and then influences the continuous variable. Accordingly, the condition of logical consistency only allows two alternative recurs ive structures. The first is the case where 0 and = 0: continuous variable discrete variable, where c ontinuous endogenous variable aq is predetermined from the linear model and appear in utility functions uqi as an explanatory variable for discrete variables. The full-information likelihood function for estimating parameters in this case is equal to: L = ,)(b) (l 1Q 1q I 1i D qiqqi (2.3.15) where (.) is the standard norma l density function, and lq and bqi are defined as follows: 1 l)az(F b xa l2 i qiqqiii 1 qi q q q (2.3.16) 37

PAGE 49

The second case is when = 0 and 0: discrete variable continuous variable, where the vector of discrete variable Dq is predetermined by the utility functions uqi and then serves as explanatory variables in th e linear model for continuous variable aq. The fullinformation likelihood function is the same as Equation (2.3.15), but here 2 i qiqiii 1 qi q q q q1 l)z(F b Dxa l (2.3.17) A statistical test is required to identify th e dom inant causal relationship between discrete variables and continuous va riable. A statistical test is pr oposed in Chapter 3 to select the causal model indicating the dominant cau sal relationship among the population. It is necessary to further discuss the u nderlying problem of discrete-continuous model system based on Lees transformati on. The modeling system is derived as: qq q q qi q qi q qi qiqi ii 1 qiDxa 0D if 1D ,0D if 0D ,v)]az([F D (2.3.18) ] )v([F vqii 1 qi (2.3.19) and (2.3.20) umaxvqi qj ij ,I,,2,1j qi The correlation i between vqi and q is caused by common unobservable variables in random error term qi and q but i is not equal to the correlation between the random error term i in utility function i and the random error term in linear regression model. i is a non-linear function with respect to not only corr( i, ) but also corr(j, ), because by plugging Equation (2.3.20) into Equation (2.3.19). ] ) umax([F vqi qj ij ,I,,2,1j i 1 qi i does not represent the correlation between j and therefore i does not have a straightforward behavioral interp retation. Indeed, Schmertmann (1994) shows 38

PAGE 50

that the Lee model places substantial re strictions on the c ovariance between the continuous variable and discrete choice mode ls. Using a Monte Carlo study, he further found that the Lee model is sign ificantly biased when this a ssumption is violated. In the following section (Section 2.2.3.1), we propose an alternative modeling methodology, called mixed discrete-continuous model, whic h is able to directly accommodate the correlation between random error term in each utility function and random error term in continuous model without nonlinear transformation. 2.2.3 Mixed Simultaneous Equations Mode l Using Flexible Error Structure 2.2.3.1 Mixed Discrete-continuous Model (Causa l Relationship Between One Continuous Variable and One Multinomial Choice Variable) The gumbel random error term adopted in the utility function for discrete choice model does not allow the correl ation with the random error term in continuous model or in other utility functions fo r discrete choice. One alternative for accommodating such correlations between discrete choices and c ontinuous variable is to employ multinomial probit model for discrete choice, where the error terms are multivariate normally distributed instead of being gu mbel distributed. However, l ogit-based discrete choice is being applied much more widely than multinomial probit model due to its more applicability, thus logit model is persistently adopted for modeling discrete choice in this dissertation. Similar to nested logit model (see Section 2.1.4), one may assume that the random error term in utility functi ons consists of two independent random components: one represents a heterogeneity which is normally distributed and the other is still standard 39

PAGE 51

gumbel distributed as usual. Such modeling methodology for discrete choice model is called mixed logit model (see Section 2.1.4). If the variance of heterogeneity is unequal across the utility functions, one may have a heteroskedastic logit model which avoids the pitfall of IIA (Bhat, 1995). However, Bhat (1995) uses gumbel-distributed random error terms with unequal variance rath er than mixed normal and gumbel error terms in mixed heteroskedastic logit model. Meanwhile, one may assume that the random error term for continuous model consists of i random componen ts, all of which are normally distributed. The modeling system can be formulated as: q q q q qiqiiqiqii qikm D' x'a nf az' u (2.4.1) where qi ~ i.i.d. Gumbel(0,1). m q and n qi are multivariate normally distributed with zero expectations and unit variances. Correlations among n qi are zero and correlations between n qi and m q are i f i and k represent the standard deviation of normal random components in utility functions and linear re gression model. In this study, we emphasize the correlation between discrete choices and continuous variable but ignore the correlation among discrete choi ces. Under the multivariate normality assumption, one may rewrite q I 1j 2 j I 1j qjj q1)n(m (2.4.2) where q is a new random variable which is standard normally distributed and independent of n qi and qi Then the model system can be reformed as: 1knk D' x'a nf az' uq I 1j 2 j I 1j qjj q q q qiqiiqiqii qi (2.4.3) 40

PAGE 52

By replacing k j with g j and I 1j 2 j1 k with the mixed joint modeling system can be reduced to ng D' x'a nf az' uq I 1j qjj q q q qiqiiqiqii qi (2.4.4) where n qi ~ Normal(0,1) and q ~ Normal(0,1) and qi ~ i.i.d. Gumbel(0,1). It implies that one univariate normal heterogeneity simultaneously appearing in both latent utility function and continuous model with unequal st andard deviations performs as well as multivariate normal heterogeneities for consistently estimating the coefficient or of endogenous variables. Similar to Section 2.2.2, either or needs to be zero, which leads to two alternative causal structures: 1) = 0 and 0, discrete choice continuous variable and 2) 0 and = 0, continuous variable discrete choice. In this joint model system, the correla tion between latent utility function and random error term in continuous model can be calculated as ) g)( 6 (f gf )a,u(Corr2 I 1j 2 j 2 2 i ii q qi (2.4.5) As f i and g i approach positive or negative infinity, lim[Corr(u qi a q )] is equal to 1; meanwhile, as f i approaches positive (or negative) infinity and g i approaches negative (or positive) infinity, lim[Corr(u qi a q )] is equal to -1. Thus, theoretically speaking, this specification of heterogeneity can accommodate any degree of correlation between latent utility function and continuous model. And the correlation has a reasonable behavioral interpretation that positive or negative correlation can explicitly indicate the same or the opposite impact of unobserved or the unspeci fied common variable s on latent utility 41

PAGE 53

function and continuous dependent variable. On this aspect mixed discrete-continuous model is better than Lee model. In a ddition, the mixed discrete-continuous model specifies a heteroskedastic logit model for discrete choice, which can avoid the IIA problem in multinomial logit model. On the other side, it might be conjectured that the coefficient estimation for endogenous variables will be very sensitive to the covariance structures of random error terms. An appr opriate specification of random error terms is critical to accurately estimate the impact of endogenous va riable, which helps us better understand the underlying causal relati onship among peoples activ ity and travel behavior. Based on the derived joint model system, we need derive the probability function for each observation and use maximum likelihood estimation to estimate the parameters. Conditional on nqi, the probability of each observati on is equal to the product of probability for discrete choice observation and probability density for continuous observation, noted as: Prob(Dq, aq|nqi) = (2.4.6) )nnn(F)nnn(LqI,,2q,1q I 1i D qI,,2q,1qiqi where I 1j qjiqjqjj qiiqiqii qI,,2q,1qi)nf azexp( ) nf azexp( )nnn(L and (2.4.7) F(nq1, nq2,, nqI) = 2 I 1i qii q q 2)nfx'a( 2 1 exp 2 1 (2.4.8) To obtain unconditional probability one needs to integrate nqi over its distributional domain and then has Prob(Dq,aq) = (2.4.9) qI q qnnn21)n(d...)n(d)n(d)n|a ,Prob(D ...qI q2 q1 qiqq 42

PAGE 54

Here, (.) represents cumulative distribution fu nction of standard normal distribution. The likelihood function can be formed as (2.4.10) Q 1q qq)a,Prob(D L Because the likelihood function does not ha ve a closed form, we need apply Maximum Simulated Likelihood Estimation Method (MSLE) to estimate the model parameters. The idea is to draw a set of random seeds from known distribution and input these random values into probability function to approximate the integral value. Bhat (2000) found quasi-random sequen ce: Halton sequence can bette r cover the distributional domain than conventional random sequence (called pseudo-random sequence). It was found that, in terms of one dimensional inte gral, with as few as 50 Halton draws, the error measures are smaller than those from 1000 pseudo-random draws and those from 75 Halton draws are much smaller than from 2000 pseudo-random draws. To save computational time, we employ Halton seque nce to generate random seeds that are uniformly distributed from 0 to 1 and use -1 (.) to convert these s eeds to be standard normally distributed. The generation of Halton draws is explicitly presented in Train (1999), thus the procedure is not repeated in this dissertati on. In Appendix A, the code written in Gauss programming language (Aptec h, 2005) for generating halton sequence is attached, which is same as the standard code for mixed logit model by Train (1996). Standard normal seeds in r th iteration, noted as n r are input into Prob(D q a q |n qi ) to calculate P r After repeating this procedure R times and accumulating the P r value, one can approximate value of Prob(D q a q ) using Then the routine of maximum likelihood procedure can be followed to consis tently estimate the pa rameters including /RPR 1r r i , f i g i , or In this dissertation, R is select ed as 100. Gauss programming language 43

PAGE 55

(Aptech, 2005) is used to code the likelihood function and its first-order derivative for the procedure of likelihood maximization (see Appendix B for details). As stated by Walker (2002), a small number of quasi-random draws will mask the under-identification issue and yield erroneous estimators. Therefore, we have to carefully specify the heterogeneity in the following mixed joint model system. ung D' x'a nf az' uq I 1i qii q q q qiqiiqiqii qi (2.4.11) Due to the slight difference between normal a nd gumbel distribution, standard deviation f i of normal heterogeneity can be identified according to the differences between each pair of latent utility functions. However, in continuous model, the random error term u q and heterogeneity n qi are both normally distributed wit hout any slight difference. The linear combination of normal random variable s is still of normality, whose expectation and variance is respectively e qual to the sum of expectations and the sum of squares of standard deviations regarding these normal random variables. Thus, estimation of g i depends on the identification of f i in latent utility function. Without identification of f i g i will be absorbed into q and turns to be unidentif iable. The reason why g i is identifiable is straightforward. The procedure that f i are identified through latent utility functions does not depend on the information from the continuous model. And continuous model itself can yield estimator for standard devi ation of random error term since dependent variable in continuous model is directly obs ervable. Finally, in joint model system, covariance between each pair of latent utility function and continuous model can provide additional information for estimating g i 44

PAGE 56

By examining the variance-covariance of utility differences, Walker (2002) established criteria for specifying a flexible error structure in mi xed logit model with respect to identif ication and normalization. She found that a mixed heteroskedastic logit model with M (M > 2) alternatives at most allows (M-1) heterogeneities to be identifiable and the valid normalization is to impose zero on the smallest variance of heterogeneity. Practically, one may use a small number of quasi-random draws to estimate an unidentified mixed logit model and obtain prel iminary estimation results. Then a zero restriction needs to be imposed on the smallest vari ance among all the estimated variances of heterogeneity. In our mixed discrete-continuous mode l system, once one of f i is fixed at zero for identificati on and normalization, the corresponding g i turns to be unidentifiable. Thus, the corresponding g i needs to be fixed at zero as well. In the procedure of Maximum Simulate d Likelihood Estimation (MSLE), t-test can be obtained for the estimator of each single parameter based on estimator itself and its standard deviation from th e diagonal elements of the es timated covariance-variance matrix. However, in this study, modelers are more concerned about the significance of the product of f i and g i instead of single parameter f i and parameter g i since f i g i represents covariance between two random components, which indicates sign (+ or -) and magnitude of correlation. One may need to test the following null hypothesis (H 0 ) and alternative hypothesis (H 1 ): H 0 : Cov(u i a) = f i g i > 0 (positive covariance) ; H 1 : Cov(u i a) = f i g i 0 (negative covariance). As MSLE estimator and are essentially maximum likelihood estimator (MLE), they should be asymptotically norma lly distributed. The correlation between if ig if 45

PAGE 57

and can be calculated from the corresponding off-diagonal element in the estimated covariance-variance matrix. Thus, and should be bivariate normally distributed. One approach to calculate the probability to make type-I error, i. e. null hypothesis is rejected when it is correct, is to first derive the cumulative distribution function of and then directly to ca lculate the probability. However, it is rather challenging to derive a tractable cumulative distribution function for calculating the probability. In this dissertation, a simulation appr oach, called simulation-based h ypothesis test, is adopted to approximate the probability and to determine the significance level for the estimated error covariance. ig if ig if ig Since the expectation and variance of estimator of f i and g i and the correlation between them have been estimated in the procedure of MSLE, Monte Carlo method can be applied to generate a large number of tw o random seeds, which are bivariate normally distributed with estimated expectati on, variance and correlation. U 1 and U 2 are two sets of pseudo-random seeds which are independently and uniformly distributed between 0 and 1. Let x = -1 (U 1 ) and )(U 1x y2 12 then Let and ). (0,0,1,1, ~ y)(x,2 )f E()xf std(fi i )g E()yg std(gi i then (f, g) ~ ] ),g (std),f (std),g E(),f [E( i 2 i 2 ii2 where, -1 ( ) is the inverse of cumulative distri bution function of standard normal distribution; is the estimated correlation between and calculated from the corresponding off-diagonal element in the estimated covariancevariance matrix; if ig 2 is 46

PAGE 58

probability density function of bivariate normal distribution; E( ) is expectation of random variable; std( ) is standa rd deviation of random variable. One may calculate the product of each pa ir of f and g and then count the frequency of positive product, denoted as N + The probability to ma ke type I error, i.e. significance level, can be approximated by (1 N + /N), where N is the total number of random seeds. Similarly, if is initially negative null hypothesis that )f E(i )g E(i Cov(u i a) is negative needs to be tested. One may approximate the significance level by (1 N /N), where N represents the count of negative product from each pair of f and g. In this dissertation study, we use 5,000,000 pseudo random seeds (i.e. N = 5,000,000) for accurately estimating the signifi cance level of error covariance estimator represented by if ig 2.2.3.2 Mixed Binary-multinomial Choice Model (Causal Relationship Between One Binary Choice Variable and On e Multinomial Choice Variable) We have presented the modeling methodol ogy for the causal relationship between two binary choices in Section 2.2.1. Binary choices can be modeled by binary probit model, in which the random error term of late nt utility function is normally distributed. The correlation between the random error terms in two latent utility functions can be easily accommodated under the assumption that two random error te rms are bivariate normally distributed. However, in travel behavior analysis, the choices are usually multinomial in nature. Travel mode choice in urban area is a typical example, in which people need choose the most a ppropriate travel mode from origin to destination among all the available alternatives possibly including au to, transit, bicycle or walk. In this case, 47

PAGE 59

recursive bivariate probit model cannot be used to model the causal relationship among discrete variables indicating multinomial unordered choices. In this section, a mixed binary-multinomial choice model will be proposed to allow causal analysis among multinomial unordered choices. Similar to Equation (2.4.2), one may have the following model system for causal modeling analysis betwee n two discrete choices: (g ung D' x'v nf Az' uq I 1i qii q q q qiqiiqiqii qi i needs to be fixed at 1) (2.4.12) q is the index of observations, i represents the index of alternatives in choice set C I consisting of I alternatives. u qi is the latent utility associated with the i th choice in a choice set C I consisting of I alternatives, v q is the latent utility a ssociated with a binary choice in the other choice set C K consisting of two alternatives, where u q is an idiosyncratic random error and i.i.d. standard normally distributed. n qj represents the heterogeneity in each utility function u qi and v q D q is a vector of dummy variable indicating the multinomial choices and A q is a dummy variable indicating the binary choices. Similar to the situation in mixed discrete-continuous model, it is unnecessary and unidentifiable to specify bivariate normally distributed heterogeneity into the model system. Instead, common univariate normally di stributed heterogenei ties are sufficient to accommodate the correlation between each pair of latent utility function for multinomial choices and latent utility f unction for binary choice. In the mixed discrete-continuous model, g i is specified to allow unequal standard deviations of heterogeneity. However, in the current binary-multinomial choice model, normal heterogeneity is assumed to have identic al standard deviation, which needs to be 48

PAGE 60

fixed at one. In the preliminary study, g i is specified into the joint model system. Unfortunately, we never reach the convergen ce in the procedure of maximum likelihood estimation in a real dataset with the involvement of g i In the estimation procedure, the phenomenon is that g i values turn to be ridiculously great and the procedure never converges even after 1000 iterati ons. A plausible explanation is that the second model in the joint model system is basically a binary probit model, in which the dependent variable is an unobservable latent variable, rather than the observable continuous dependent variable in linear regression model. In bi nary probit model, the standard deviation of random error term is not identifiable. Walker (2002) found that the standard deviation of heterogeneity in a binary mixed heterosk edastic logit model is unidentifiable. Analogously, the standard deviation of heter ogeneity in a binary mixed probit model is not identifiable, either. One alternative to deal with this problem is adopted in Eluru and Bhat (2005), where the seat belt usage and accid ent severity are modeled in a joint model which consists of a binary logit model for se at belt usage and an ordered logit model for accident severity. In that work, the common he terogeneity in both latent utility functions is assumed to be normally distributed with identical standard deviation ( ) but with alternative sign (+/-) in front of in the random component of the model. In the joint modeling system consisting of binary logit model and ordered logit model, can be identified in both utility functions. Howeve r, the positive or negative correlation between the random error terms cannot be naturally acco mmodated into this specification. The investigators need empirically test the models with both + and signs to justify whether positive or negative correlation is mo re appropriate according to goodness-of-fit measures. Due to the symmetric property of normal distribution, one only needs to try 49

PAGE 61

twice to obtain an appropriate estimator for in mixed binary-ordered model by comparing the fitness of two m odels. (The combinations: + /and + /+ are the same as the combinations: /+ and /because normal distri bution is symmetric.) However, it will be very cumbersome to explore the possible sign combination if two normal heterogeneities with equal standard de viations are specifie d into mixed binarymultinomial choice model for accommodating the co rrelation between each pair of utility function for multinomial choices and that for binary choice. Suppose there are 4 alternatives for multinomial choices, one has to try 2 4 = 16 times for all the possible combinations and select the best fitness among these different specifications. This approach is inconvenient for practice, thus th e specification in Equa tion (2.4.12) is first adopted in this study. If all the g i is fixed at 1, the sign of co rrelation will be attributable to the sign of f i in the latent utility function for multinomial choices. The correlation can be expressed as )1 I)( 6 (f f )v,u(Corr2 2 i i q qi (2.4.13) As f i approaches positive or negative infinity, lim[Corr(u qi a q )] is equal to 1 I 1 or 1 I 1 If I = 4, -0.447 < Corr(u qi a q ) < 0.447. In other words, the error structure in specification (2.4.12) cannot allow accommoda te the correlation greater than 0.447 or less than -0.447. This is a di sadvantage of specification (2.4. 12) but it aids in identifying the sign of correlation between each pair of utility function for multinomial choices and that for binary choice. According to sign of correlation estimated from specification (2.4.12), we specify (2.4.14), in which the standard deviati ons of common normal heterogeneity are identical. 50

PAGE 62

unf D x 'v nf A z' uq I 1j qjj q q q qiqiiqiqii qi, (2.4.14) where + or - sign in front of fi is imposed if Corr(uqi *, vq *) is estimated to be positive or negative in the first step. In the current specification, 1) f)( 6 (f f )v,Corr(uI 1j 2 j 2 2 i 2 i q qi (2.4.15) As fi approaches positive or negative infinity, lim[Corr(uqi *, vq *)] is equal to 1. Plus the imposed sign for fi, specification (2.4.14) can theore tically accommodate any degree of correlation which ranges from -1 to 1. It is believed that such kind of specification will yield more accurate estimation for the im pact of endogenous variables. Similarly, either or needs to be zero, which lead s to two alternative causal structures: 1) = 0 and 0, multinomial choices binary choice and 2) 0 and = 0, binary choice multinomial choices. The probability function for each obse rvation an d likelihood function can be formulated similar to the procedure in Section 2.2.3. Conditional on the nqi, the probability of each observation is equal to the product of probability for multinomial discrete choice observation and probabil ity for binary choice, noted as: Prob(Dq, Aq|nqkj) = (2.4.16) )nnn(F)nnn(LqI,,2q,1qq I 1i D qI,,2q,1qiqi where I 1j qjiqjj qiiqii qI,,2q,1qi)nf x'exp( ) nf x'exp( )nnn(L and (2.4.17) 51

PAGE 63

])nf D' x'([])nf D' x'[()nnn(FI 1i qii q q A1 q I 1i qii q q A q qI,,2q,1qqq q (2.4.18) To obtain unconditional probability, one needs to integrate nqi over their distributional domains and then have Prob(Dq, Aq) = qI 2 q1 qnnn qI 2q 1q qiqq)n(d)...n(d)n(d)n|A ,Prob(D ... (2.4.19) (.) represents the cumulative distribution f unction of standard normal distribution. The likelihood function can be form ulated as (2.4.20) Q 1q qq)A,D(obPr LBecause the likelihood function does not have a closed form, we still apply Maximum Simulated Likelihood Estimation Method (MSLE) to estimate the model parameters. Halton sequence is still a dopted for generating quasi-random sequence which is uniformly distributed from 0 to 1. Then these random seed s are converted to be standard normally distributed using function -1( ) (inverse of CDF of standard normality). Standard normal seeds in rth iteration, noted as nr, are input into Prob(Dq, Aq| nqi) to calculate Pr. After repeating this procedure R times and accumulating the Pr value, Prob(Dq, Aq) can be approximated as As mentioned before, R is selected as 100. Then the routine of ma ximum likelihood procedure can be followed to estimate the parameters including /RPR 1r ri, fi, or Gauss programming language (Aptech, 2005) is used to code the likelihood func tion and its first-order derivative for maximization (see Appendix D for details). 52

PAGE 64

2.3 Non-nested Test for Choosi ng Alternative Causal Structure 2.3.1 Cox Test for Separate Families of Hypothesis A strict statistical test is required for comparing a nd selecting the models under alternative causal structures in favor of id entifying dominant causa l relationship within travel behavior among population. The causal models under alternativ e causal structures actually belong to non-nested structure, therefor e the classical statistical tests, such as likelihood ratio test for nested structure, canno t be applied for this purpose. Two models are in nested structure if and only if one model can be reduced to the other model by imposing restrictions on the parameters. C ox (1961, 1962) initially proposed a statistical test for comparing the models of separate families of hypothesis. Horowitz (1982) simplified this test in the context of discrete choice model by deriving the test into a more compact and more applicable form for comp aring non-nested discrete choice models. Ben-Akiva and Swait (1986) converted Horowitz test into a form represented by Akaike Information Criterion (AIC) and collected it into the book (Ben-A kiva and Lerman, 1985). Pendyala and Bhat (2004) drew the conclu sion on the basis of this non-nested test. However, after carefully reviewing the original pa per (Horowitz, 1982); we consider it is inappropriate to directly apply this test to the non-nested discrete-c ontinuous model. An appropriate test is requir ed for a rigorous comparison between non-nested discretecontinuous models. 2.3.2 Non-nested Test in Discrete Choice Model It is necessary to review the original paper that proposed non-nested test for discrete choice model by Horowitz (1983). In the original paper, the following goodness53

PAGE 65

of-fit measures are adopted in stead of standard adjusted 2 gg 2 gL 2/KL 1 (2.5.1) ff 2 fL 2/KL 1 (2.5.2) L g and L f are log-likelihood function value for model g and model f, both of which belong to non-nested stru ctures, respectively; K g and K f are number of estimated parameters in model g and model f, respectively; L is log-likelihood function value of the mode l without any explanat ory variables or any parameters (L must be negative since proba bility ranges from 0 to 1). Then fg fg 2 f 2 gL 2/)KK()LL( (2.5.3) 2/)KK()(L)LL(fg 2 f 2 g fg (2.5.4) According to separate family of hypothesis test (Cox, 1961), )u, N2 KK 2 Nu (Normal~ N )LL(fg fg (2.5.5) u is variance, which is always positive. Thus, )1,0(Normal~u/ N2 )KK( 2 Nu N )LL(fg fg (2.5.6) Then by plugging Equation (2.5.4 ) into Equation (2.5.6), )1,0(Normal~u/ N2 )KK( 2 Nu N 2/)KK()(Lfg fg 2 f 2 g (2.5.7) )1,0(Normal~ 2 Nu Nu )(L2 f 2 g (2.5.8) 54

PAGE 66

Therefore, )] 2 Nu Nu zL ([] 2 Nu Nu zL []z Pr[* 2 f 2 g (2.5.9) suppose z is positive. Since and 0zL* 0Nu ]zL2[]z Pr[* 2 g 2 f (2.5.10) as 2 Nu Nu zL* i.e. according to the property of inequality. In this procedure, the term u, which is intractable in empirical work, has been eliminated from the equation. N/zL2u* Without any explanatory variab les or any parameters in discrete choice model, the log-likelihood function value where J represents the number of alternatives in the choice set. Then by plugging it into Equation (2.5.10), one may obtain that ) J/1ln(NL* ])Jln(Nz2[]z Pr[2 g 2 f (2.5.11) Since 2 f and 2 g are not standard output of statistical or econometric software, it is inconvenient to directly appl y Horowitz test. Ben-Akiva and Swait (1986) replaced 2 f and 2 g with standard adjusted likelihood ra tio indices by slightly adjusting the Horowitz test as follows: 0z},)]KK()Jln(Nz2[ {)z Pr(2/1 12 2 1 2 2 (2.5.12) where L(0) K) L( 11 1 2 1 (2.5.13) L(0) K) L( 12 2 2 2 (2.5.14) 55

PAGE 67

2 1 : Adjusted likelihood ratio index for model g; 2 2 : Adjusted likelihood ratio index for model f; L( 1 ) : Log-likelihood value at convergence in model g; L( 2 ) : Log-likelihood value at convergence in model f; L(0) : Log-likelihood value at zero [= N ln(1/J)]; K 2 and K 1 : the number of parameters in model g and model f. The probability that the adju sted likelihood ratio index of model f is greater by some z > 0 than that of model 1, given that model g is the true model, is asymptotically bounded by the right-hand side of equation (2.5.12) above. If the model with the greater 2 is selected, then this bo unds the probability of errone ously choosing the incorrect model over the true specification. With th is test, joint discrete choice models under alternative causal structures can be compared against one another. 2.3.3 Extension to Discrete-continuous Model System Through the mathematical deriva tion, we realize that non-nested test is originated from Cox separate family of hypothesis test without any additional assumptions. Cox test can be applied not only for discrete choice model, but also for any models estimated by maximum likelihood method. Discrete-con tinuous model adopted in this dissertation is not an exception. Suppose we have a linear regression model as y = 0 + x + u. A basic model with minimum number of pa rameters is required to provide L value in Equation (2.5.10). Unlike discrete choice m odel, the linear regression model at least needs to contain two parameters: constant 0 and standard deviation of normal error 56

PAGE 68

term. Then one may have y i = 0 + n, n ~ Normal(0,1). For linear regression model, it is easy to show that MLE estimators are ex actly equal to OLS estimators, therefore y N y N 1i i 0 (2.5.15) 1N )yy( N 1i 2 i (2.5.16) Under normality assumption on the random erro r term, the probabil ity density and logprobability density function for each observation i can be expressed as ] 2 )y(y exp[ 2 1 f2 2 i i (2.5.17) and ) 2 ln(] 2 )y(y [)ln(f2 2 i i (2.5.18) By replacing the parameters with OLS/MLE estimators and summing up logprobability density value over the sample, one may obtain L (continuous observations) = ) 2ln(N])yy([ 2 1 )] 2ln( 2 )yy( [)fln(N 1i 2 i 2 N 1i 2 2 i N 1i i ) 2ln(N 2 1N (2.5.19) The log-likelihood function value for naive di screte choice model is same as before: L (discrete observations) = N ln(1/J) ; (2.5.20) L (total) = L (continuous observations) + L (discrete observations) )J 2ln(N 2 1N (2.5.21) 57

PAGE 69

By plugging L (total) into (2.5.10), we obtain })]J 2ln(N21N[z{]z Pr[2 g 2 f (2.5.22) By replacing 2 f and 2 g with standard adjusted like lihood ratio indices, we have }KK)]J 2ln(N21N[z{ ]z Pr[12 2 1 2 2 (2.5.23) where )1N/()yy( N 1i 2 i N = sample size; J = number of alternatives in discrete choice set; y i = i th observation on continuous dependent variable; y = sample mean of y i L(0) K) L( 11 1 2 1 ; (2.5.24) L(0) K) L( 12 2 2 2 ; (2.5.25) 2 1 : Adjusted likelihood ratio index for model g; 2 2 : Adjusted likelihood ratio index for model f; L( 1 ) : Log-likelihood value at c onvergence in model g; L( 2 ) : Log-likelihood value at convergence in model f; L(0) : Log-likelihood value at zero (No parameters for di screte choice model and two parameters: 0 and for linear regression model); K 2 and K 1 : the number of parameters in model g and model f. 58

PAGE 70

The probability that the adju sted likelihood ratio index of model f is greater by some z > 0 than that of model g, given that model g is the true model, is asymptotically bounded by the right-hand side of equation (2.5.23) above. If the model with the greater 2 is selected, then this bo unds the probability of errone ously choosing the incorrect model over the true specification. With this procedure, discrete-c ontinuous models under alternative causal structures can be compared against one another. 2.4 Monte Carlo Study for Bivari ate Probit Model and Lee Model 2.4.1 Introduction In statistical and ec onometric literature, Monte Carlo studies are widely applied to illustrate the properties of estimators and to compare the estimators obtained from different estimation methods. A synthetic random dataset is generated based on pseudorandom sequences, given parameters and model formulations. Then the proposed estimation method is applied to estimate the pa rameters based on this synthetic dataset. One may compare estimators with the true values of parameters which are given in advance and examine the statistical properties of estimators from a large number of simulation experiments. In this dissertation, the consistency pr operty from the both recursive and joint estimation procedure and the efficiency propert y from the joint estimation procedure is of interest. There are two questions to be addressed. One is whether joint estimation of causal model can yield consistent estimato r for endogenous variable. The other is whether non-nested test is valid for comparing the competing causal structures. Monte Carlo studies will be conducted in the contex t of bivariate probit model to illustrate the 59

PAGE 71

consistency of joint estimation results and to validate the boundi ng probability given by the non-nested test for comparing recursiv e bivariate probit mode ls under alternative causal structures. The proposed mixed model is not selected for Monte Carlo studies due to its great time consumption in estimation (One successful estimation of a mixed model using 100 Halton random seeds takes 3 ~ 4 hour s on a personal computer with 3.0-GHz Pentium IV CPU). In addition, the estimators from Lees discrete-continuous model are examined based on synthetic dataset whose covariance structure of random error terms are not consistent with Lee models assumption. It is found that, except the coefficients for exogenous variables, all the other parameters are seriously biased when the assumption on covariance structure of rando m error terms is violated. 2.4.2 Monte Carlo Studies for Bivariate Probit Model This section intends to compare the performance of estimation results from recursive estimation and joint estimation of bi variate probit model. Assume the bivariate probit model to take the following formula: (2.6.1) qq21 q qq21 qx T z M There is one constant and onl y one explanatory variable in either utility function. Let the parameter 1 = 0.1, 2 = 0.2, 1 = -0.3 and 2 = 0.15; the explanatory variables z q and x q are uniformly distributed as R(0,3); the random error terms q and q are standard bivariate normally distributed with zero means, unit variances, and correlation i.e. q q ~ 2 (0, 0, 1, 1, ). Let = -0.4 and 60

PAGE 72

-0.8, respectively, for an examination of the effect on estimators from the error correlations in various magnitude. The following procedure is employed to ge nerate bivariate normal random seeds. Generate t1 and t2 which are independen tly and uniformly distributed as R(0,1); Let q = -1 (t1) and u q = -1 (t2), then q and u q are both standard univariate normally distributed as N(0, 1) and mutually independent. If let q = q + 2 -1 u q it is easy to show that q and q are standard bivariate normally distributed with zero means, unit variances, and correlation as 2 (0, 0, 1, 1, ). After , z, x, and are determined, latent variables M q and T q can be directly calculated. Let M q = 1 if M q > 0; = 0 otherwise and T q =1 if T q > 0; = 0 otherwise. A dataset has been si mulated including four variables: M q T q z q and x q Both recursive and joint estimation method have been applied to estimate the parameters which have been determined in advance, thereby offering an opportunity to directly compare the true para meters and parameter estimators. If running this procedure for thousands of times, we may explicitly il lustrate the statistical property of the parameter estimators. In this study, sample size is selected as 1000 and this procedure is repeated for 1000 times. The statistics of es timators are listed in Table 2.1. The upper block in the table offers st atistics of estimators as = -0.4, within which the left-handside block shows the results from the recursive estimation, i.e. the parameters being estimated as two recursive binary probit models, whereas the right-hand-side block provides those from joint estimation proce ss using full-information likelihood method. The lower block is provided for the estimators as = -0.8. Similarly, the left-hand-side block and the right-hand-side block in lower position show the statistics from recursive 61

PAGE 73

and joint estimation, respectively, for the situation with higher error correlation. The sample size of the estimators from joint estim ation under higher correlation is 949, while the other three sample sizes are all 1000. That is because the likelihood maximization procedure fails to converge for 51 times (5%) among 1000 repetitions, bu t it is unlikely to influence the statistical distribution of estim ators if such 51 estimators are excluded from the analysis. In the table, Min, Max, Mean and SD columns represent minimum value, maximum value, mean value and standard deviation for the sample of estimators. Tr. Par indicates true value of parameters gi ven in advance. ESD( .) rows indicate the estimated standard deviation of estimator s from maximum likelihood procedure. For convenience, standard deviation is obtained from the outer product of estimated firstorder derivative vectors with respect to the parameters in stead of from estimated Hessian matrix, since in some cases, Hessian matrix is not invertible, but outer product of firstorder derivative can be applie d to approximate the Hessian matrix at convergence. Let R1 = (Mean Tr.Par)/Tr.Par, which is used to measure relative bias in parameter estimators. Let R2 = [Mean(ESD) SD)]/SD, which is used to measure relative bias in the estimate for estimators standard deviation. Let R3 = [SD r SD j )]/SD j which is used to measure relative difference between in the standard deviation of estimators from recursive estimation and joint estimation. As = -0.4, R1 values for al l the parameters in both recursive estimation and joint estimation are less than 5%, which virtually indicates the consistency property of estimators through both estimation methods. In addition, there are no absolute values in 62

PAGE 74

63 R2 and R3 being greater than 5% on both side s, which implies that the joint estimation procedure does not substantially improve the efficiency of estimators compared with recursive estimation procedure wh en the magnitude of correlation is as low as 0.4. As = -0.8, R1 values still suggest the consistency property of estimators on both sides. However, R2 values take 6.2% and 9.3% for recursive estimators associated with two constants in the model. The asymptotical estimators of standard deviations do not fit the observed standard deviations very well. Instead, there is considerable bias within the estimator. The R3 values for all the recurs ive estimators are greater than 14%, thereby indicating that the joint estimation substantially improves efficiency of the parameter estimators when the magnitude of correlation is as high as 0.8. Monte Carlo studies explicitly illustrate the bivariate probit model characterized by the accommodation of the error correlation The greater absolute value the correlation takes, the more efficiency for estimators can be obtained from the bivariate probit model relative to recursive binary probit models. However, both estimation methods will yield consistent estimators on the model parameters, as illustrated by Monte Carlo studies. 2.4.3 Monte Carlo Studies for Recursive Bivariate Probit Model If the Recursive Bivariate Probit Model can indicate two bina ry choices following a sequential manner, let two dummy variables M q and T q indicate such two binary choices

PAGE 75

Table 2.1 Statistics of Estimat ors for Bivariate Probit Model = -0.4 Recursive Estimation (Sample Size = 1000) Joint Estim ation (Sample Size = 1000) estimator Min. Max. Mean SD Tr. Par R1 R2 R 3 Min. Max. Mean SD Tr. Par R1 R2 1 -0.113 0.359 0.097 0.082 0.100 -0 .029 0.000 0.023 -0.128 0.341 0. 098 0.080 0.100 -0.024 -0.010 2 0.078 0.339 0.202 0.047 0.200 0.012 0.022 0.031 0.078 0.353 0.202 0.045 0.200 0.011 0.020 1 -0.581 -0.078 -0.302 0.078 -0.300 0.006 0.034 0. 023 -0.542 -0.082 -0.302 0.076 -0.300 0.006 0.024 2 0.009 0.313 0.151 0.045 0.150 0.004 0.018 0.031 0.025 0.310 0.151 0.044 0.150 0.004 0.016 ---------0.541 -0.242 -0.399 0.044 -0.400 -0.002 0.036 ESD( 1 ) 0.077 0.086 0.082 0.002 ----0.073 0.084 0.079 0.002 ---ESD( 2 ) 0.045 0.051 0.048 0.001 ----0.043 0.050 0.046 0.001 ---ESD( 1 ) 0.077 0.086 0.081 0.001 ----0.074 0.084 0.078 0.002 ---ESD( 2 ) 0.044 0.049 0.046 0.001 ----0.042 0.048 0.045 0.001 ---ESD( ) --------0.041 0.050 0.046 0.001 ---= -0.8 Recursive Estimation (Sample Size = 1000) Joint Estim ation (Sample Size = 949) estimator Min. Max. Mean SD Tr. Par R1 R2 R 3 Min. Max. Mean SD Tr. Par R1 R2 1 -0.199 0.353 0.100 0.081 0.100 -0.005 0.062 0. 155 -0.082 0.295 0.102 0.070 0.100 0.024 0.007 2 0.063 0.373 0.201 0.048 0.200 0.007 -0.014 0.205 0.080 0.306 0.201 0.040 0.200 0.004 -0.011 1 -0.563 -0.051 -0.302 0.078 -0.300 0.006 0.093 0. 148 -0.560 -0.097 -0.303 0.068 -0.300 0.011 0.035 2 0.028 0.307 0.150 0.046 0.150 0.001 0.006 0.202 0.016 0.253 0.150 0.038 0.150 0.003 0.007 ---------0.868 -0.716 -0.798 0.025 -0.800 -0.002 0.025 ESD( 1 ) 0.081 0.094 0.086 0.002 ----0.065 0.077 0.070 0.002 ---ESD( 2 ) 0.045 0.052 0.048 0.001 ----0.036 0.044 0.040 0.001 ---ESD( 1 ) 0.080 0.091 0.085 0.002 ----0.064 0.077 0.070 0.002 ---ESD( 2 ) 0.044 0.049 0.046 0.001 ----0.035 0.042 0.038 0.001 ---ESD( ) --------0.020 0.032 0.026 0.002 ---64

PAGE 76

for person q, i.e. M q = 1 if M is selected by person q, M q = 0, otherwise; T q = 1 if T is selected by person q, T q = 0, otherwise. Person q first makes choice decision on M q then on T q The sequential manner contains two implications: 1. Choice decision on M is made before the choice decision on T; 2. Predetermined choice on M exerts im pact on the choice decision on T. Assume there exists a late nt continuous variable M q indicating the dummy variable M q The choice decision M q = 1 if M q > 0, M q = 0, otherwise. Let M q = 1 + 2 z q + q where parameter 1 = 0.1 and parameter 2 = 0.2; the explanatory variable z q is a uniformly distributed random variable as R(0,3). Let q = -1 (t1), where t1~ R(0,1), thus q ~ N(0,1). After M q is determined, person q starts to make decision on T q conditional on M q according to the other latent continuous variable T q T q = 1 + 2 x q + M q + q where parameter 1 = -0.3, parameter 2 = 0.15 and parameter = 0.9. The explanatory variable x q is a uniformly distributed random variable as R(0,3). q ~ N(0,1) and is correlated with q because q and q contain common unobserved variab les. Let the correlation = -0.4 and q = q + 2 -1 -1 (t2), t2 ~ R(0,1), then q q ~ 2 (0, 0, 1, 1, ). Now T q can be calculated to determine the choice decision of T q : T q = 1 if T q > 0; T q = 1, otherwise. Finally, one obtains a simulated dataset with four variables: M q T q z q and x q Two alternative recursive bivariate pr obit models as in Section 2.2.1 are both applied to estimate the parameters. Obviously, the cau sal structure (T M) is a wrong model specification for the simulated data set, whereas the ca usal structure (M T) is the correct one. In addition to the parameter estimators, adjusted 2 (adjusted likelihood ratio 65

PAGE 77

index) values are recorded from both causal structures in each estimation process for examining the performance of non-nested test The procedure of simulating the dataset and estimating the parameters is repeated 1000 times under various sample size (1000, 2000, 3000 and 5000 respectively) in the interest of finding an appropriate sample size for applying non-nested test. In the upper half of Table 2.2, the left part shows the estimation results for two alternative causal structures using recursiv e estimation procedure (the error correlation is restricted to be zero), where the sample size of synthetic random data (N1) is 1000. R1 = (Mean Tr. Par)/Tr. Par, which is still used to measure relative bias in parameter estimators. T-test can be c onducted for comparing the mean value of estimators and true parameters according to MLEs (Maximum Likelihood Estimator) desirable property of asymptotical normality. N stands for the sample size of estimators statistics. For example, in joint estimation procedure, N = 930, which means that there are 930 estimations successfully reaching convergen ce among 1000 simulation experiments. The rest parts of Table 2.2 offers the similar stat istics for synthetic datasets with sample size as 2000, 3000 and 5000. In all the tables, it can be found that all the joint estimators are not rejected by the t-test but all the recursive estimator for endogenous variable a nd the constant in the same latent function is rejected to be consiste nt by t-test. It infers that joint estimation procedure is necessary for endoge nous variable instead of recu rsive estimation procedure. It is noticeable that the relative bias in recursive bivariate probit model is substantially greater than that of bivariate probit model. For example, the R1 value for joint estimator is as high as -13.9% when N1 = 1000. In bivariate probit model, as N1 66

PAGE 78

67 = 1000, most relative biases are less than 5%. As N1 increases from 1000 to 5000, absolute value of R1 for decreases from 0.139 to 0.029. The statistical results suggest that a large sample size of dataset be requi red for estimating the endogenous coefficient as accurately as those in bivariate probit m odel. The joint estimation results for wrong causal structure generally provi de inconsistent estimators on constant. As N1 = 3000 or 5000, the estimators on coefficients for exogenou s variables appear cons istent even if the casual structure is wrong. For the row of adjusted 0 2 R1 = ( 0 2 0 2 ) / 0 2 where 0 2 represents the adjusted likelihood ratio inde x at zero for the model under the true causal structure, whereas 0 2 represents that under th e wrong causal structure. Regardless of random datasets size (N1), the relative difference of adjusted likelihood ratio index between true model and wrong model is as little as -0.007 or -0.008. The statistical results indicate the seemingly slight difference in the goodnessof-fit measures unde r alternative model structures is informative e nough to identify the model under th e true causal structure. Non-nested test has been in troduced to identify the true causal structure between two alternative ones. For examining the power of non-nested test, we applied this test to each simulation experiment. For each simulation experiment, let z = 2 2 1 2 where 2 2 is adjusted likelihood ratio index at zero for wrong model (noted as model 2) and 1 2 is that for true model (noted as model 1). If z < 0, model 1 performs seemingly better than model 2, thus we should establish null hypothesis that model 1 is true, then calculate the bounding probability (noted as BP) given by {-[-2 |z| L(0) + (K 2 K1)] 1/2 } (see Equation 2.5.12 in Section

PAGE 79

0 5 10 15 20 25 30 35 40 45 -9~-8-8~-7-7~-6-6~-5-5~-4-4~-3-3~-2-2~-1-1~00~11~22~33~44~5Unit: 10E-3Percent (%) 68 Figure 2.1 Distribution of z (N = 930 and N1 = 1000)

PAGE 80

69 Table 2.2 Statistics of Estimators for Recursive Bivariate Probit Model Joint Estimation N1 = 1000 and N = 930 N1 = 2000 and N = 940 Estimator Min. Max. Mean SD R1 t-test Min. Max. Mean SD R1 t-test 1 (0.1) -0.154 0.377 0.091 0.077 -0.087 -0.12 -0.105 0.307 0.094 0.060 -0.061 -0.10 2 (0.2) 0.071 0.362 0.205 0.045 0.025 0. 11 0.061 0.347 0.203 0.036 0.016 0.08 1 (-0.3) -0.964 0.946 -0.225 0.362 -0.251 0.21 -0.880 0.855 -0.261 0.288 -0.129 0.14 2 (0.15) -0.010 0.329 0.147 0.048 -0. 019 -0.06 0.054 0.288 0.150 0.037 0.000 0.00 -0.812 1.709 0.775 0.514 -0.139 -0.24 -0.751 1.650 0.832 0.407 -0.075 -0.17 ------------(-0.4) -0.977 0.620 -0.326 0.328 -0.185 0.23 -0.947 0.532 -0.362 0.261 -0.094 0.15 Adj. R 0 2 0.067 0.146 0.1064 0.012 -0.53 0.079 0.140 0.1076 0.010 0.079 -Recursive Estimation N1 = 1000 and N = 1000 N1 = 2000 and N = 1000 Estimator Min. Max. Mean SD R1 t-test Min. Max. Mean SD R1 t-test 1 (0.1) -0.156 0.403 0.097 0.079 -0.025 -0.04 -0.105 0.341 0.098 0.062 -0.018 -0.03 2 (0.2) 0.050 0.363 0.201 0.047 0.005 0. 02 0.063 0.347 0.201 0.037 0.004 0.03 1 (-0.3) -0.249 0.476 0.125 0.100 -1.417 4.25 -0.145 0.434 0.128 0.075 -1.426 5.71 2 (0.15) -0.011 0.336 0.158 0.049 0. 050 0.16 0.054 0.288 0.159 0.038 0.062 0.24 0.002 0.575 0.271 0.087 -0.699 -7.23 0.029 0.477 0.266 0.070 -0.704 -9.06 Joint Estimation in wrong causal direction N1 = 1000 and N = 908 N1 = 2000 and N = 950 Estimator Min. Max. Mean SD R1 t-test Min. Max. Mean SD R1 t-test 1 (0.1) -1.192 1.030 0.034 0.499 -0.658 -0.13 -1.128 0.989 0.075 0.395 -0.247 -0.06 2 (0.2) 0.035 0.342 0.184 0.047 -0.080 1. 79 0.054 0.343 0.188 0.037 -0.061 2.38 1 (-0.3) -0.044 0.509 0.291 0.080 -1.969 2.39 0.060 0.464 0.296 0.060 -1.988 3.27 2 (0.15) 0.019 0.335 0.162 0.046 0.079 1.35 0.064 0.276 0.160 0.037 0.066 1.62 -------------1.211 1.761 0.100 0.696 --1 .231 1.647 0.043 0.556 --0.10 (-0.4) -0.952 0.994 0.095 0.426 -1.237 -0.01 -0.909 0.938 0.128 0.340 -1.319 0.08 Adj. R 0 2 0.067 0.145 0.1056 0.012 -0.008 -0.059 0.139 0.1068 0.010 -0.007 -

PAGE 81

Table 2.2 (Continued) Joint Estimation N1 = 3000 and N = 950 N1 = 5000 and N = 939 Estimator Min. Max. Mean SD R1 t-test Min. Max. Mean SD R1 t-test 1 (0.1) -0.059 0.272 0.096 0.045 -0.039 -0.09 -0.004 0.269 0.095 0.035 -0.054 -0.14 2 (0.2) 0.110 0.317 0.203 0.026 0.013 0. 12 0.121 0.263 0.203 0.021 0.017 0.14 1 (-0.3) -0.862 0.523 -0.266 0.226 -0.112 0.15 -0.751 0.327 -0.287 0.169 -0.045 0.08 2 (0.15) 0.060 0.228 0.148 0.027 -0.015 -0.07 0.091 0.219 0.151 0.021 0.006 0.05 -0.314 1.602 0.851 0.316 -0.055 -0.16 -0.055 1.503 0.874 0.238 -0.029 -0.11 ------------(-0.4) -0.910 0.393 -0.373 0.200 -0.066 0.14 -0.825 0.158 -0.386 0.153 -0.035 0.09 Adj. R 0 2 0.087 0.137 0.1084 0.007 --0.089 0.131 0.1086 0.006 --Recursive Estimation N1 = 3000 and N = 1000 N1 = 5000 and N = 1000 Estimator Min. Max. Mean SD R1 t-test Min. Max. Mean SD R1 t-test 1 (0.1) -0.058 0.256 0.099 0.046 -0.013 -0.02 -0.005 0.268 0.097 0.036 -0.026 -0.08 2 (0.2) 0.110 0.318 0.201 0.027 0.005 0. 04 0.122 0.263 0.202 0.022 0.009 0.09 1 (-0.3) -0.046 0.304 0.133 0.057 -1.445 7.60 -0.029 0.272 0.127 0.044 -1.423 9.70 2 (0.15) 0.062 0.234 0.156 0.028 0.040 0.21 0.095 0.229 0.159 0.021 0.057 0.43 0.065 0.428 0.265 0.051 -0.706 -12.45 0.158 0.373 0.267 0.039 -0.703 -16.23 Joint Estimation in wrong causal direction N1 = 3000 and N = 984 N1 = 5000 and N = 998 Estimator Min. Max. Mean SD R1 t-test Min. Max. Mean SD R1 t-test 1 (0.1) -1.089 0.932 0.077 0.342 -0.232 -0.07 -0.876 0.786 0.088 0.267 -0.118 -0.04 2 (0.2) 0.104 0.303 0.190 0.028 -0.050 -0.3 6 0.107 0.255 0.193 0.022 -0.036 -0.32 1 (-0.3) 0.171 0.443 0.304 0.046 -2.012 13.13 0.164 0.418 0.299 0.036 -1.996 16.64 2 (0.15) 0.075 0.233 0.155 0.027 0.033 0.19 0.095 0.230 0.157 0.021 0.048 0.33 -------------1.135 1.618 0.043 0.478 ---1.017 1.370 0.025 0.376 --(-0.4) -0.913 0.859 0.127 0.293 -1.318 1.80 -0.665 0.775 0.140 0.229 -1.349 2.36 Adj. R 0 2 0.087 0.136 0.1076 0.007 -0.007 -0.088 0.130 0.1078 0.006 -0.007 -70

PAGE 82

2.3.2). If BP < 0.05, model 1 is selected, ot herwise the test result should be recorded as being inconclusive. In other words, 0.05 significance level is tested. If z > 0, model 2 performs seemingly bette r than model 1, the null hypothesis that model 2 is true should be estab lished, then BP value can be ca lculated. Similarly, if BP < 0.05, model 2 is selected, otherwise it is recorded as being inconclusive. Figure 2.1 shows the distribu tion of z from the simula tion experiments as N1 = 1000. The distribution is seriously biased to ward the negative side, which is consistent with expectation that true model will better fit the data in most simulation experiments. Table 2.3 shows the statistical results fo r the application of non-nested test under various sample sizes. As N1 = 1000, among 847 effective experiments, where convergence is reached under both causal st ructures, 237 experiments offer better goodness-of-fit from wrong model than from tr ue model. Among these 237 experiments, only 29 experiments obtain the BP value less than 0.05, where wrong model 2 is judged as true model. This is Type II error that wrong model is incorrectly accepted. In this study, the possibility of making Type II error can be estimated by 29/847 3.4% < 5%. (5% is the significance level that is selected for non-nested te st.) This result supports the validity of non-nested test in the application to identify the alternative causal structures. In addition, there are 265 ( 31.3%) experiments with conclu sively correct judgement and 553 (65.3%) experiments with inconclusive judgement. As N1 increases, more experiments can be conclusively identified. For example, if sample size of dataset for model estimation increases to 5000, 89.9% of th e experiments can be correctly identified by the non-nested test (as s hown in Table 2.3). Thus, the simulation study highly recommends a large sample size for estima ting recursive bivari ate probit model. 71

PAGE 83

2.4.4 Monte Carlo Studies for Lee Model In this section, Monte Carlo Study is conducted to examine the robustness of discrete-continuous simultaneou s equations model based on L ee transformation. As long as distributional function is know n and all the parameters with respect to this function are identified, maximum likelihood estimators are alwa ys consistent and efficient. However, in the real case, the distributional assumpti on can be easily violated. Robustness stands for the consistency property of estimators wh en the distributional a ssumption is violated. For example, linear regression model is cons idered robust because the consistency of OLS estimators does not depend on distributional assumption on its random error term. The procedure to generate the synthetic datasets is presented as follows. Table 2.3 Statistics for Nonnested Test Application N1 = 1000 Inconclusive Conclusive Total True Model is seemingly better 345 (40.7%) 265 (31.3%) 610 (72.0%) Wrong Model is seemingly better 208 (24.6%) 29 (3.4%) 237 (28.0%) Total 553 (65.3%) 294 (34.7%) 847 (100%) N2 = 2000 Inconclusive Conclusive Total True Model is seemingly better 287 (32.1%) 465 (52.0%) 752 (84.0%) Wrong Model is seemingly better 118 (13.2%) 25 (2.8%) 143 (16.0%) Total 405 (45.3%) 490 (54.8%) 895 (100%) N2 = 3000 Inconclusive Conclusive Total True Model is seemingly better 112 (12.0%) 746 (79.9%) 858 (91.9%) Wrong Model is seemingly better 48 (5.1%) 28 (3.0%) 76 (8.1%) Total 160 (17.1%) 774 (82.9%) 934 (100%) N2 = 5000 Inconclusive Conclusive Total True Model is seemingly better 67 (7.1%) 843 (89.9%) 910 (97.0%) Wrong Model is seemingly better 22 (2.3%) 6 (0.6%) 28 (3.0%) Total 89 (9.4%) 849 (90.5%) 938 (100%) 72

PAGE 84

1. Generation of Error Terms Suppose there are 3 alternatives (I = 3) in the discrete choice set. Let z 1 z 2 and z 3 be independently and uniformly distributed as R(0,1). Let 1 = -ln[-ln(z 1 )], 2 = -ln[ln(z 2 )] and 3 = -ln[-ln(z 3 )]. Then, 1 2 and 3 are i.i.d. Gumbel(0, 1). To realize the correlation between norm al seeds and each Gumbel distribution i let 1 = -1 (z 1 ), 2 = -1 (z 2 ) and 3 = -1 (z 3 ), thus i ~ N(0,1) and corr( i i ) 1 (due to the non-linear transformation of the same ra ndom seeds, the correlation may not exactly equal to 1; however, it is a constant a nd approximately equal to 0.97). Generate z 4 ~ R(0,1) which is independent of z i and let 4 = -1 (z 4 ). Then 4 ~ N(0,1) is independent of i (i = 1,2,3). Let = 2 4 2 3 2 2 2 1 44332211ffff ffff (2.6.2) where f i are arbitrary constant coefficients to control the correlation between and i It can be shown that ~ N(0, 1) and corr( i ) 2 4 2 3 2 2 2 1 iffff f (2.6.3) This process results in the generation of three i.i.d. standard Gumbel random seeds i one standard normal random seed and corr( i ) 2 4 2 3 2 2 2 1 iffff f = c i Note that c i is constant, which meets the requirement as noted in the preceding discussions. 73

PAGE 85

2. Generation of Parameters and Explanatory Variables Given the model equations (2.6.4) with no endogenous variable and model equations (2.6.5) with en dogenous continuous variable in utility function, q q10q qiqii1 i 0 qix a y u (2.6.4) q q10q qiqiqii1 i 0 qix a ay u (2.6.5) let f1 = -0.5; f2 = 0.4; f3 = 0.0; f4 = 0.5, then c1 -0.60 and c2 0.48. Table 2.4 True Values of Parameters in the Model 01 = -0.15 02 = 0.25 03 = 0 0 = 1.0 11 = -0.1 12 = 0.3 13 = 0 1 = 1.5 1 = 0.2 2 = -0.2 = 2 -Let explanatory variables y1 ~ R(0,3), y2 ~ R(0,2) and x ~ R(0,4), where R() represents a uniform distribution. 3. Generation of Dependent Variables Let a = 1.0 + 1.5 x + 2 Then calculate ui based on the model formulated as Equation (2.6.5). u1 = -0.15 0.1 y1 + 0.2 a + 1, u2 = 0.25 + 0.3 y2 0.2 a + 2, u3 = 3. 74

PAGE 86

For the model formulated by Equa tion (2.6.4), we only calculate u 1 = -0.15 0.1 y 1 + 1 u 2 = 0.25 + 0.3 y 2 + 2 u 3 = 3 Then, calculate the dummy variables D 1 D 2 and D 3 indicating the discrete choices as: D 1 = (u 1 u 2 and u 1 u 3 ); (If the conditions are satisfied, D 1 =1; D 1 = 0 otherwise); D 2 = (u 2 u 1 and u 2 u 3 ); (If the conditions are satisfied, D 2 =1; D 2 = 0 otherwise); D 3 = (u 3 u 1 and u 3 u 2 ). (If the conditions are satisfied, D 3 =1; D 3 = 0 otherwise); This completes the development of a synthetic random dataset consisting of discrete choice indicators D 1 D 2 and D 3 as well as explanatory variables y 1 y 2 and x. The sample size is set to 3000. 4. Simulation Results Following the previous procedure, a s ynthetic random dataset is generated 500 times and parameters are estimated by ma ximizing the log-likelihood function. The statistical results are shown in Table 2.5 and Table 2.6. Table 2.5 offers the statistical results of estimators from joint estimation for the model without endogenous variables, whereas Table 2.6 offers the results from recursive estimation (multinomial logit model for di screte choice and linear regression for continuous variable). In both tables, t-test fails to reje ct the null hypothe sis that the estimator for exogenous variable is consistent. However, t-test rejects the consistency of s estimator from joint estimation method. The mean value of i s estimator is far from 75

PAGE 87

c i which indicates that i cannot truly represen t the correlation c i It can lead to the inconsistent estimate of It is noticeable that estimat ors in Table 2.5 are a bit more efficient than those in Table 2.6, as evidenced by the smaller standard deviations. That is because joint estimation with accommodating error correlations, in spite of being misspecified, will anyway improve efficiency of estimators for exogenous variables. Table 2.5 Statistics of Estimators from Joint Estimation Procedure (without Endogenous Variables) Discrete-continuous Simultaneous Equation System: Joint Estimation N=500 Minimum Maximum Mean True Parameter Std Dev Difference t-stat 01 -0.452 0.078 -0.159 -0.150 0.088 -0.009 -0.104 11 (y 1 ) -0.286 0.052 -0.102 -0.100 0.047 -0.002 -0.048 02 -0.023 0.461 0.239 0.250 0.071 -0.011 -0.152 12 (y 2 ) 0.119 0.494 0.312 0.300 0.057 0.012 0.212 0 0.712 1.258 0.985 1.000 0.085 -0.015 -0.182 1 (x) 1.415 1.585 1.500 1.500 0.026 0.000 -0.006 1.817 1.956 1.888 2.000 0.025 -0.112 -4.548 c 1 and 1 0.664 0.796 1 = 0.726 c 1 =-0.600 0.022 --c 2 and 2 -0.708 -0.496 2 = -0.619 c 2 =-0.597 0.034 --c 3 and 3 -0.051 0.199 3 = 0.077 c 3 =0.477 0.044 --Table 2.7 offers the statistical result s of estimators through joint estimation method for the model with endogenous continuo us variables in utility functions, whereas Table 2.8 offers the results from recursive estimation method. Since i does not truly represent c i consistent estimators for endogenous va riables a were not obtained as evidenced by the t-tests, which strongly rej ect the null hypothesis th at the expectation of estimator is equal to the true parameter valu e. Inconsistency in the estimator for the endogenous variable leads to inconsistency in all of the estimators of the constant terms in the model. The t-test only fails to reje ct the consistency of estimators for exogenous variables; this is reasonable because the co efficients for exogenous variables can be 76

PAGE 88

consistently estimated even without accommodation of random error correlat ions, similar to those in Table 2.6. Table 2.6 Statistics of Estimators from Recursive Estimation Procedure (without Endogenous Variables) Discrete-continuous Simultaneous Equation System: Recursive System Estimation N=500 Minimum Maximum Mean True Parameter Std Dev Difference t-stat 01 -0.488 0.130 -0.156 -0.150 0.096 -0.006 -0.061 11 (y 1 ) -0.271 0.053 -0.097 -0.100 0.053 0.003 0.054 (a) 0.002 0.522 0.252 0.250 0.077 0.002 0.029 02 0.113 0.485 0.300 0.300 0.064 0.000 0.004 12 (y 2 ) 0.799 1.236 1.001 1.000 0.073 0.001 0.010 (a) 1.415 1.586 1.501 1.500 0.031 0.001 0.031 0 1.921 2.068 2.000 2.000 0.026 0.000 -0.018 1 (x) -0.488 0.130 -0.156 -0.150 0.096 -0.006 -0.061 -0.271 0.053 -0.097 -0.100 0.053 0.003 0.054 c 1 and 1 0.000 0.000 1 = 0.000 c 1 =-0.600 ---c 2 and 2 0.000 0.000 2 = 0.000 c 2 =-0.597 ---c 3 and 3 0.000 0.000 3 = 0.000 c 3 =0.477 ---Table 2.7 Statistics of Estimators from Joint Estimation Procedure (with Endogenous Variables) Discrete-continuous Simultaneous Equation System: Joint Estimation N=500 Minimum Maximum Mean True Parameter Std Dev Difference t-stat 01 -0.875 0.075 -0.492 -0.150 0.154 -0.342 -2.223 11 (y 1 ) -0.185 0.058 -0.082 -0.100 0.034 0.018 0.526 (a) 0.146 0.359 0.271 0.200 0.030 0.071 2.329 02 -0.876 -0.171 -0.502 0.250 0.117 -0.752 -6.420 12 (y 2 ) 0.152 0.562 0.325 0.300 0.074 0.025 0.341 (a) -0.106 0.030 -0.047 -0.200 0.017 0.153 9.044 0 1.214 2.743 2.158 1.000 0.264 1.158 4.382 1 (x) 1.216 1.673 1.411 1.500 0.077 -0.089 -1.150 2.059 2.455 2.273 2.000 0.071 0.273 3.830 c 1 and 1 0.738 0.967 1 = 0.912 c 1 =-0.600 0.033 --c 2 and 2 -0.292 0.073 2 = -0.069 c 2 =0.480 0.060 --c 3 and 3 0.076 0.373 3 = 0.224 c 3 =0.000 0.051 --77

PAGE 89

Table 2.8 provides the estimation results fo r a recursive system, which is obtained by imposing zero values on the correlation i The results indicate that the constants and parameters for endogenous variables a in th e utility functions are not consistently estimated, but all of the parameters in the continuous model appear consistent with the true parameter values. This is because there are no endogenous variables in the continuous model and any parameters in th is model equation can be consistently estimated in a recursive system. In the join t estimation procedure, the constant in the continuous model is inconsiste ntly estimated because of th e inconsistent estimator on However, it is not a problem in the recursive estimation procedure. 2.4.5 Summary Monte Carlo studies in this section disclose the following facts: 1. Joint estimators for bivariate probit model are more efficient than recursive estimators, particularly when the error correlation is high. However, estimators are both consistent regardless of recursive estimation or joint estimation. 2. Joint estimator of coefficient for e ndogenous dummy variab le in recursive bivariate probit model is consistent, whereas recursive estimator of coefficient for endogenous dummy variable is in consistent. Regardless of recursive estimation or joint estimation, the estimators of coefficients for exogenous variables are consistent. Regardless of correct casual structure or wrong causal stru cture, the estimators of coefficients for exogenous va riables are consistent. 78

PAGE 90

3. Non-nested test offers a valid higher bound of probability that the model under wrong causal structure will be conclusively accep ted as it is applied to compare recursive bivariate probit model under alternative causal structure. 4. The estimation of coefficient for endo genous variable in discrete-continuous model is highly sensitive to covariance st ructure of random error terms. Arbitrary specification of error correla tions cannot help consistently estimate the coefficient for endogenous variables. However, the coefficients for exogenous variables can be consistently estimated regardless of recurs ive estimation or joint estimation. Joint estimators for exogenous variables are more e fficient than recursive estimators even if covariance structure of random error terms is misspecified. Table 2.8 Statistics of Estimators from Recursive Estimation Procedure (with Endogenous Variables) Discrete-continuous Simultaneous Equation System: Recursive System Estimation N=500 Minimum Maximum Mean True Parameter Std Dev Difference t-stat 01 0.317 0.907 0.654 -0.150 0.106 0.804 7.623 11 (y 1 ) -0.243 0.005 -0.115 -0.100 0.043 -0.015 -0.355 (a) -0.054 0.072 0.003 0.200 0.016 -0.197 -12.014 02 -1.178 -0.364 -0.760 0.250 0.130 -1.010 -7.752 12 (y 2 ) 0.160 0.608 0.354 0.300 0.080 0.054 0.671 (a) -0.057 0.096 0.001 -0.200 0.020 0.201 9.942 0 0.724 1.211 1.003 1.000 0.078 0.003 0.037 1 (x) 1.386 1.621 1.499 1.500 0.034 -0.001 -0.018 1.910 2.068 1.998 2.000 0.027 -0.002 -0.072 c 1 and 1 0.000 0.000 1 = 0.000 c 1 =-0.600 ---c 2 and 2 0.000 0.000 2 = 0.000 c 2 =0.480 ---c 3 and 3 0.000 0.000 3 = 0.000 c 3 =0.000 ---79

PAGE 91

Chapter Three: Dataset Preparation and Description 3.1 Introduction to Swiss Travel Survey The data set used for analysis and mode l estimation is extracted from the Swiss Travel Microcensus 2000. A very detailed de scription of the survey and the survey sample can be found in Ye and Pendyala (2003). The survey respondent sample consists of 27,918 households from 26 cantons in Sw itzerland. The person sample was formed by randomly selecting one person over six year s old from each household with less than four household members and two persons over six years old from each household with four or more members. As a result of this sampling scheme, the person respondent sample consisted of 29,407 persons. All of the persons in the person sample were asked to report their travel in a one-day trip di ary. The resulting trip data set includes 103,376 trips reported by 29,407 interviewed persons (including the possibility of some respondents making zero trips on the survey day). The household, person and trip characteristics of these samples are respectiv ely shown in Table 3.1 through Table 3.4. 3.2 Dataset Description at Household Level Table 3.1 shows the household characterist ics of Switzerland. The sample shows that the average household size in Switzerland is 2.43. Single family constitutes 27.5% among all the households, whereas big family with more than 3 ( 4) household 80

PAGE 92

members constitutes 23.4%. Household monthly income has low response rate as indicated by 24.9% missing values. With exclusion of missing values, around 27.7% Table 3.1 Household Characteristics of Swiss Travel Microcensus 2000 Characteristic Swiss Sample Sample Size 27918 Household Size 2.43 1 person 27.5% 2 persons 35.1% 3 persons 14.0% 4 persons 23.4% Monthly Income Low (Fr 8K) 18.4% Missing 24.9% Vehicle Ownership 1.17 0 auto 19.8% 1 auto 50.5% 2 autos 24.5% 3 autos 5.2% Family Type Single 27.2% Partner (unmarried and no child) 27.9% Married 43.6% Other 1.3% Presence of Children Child <6 years old 10.6% Child 6~17 years old 22.5% Household Location Major city 42.4% Surrounding areas of city 30.4% Isolated city 1.1% Rural 26.1% of households are categorized into lowincome household (20.8% in raw data). However, the proportion of low-income households should be higher than 27.7% due to 81

PAGE 93

the potentially positive correlation between low response rate and low income. The average number of household cars is 1.17 in Switzerland. As expected, the proportion of households without automobiles ( 19.8%) in this Swiss sample is substantially higher than in a typical sample from the United States. This may be reflective of the higher level of public transport service in Switzerland that enables mobility and accessibility without the same level of auto dependence. As a result one might expect the automobile to play a smaller role in the Swiss tr avel environment than in the US environment. 43.6% of households are composed of married couples with or without children. 10.6% of households have children who are less than 6 years old and 22.5% of households have children who are 6~17 years ol d. In Switzerland, 42.4% of households are located in major city and 30.4% in surrounding areas of city, while only 1.1% of households are located in rural areas. 3.3 Dataset Description at Person Level Table 3.2 presents person characteristic s of the Swiss respondent sample. The average age is 43.9 years in the person sample, among which the proportion of people over 60 years is as high as 25.5%. It is well-known that Switzerland has a serious problem of aging population. 46.3% of responde nts in the sample are male, while 53.7% are female. It is unlikely to be a true reflection of gender distribution among the population. Higher female proportion in the sample is probably caused by higher response rate of female. 48.4% of persons are not employed, 37.3% of persons are fulltime employed and 14.3% are part-time employed. Only 67.4% of respondents are licensed in Swiss sample, which is much lower than in a typical US sample. On average, 82

PAGE 94

Swiss people make 3.51 trips per day, among which 0.46 trips are for work purpose and 3.06 trips are for non-work purpose. Table 3.2 Person Characteristics of Swiss Travel Microcensus 2000 Characteristic Swiss Sample Sample Size 29407 Age (in years) 43.9 (Mean) Young (6~29) 26.8% Middle (30~59) 47.6% Old ( 60) 25.5% Sex Male 46.3% Female 53.7% Employment Status Full time 37.3% Part time 14.3% Not employed 48.4% Licensed 67.4% #Trips/day 3.51 Work trips 0.46 Non-work trips 3.06 3.4 Dataset Description at Trip Level Table 3.3 and Table 3.4 illust rate the trip characteri stics of Swiss sample by analyzing the trip distribution by purpose and mode. Among the total 103,376 trips, 101,783 trips are selected for analysis with exclusion of the trips containing missing value on purpose or mode. Table 3 offers trip purpose distribution by various trip modes. In general, as high as 25.1% of trips ar e made for leisure purpose, 13.2% for work purpose and 39.4% for back-home purpose. To some degree, the percent of back-home trip purpose is able to reflect the prevalence of trip chaining behavior. With the absence of trip chaining behavior (home single destination home), the percent of backhome purpose should be around 50%. That 39.4% in the sample is considerably lower 83

PAGE 95

than 50% indicates trip chaining behavior cannot be ignored in the context of Switzerland. The distribution of trip purpose using various m odes is generally consistent with expectation. For example, 18.0% of auto drivers trips are for work purpose, however only 5.4% of auto passengers trip ar e for work purpose. It is intuitive that workers do not tend to commute as auto passeng ers. Only 0.8% of trips made by transit riders are for service, but 6.3% by auto drivers are for service. Intuitively, auto drivers are more likely to serve people than transit riders. Table 3.3 Trip Characteristics of Swiss Travel Microcensus 2000 (Trip Purpose Distribution by Trip Mode) All Purposes Work (%) School (%) Shopping (%) Business (%) Leisure (%) Service (%) Back Home (%) All (%) Auto Driver 39059 18.0 0.5 11.9 4.3 19.8 6.3 39.1 100 Auto Passenger 11671 5.4 2.5 10.4 1.5 32.9 4.1 43.3 100 Bicycle/Motorcycle 9297 14.6 8.2 9.9 1.3 21.4 1.3 43.4 100 Pedestrian 29052 8.1 6.5 14.1 1.1 32.0 1.8 36.3 100 Transit 12704 16.2 8.0 10.5 3.0 20.8 0.8 40.6 100 Trip Numbers 101783 13.2 4.1 12.0 2.6 25.1 3.6 39.4 100 Table 3.4 Trip Characteristics of Swiss Travel Microcensus 2000 (Trip Mode Distributi on by Trip Purpose) All Purposes Work (%) School (%) Shopping (%) Business (%) Leisure (%) Service (%) Back Home (%) Auto Driver (%) 38.4 52.3 5.1 38.0 62.4 30.4 66.8 38.1 Auto Passenger (%) 11.5 4.7 6.9 9.9 6.6 15.1 12.9 12.6 Bicycle/Motorcycle (%) 9.1 10.1 18.1 7.5 4.4 7.8 3.3 10.1 Pedestrian (%) 28.5 17.6 45.5 33.6 12.4 36.4 14.1 26.3 Transit (%) 12.5 15.3 24.3 11.0 14.1 10.3 2.9 12.9 All (%) 100 100 100 100 100 100 100 100 Trip Numbers 101783 13436 4181 12217 2671 25509 3704 40065 Table 3.4 offers trip mode distribution by various trip purposes Generally, autodependent trips including both as drivers and passengers constitutes around half of total number of trips. 12.5% of trips use public transit, 28.5% are made on foot and 9.1% of 84

PAGE 96

trips use bicycle or motorcycle. This statistics illustrates a multimodal travel environment in Swiss context. Further, the mode distribution by each trip purpose is also consistent with common sense. Fox exampl e, 52.3% of work trips are made by auto drivers, but only 4.7% by auto passengers. School trips take the lowest proportion of using auto but the highest proportion of usi ng transit mode, cycle mode and walk mode, among the all types of trips. 85

PAGE 97

86 Chapter Four: Empirical Estimation Results 4.1 Causal Models Between Trip Chaining and Mode Choice (Recursive Bivariate Probit Model) 4.1.1 Background Over the past few decades, there has been considerable research on peoples trip chaining patterns, i.e., the propens ity to link a series of activit ies into a multi-stop tour or journey (Shiftan, 1998; Dissanaya ke and Morikawa, 2002). The analysis of trip chaining activity may lead to a better understanding of travel behavior and provide a more appropriate framework for examining various transportation policy issues (Strathman and Dueker, 1995). Indeed, the profession has s een tour-based models being developed and increasingly applied in the travel dema nd forecasting arena in place of the more traditional trip-based models that do not reflect trip chaining behavior and tour formation. The terms trip chain and tour are used synonymously to refer to a sequence of trips that begins at home, involves visits one or more other places, and ends at home. Depending on the number of places visited with in the tour or chain, the tour may be classified into two categories: simple and comp lex. A tour or chain with a single stop or activity outside the home location is defined as a simple tour, whereas a tour or chain with more than one stop outside the home loca tion is defined as a complex tour. Thus

PAGE 98

87 a tour or chain of the form: home shop home is considered a simple tour while a tour of the form: home work shop home is considered a complex tour. As peoples activity patterns become increasingly complex and involve interactions with other h ousehold and non-household members and as time is a finite resource, it may be conjectured that trip chains are likely to be increasingly complex over time. The ability to chain multiple activities together in a single tour or chain may provide greater efficiency a nd convenience than a series of single-stop simple tours (Hensher and Reyes, 2000). There are at least two reasons as to why this has significant traffic and policy implications. First, complex tours or chains may lead to an increase in automobile usage. If one needs to pursue complex tours or chains, then the flexibility afforded by the private automobile is desirabl e. The ability to pursue multiple activities in a single journey is rather limited wh en constrained by the schedules, routes, and uncertainty associated with public trans portation. Thus, complex trip chaining may contribute to an increased auto dependency and consequently, automobile traffic. Second, in the case of workers (commuters), the formation of complex trip chains may entail the linking of non-work activities with the work trip (commute). Then, non-work trips that could have taken place outside the peak periods now occur in the peak periods simply because they are being tied together with the commute. Thus, complex trip chaining patterns may contribute to an in crease in peak period travel demand. The above discussion clearly points to th e possible interdependency between trip chaining, auto usage, and trip timing. Strathman and Dueker (1995), in an analysis of the 1990 NPTS, found that complex trip chains may tend to be more auto-oriented. However, the nature of the causal relationship is not uni laterally evident because the availability of

PAGE 99

88 an automobile may provide the flexibility and convenience that contributes to the formation of complex trip chains. The flex ibility of the automobile may stimulate the desire to undertake a dditional activities in one tour. Fo r example, the lower travel times typically associated with the auto mode c hoice may relax time constraints and lead to more stop-making (Bhat, 1997). Moreover, shar ed rides, which constitute a portion of total auto mode share, are more likely to invol ve complex tours due to the variety of trip purposes and destinations between the driver and passengers. The central question being addressed is: Does mode choi ce influence the complexity of trip chaining patterns or does the complexity of the trip chaining patterns influence mode choice?. Previously, Strathman and Dueker (1994) analyzed the probability of an individual engaging in a complex work tour using a binary log it model formulation, where the complexity/simplicity of a tour was modeled as a binary choice. One may also adopt a binary choice formula tion to model mode choice at th e tour level, i.e., auto vs. non-auto mode choice. Thus, the investig ation of the mutual influence and causal relationship between tour complexity and mode choice may be reduced to a problem involving two binary discrete choi ce variables. The nested lo git model is often applied in dealing with problems of this nature. Ba sed on the assumption of a conditional choice mechanism, nested logit models representi ng two alternative tree structures can be formed. By checking the reasonableness of the estimated inclusive value parameter coefficients and/or comparing measures of goodness-of-fit betw een models of two different structures, the more plausible stru cture that is supporte d by the data may be identified. Hensher and Reyes (2000) used the nested logit model formulation to understand the role of trip chaining in serving as a barrier to the us e of public transport

PAGE 100

89 modes. This section is intended to further clarify the relationship between mode choice and tour complexity using recursive bivari ate probit model (see in Section 2.2.1) that explicitly allows the quantification of the imp act of one choice dimension on another. In other words, it is to model the causal relations hip between the complexity of trip chains and mode choice. 4.1.2 Dataset Preparation and Desc ription for Modeling Analysis In this study, the unit of anal ysis is the tour or trip ch ain. A trip chain is defined as a complete home-to-home journey where the origin of the first trip is home and the destination of the last trip is home. No intermediate home stop is present within the trip chain. Whenever the home location is reached, a chain is formed. A tour-level data set was formed by aggregating the trip data set to the tour level. All person and household characteristics were merged into the tour leve l data set. In most cases, a single mode was prevalent for the trip chain. In cases wher e multiple modes were prevalent within the same trip chain or tour, a single mode was assigned based on the whether or not the auto mode was used in the chain. If the auto mode was used for any segment in the trip chain, then the chain was assigned an auto mode and vice versa. One may argue that main mode should be defined as a representation of mode choice at the tour level but it is felt that the definition of the mode for a chain is a complex issue. The definition in this study is made in this way because the major concern is not the main mode of the chain, but whether the auto mode was used for any part of the chain thus potentially contributing to the formation of a multi-stop complex chain. Ea ch tour was classified as a simple or

PAGE 101

90 complex tour depending on whether it had one intermediate stop or more than one intermediate stops within the chain. Data corresponding to respondents from th e Canton of Zurich was extracted to reduce the data to a more manageable size and to control for possible area specific effects. Tables 4.1 and 4.2 include summary statistics for the Zurich subsample in addition to those of the overall Swiss sample. Ther e are 3293 persons from 2998 households who report at least one non-work tour in th e Zurich sample and 1466 persons from 1438 households who report at least one work tour. It is to be not ed that these two samples are not mutually exclusive as some individuals may report both a work tour and a non-work tour. As expected, households in which there are work tour makers report higher income levels than households in which there are non-work tour makers, presumably because the work tour maker households consistently in clude workers earning wages. The average household size is a little over two persons pe r household while vehicle ownership is a little over one vehicle per household. As expected, a very small percentage of households in the Zurich subsample report th eir residence as bei ng in a rural location, presumably due to the urban nature of Zurich and its immediate surrounding areas. Person characteristics also s how similarities between the overall Swiss sample and the Zurich subsamples. As expected, the non-work tour maker sample consists of a greater proportion of elderly (retired) a nd young persons than the work tour maker sample. On average, work tour makers make about 1.17 trip chains per day where a trip chain is defined as a complete home-to-hom e tour. Non-work tour makers report, on average, about 1.49 trip chains per day. Work tour makers make 4.46 trips per day while non-work tour makers report fewer trips at 4.11 trips per day. The trip rates are

PAGE 102

91 substantially higher than the trip rate for the overall Swiss sample, which is partially caused by the exclusion of zero-tr ip making persons from the Zurich subsample. As the model estimation was performed only on the Zu rich subsample, all further analysis presented in the section pertains only to this subsample. The Zurich subsample included 4,901 non-work tours and 1,711 work tours. Table 4.1 Household Characteristics of Swiss Travel Microcensus 2000 and Zurich Subsamples Swiss Sample Zurich Non-work Tour Makers Zurich Work Tour Makers Sample Size 27918 2998 1438 Household Size 2.43 (Mean) 2.42 (Mean) 2.33 (Mean) 1 person 27.5% 29.9% 31.5% 2 persons 35.1% 33.2% 33.5% 3 persons 14.0% 11.6% 12.0% 4 persons 23.4% 25.4% 22.9% Monthly Income Low (Fr 8K) 18.4% 21.5% 35.0% Missing 24.9% 20.1% 15.7% Vehicle Ownership 1.17 1.07 1.25 0 auto 19.8% 24.2% 16.5% 1 auto 50.5% 49.8% 51.0% 2 autos 24.5% 21.8% 26.1% 3 autos 5.2% 4.2% 6.4% Family Type Single 27.2% 29.5% 31.0% Partner (unmarried and no child) 27.9% 26.4% 26.4% Married 43.6% 42.0% 40.0% Other 1.3% 2.1% 2.6% Presence of Children Child < 6 years old 10.6% 9.9% 9.0% Child 6~17 years old 22.5% 24.7% 19.9% Household Location Major city 42.4% 54.4% 54.2% Surrounding areas of c ity 30.4% 35.0% 35.7% Isolated city 1.1% 0.8% 0.8% Rural 26.1% 9.7% 9.4%

PAGE 103

92 Tables 4.3 and 4.4 offer simple cross-tabula tions of tour complexity against mode choice. Table 4.3 examines the distribution of tour complexity by mode choice for nonwork tours while Table 4.4 examines the distri bution for work tours. An examination of column-based percentages in Table 4.3 indi cates that about 28 percent of simple nonwork tours involve the use of the automobile as the primary mode of transportation. This value is considerably higher at 44 percent fo r complex non-work tours. Thus it appears Table 4.2 Person Characteristics of Swiss Travel Microcensus 2000 and Zurich Subsamples Characteristic Swiss Sample Zurich Non-work Tour Makers Zurich Work Tour Makers Sample Size 29407 3293 1466 Age (in years) 43.9 (Mean) 44.2 (Mean) 41.8 (Mean) Young (6~29) 26.83% 28.1% 18.3% Middle (30~59) 47.64% 42.1% 74.6% Old ( 60) 25.48% 29.8% 7.2% Sex Male 46.31% 45.6% 60.8% Female 53.69% 54.4% 39.2% Employment Status Full time 37.34% 29.4% 76.9% Part time 14.27% 14.5% 20.2% Not employed 48.39% 56.0% 2.9% Licensed 67.43% 64.1% 87.9% #Chains/day 1.33 1.49 1.17 #Trips/day 3.51 4.11 4.46 Work trips 0.46 0.21 1.54 Non-work trips 3.06 3.90 2.92 Work Trip Mode Share Auto 55.84% 49.6% 52.0% Non-Auto 44.16% 50.4% 48.0% Non-Work Trip Mode Share Auto 48.92% 43.5% 54.1% Non-Auto 51.08% 56.5% 45.9%

PAGE 104

93 that there is a correlation (a t least) between mode choice a nd tour complexity. Clearly, the auto mode is utilized to a greater degree in the contex t of complex multi-stop trip chains. Similarly, examining the row-base d percentages shows that 80 percent of nonwork non-auto tours are simple in nature (involve only one stop). On the other hand, only 66 percent of non-work auto tours are si mple in nature. Thus it appears that nonauto tours tend to be simpler than auto-based tours. Table 4.3 Crosstabulation of Mode Choi ce and Tour Type for Non-work Tours Tour Type Mode Choice Simple Complex Total Frequency Non-auto 2685 661 3346 Auto 1030 525 1555 Total 3715 1186 4901 Column Percent Non-auto 72.3% 55.7% 68.3% Auto 27.7% 44.3% 31.7% Total 100.0% 100.0% 100.0% Row Percent Non-auto 80.2% 19.8% 100.0% Auto 66.2% 33.8% 100.0% Total 75.8% 24.2% 100.0% Table 4.4 Crosstabulation of Mode C hoice and Tour Type for Work Tours Tour Type Mode Choice Simple Complex Total Frequency Non-auto 436 355 791 Auto 397 523 920 Total 833 878 1711 Column Percent Non-auto 52.3% 40.4% 46.2% Auto 47.7% 59.6% 53.8% Total 100.0% 100.0% 100.0% Row Percent Non-auto 55.1% 44.9% 100.0% Auto 43.2% 56.8% 100.0% Total 48.7% 51.3% 100.0% Table 4.4 offers similar indications, albeit the tendencies are not as strong as those seen in Table 4.3. In the case of work tours, it is found that a majority of simple tours are non auto-based (52 pe rcent) while a majority of complex tours are auto-based

PAGE 105

94 (60 percent). Similarly, a majority of non au to-based work tours tend to be simple in nature (55 percent), while a majority of auto-b ased tours tend to be complex in nature (57 percent). Once again, a clear co rrelation between auto use and trip chain complexity is seen in these cross-tabulations. Given the di fference in the percent distributions between work and non-work tours, it was considered prudent to examine th e causal relationship between tour complexity and mode choice fo r work and non-work tours separately. 4.1.3 Model Estimation Results This section presents estimation results of the recursive bivariate probit model for causal analysis between trip chaining formati on and mode choice. A ll the tables showing model estimation results consist of four bloc ks. The first two blocks provide the model estimation results for the causal structure wher e tour complexity affects mode choice. Between these two blocks, the left block provides estimation results from recursive estimation method and the right block provide s estimate results from joint estimation method. The next two blocks provides esti mation results for the causal structure where mode choice affects tour complexity. Simila rly, between these two blocks, the left one is obtained from recursive estimation method and the right block is from joint estimation method. 4.1.3.1 Estimation Results for Non-Work Tours Table 4.5 offers the definition and desc ription of the variables regarding nonwork tour model and the estimation results for non-work tour m odels are provided in Table 4.6. In the causal structure where to ur complexity affects mode choice, the

PAGE 106

95 coefficient for tour complexity is statistica lly significant and positive in the mode choice model, regardless of from r ecursive estimation method or jo int estimation method. This lends credence to the hypothesis that the need to make a complex tour is likely to increase dependency on the auto mode. Th e coefficient of COMPLEX from joint estimation is more positive than from recu rsive estimation (1.409 vs. 0.456) because the negative correlation (-0.622) is accommodated into jo int estimation procedure. In addition, it was found that demographic and so cio-economic characteristics, the tours primary purpose, and time-of-day signifi cantly influence mode choice and tour complexity. The coefficients of these variables are rather close between recursive estimation and joint estimation. In the auto mode choice model, negative coefficient of CAR_0 is consistent with expectat ion that tour makers with zer o autos are less likely to use auto mode, while those with more than on e auto are more likely to use the auto mode as evidenced by the positive coefficient of CAR_GE2. As expected, those with a driver license are more prone to using the auto mode while those with seasonal transit ticket subscriptions are less likely to use the auto mode. Transit ticket subscribers are likely to be more transit-oriented and have better access to transit services than nonsubscribers. Tours made by persons living in rural areas are likely to be auto-oriented, presumably due to their limited transit accessibility. In the tour complexity model, it is fo und that individuals in larger households tend to make less complex tours as opposed to individuals in smaller households. One may conjecture that the possibility of task allocation present in a multi-person household may reduce the need to perform multi-stop tr ip chains (Strathman and Dueker, 1994). The young and the elderly are less likely to pursue complex non-work tours, possibly

PAGE 107

96 because they have fewer household obligations than those in the middle age groups. It is rather interesting that tours undertaken in the AM peak show a greater propensity to involve multiple stops than those undertaken in the PM peak period. However, in the context of non-work tours, this may be a plau sible result in that people combine a series of errands and school activities in the morning and complete their activities by mid-day. In the tour complexity model, it is fo und that individuals in larger households tend to make less complex tours as opposed to individuals in smaller households. One may conjecture that the possibility of task allocation present in a multi-person household may reduce the need to perform multi-stop tr ip chains (Strathman and Dueker, 1994). The young and the elderly are less likely to pursue complex non-work tours, possibly because they have fewer household obligations than those in the middle age groups. Table 4.5 Non-work-tour Model Variable Description and Statistics (N = 4901) Variable Variable Description Mean Std. Dev. CAR_0 Number of autos in household = 0 0.21 0.41 AUTOLIC Person has auto license 0.62 0.48 H_SUB Person subscribes half-price seasonal ticket 0.43 0.49 O_SUB Person subscribes other type of seasonal ticket 0.26 0.44 CAR_GE2 Number of autos in household 2 0.29 0.45 RURAL Person lives in rural area 0.11 0.31 COMPLEX Tour is complex (multi-stop) 0.24 0.43 HHSIZE Number of household members 2.72 1.46 YOUNG Person < 18 years old 0.21 0.40 OLD Person > 60 years old 0.27 0.45 SERVICE Primary purpose of the tour is service 0.06 0.23 SHOPPINGPrimary purpose of the tour is shopping 0.30 0.46 AMPEAK Tour starts in AM peak period (7:00~8:59) 0.17 0.37 PMPEAK Tour starts in PM peak period (16:00~17:59) 0.10 0.30 AUTO Tour uses auto 0.32 0.46

PAGE 108

Table 4.6 Non-work-tour Model Causal Structure Complex Auto Auto Complex Recursive Estimation Joint Estimation Recursive EstimationJoint Estimation Variable Coefficientt-testCoefficien tt-testCoefficientt-testCoefficientt-test Auto Mode Choice Model Constant -2.097 -24.24 -2.100 -24.98 -1.991 -23.65 -1.988 -23.86 CAR_0 -1.303 -12.59 -1.185 -12.84 -1.290 -12.55 -1.290 -13.32 AUTOLIC 2.211 25.66 1.933 19.76 2.224 26.04 2.222 26.86 O_SUB -0.428 -7.01 -0.391 -6.95 -0.424 -6.98 -0.423 -6.94 H_SUB -0.206 -4.33 -0.178 -4.24 -0.201 -4.26 -0.203 -4.39 CAR_GE2 0.200 4.00 0.201 4.36 0.202 4.08 0.197 3.90 RURAL 0.139 1.94 0.121 1.86 0.117 1.64 0.125 1.75 COMPLEX 0.456 8.71 1.409 11.42 ----Complex Tour Choice Model Constant -0.285 -5.07 -0.282 -5.31 -0.427 -7.21 -0.384 -6.12 HHSIZE -0.120 -6.74 -0.130 -7.92 -0.128 -7.14 -0.121 -6.74 YOUNG -0.243 -3.82 -0.210 -3.46 -0.069 -1.02 -0.137 -1.82 OLD -0.136 -2.71 -0.150 -3.17 -0.099 -1.97 -0.108 -2.12 SERVICE 0.582 7.05 0.676 8.49 0.495 5.92 0.495 5.95 SHOPPING -0.266 -5.73 -0.206 -4.73 -0.260 -5.57 -0.267 -5.73 AMPEAK 0.285 5.28 0.258 5.10 0.286 5.27 0.285 5.21 PMPEAK -0.315 -4.24 -0.258 -3.58 -0.326 -4.36 -0.325 -4.36 AUTO ----0.363 8.03 0.227 2.60 0.000 --0.622 -7.65 0.000 -0.111 1.81 97

PAGE 109

98 It is rather interesting that tours und ertaken in the AM peak show a greater propensity to involve multiple stops than those undertaken in the PM peak period. However, in the context of non-work tours, th is may be a plausible result in that people combine a series of errands and school act ivities in the morning and complete their activities by mid-day. Another possible explanat ion is that time constraints towards the end of the day (PM period) limit the number of activities that an indi vidual can pursue at that time. Another interesti ng finding is that gender does not significantly influence tour complexity in the case of non-work tours. Ot her studies have suggest ed that females tend to make more complex trip chains than males (McGuckin and Murakami, 1999). The analysis in this section does not support that finding in the Swiss travel context. The tours primary purpose appears to a ffect tour complexity. While service (serve passenger) tours tend to be complex in nature, shopping tours do not tend to be complex in nature. Thus it appears that the shopping activity may be more prone to being a stand-alone activity within a tour The error correla tion is found to be statistically significant and this is indicative of the validity of the assumption that nonwork tour complexity and mode choice shoul d be modeled in a simultaneous equations framework. The negative sign associated with the error correlation indicates that the unobserved factors influencing th ese two variables are negativel y correlated. It is not straightforward to interpre t the negative sign of the error correlation, since the unobserved variables associated with complex tour choice and auto mode choice would be expected to be positively correlated. For example, the unobserved personal preference to be more efficient may stimulate more auto mode selection as well as more multi-stop tours. Indeed, error correlations were found to be positive in the preliminary analysis in

PAGE 110

99 which bivariate probit models were estimat ed without endogenous dummy variables. The inclusion of the endogenous dummy variable, which is likely to be positively correlated with unobserved variables, may be contributing to the negative error correlation. The negative erro r correlation may also be due to the exclusion of unobserved factors from the model and this ha ppens often when data are analyzed at a higher aggregation level. For example, no di stinction is made between drive alone and drive/ride with others, both of which entail the use of th e auto mode. As a result, person and household correlations are absorbed in the unobserved part of the models that, in turn, leads to negative correlations among the error terms used in the model formulations. Further analysis is warr anted to fully understand the source and interpretation of the nega tive error correlations. The right two blocks of Table 4.6 provi de estimation results for the causal structure where mode choice affects tour co mplexity for non-work tours using recursive estimation method and joint estimation me thod. It is found that mode choice significantly affects tour complexity and that the choice of auto is positively associated with the formation of complex tours. Ther e is no substantial difference between the coefficients of endogenous dummy vari able AUTO (0.363 vs. 0.227) from two different estimation methods, since the error correlation is estimated as low as 0.111. Thus it appears from this model that the choice of the automobile mode for a tour contributes positively to the formation of mu lti-stop trip chains. In addition, the error correlation is positive for this model structure, consistent with exp ectation. All of the other indications provided by the model system are similar to those seen in the left block.

PAGE 111

100 As all of the estimation results in Table 4.6 offer plausible and similar interpretations, a more rigorous performa nce comparison must be conducted among the models to potentially identify the causal st ructure underlying the data set. This performance comparison is presented in S ection 4.1.4 following the discussion of the estimation results for the work tour models. 4.1.3.2 Estimation Results for Work Tours Table 4.7 offers the definition and descri ption of the variables regarding work tour model. Estimation results for work tour models are provided in Table 4.8. Similar to Table 4.6, four blocks in Tabl e 4.8 also represent two different causal structures using recursive a nd joint estimation methods. In the causal structure where tour complexity affects mode choice, it is found that tour complexity has a positive impact on auto mode choice, regardless of us ing recursive or joint estimation methods, as indicated by the coefficients of COM PLEX (0.486 vs. 0.915). The negative error correlation (-0.293) is jointly estimated in bivariate probit model, where the coefficient is more positive than in the recursive model. This finding is consistent with expectations, trends in the data, and the models of non-work tours. The model s upports the notion that a complex tour or trip chaining pattern contri butes to the choice of auto as the mode for the tour. In addition, the error correlati on is statistically significant, once again supporting the simultaneous equations formul ation of the relationship between tour complexity and mode choice. Similar to the non-work tour mode l estimation results, auto ownership and the possession of a driver license contribute positively to auto mode selection, whereas seasonal transit ticket s ubscription contributes negatively to auto mode choice. With respect to work-related vari ables, it is found that the availability of

PAGE 112

101 free parking at the work place and longer co mmutes are both positively associated with the choice of auto for work tours. Table 4.7 Work-tour Model Variable De scription and Statistics (N = 1711) Variable Variable Description Mean Std. Dev. HIGH_INC Monthly household income > Fr10000 0.21 0.41 OWN_BUS Person owns enterprise/business 0.14 0.35 SWISS Person is of Swiss Nationality 0.85 0.36 BEG6_8 Tour starts in time period from 6:00 to 8:59 0.67 0.47 BEG13_14 Tour starts in time period from13:00 to 14:59 0.11 0.31 END12 Tour ends in time period from 12:00 to 12:59 0.12 0.32 AUTO Tour uses auto 0.54 0.50 CAR_0 Number of autos in household = 0 0.16 0.36 CAR_GE2 Number of autos in household 2 0.32 0.47 AUTOLIC Person has auto license 0.88 0.33 FREEPARK Reserved parking lot at the work place is free 0.33 0.47 H_SUB Person subscribes half-price seasonal ticket 0.47 0.50 O_SUB Person subscribes other type of seasonal ticket 0.25 0.43 DIS_WORK Distance between residence and work place (km) 11.02 15.02 COMPLEX Tour is complex (multi-stop) 0.51 0.50 All of these findings are consistent with expectations. In the tour complexity model, it is found that persons of higher income are prone to making complex work tours. In addition, individuals owning their business enterprise are more likely to engage in multi-stop trip chains. It is possible that these individuals have occupational characteristics that lead to the formation of complex trip chains. Individuals of Swiss Nationality are more likely to engage in comp lex work tours, possibly because they have a denser network of social contacts and a larger set of activity options. Another interesting finding is that timeof-day indicators play an im portant role in influencing tour complexity. Tours ending within the lunch hour are less prone to be complex possibly due to time constraints and the pr esence of a single lunc h stop/destination. However, those beginning in the morning peri od of 6 to 9 AM are more prone to being multi-stop trip chains, possibly due to the li nking of a non-work activity with the work

PAGE 113

102 activity in the overall tour. A more detailed time-of-day based analysis of trip chain formation is warranted to fully understand the relationship between trip chain complexity and time-of-day choice behavior. Within the context of this study, time-of-day choice is assumed exogenous to the model system. Ho wever, one may argue that time of day choice is endogenous to trip chain complex ity and mode choice. The study of the simultaneous causal relationships among trip chain formation, mode choice, and time of day choice (three endogenous entities) remains a future research effort. Indeed, the simultaneous equations model for analyzing th e causal relationship between mode choice and time-of-day choice is conducted in Sect ion 4.3 of this diss ertation using mixed binary-multinomial choice model. The right two blocks of Table 4.8 give es timation results for the model where the choice of mode affects work tour complex ity using both recursive and joint estimation method. In the recursive model, the coeffi cient indicating auto mode choice appears positively significant in the tour complexity equation. However, in joint model, this coefficient associated with the auto mode choice variable is not statistically significant in the tour complexity equation, but the error correlation is positi ve and statistically significant. This result does not support the hypothesis that auto mode choice positively affects the formation of a complex work tour The model supports the notion that these choices should be modeled in a simultane ous equations framewor k because recursive model gives misleading inference that auto positively affects the formation of complex work tour.

PAGE 114

103 4.1.4 Model Performance Comparisons Based on Non-nested Test The model estimation results presented in Section 4.1.3 generally offer plausible statistical indications for a lternative causal paradigms. The only model that may be rejected on qualitative grounds is work-tou r model where the mode choice decision precedes the tour complexity decision. Th e statistically insignificant coefficient associated with the endogenous auto mode choi ce variable in the tour complexity model implies that the choice of the auto mode doe s not significantly influence the complexity of work tours. Although this is possible, it is not consistent with the trends noted in the descriptive cross-tabulations and with any of the other models where the endogenous dummy variables have been consistently statistically significant. Given the preponderance of evidence to the contrary, it is difficult to explain and defend this statistically insignificant coefficient. Fo r all other models, however, the statistical indications are plausible. Th is section presents a rigorous comparison across models to see if it is possible to identify the most likely causal structure governing the relationship between mode choice and trip chaining. Non-nested test, mentioned in Section 2.3.2, is adopted to compare the models under alternative causal struct ures. For non-work tour models using joint estimation method, the differences in adjusted likeli hood ratios are 0.0023 between two alternative causal structures. According to Equation (2.5.12), the calculated bounding probability on the right hand side of the expression for the comparison between the two causal structures is almost zero.

PAGE 115

Table 4.8 Work-tour Model Causal Structure Complex Auto Auto Complex Recursive Estimation Joint Estimation Recursive EstimationJoint Estimation Variable Coefficientt-testCoefficien tt-testCoefficientt-testCoefficientt-test Auto Mode Choice Model Constant -2.010 -7.40 -2.149 -8.32 -1.804 -6.80 -1.768 -7.06 CAR_0 -1.287 -7.17 -1.247 -6.77 -1.254 -7.07 -1.259 -7.06 CAR_GE2 0.459 5.35 0.430 4.89 0.471 5.55 0.467 5.58 AUTOLIC 2.067 7.77 1.996 7.83 2.052 7.81 2.040 8.24 FREEPARK 0.812 9.32 0.797 9.00 0.803 9.36 0.790 9.32 O_SUB -1.465 -13.00 -1.440 -12.41 -1.400 -12.65 -1.413 -13.24 H_SUB -0.422 -5.30 -0.421 -5.30 -0.382 -4.89 -0.397 -5.05 DIS_WORK 0.017 4.99 0.015 5.00 0.018 5.48 0.017 6.06 COMPLEX 0.486 6.06 0.915 3.42 ----Complex Tour Choice Model Constant -0.420 -4.31 -0.414 -4.24 -0.553 -5.40 -0.439 -4.06 HIGH_INC 0.292 3.77 0.295 3.80 0.262 3.36 0.277 3.54 OWN_BUS 0.282 3.09 0.304 3.30 0.249 2.70 0.244 2.64 SWISS 0.315 3.57 0.299 3.38 0.316 3.58 0.321 3.62 BEG6_8 0.324 4.25 0.327 4.39 0.320 4.18 0.310 4.11 BEG13_14 -0.413 -3.55 -0.389 -3.39 -0.430 -3.67 -0.422 -3.59 END12 -0.753 -7.49 -0.760 -7.64 -0.751 -7.43 -0.725 -7.24 AUTO ----0.275 4.36 0.053 0.60 0.000 --0.293 -1.63 0.000 -0.246 3.49 104

PAGE 116

105 Even the corresponding recursive models under two causal structures offer the consistent results, where the difference in adjusted likelihood ra tio is 0.0009 and the bounding probability is almost zero. Thus, it ma y be concluded that the model in the left block is more closely capturing the causal structure underlying the relationship between mode choice and tour complexity. The signi ficantly better goodnessof-fit of the model in the left block suggests that the causal st ructure where the complexity of the tour affects mode choice (tour complexity auto mode choice) is statistically, and possibly behaviorally, dominant in the population for nonwork tours. For work tour models, the situation is very similar. In comparing th e joint models, the seemingly better model in the block (tour complexity auto mode choice) of Table 4.9 has an adjusted likelihood ratio index that is 0.0018 greater than those of the models in the ot her causal structure (auto mode choice tour complexity). The bounding pr obabilities, as pe r the right hand side of equation (2.5.12), are calculated as almost zero. Even in recursive model, nonnested test rejects the causal relationship that auto mode choice affecting tour complexity in recursive model by the difference as 0.0039 and negligib le bounding probability value. Also, as mentioned earlier, the statistically insignificant coefficien t associated with the endogenous dummy variable appe ars to suggest that the cau sal structure where auto mode choice drives complex work tour forma tion is not capturing the trends in the data set. Once again, it may be concluded that th e causal structure wher e the complexity of the tour affects mode choice (tour complexity auto mode choice) is statistically, and possibly behaviorally, dominant in the population for work tours. From the viewpoint of activity-based trav el behavior theory where travel choices are considered to be derived from activity pa tterns (and activity needs that are distributed

PAGE 117

106 in time and space), one may consider the findings of this section to be quite consistent with expectations. For both non-work tour s and work tours, the statistical model estimation results show that tour complexity (which is reflective of the activity pattern) drives mode choice. This finding is also c onsistent with and conf irms previous results regarding the nature of the relationship betw een trip chaining and mode choice reported by Hensher and Reyes (2000). Table 4.9 Comparisons of Goodness-of-fit of Recursive Bivariate Probit Models Non-work Tour Model Recursive Estimation Joint Estimation Complex AutoAuto Complex Complex Auto Auto ComplexSample size 4901 LL at zero: LL(0) -6794.229 LL at constant: LL(c) -5719.416 # of Parameters 16 1617 17 LL at convergence -4585.147-4591.132-4573.906 -4589.533 2 at zero 0.32510.32430.3268 0.3245 Adj. 2 at zero 0.32280.32190.3243 0.3220 2 at constant 0.19830.19730.2003 0.1976 Adj. 2 at constant 0.19550.19450.1973 0.1946Non-nested Test (Prob.) 0.0009 (0.000)0.0023 (0.000) Work Tour Model Sample size 1711 LL at zero: LL(0) -2371.950 LL at constant: LL(c) -2354.272 LL at convergence -1780.601-1789.797-1779.440 -1783.900 Number of Par 16 1617 172 at zero 0.24930.24540.2498 0.2479Adj. 2 at zero 0.24260.23870.2426 0.24082 at constant 0.24370.23980.2442 0.2423Adj. 2 at constant 0.23690.23300.2369 0.2351Non-nested Test (Prob.) 0.0039 (0.000)0.0018 (0.000) 4.1.5 Discussions and Conclusions Mode choice behavior is a fundamental element of travel behavior that has significant implications for tran sportation planning. Estimates of public transit ridership and the use of alternative mode s of transportation are largely based on studies of mode choice behavior and modal split models. Public transport agencies face increasing

PAGE 118

107 competition from the automobile as automob iles become increasingly affordable and the road infrastructure become s increasingly ubiquitous. Undoubtedly, the automobile is considered to provide greater flexibility and convenience when compared with public transport modes that are generally cons trained with respect to schedules and routes/destinations. This study examines the inter-relations hip between the complexity of peoples activity-travel patterns and their mode choice. In order to conduc t the analysis, this section examines mode choice behavior in th e context of multi-stop (complex) vs. singlestop (simple) trip chains. Through recursive bi variate probit model, this section presents a rigorous analysis of the most likely causa l relationship between these two phenomena at the level of the individual tr ip chain or tour. It should be emphasized that the analysis in this section does not attempt to replicate cau sality at the level of the individual traveler, but rather at the macroscopic level to iden tify the causal tendency that appears to be dominant in the population. This section estimates recursive bivariat e probit models that provide a rigorous analytical framework for anal yzing and testing alternative causal structures. For both non-work tours (i.e., tours that do not involve any work stops) and work tours (i.e., tours that involves at least one work stop), the anal ysis suggests that the causal structure where the complexity of the trip chaining pattern drives mode choice is the dominating behavioral trend in the population. These findings have important implicatio ns for public transport service providers who are interested in attracti ng choice riders. If mode c hoice decisions precede activity pattern/agenda decisions, then it may be possi ble for public transport service providers to

PAGE 119

108 simply attract choice riders by improving amen ities, schedule, route coverage, safety and security, and comfort. On the other hand, if the formation of the activity agenda precedes or drives mode choice decisions, then th e public transport indu stry has a greater challenge before it. Trip ch aining and tour complexity se rve as impediments to public transport usage as it is generally more burdensome to undertake multi-stop tours using public transportation where travelers ar e constrained by routes, schedules, and access/egress issues. The analysis in this section suggests that the dominant relationship in the data set be the one in which tour co mplexity drives mode choice, both for work and non-work tours. Then, not only do public transport services providers have to improve service amenities, but they also have to cater to a multi-stop oriented complex activity agenda. This is extremely difficult to do with a fixed route, fixed schedule system. As activity-travel patterns and t ours become increasingly complex, it is likely that public transport agencies wi ll have to develop new types of services to try and retain existing riders in addition to attracting new ri ders. Fixed route bus and rail services may continue to be useful in serving longer li ne-haul portions of multi-stop tours. However, serving shorter multi-stop trips calls for the provision of more flexible circulator and paratransit-type services that may involve the use of smaller buses and vans than conventional vehicles. Also, attention needs to be paid to land use developments around transit stops/stations. Concerted efforts n eed to be made to promote mixed use land developments and multi-purpose activity centers so that travelers are able to fulfill a variety of activity needs at a single location (with out the need for undertaking additional trips).

PAGE 120

109 The analysis and findings of this secti on are also useful and important in the specification and development of activity-based and tour-based mode ls. Most activitybased and tour-based travel demand model sy stems consist of hierarchical structures involving, at a minimum, activity agenda or tour formati on, mode choice, destination choice, and time of day choice. A lthough many of the model systems utilize simultaneous equations systems to repres ent joint choice proc esses and recognize endogeneity, there is invariably a causal hierar chy that is implied in the specification of the model system. Knowledge about the natu re of the relationships among key choice dimensions can aid in the specification of activ ity-based model structures that reflect the dominant behavioral trends in the population. For example, consider the findings of this section in which it is found that the activity agenda or tour formation drives mode choice for both non-work and work tours. Clearl y, this finding suggests that activity-based models should be formulated such that indi vidual activity agendas and tours are formed first and then mode choice is determined base d on the nature of th e activity agenda or tour complexity. Such a model would more accurately reflect behavioral changes that might result from a system change, say, the im provement of transit service in a corridor or region. If, for example, one developed an activity-based model system assuming a different causal structure, i.e., one in wh ich mode choice precedes and drives tour formation, then the model is prone to errone ously over-estimate the potential benefits or impacts of the transit service improvement and may alter the nature of the individual tour patterns in response to the mode shift. Acco rding to the results obt ained in this section, the dominant relationship is one in which pe ople make decisions re garding their activity agendas or tour complexity first and this decision drives the m ode choice decision.

PAGE 121

110 Many individuals with complex t our patterns will not be able to shift modes in response to improvements in transit service and thus, in reality, the impacts of the improved transit service may be substantially lower (than th at which might be obtained had the reverse causal structure where mode choice drives tour complexity has been assumed). Mode choice can be expanded to consider multinomial modes including SOV, shared ride, public transit, and non-motorized modes. Similarly, tour complexity can be expanded to consider different le vels of tour complexity or different tour types such as that presented in Strathman and Dueker ( 1995). Another consideration that merits further investigation is the extent to which findings such as those presented are sensitive to model specification. It is possible that st atistical indicators of model performance will change depending on the model specification chose n. One of the limitations of this study is that detailed level-of-service and price variables were not available for all trips as many trips had either an origin or a destination outside the Zurich region. While levelof-service variables are available for trips wi th known origins and de stinations within the Zurich region, they are not available when one of the trip ends is outside the region. This problem is exacerbated when one is conducting analyses and modeling efforts at the tour level. Limiting the analysis to the subset of trips with known orig ins and destinations within the Zurich region would have resulted in a very restrictive sample of tours. It is unclear whether the inclusion of such vari ables would significant ly alter the findings reported in this section and therefore the se nsitivity of findings to model specification merits further investigation.

PAGE 122

111 4.2 Causal Models Between Activity Timing and Activity Duration (Mixed Discrete-continuous Model and Lee Model) 4.2.1 Background Activity-based approaches to travel demand analysis explicitly recognize the important role played by time in shaping activity and travel patterns (Axhausen and Garling 1992). One of the key advantages of the activity-based approach is that it is capable of explicitly incorporating the time dimension into the travel modeling process (Pas and Harvey 1997). In the new planni ng context where travel demand management (TDM) strategies and transpor tation control measures (TCM) are inherently linked to the time dimension, activity-based approaches th at recognize the time dimension offer a stronger behavioral framework for conducting po licy analyses and impact studies (Bhat and Koppelman 1999; Harvey and Taylor 2000; Kitamura et al 1996 ; Pendyala et al 1997, 1998; Yamamoto and Kitamura 1999). Telecommuting is a good example for illustrating the importance of time dimension. The commute trip to and from work place is not made when a worker telecommutes, thus he or she has additional time available for pursuing more activities. The elimination of the commute trip influen ces the duration of travel and/or activity engagement. Besides influencing duration, telecommuting may influence the timing of activity engagement. A worker used to pursu e non-work activities on the way to work or on the way back from work, but now he or she may choose to engage in non-work activities in different times of day. Without commute trip, the worker has no longer the need or opportunity to pursue non-work act ivities in combination with commute.

PAGE 123

112 Analyzing these temporal changes in activ ity engagement patterns is important for accurately assessing the impacts of telecommuting on travel demand. As illustrated by the telecommuting example, there are two key aspects of the temporal dimension that play an important role in activity-travel demand modeling (Goulias 1997). They are the timing of an activity episode and the duration (time allocation) of an activity episode (Mah massani and Chang 1985; Mahmassani and Stephan 1988; Abkowitz 1981). In other words, activity-based analys is allows one to answer the two critical questions: When is an activity pursued? For how long is th e activity pursued? In recent years, activity-based research has focused on the analysis of individual activity episodes so that both of these aspects may be studi ed in detail (Bhat 1996, 1998; Bhat and Misra 1999; Bhat and Singh 2000) Studies that focused on daily time allocations to various activity types were not able to addr ess the time-of-day choice in activity engagement (Kasturirangan et al. 2002). Thus, conducting activity-based analysis at the individual activity episode leve l is crucial to gaini ng an understanding of the relationships between activity timing and duration (Hamed and Mannering 1993; Hunt and Patterson 1996; Levinson a nd Kumar 1995; Steed and Bhat 2000). The causal relationship between activity timing and duration is an important component of activity-based travel demand modeling systems that aim to explicitly capture the temporal dimension (Kitamura et al 2000; Mannering et al 1994; Pendyala et al 2002; Wang 1996; Wen and Koppelman 2000). On the one hand, one may hypothesize that the timing of an activity aff ects its duration. Perh aps activity episodes

PAGE 124

113 pursued during peak periods are of short dura tion while those pursued in off-peak periods are longer in duration. On the other hand, the dur ation of an activity ma y affect its timing. Perhaps activities of longer duration are scheduled during the off-peak periods while activities of shorter duration ar e scheduled during peak periods. This section attempts to shed light on this relationship by exploring both causal structures in a simultaneous equations framework. By identifying the causa l structure that is most appropriate in different circumstances, one may be able to design activity-based model systems that accurately capture the relationship between activity timing and duration. 4.2.2 Data Preparation and Description for Model Analysis The data set is derived from Swiss Tr avel Microcensus 2000, which has been introduced in Chapter 3. The trip file was used to create an out-of-home activity file where individual activity records were created from the trip records. This activity file included information about activity type, activity timing, activity duration, and other variables pertinent to each activity episode This section focuses on the relationship between activity timing and duration for main tenance activities. Maintenance activities included the following two activit y (trip) types: shopping and se rvice (passenger or child). These activity records were extracted from the original file to create two maintenance activity record files, one for commuters and one for non-commuters. Commuters were defined as individuals who commuted to a wo rk place on the travel diary day, while noncommuters were defined as those who did not commute to a work place (made zero work trips) on the travel diary da y. Note that a worker (employed person) who did not

PAGE 125

114 commute on the travel diary day would still be classified as a non-commuter for the purpose of this study. Maintenance activities were pursued by 10833 individuals residing in 10554 households. Of these individuals, 2617 were commuters and they reported 3394 maintenance activities. The remaining 8216 individuals were non-commuters and they reported 11293 maintenance activ ities. The commuter and non-commuter maintenance activity episode data sets in cluded complete socio-economic and activity information for the respective samples. For these specific data sets, Table 4.10 provides a summary of the household characteristics of these two samples for modeling purpose, in comparison with those in the whole Swiss sample. The average household size for the noncommuters and commuters household sample is 2.44 and 2.51 persons, individually, wh ich is close to average household size from the whole Swiss sample as being 2.43. Households of commuter sample report higher income levels than households of non-commuter sample, presumably because commuters households consistently include workers earning wages. Similarly, households of commuter sample report higher car ownership levels than households of non-commuter sample because commuters househol ds are more likely to own cars. The distributions of the other char acteristics are rather consiste nt across household samples. Table 4.11 compares the person characteristic s of samples with those of the whole Swiss sample. The major differences between commuters and non-commuters are consistent with expectations. Commuters are predominantly in the age groups of 30-59 years while 37.8% of non-commuter s are older than or equal to 60 years of age. 67.4% of commuters are employed full time while only 21.0% of non-commuters are employed

PAGE 126

115 full time. 88.7% of commuters hold driver license but 67.6% of non-commuters hold driver license. Finally, co mmuters make 1.62 work trips per day and 4.23 non-work trips per day, which is almost equal to non-work trip frequency of non-commuters (4.30 times per day). Table 4.10 Household Characteristics of Swi ss Travel Microcensus 2000 and Sample for Model of Maintenance Activity Duration and Time-of-day Choice Characteristic Swiss Sample Non-commuters Sample Commuters Sample Sample Size 27918 7957 2597 Household Size 2.43 2.44 2.51 1 person 27.5% 29.4% 28.7% 2 persons 35.1% 33.1% 29.4% 3 persons 14.0% 11.5% 13.1% 4 persons 23.4% 26.0% 28.8% Monthly Income Low (Fr 8K) 18.4% 16.8% 29.8% Missing 24.9% 18.6% 13.7% Vehicle Ownership 1.17 1.10 1.29 0 auto 19.8% 22.6% 14.1% 1 auto 50.5% 50.7% 51.1% 2 autos 24.5% 22.2% 28.5% 3 autos 5.2% 4.5% 6.4% Family Type Single 27.2% 29.2% 28.3% Partner (unmarried and no child) 27.9% 27.0% 23.3% Married 43.6% 42.9% 46.7% Other 1.3% 0.9% 1.8% Presence of Children Child < 6 years old 10.6% 11.1% 11.5% Child 6~17 years old 22.5% 23.7% 24.8% Household Location Major city 42.4% 43.5% 42.7% Surrounding areas of c ity 30.4% 30.1% 31.8% Isolated city 1.1% 1.3% 1.3% Rural 26.1% 25.1% 24.2%

PAGE 127

116 Prior to commencing the model developmen t effort, descriptive analysis of the potential relationship between activity duration and timing was undertaken. The results are presented in Table 4.12 and Table 4.13. Table 4.11 Person Characteristics of Swiss Travel Microcensus 2000 and Sample for Model of Activity Duration and Time-of-day Choice Characteristic Swiss Sample Non-commuters Sample Commuters Sample Sample Size 29407 8216 2617 Age (in years) 43.9 (Mean) 49.6 (Mean) 41.0 (Mean) Young (6~29) 26.8% 18.9% 19.3% Middle (30~59) 47.6% 43.3% 75.7% Old ( 60) 25.5% 37.8% 5.1% Sex Male 46.3% 36.1% 47.0% Female 53.7% 63.9% 53.0% Employment Status Full time 37.3% 21.0% 67.4% Part time 14.3% 14.9% 30.0% Not employed 48.4% 64.1% 2.6% Licensed 67.4% 67.6% 88.7% #Trips/day 3.51 4.30 5.85 Work trips 0.46 0.00 1.62 Non-work trips 3.05 4.30 4.23 Based on a time of day distribution of all tr ips in the data set, four distinct time periods in which activity begins are identified. They are: AM peak: 6:00 AM 8:59 AM Midday: 9:00 AM 3:59 PM PM peak: 4:00 PM 6:59 PM Off peak: 7:00 PM 5:59 AM

PAGE 128

117 Table 4.12 and Table 4.13 compare mean valu e and standard devi ation of activity duration across time-of-day a llocation of activity within non-commuter sample and commuter sample. To alleviate the variance of dependent variable in linear regression model, Ln(1+activity duration in minutes) is sp ecified as dependent variable in the joint model system, noted as LN_DUR. In term s of mean value of LN_DUR, four timeof-day choices of non-commuters can be ra nked into the following sequence: MIDDAY > AMPEAK > PMPEAK > OFFPEAK. However, the corresponding sequence for commuter is shifted as: MIDDAY > PMPE AK > AMPEAK > OFFPEAK, presumably because commuters are less likely to pursue maintenance activities in AM peak period than non-commuters due to the work sc hedule constraint. Generally speaking, commuters maintenance activ ities are of shorter duration than non-commuters. Table 4.12 Description of Endogenous Variables in Non-commuter Sample Time-of-Day Choices Mean of LN_DUR Std. Dev. Of LN_DUR Mean of Duration Std. Dev. of Duration N AMPEAK (6:00-8:59) 2.90 1.38 41.24 73.13 1142 PMPEAK (16:00-18:59) 2.68 1.40 37.22 80.59 1775 MIDDAY (9:00-15:59) 3.21 1.29 50.70 90.62 7911 OFFPEAK (19:00-5:59) 1.87 1.53 22.09 48.84 465 Total 3.04 1.36 46.45 86.40 11293 Table 4.13 Description of Endogenous Variables in Commuter Sample Time-of-Day Choices Mean of LN_DUR Std. Dev. Of LN_DUR Mean of Duration Std. Dev. of Duration N AMPEAK (6:00-8:59) 1.95 1.33 17.38 46.52 359 PMPEAK (16:00-18:59) 2.77 1.19 31.58 66.47 1222 MIDDAY (9:00-15:59) 2.89 1.25 35.09 55.93 1467 OFFPEAK (19:00-5:59) 1.84 1.53 21.38 47.43 346 Total 2.64 1.33 30.55 58.61 3394

PAGE 129

118 4.2.3 Model Estimation Results 4.2.3.1 Estimation Results for Non-commuters Table 4.14 offers a description of explan atory variables used in all the causal models for time-of-day choice and duration of maintenance activ ities. Among these variables, LN_DUR is an endogenous continuous variable and AMPEAK, PMPEAK and MIDDAY are three endogeno us dummy variable indicating time-ofday choices. Table 4.15 provides the estimation resu lts of non-commuter model under the causal structure where activity duration is predetermined and affects time-of-day allocation of activity. For comparison, the tabl e is composed of four blocks. The first block offers estimation results from recursive estimation, i.e. a multinomial logit model for time-of-day choices among four time periods and a linear regression model for the logarithm of activity duration. The second block offers the estimation results of a recursive unidentified mixed logit model and linear regression model. Here, f4 in offpeak choice model is seemi ngly smallest among all the fi, thus f4 is fixed at zero and thereby g4 also needs to be fixed at zero. Then an identified mixed discrete-continuous model can be estimated and the estimation resu lts are shown in the third block. The final block shows the estimation results of discrete-continuous model based on Lee transformation. In this block, gi/ri values represent the correlations between transformed utility function for discrete choi ce and error term in lin ear regression model, while in the second and the third block, gi/ri represents gi. Except the unidentified mixed model, al l the three identified models provide similar estimators for exogenous variables. Pa rticularly, the estimators in Lee models are

PAGE 130

119 almost identical to those in recursive mode ls. Thats because joint estimation merely improves the efficiency of estimators for exoge nous variables. The absolute values of coefficient in mixed time-of-day choice mode l are generally somewhat greater than the recursive model and Lee model, possibly caused by the involvement of additional heterogeneities in a late nt utility functions. All the coefficients of exogenous variables have reasonable behavioral interpretation. In all the models, AGE ta kes greatest positive coefficient in AM peak choice utility function, which indicates that elder non-comm uters are most likely to schedule maintenance activities in AM peak period and least likely in Off-peak period. Elder may undertake more responsibility of taking children to school or shopping for grocery in AM peak period. Non-commuters living with more household members tend to pursue maintenance activities in AM peak period as evidenced by the positive coefficient of HHSIZE, presumably because th ey have to undertake more responsibility of serving children in AM peak period. The positive coefficient of LOW_INC indicates that low-income non-commuters prefer to e ngage in maintenance activity in AM peak period, possibly because their travel are more transit-oriented or more dependent on nonmotorized mode thereby less sensitive to AM peak-period traffic congestion. The noncommuters without household car are most lik ely to pursue mainte nance activities in Midday period and least likely in Off-peak pe riod, as evidenced by the greatest positive coefficient in Midday utility function and less positive coefficients in AM peak and PM peak utility functions. The dependency on tr ansit might be a plau sible explanation. Public transit may be the least congested in midday period and unavailable or the least secure in off-peak period. The negative coe fficients of MALE in PM peak utility and

PAGE 131

120 Midday utility indicate that ma le non-commuters dislike scheduling maintenance activity in PM peak and Midday, as compared to fema le, possibly due to the fact that female noncommuters tend to stay at home in the morning and at night for household obligations. High-income non-commuters prefer to sche dule maintenance activ ities in PM peak period as evidenced by the positive coefficient. Age and square of age appear signif icant in log-linear regression model for maintenance activity duration. Negative coefficient on age and positive coefficient on age square infer a non-linear effect of age on activity duration. Thats probably because non-commuters in mid-age are more sensitive to time expenditure and less willing to spend much time on maintenance activities than younger and elder. Relative to female non-commuters, male non-commuters allocate less time on daily maintenance activities. Negative coefficient of HHSIZE indicates that people living with more household members spend less time in maintenance activities than those living in small family, presumably because shopping obligations can be assigned to more family members in a big household. High-income non-commuters are expected to expend less time on maintenance activities, as indicated by the nega tive coefficient, possi bly due to their more concern on time budget. Car ownership a ppears significant in recursive model but insignificant in the other three types of models therefore it has been excluded from these models. The coefficients for endogenous variable are the most important outputs from the modeling estimation results. In recursive models, LN_DUR, indicating logarithm of activity duration, appears positively significant in all the three utility functions for timeof-day choices. The coefficient in utility function for Midday choice is greatest. The

PAGE 132

121 estimation results in Lee model are rather clos e to those in recursive models in spite of the accommodation of error correlations, possibl y because the correlations r1 (0.088) and r2 (0.171) are rather small, albeit statistically significant. In the unidentified mixed logit model, f2 appears smallest among all the fi. Thus f2 and g2 are fixed at zero in the identified mixed model, where different estimation results are found for endogenous variable. LN_DUR does not appear significant in utility functions for AM peak choice. Table 4.19 presents all the simulation-based hypothesis test results for error covariance estim ators in all the identified mixed models. In the current model, the signifi cance level of positive covariance f1g1 is 0.209, which is not of high level but considerab le in mixed model. A very large sample size is required to obtain accurate estimators for standard de viation of heterogene ities in mixed logit model. The positive correlation is calculate d as 0.231 according to Equation (2.4.5). The statistical result indicates that activity durat ion does not have significant impact on the utility of AM peak choice. Without accommodation of direct positive correlation between AM peak utility and activity durati on choice, the coefficient of LN_DUR in AM peak utility function is overestimated as 0.495 in the recursive model. Corr(u3, a) and Corr(u4, a) are estimated as -0.057 and 0.004, whic h are almost negligible. That is the reason why the coefficient for endoge nous variable LN_DUR in PM peak and Midday utility function of mixed model does not differ substantially from those in recursive model. Finally, the current mi xed model supports the hypothesis that noncommuters maintenance activities of longer dur ation are more likely to be pursued in PM peak period and Midday.

PAGE 133

122 In the mixed continuous model without endogenous variable s, the standard deviation ( ) of normal random disturbance estimated from recursive model and Lee model are rather consistent (1.357 vs. 1.348). In the identified mixed model, one may have 358.1254.1 0.284 0.240 364.0 g g g2 2 2 2 2 2 3 2 2 2 1 1.357. It can be seen that, in the mixed linear regre ssion model, the random component has been divided into a linear combination of four pa rts: the first three parts are individually correlated to the first three util ity functions and the last part is idiosyncratic random error which is uncorrelated with the utility functions This result coincides with the a priori assumption for the mixed discrete-continuous modeling system. Table 4.16 offers model estimation resu lts of non-commuter model under the causal structure that time-of-day choices affect activity duration. All the exogenous variables take the coefficients similar to those in Table 4.15. T hose coefficients for exogenous variables in Lee model are fairly cl ose to those in Recursive model. Except the coefficient for endogenous dummy variab le MIDDAY being re duced substantially (1.308 vs. 1.044), there is no considerable ch ange for AMPEAK and PMPEAK. It may be explained by the fact that r3 is estimated as -0.307. Please notice that ri represents corr(vqi *, ), where)]umax(F[vqi qj ij,I,2,1j i 1 qi and is random component in continuous model. qi is not normally but gumbel distributed, which is asymmetric. Then ri is basically negatively related with the correlation between qi and Therefore negative r3 value infers positive correlation between q3 and This is an explanation why the positiv e coefficient for the thir d endogenous dummy variable MIDDAY is reduced in Lee model that, to some degree, accommodates the positive correlation between the third utility function and conti nuous model.

PAGE 134

123 In the unidentified mixed logit model, f1 takes seemingly smallest absolute value among all the fi, thus f1 and g1 are fixed at zero. The co efficients for endogenous dummy variables are close to those in recursive model and Lee model. Corr(u2 *, a) is calculated as -0.073, Corr(u3 *, a) as -0.104 and Corr(u4 *, a) as -0.091 according to Equation (2.4.5). These error co rrelations are too slight to substantially influence the coefficient of endogenous dummy variables, though simulation-based hypothesis test indicates that covariance f3g3 and f4g4 have relatively high si gnificance level at 0.089 and 0.072. In the continuous linear regression model, the standard deviation ( ) of normal random disturbance estimated from recursive model and Lee model are rather consistent (1.323 vs. 1.346). In the identified mixed mode l, one may have the standard deviation of the whole random components as 1.328, which is close to 1.323. All the three dummy variables indicating AM peak choice, PM peak choice and Midday choice appear positively significant in all types of models. These statistical results strongly support the hypothesis that time-of-day choices of mainte nance activity affect the activity duration for non-commuters. Except Lee model, both recursive model and mixed model take the greatest positive coefficient on MIDDAY, less positive coefficient on AMPEAK and the least positive coefficient on PMPEAK. This result is consistent with the descriptive analysis in Table 4.12, where the ranking of time-of-day categories are MIDDAY > AMPEAK > PMPEAK > OFFPEAK in terms of average activity duration. In summary, for non-commuters, AM peak, Midday and PM peak choice of maintenance activities positively affects activity duration. In these time periods, noncommuters have sufficient time available fo r maintenance activities without institutional

PAGE 135

124 constraint such as closing time of shopping center. On the other side, maintenance activity duration positively aff ects Midday choice. In other words, maintenance activity of longer duration tends to be scheduled in midday period. Intuitively, non-commuters who intends to make longer maintenance activ ities probably prefer to start them in midday for having sufficient time, avoiding peak-period congestion and institutional constraint. 4.2.3.2 Estimation Results for Commuters Table 4.17 offers model estimation resu lts for commuter model where activity duration affects time-of-day c hoices. The exogenous variable s in all the models take almost identical coefficients. AGE takes positive coefficient in all the three utility functions, among which the one in PM peak utility is the greatest. It indicates that the elder commuters prefer to allocate maintenan ce activity in PM peak but does not tend to allocate it in Off-peak, presumably because the elder commuters are used to pursuing their maintenance activities in commute wa y from work place back home. Compared with female commuters, male commuters are more likely to undertake maintenance activity in off-peak period, as evidenced by the negative coefficients of PMALE in the utility functions for the other three time periods. Thats probably because females are unwilling to go out of home at night or in early morning for security purpose. Commuters living alone wit hout any other household members are less likely to undertake maintenance activities in AM peak and midday as evidenced by the negative coefficients on HHSIZE1 in both utility func tions. That is possibly because they do not have obligation of taking children to sc hool in AM peak period. In addition, they do not have to undertake maintenance activ ity in midday without urgent household

PAGE 136

125 obligations. Commuters with no cars in household are more likely to undertake maintenance activities in AM peak, PM p eak and midday, as indicated by the positive coefficients of CAR_0. Thats possibly because the commuters with car are more likely to pursue activities in off-peak period since their schedule is not constrained by transit availability in that time period. The low-income commuters are more likely to undertake maintenance activities in midday, as shown by the positive coefficient which appears slightly significant. As expected, low-income commuters ha ve more spare time for pursuing maintenance activities in the mi ddle of daily work. The commute distance negatively affects AM peak engagement of maintenance activities. Uncertainty in commute time increases as commute distance increases, thus commuters are unwilling to undertake additional activities in AM peak period on their commute. Total daily work time negatively affects midday engagement of ma intenance activities, which is consistent with expectation. The more time commuters spend on work, the less time is available for maintenance activity in midday. Table 4.14 Description and Definition of Variables in Timing-duration Model Commuters Activity Sample Non-Commuters Activity Sample Sample Size 3394 11293 Variable Name Variable Description Mean Std. Dev. Mean Std. Dev. AGE Age in 100 years 0.410.12 0.49 0.20 AGE_SQ The square of age (1002 years2) 0.180.10 0.28 0.20 PMALE Person is male 0.460.50 0.35 0.48 HHSIZE1 Single-member family 0.270.44 --HIGH_INC Monthly household income > Fr 10000 ---0.10 0.30 LOW_INC Monthly household income < Fr 4000 0.100.31 0.24 0.43 CAR_0 Household does not own car 0.130.34 0.20 0.40 LN_DISW Ln(1 + commute distance in kilometers) 1.911.02 --WORKDUR Daily total work time (100 mins) 4.151.66 --HHSIZE Total number of household members 2.561.33 2.55 1.39 LN_DUR Ln(1 + activity duration in minutes) 2.641.33 3.04 1.36 AMPEAK Activity is scheduled in AM peak 0.110.31 0.10 0.30 PMPEAK Activity is scheduled in PM peak 0.360.48 0.16 0.36 MIDDAY Activity is scheduled in Midday 0.430.50 0.70 0.46

PAGE 137

126 Table 4.15 Non-commuter Model (Duration Time-of-day) Recursive Models Unidentified Mixed Models Identified Mixed Models Lee Models Variable Coeff. t-test Coeff. t-test Coeff. t-test Coeff. t-test Activity Time-of-Day Choice Model AM Peak Choice Model Constant -2.502 -11.66 -12.635 -1.93 -2.873 -1.33 -2.226 -10.15 AGE 3.760 11.84 8.984 2.48 4.557 2.34 3.768 11.88HHSIZE 0.129 4.43 0.485 1.87 0.155 1.30 0.127 4.36LOW_INC 0.186 2.36 0.695 1.53 0.271 1.05 0.185 2.35CAR_0 0.678 3.35 0.447 1.02 0.677 2.80 0.647 3.24LN_DUR 0.495 12.20 0.310 1.76 0.044 0.04 0.377 8.92f1 --7.369 1.90 2.167 0.60 ---PM Peak Choice Model Constant -0.023 -0.160.030 0.18 -0.009 -0.02 0.144 0.94 AGE 1.206 4.351.263 4.35 1.206 4.13 1.192 4.30PMALE -0.234 -3.25-0.383 -3.65 -0.260 -1.96 -0.229 -3.18HIGH_INC 0.146 1.770.000 -0.000 -0.156 1.89CAR_0 0.565 2.87 0.545 2.75 0.557 2.79 0.535 2.75LN_DUR 0.384 10.22 0.393 9.67 0.390 1.93 0.305 7.73f2 ---0.018 -0.03 0.000 ----Midday Choice Model Constant 0.003 0.02 -0.105 -0.34 -0.193 -0.17 0.380 2.70 AGE 2.474 9.52 2.976 3.79 2.514 6.43 2.475 9.56PMALE -0.199 -3.46 -0.380 -3.33 -0.224 -1.39 -0.205 -3.56CAR_0 0.792 4.22 0.882 3.57 0.803 4.03 0.762 4.13LN_DUR 0.665 18.81 0.792 4.29 0.741 1.83 0.514 14.02f3 ---1.460 -1.12 -0.440 -0.25 ---Off-Peak Choice Model f4 ---0.179 -0.22 -0.029 -0.02 --Activity Duration Model Constant 3.404 56.48 3.407 41.66 3.439 40.96 3.524 43.38 AGE -0.760 -3.53 -0.793 -2.73 -0.959 -3.08 -0.782 -2.71AGE_SQ 0.821 3.75 0.863 2.92 1.038 3.27 0.764 2.61PMALE -0.147 -7.42 -0.147 -5.46 -0.146 -5.40 -0.132 -4.94HHSIZE -0.056 -6.46 -0.060 -5.29 -0.060 -5.30 -0.057 -5.09HIGH_INC -0.136 -4.14 -0.148 -3.38 -0.141 -3.02 -0.144 -3.34CAR_GE2 -0.045 -1.96 0.000 -0.000 -0.000 --g1/r1 --0.000 -0.364 1.18 0.088 4.66g2/r2 --0.000 -0.000 -0.171 9.53g3/r3 --0.000 -0.240 0.29 0.000 --g4/r4 --0.000 --0.284 -1.08 0.326 14.92sigma 1.357 -1.358 150.29 1.254 10.18 1.348 151.10

PAGE 138

127 Table 4.16 Non-commuter Model (Time-of-day Duration) Recursive Models Unidentified Mixed Models Identified Mixed Models Lee Models Variable Coeff. t-test Coeff. t-test Coeff. t-test Coeff. t-test Activity Time-of-Day Choice Model AM Peak Choice Model Constant -1.360 -7.13 -2.650 -2.74 -1.267 -5.19 -1.360 -7.14 AGE 3.805 12.29 11.200 3.33 3.916 12.05 3.800 12.27HHSIZE 0.132 4.56 0.413 2.85 0.130 4.44 0.133 4.54LOW_INC 0.185 2.34 0.467 1.76 0.171 2.15 0.178 2.27CAR_0 0.795 3.95 1.937 3.02 0.801 3.92 0.814 4.04f1 --2.437 2.98 0.000 ----PM Peak Choice Model Constant 0.826 6.74 -2.293 -1.38 0.826 4.32 0.831 6.77 AGE 1.237 4.59 -2.955 -1.33 1.313 4.63 1.220 4.53PMALE -0.239 -3.33 0.000 -0.000 --0.238 -3.29HIGH_INC 0.180 2.20 1.312 1.69 0.181 2.18 0.185 2.26CAR_0 0.667 3.40 0.957 1.25 0.706 3.56 0.678 3.45f2 ---14.032 -3.00 0.143 0.73 ---Midday Choice Model Constant 1.686 15.22 6.612 3.48 1.708 8.84 1.694 15.28 AGE 2.521 10.14 8.258 3.14 2.617 9.68 2.497 10.05PMALE -0.238 -4.20 0.000 -0.000 --0.238 -4.19CAR_0 0.915 4.93 2.623 3.42 0.963 5.08 0.926 4.98f3 --5.939 3.25 -0.366 -1.35 ---Off-Peak Choice Model f4 ---4.684 -3.01 0.552 1.46 --Activity Duration Model Constant 2.313 31.03 2.314 23.46 2.331 15.09 2.357 8.82 AGE -0.897 -4.17 -0.915 -3.22 -0.935 -3.28 -0.822 -2.88AGE_SQ 0.785 3.59 0.808 2.80 0.853 2.96 0.772 2.68PMALE -0.121 -6.08 -0.121 -4.59 -0.117 -4.49 -0.134 -5.01HHSIZE -0.052 -6.06 -0.055 -4.93 -0.053 -4.76 -0.059 -5.33HIGH_INC -0.126 -3.84 -0.132 -3.10 -0.134 -3.14 -0.142 -3.31CAR_GE2 -0.023 -1.01 0.000 -0.000 -0.000 --AMPEAK 1.016 18.30 1.017 13.85 0.840 5.96 1.099 2.59PMPEAK 0.794 15.20 0.795 11.50 0.740 3.79 0.802 2.56MIDDAY 1.308 27.19 1.309 20.58 1.313 8.71 1.044 3.96g1/r1 --0.000 -0.000 -0.063 0.44g2/r2 --0.000 --0.876 10.57 0.031 0.36g3/r3 --0.000 -0.504 5.26 -0.307 -5.80g4/r4 --0.000 --0.305 -3.32 0.018 0.20sigma 1.323 -1.323 150.29 0.805 10.13 1.346 109.44

PAGE 139

128 In log-linear regression model for main tenance activity duration, age, gender, household size, total daily work time and car ownership are found to be significant contributing factors. Quadratic term of ag e is specified for ca pturing the non-linear impact of age on activity duration. As oppos ed to non-commuter model, AGE takes positive coefficient and AGE_SQ takes negative coefficient. These results indicate that commuters in mid-age are expected to unde rtake longer maintena nce activities than younger and elder. Commuters in mid-age have to undertake much more responsibilities than younger and elder commuters. Similar to non-commuters activity durations, male commuters activity durations are expected to be shorter than female commuters as evidenced by the negative coefficient of PMALE. Commuters living with more household members are less likely to engage in to longer activity dura tion, as evidenced by the negative coefficient of HHSIZE, si milar to non-commuters. Total daily work time negatively affects main tenance activity duration, as indicated by the negative coefficient for WORKDUR. As expecte d, the more time commuters spend on work, the less time is available for maintenance activities. The negative coefficient for CAR_0 indicates that commuters without ca rs in household tend to allocate more time on maintenance activity than those with household cars. These commuters should heavily depend on public transit, thus the fi xed schedule of transit service may lengthen their activity duration. As for the estimation results for endogenous variable LN_DUR, there are no substantial difference found in all types of models. LN_DUR takes positive coefficient in the utility functions for PM peak choice and Midday c hoice and take s insignificant coefficient in the utility function for AM peak choice. It implies that activity duration

PAGE 140

129 negatively affects AM-peak or Off-peak c hoice of maintenance activities probably because the activities of l onger duration cannot be pursued due to work schedule (e.g. work starts in the morning) or institutiona l constraint (e.g. shopping center is closed at night and in early morning). The minor difference in the coefficients for endogenous variable among recursive model, mixed model and Lee model is cause d by slight correlations between random error terms. The unidentified mixed logit mode l justifies that f4 takes seemingly smallest value, thus f4 and g4 are fixed at zero in the identified mixed disc rete-continuous model. In the identified mixed model, the correla tions are calculated as 0.030, -0.129 and -0.013 according to Equation (2.4.5). In Lee model, r1 (0.285) and r4 (0.261) appear statistically significant and rather considerab le, thus the coefficient of LN_DUR is a bit less than that from recursive estimation. It ca n be seen the standard deviation fi for normal heterogeneity appears small and insignificant in mixed discrete choice model. Indeed, it reflects that commuters daily activity pattern is constrained by their rigid work schedule. The flexibility of maintenance activities is limited for commuters, thus there are not many unspecified factors contributi ng to time-of-day choices of maintenance activity. In addition, Table 4.19 shows the significance levels of error covariance are 0.387, 0.288 and 0.429, which does not provide strong evidence for the existe nce of error correlations. This result indicates that there are not many unspecified factors simultaneously affecting commuters maintenance activity timing and duration, which is consistent with the finding in Pendyala and Bhat (2004). The standard deviations of random error components in continuous model are reasonable and fairly consistent in all types of models. The standard deviation of the tota l random component in the identified mixed

PAGE 141

130 model can be calculated as 1.306, which is al most equal to the c ounterpart (1.307) in recursive model and is close to the c ounterpart (1.298) in Lee model. Table 4.18 offers model estimation resu lts for commuter model where time-ofday choices affect activity durat ion. The exogenous variables in all the models take the coefficients with the same sign and slight va riations in magnitude. The coefficient of AGE in PM utility function of Lee model is greatly different from the others due to the exclusion of insignificant AGE_SQ variable. All these coefficients for exogenous variables take the same sign as those in Table 4.17. The unidentified mixed logit model indicat es that f4 takes seemingly smallest value, thus f4 and g4 are fixed at zero in the identified mixed model. The estimation results for endogenous variables greatly diffe r from one another among various types of models. The coefficient of AMPEAK is not significant in recursive model. However, the coefficient of AMPEAK a ppears significan tly negative (-1.422) in Lee model as the error correlation r1 is signif icantly negative (-0.526) which basically indicate a strongly positive correlation between the corresponding random error terms. This estimation result is consistent with that in Pendyal a and Bhat (2004), where they also found that AMPEAK takes negative coefficient in comm uters activity duration model. However, the estimated coefficient for AMPEAK from mix ed model is insignificant. It explicitly indicates that the coeffici ent estimation for endogenous dummy variables is highly sensitive to the specification of error stru cture in the joint modeling system. PMPEAK takes significantly positive co efficient (0.867) in recursive model, but takes insignificant coefficient in iden tified mixed model and Lee model. The corresponding Corr(u2,a) is calculated as 0.243 and ta kes significance level of 0.187 in

PAGE 142

131 the mixed model, while the corresponding e rror correlation r2 is -0.648 in Lee model, inferring strongly positive correlation between the utility function for PM peak choice and continuous activity duration model. At this point, both mixed model and Lee model yield consistent results: unspecified factor s associated with PM peak choice positively affect duration of maintenance activities, but PM peak choice itself does not exert significant impact on the activity duration. MIDDAY takes significantly positive coe fficients 0.960 in recursive model and 0.813 in identified mixed model. The Corr(u3, a) is calculated as 0.080 and insignificant (significance level: 0.313). That is the reason why MIDDAY coefficient in mixed model does not differ greatly from that in r ecursive model. However, in Lee model, MIDDAY takes much greater positive coe fficient 1.791 while r3 is estimated as 0.726 indicating a highly negative correlation betw een the utility of midday choice and activity duration. The standard deviation of total random component in the mixed continuous model can be calculated as 1.267, which is close to the standard deviation 1.258 in recursive model and much less than the standard de viation 1.477 in Lee model. The greater standard deviation of Lee model is probably caused by the positive coefficient of WORKDUR in log-linear activity durati on model. In the recursive model, WORKDUR takes significantly negative coeffi cient as -0.042. In the identified mixed model, WORKDUR still takes ne gative coefficient as -0.034 in spite of insignificance. From behavioral perspective, total daily work time is expected to negatively affect commuters maintenance activity duration. Th e positive coefficient of WORKDUR in Lee model is counterintuitive. The distribut ional assumption on the whole latent utility

PAGE 143

132 including both systematic component and ra ndom component may be contributing to this problem in Lee model. Note that WO RKDUR has been specified and appears significant in Midday utility function. The highly negative correlation between Midday utility function and activity duration model, both of which include WORKDUR variable, may result in such a counterin tuitive estimator fo r WORKDUR in the continuous model. In general, the modeling estimation resu lts are rather consistent with the descriptive analysis in Table 4.13. On av erage, commuters maintenance activities starting in Midday and PM peak are longer than those in AM peak and Off-peak period. This result implies that activity duration is positively correlated with Midday and PM peak activity beginning time. In summary, only midday choice of maintenance activity positively affects activity duration for commuters. Due to constraint of fixed work schedule, commuters usually do not have much time for maintenance activities. Since midday period includes lunch time, comm uters may like to undertake a longer maintenance activity in this time period. On the other side, maintenance activity duration positively affects PM peak choice and midday c hoice. Intuitively, if commuters plan to make maintenance activities of longer duration, they would like to schedule them in midday or PM peak period. That is because midday includes lunch time at noon and PM peak period is flexible after work, during which longer maintenan ce activities can be undertaken on the way back home.

PAGE 144

133 Table 4.17 Commuter Model (Duration Time-of-day) Recursive Models Unidentified Mixed Models Identified Mixed Models Lee Models Variable Coeff. t-test Coeff. t-test Coeff. t-test Coeff. t-test Activity Time-of-Day Choice Model AM Peak Choice Model Constant -1.050 -3.26 -5.280 -1.15 -1.098 -2.61 -0.900 -2.75 AGE 3.610 5.56 7.137 1.66 3.713 5.28 3.592 5.50PMALE -0.400 -2.59 -0.371 -0.91 -0.413 -2.57 -0.471 -3.03HHSIZE1 -0.405 -2.72 -1.149 -1.23 -0.403 -2.47 -0.419 -2.81CAR_0 0.974 3.50 1.681 1.64 0.987 3.44 0.940 3.42LN_DISW -0.105 -1.83 -0.328 -1.09 -0.112 -1.81 -0.118 -2.06LN_DUR 0.035 0.58 -0.988 -0.84 -0.019 -0.16 0.001 0.02f1 --6.088 1.18 0.665 0.85 ---PM Peak Choice Model Constant -1.347 -3.08 -1.484 -2.82 -1.632 -2.44 -1.012 -2.31 AGE 5.590 2.75 6.174 2.61 5.352 2.52 5.419 2.68AGE_SQ -4.290 -1.76 -4.978 -1.75 -4.069 -1.62 -4.117 -1.70PMALE -0.311 -2.44 -0.324 -2.42 -0.285 -2.07 -0.363 -2.84CAR_0 0.812 3.41 0.824 3.32 0.787 3.23 0.793 3.38LN_DUR 0.530 10.72 0.538 10.02 0.657 2.80 0.410 7.41f2 ---0.461 -0.63 -0.210 -0.56 ---Midday Choice Model Constant 0.483 1.81 0.585 1.83 0.476 1.45 0.916 3.35 AGE 2.522 4.68 2.569 4.40 2.514 4.65 2.515 4.68PMALE -0.630 -4.94 -0.673 -4.38 -0.629 -4.91 -0.685 -5.37HHSIZE1 -0.182 -1.98 -0.205 -1.81 -0.166 -1.66 -0.165 -1.79LOW_INC 0.210 1.68 0.241 1.55 0.213 1.68 0.217 1.74CAR_0 0.640 2.63 0.642 2.58 0.632 2.59 0.619 2.58WORKDUR -0.274 -12.07 -0.307 -4.37 -0.280 -10.72 -0.278 -12.25LN_DUR 0.582 11.79 0.601 9.14 0.595 7.01 0.418 7.98f3 ---0.477 -0.50 0.078 0.19 ---Off-Peak Choice Model f4 ---0.120 -0.25 0.000 ---Activity Duration Model Constant 2.713 13.97 2.72010.722.72210.70 2.834 11.36 AGE 2.601 2.85 2.5652.152.5522.13 2.587 2.20AGE_SQ -2.853 -2.60 -2.809-1.96-2.808-1.96 -2.871 -2.03PMALE -0.196 -5.64 -0.196-4.32-0.187-4.13 -0.176 -3.92HHSIZE -0.097 -7.14 -0.097-5.47-0.097-5.50 -0.097 -5.74WORKDUR -0.071 -6.77 -0.071-5.18-0.071-5.21 -0.058 -4.22CAR_0 0.121 2.31 0.1221.760.1271.83 0.000 --g1/r1 --0.000--0.0860.54 0.285 8.38g2/r2 --0.000--1.0408.64 0.094 1.76g3/r3 --0.000---0.276-1.90 ---g4/r4 --0.000--0.000-0.261 7.67sigma 1.307 -1.30782.390.7364.98 1.298 82.45

PAGE 145

134 Table 4.18 Commuter Model (Time-of-day Duration) Recursive Models Unidentified Mixed Models Identified Mixed Models Lee Models Variable Coeff. t-test Coeff. t-test Coeff. t-test Coeff. t-test Activity Time-of-Day Choice Model AM Peak Choice Model Constant -0.973 -3.17 -4.407 -1.95 -1.161 -0.99 -0.954 -3.17 AGE 3.623 5.57 6.611 2.89 3.699 4.65 3.345 5.19PMALE -0.387 -2.52 -0.450 -1.16 -0.385 -2.36 -0.277 -1.82HHSIZE1 -0.479 -3.26 -1.284 -2.12 -0.519 -2.28 -0.387 -2.87CAR_0 1.013 3.67 1.699 2.45 1.036 3.44 0.944 3.44LN_DISW -0.108 -1.91 -0.279 -1.55 -0.115 -1.68 -0.091 -1.70f1 --4.515 2.34 0.704 0.33 ---PM Peak Choice Model Constant -0.224 -0.53 -0.778 -0.84 -0.352 -0.69 0.404 1.85 AGE 6.036 3.00 11.085 2.13 6.459 2.73 2.396 4.61AGE_SQ -4.551 -1.88 -9.600 -1.65 -5.096 -1.78 0.000 --PMALE -0.388 -3.12 -0.515 -2.02 -0.373 -2.82 -0.327 -2.70CAR_0 0.965 4.10 1.486 2.79 0.991 4.05 0.929 3.97f2 ---2.218 -2.00 0.640 1.05 ---Midday Choice Model Constant 1.809 7.45 3.696 2.94 1.893 5.83 1.789 7.53 AGE 2.768 5.25 4.419 2.74 2.791 5.16 2.696 5.23PMALE -0.716 -5.76 -1.323 -2.62 -0.728 -5.39 -0.631 -5.21HHSIZE1 -0.150 -1.64 -0.316 -1.37 -0.167 -1.62 0.000 --LOW_INC 0.217 1.74 0.532 1.48 0.236 1.67 0.184 1.78CAR_0 0.784 3.26 1.092 2.29 0.784 3.22 0.643 2.71WORKDUR -0.280 -12.49 -0.663 -2.53 -0.301 -5.19 -0.282 -12.99f3 ---2.918 -1.77 -0.403 -0.56 ---Off-Peak Choice Model f4 ----1.436-1.560.000---Activity Duration Model Constant 1.863 9.28 1.863 7.38 1.997 6.27 1.571 3.05 AGE 2.350 2.57 2.350 2.04 2.658 2.23 2.339 2.04AGE_SQ -2.648 -2.41 -2.648 -1.92 -3.021 -2.10 -2.722 -1.98PMALE -0.140 -4.02 -0.140 -3.19 -0.136 -2.68 0.000 --HHSIZE -0.083 -6.08 -0.083 -4.84 -0.086 -4.93 -0.075 -4.39WORKDUR -0.042 -3.86 -0.042 -3.07 -0.034 -1.49 0.045 2.66CAR_0 0.088 1.67 0.088 1.33 0.113 1.63 0.158 2.11AMPEAK 0.071 0.94 0.071 0.75 -0.010 -0.04 -1.422 -2.51PMPEAK 0.867 14.13 0.867 11.24 0.450 1.50 -0.279 -0.56MIDDAY 0.960 15.66 0.960 12.45 0.813 3.01 1.791 3.82g1/r1 --0.000 -0.094 0.24 -0.526 -6.10g2/r2 --0.000 -0.689 1.73 -0.648 -14.15g3/r3 --0.000 --0.337 -1.57 0.726 15.96g4/r4 --0.000 -0.000 -0.083 0.45sigma 1.258 -1.258 82.39 1.004 3.57 1.477 34.81

PAGE 146

135 4.2.4 Model Performance Comparisons Based on Non-nested Test The model estimation results presented in Section 4.2.3 generally offer plausible statistical indications for alternative causal paradigms. Table 4.20 compares goodnessof-fit measurements of models under alternative causa l structure. As derived in Section 2.3.3, the extension of non-nest ed test for discrete-conti nuous model is adopted for Table 4.19 Simulation-based Hypothesis Test for Error Covariance of Identified Mixed Discrete-continuous Models i 1 2 3 4 Non-commuter Model (Duration Time-of-Day) E(fi) 2.167---0.440-0.029 Std(fi) 3.612--1.7601.450 E(gi) 0.364--0.240-0.284 Std(gi) 0.308--0.8280.263 (fi,gi) 0.709---0.915-0.314 Sign.(figi) 0.209--0.1280.549 Non-commuter Model (Time-of-Day Duration) E(fi) --0.143-0.3660.552 Std(fi) --0.1960.2710.378 E(gi) ---0.8760.504-0.305 Std(gi) --0.0830.0960.092 (fi,gi) ---0.0860.196-0.026 Sign.(figi) --0.2320.0890.072 Commuter Model (Duration Time-of-Day) E(fi) 0.665-0.2100.078-Std(fi) 0.7820.3750.411-E(gi) 0.0861.040-0.276-Std(gi) 0.1590.1200.145-(fi,gi) -0.065-0.279-0.009-Sign.(figi) 0.3870.2880.429-Commuter Model (Time-of-Day Duration) E(fi) 0.704 0.640 -0.403 -Std(fi) 2.1270.6890.719-E(gi) 0.0940.806-0.337-Std(gi) 0.3870.3990.215-(fi,gi) -0.522-0.548-0.004-Sign.(figi) 0.6300.1870.313-

PAGE 147

136 identifying the dominant causal structur e among the population. For non-commuters, both recursive model and mixed model indicat e that the model in which time-of-day choices affects activity dura tion provides better goodness-of -fit in terms of greater adjusted likelihood index. Also, non-nested test rejects the model in which activity duration affects time-of-day choice. For commu ters, non-nested test fails to reject the model in which activity duration affects time-of-day choice, therefore the causal relationship between time-of-day choices and activity duration is still inconclusive for commuters. On the other side, Lee model supports opposite conclusion that the causal relationship of duration aff ecting time-of-day choices is dominant among population. This finding is consistent with that in Pe ndyala and Bhat (2005), who applied Lee model for identifying causal relationship between tim e-of-day choices and maintenance activity duration based on the survey data from Florida, USA. Lee model also identifies that the dominant causal relationship fo r commuters is also duration time-of-day but Pendyala and Bhat (2005) did not draw conclusi ve results for commuters. It is rather surprising to see that not only is the co efficient estimation of endogenous variables sensitive to the specification of error structure, but also the dominant causal structure is. Finally, Lee model provides better overall goodness-of-fit of data than mixed model does. 4.2.5 Discussions and Conclusions Figures 4.1 and 4.2 summarize the causal relationships between activity duration and time-of-day choices based on the mixed di screte-continuous model and Lee model. There are some contradictive results associated with the impact of AM peak choice on the

PAGE 148

137 activity duration betwee n different error structures. The diagrams only show the causal relationships that are consistently indicated by both mixed and Lee model. The causal relationship rejected by non-ne sted is not dominant among p opulation but probably exist among population, thus both dominant and undomina nt causal relationship are illustrated in the figures for comparison purpose. For non-commuters, Midday a nd PM peak choice Table 4.20 Comparison of Goodness-offit of Timing-duration Models Non-Commuter Models Commuter Models Duration TimeTime DurationDuration Time Time Duration Sample Size 11293 3394 LL at zero: LL(0) -35185.0 -10476.7 LL at constant: LL(c) -29729.8 -9846.93 Estimated sigma 1.3640 1.3254 Recursive Model # of Parameters 25 25 29 30 LL at convergence (LL) -29212.9 -29193.2 -9514.41 -9513.36 2 at zero 0.1697 0.1703 0.0919 0.0920 Adj. 2 at zero 0.1690 0.1696 0.089082 0.089087 2 at constant 0.0174 0.0180 0.0338 0.0339 Adj. 2 at constant 0.0165 0.0172 0.0308 0.0308 Non-nested Test (Prob.) 0.0006 (0.000) 0.000005(0.147)Unidentified Mixed Model # of Parameters 27 27 33 33 LL at convergence -29211.0 -29193.2 -9512.67 -9511.38 2 at zero 0.1698 0.1703 0.0920 0.0921 Adj. 2 at zero 0.1690 0.1695 0.088867 0.088990 2 at constant 0.0175 0.0180 0.0339 0.0341 Adj. 2 at constant 0.0165 0.0171 0.0306 0.0307 Non-nested Test (Prob.) 0.0005 (0.001) 0.00012(0.056)Identified Mixed Model # of Parameters 29 28 35 35 LL at convergence -29213.1 -29196.6 -9511.55 -9511.69 2 at zero 0.1697 0.1702 0.0921 0.0921 Adj. 2 at zero 0.1689 0.1694 0.088783 0.088770 2 at constant 0.0174 0.0179 0.0341 0.0340 Adj. 2 at constant 0.0164 0.0170 0.0305 0.0305 Non-nested Test (Prob.) 0.0005 (0.000)0.000013(0.300) Lee Model # of Parameters 27 28 31 31 LL at convergence -29078.9 -29187.8 -9463.49 -9497.50 2 at zero 0.1735 0.1704 0.0967 0.0935 Adj. 2 at zero 0.1728 0.1697 0.0938 0.0905 2 at constant 0.0219 0.0182 0.0389 0.0355 Adj. 2 at constant 0.0210 0.0173 0.0358 0.0323 Non-nested Test (Prob.) 0.0031 (0.000) 0.0032 (0.000)

PAGE 149

138 positively affects activity duration. As e xpected, non-commuters have sufficient time available for maintenance activities without in stitutional constraint such as closing time of shopping center. On the other side, main tenance activity duration positively affects Midday choice. In other words, maintenan ce activity of longer duration tends to be scheduled in midday period. Intuitively, non-commuters w ho intends to make longer maintenance activities probably prefer to star t them in midday for having sufficient time, avoiding peak-period congestion and institut ional constraint. After long maintenance activities in midday, non-commuters may have to get back home earlier, probably in PM peak period, for undertaking necessary household obligations. Note: Solid arrow represents positive impact Figure 4.1 Diagram of Consistent Causal Relationship Identif ied by Joint Timingduration Model for Non-commuters Note: Solid arrow represents positive impact Figure 4.2 Diagram of Consistent Causal Relationship Identif ied by Joint Timingduration Model for Commuters Midday Activity Duration PM Peak AM Peak Midday Activity Duration PM Peak AM Peak

PAGE 150

139 For commuters, only midday choice of maintenance activity positively affects activity duration. Due to the constraint of fixed work sc hedule, commuters usually do not have too much time for maintenance activitie s. Since midday period includes lunch time, commuters may like to enjoy a longer maintena nce activity in this time period. On the other side, maintenance activity duration positively affects PM peak choice and midday choice. Intuitively, if commuters plan to ma ke maintenance activities of longer duration, they would like to schedule them in midday or PM peak period. That is because midday includes lunch time at noon and PM peak period is commute time after work, while longer maintenance activities can be made on the way back home. In summary, this section ha s presented an exploration of the relationship between activity timing (time of day choice) and activity episode duration for maintenance activities such as shopping and service. The analysis invo lved the estimation of joint models of activity timing and duration se parately for commuters and non-commuters while allowing two types of error correla tions between the timing and duration model equations. Time of day choice was modeled as a discrete choice va riable involving four alternative periods of the day while duration was modeled using a log-linear formulation. Two different causal structures were considered: Activity timing (time of day c hoice) affects activity duration Activity episode duration affects act ivity timing (time of day choice) Both of these causal structures we re estimated on the non-commuter and commuter sample activity episodes to identify the appropriate causal structure for each sample group. The identification of such causal relationships between activity engagement phenomena is very important from several key perspectives. First, the

PAGE 151

140 identification of appropriate causal structures will help in the development of accurate activity-based travel demand model systems that intend to capture su ch relationships at the level of the individual traveler and activ ity episode. Second, a knowledge of the true causal relationships underlying decision proces ses will help in the accurate assessment and impact analysis of alternative transporta tion policies such as va riable pricing, parking pricing, and telecommuting. Unfortunately, the dominant causal relationship between timing and duration has not been consistently identified through tw o types of models. For both commuters and non-commuters, Lee model supports the causa l relationship that activity duration is determined first and then influence time of day choice. However, mixed discretecontinuous model supports the alternative casual relationship for both commuters and non-commuters: time-of-day choices are firs t determined and then influence activity duration. Both mixed model and Lee model adopt Full Information Maximum Likelihood (FIML) method based on distributional assu mption of error structure in simultaneous model system, so as to consistently es timate the coefficient of endogenous dummy variable in continuous model or endogenous continuous variable in discrete choice model. The error structure in mixed model is a more behaviorally interpreta ble than the one in Lee model. However, the likelihood function of mixed model does not have closed form and Monte Carlo integral is required to approximate th e likelihood function. Maximum Simulated Likelihood Estimation (MSLE) ba sed on Monte Carlo integral is timeconsuming under the current leve l of computational technolo gy. Further, the simulation bias cannot be avoided in MSLE. More qua si-random seeds for simulation can alleviate

PAGE 152

141 the simulation bias, but the accuracy is tr aded with time consumption in estimation procedure. Lee model has a closed form based on Lee transformation. Estimation procedure of Lee model takes much less time than that of mixed model (a few minutes vs. a few hours). Moreover, Lee model better fits the data though herein the coefficient of endogenous variable is more of con cern, rather than the fitness. The dependency on strong distributional assumption is a common disadvantage of mixed model and Lee model. Maximum like lihood estimation is always consistent and efficient as long as the distributional assu mption is true and all the parameters are identifiable. However, the distributional assumption is vulnerable in many cases, particularly when it is assumed to take acc ount of unobserved heterogeneities. Since the coefficient of endogenous vari able is highly sensitive to distributional assumption, a robust specification of error structure turn s to be extremely important. For obtaining more robust estimation results, there might be tw o directions to furthe r explore this topic in the area of travel behavior analysis One is to introduce non-parametric heterogeneities into the joint model system. In econometric lite rature, Mroz (1987 and 1999) applied discrete factor approximation to estimate endogenous dummy variable in a continuous model. Sometimes, this met hod is called mass point method, in which heterogeneity is non-parametric and discrete ly distributed, in place of parametric and continuously distributed (e.g. normal distributi on, gumbel distributi on). However, it is not easy to apply this method in practice si nce the derived log-lik elihood function is not globally concave and has multiple peaks. A large number of starting values need to be explored to avoid the pitfa ll of local maxima. The other way is to apply Limited Information Likelihood Estimation (LILE), whic h is more robust, albeit less efficient,

PAGE 153

142 than FIML. Dubin and Mcfadden (1984) deve loped a two-stage estimation procedure for joint discrete-continuous model system, where multinomial logit model is initially estimated and then a non-linear function with re spect to the predictors is specified into continuous model as well as the other explanatory variables. This approach may be used for consistently estimating the endogenous dummy variables in continuous model but cannot be directly used for estimating the endogenous continuous vari ables in the latent utility function for discrete choice. That is because multinomial logit model needs to be estimated in advance without any endogenous variables. But the idea of two-stage estimation merits our reference for devel oping a more robust modeling framework in which both continuous dummy variable a nd endogenous dummy variables can be consistently estimated. It remains challenging but interesting research effort for future. 4.3 Causal Models Between Trip Ti ming and Mode Choice (Mixed Binarymultinomial Choice Model) 4.3.1 Background Departure time choice and mode choice ar e important constituents of traveler behavior (Bhat, 1998). Travel demand models designed to estimate travel not only for the average weekday, but for different periods within the day (referred to as time-of-day models), are increasingly required to analy ze a broad range of tran sportation policies and initiatives (Cambridge Systematics, 1997). In addition to the temporal dimension of trip making, mode choice is another facet of trip making that ha s important implications in the transportation policy context. Understa nding the relationships underlying these two facets of travel behavior will, in turn, assist planners in examining the potential

PAGE 154

143 effectiveness of policy measures aimed at a lleviating traffic congest ion and reducing auto vehicle emissions. Such policies, motivated by recent legislation, call for the deployment of travel demand models capable of assessing a range of tr ansportation control measures (TCMs) (Stopher, 1993 and Weiner and Ducca, 1996). Early studies involving de parture time choice have focused mainly on work or commuting trips. Indeed, commuting directly contributes to morning and afternoon peak period congestion. The direct link between wo rk trips and peak travel has provided researchers (Noland and Small 1994, Kumar and Levinson 1994, Lockwood and Demetsky, 1994) the necessary impetus to undertake studies that aim at modeling departure time choice of commuters a nd understanding the relationship between commuter departure time choice an d traffic congestion levels. The interest in modeling non-work trips also lies in their inherent nature of being more flexible than work trips in terms of the individuals time-of-day choice and mode choice. For certain types of non-work activit ies, such as shopping, the departure time flexibility is evident and therefore travel ers may have a greater tendency to shift departure times than shift modes in response to transportation cont rol measures (Bhat, 1998). Similarly, social-recreation trips may be pursued at various times of the day unless the activity involves rigid time and space constraints such as those associated with concerts, sporting events, and movies. With respect to mode choice, non-work activities and trips tend to be undertak en jointly with other househol d members or friends (Steed and Bhat, 2000). Such joint coupling c onstraints may make mode switching quite difficult; on the other hand, departure time shifts may still be feasible, particularly in

PAGE 155

144 todays context of real-time activity scheduling using cellular communications technology. The causality between departure time choi ce and mode choice is quite important from a transportation planning and policy analysis context. If mode choice precedes departure time choice, then strategies aimed at reducing peak period travel should also focus significantly on peoples mode choice beha vior (because the departure time choice is influenced by mode choice). On the other hand, if departure time choice affects (and therefore precedes) mode choice, then strate gies aimed at reducing peak period travel demand can focus primarily on departure time as pects of behavior. Besides, strategies aimed at reducing SOV use would have to focus significantly on departure time choice aspects as well because mode c hoice is affected by departure time choice. In addition to the causal relationship between these two aspect s of behavior, attention must be paid to the potential simultaneity in their nature, in that, unobserved factor s affecting each of these may be correlated with one another. Thus, when modeling the relationship between departure time choice and mode choice, one needs to consider a rigorous simultaneous equations modeling framework. Treating m ode choice as multinomial choice variable and departure time choice as a binary c hoice variable, the proposed mixed binarymultinomial choice modeling methodology provid es a rigorous modeling framework in which the causal relationship be tween them can be analyzed. The central question addressed is: what is the causal re lationship between departure time choice and mode choice for non -work trips? One may conjecture that people engaging in activities in the non-peak period may choos e to travel by automobile because of the reduced traffic congestion and possibly better trans it levels of service

PAGE 156

145 during such periods. Conversely, people choos ing to travel by the automobile may arrange their activities such that they can do so in th e non-peak periods to avoid congestion. Similar causal relationships may be considered in the c ontext of peak period travel and/or non-auto travel. Thus, one may hypothesize causal re lationships between departure time choice and mode choice that are opposite to one another. This section attempts to shed light on this issue by id entifying the causal structure using proposed mixed binary-multinomial choice model, wher e peak-period departure is modeled as a binary choice: peak vs. non-peak and mode choice as 4-alternative multinomial choices: SOV (Single Occupancy Vehicle), HOV (Hi gh Occupancy Vehicle), Transit and Nonmotorized Mode (Bicycle and Walk). 4.3.2 Dataset Preparation and Desc ription for Modeling Analysis The data set for modeling analysis is derived from Swiss Travel Microcensus 2000. Level-of-Service (LOS) variables asso ciated with travel modes are the most important variables influencing mode choice be havior. These data are only available for the model area of Canton Aargau, thus all the non-work trips made in this area are selected to form a trip file including LOS variables for each pair of trips origin and destination, trip departure time, revealed m ode choice, trip purpose and socio-economic and demographic variables of trip makers Two market segments: commuters and noncommuters are classified and separately m odeled with consideration of the influence from the work schedule constraint on commuters. Commuters were defined as individuals who commuted to a work place on the travel diary day, while non-commuters were defined as those who did not commute to a work place (made zero work trips) on

PAGE 157

146 the travel diary day. Note that a worker (employed person) who did not commute on the travel diary day would still be classified as a non-commuter for the purpose of this study. Table 4.21 Household Characteristics of Swi ss Travel Microcensus 2000 and Sample for Model of Mode Choice and Time-of-day Choice Characteristic Swiss Sample Non-commuters Sample Commuters Sample Sample Size 27918 2273 1753 Household Size 2.43 2.63 2.47 1 person 27.5% 26.7% 26.7% 2 persons 35.1% 29.3% 33.9% 3 persons 14.0% 12.1% 13.2% 4 persons 23.4% 31.8% 26.1% Monthly Income Low (Fr 8K) 18.4% 19.1% 33.0% Missing 24.9% 22.6% 16.5% Vehicle Ownership 1.17 1.10 1.33 0 auto 19.8% 23.1% 13.1% 1 auto 50.5% 50.5% 50.3% 2 autos 24.5% 21.5% 29.8% 3 autos 5.2% 4.9% 6.7% Family Type Single 27.2% 26.7% 26.4% Partner (unmarried and no child) 27.9% 23.2% 28.0% Married 43.6% 48.7% 43.4% Other 1.3% 1.5% 2.3% Presence of Children Child < 6 years old 10.6% 10.5% 9.4% Child 6~17 years old 22.5% 31.6% 21.7% Household Location Major city 42.4% 47.2% 46.3% Surrounding areas of c ity 30.4% 29.1% 31.9% Isolated city 1.1% 1.0% 0.5% Rural 26.1% 22.7% 21.3%

PAGE 158

147 In the sample, non-work trips were pu rsued by 4260 individuals residing in 4026 households. Among these individuals, 1805 were commuters reporting 4619 non-work trips and the remaining 2455 individuals we re non-commuters reporting 7984 non-work trips. For these specific datasets, Table 4.21 provides a summary of the household characteristics of these two samples and comp ares the characteristics with that of the whole Swiss sample. The average hous ehold size for the non-commuters and Table 4.22 Person Characteristics of Swiss Travel Microcensus 2000 and Sample for Model of Mode Choice and Time-of-day Choice Characteristic Swiss Sample Non-commuters Sample Commuters Sample Sample Size 29407 2455 1805 Age (in years) 43.9 (Mean) 44.1 (Mean) 41.1 (Mean) Young (6~29) 26.8% 32.0% 21.1% Middle (30~59) 47.6% 32.9% 72.0% Old ( 60) 25.5% 35.1% 7.0% Sex Male 46.3% 40.9% 59.2% Female 53.7% 59.1% 40.8% Employment Status Full time 37.3% 16.3% 75.5% Part time 14.3% 11.9% 19.3% Not employed 48.4% 71.7% 5.2% Licensed 67.4% 56.0% 88.1% #Trips/day 3.51 4.18 4.74 Work trips 0.46 0.07 1.63 Non-work trips 3.05 4.11 3.11 commuters household sample is 2.63 and 2.47 persons, respectively. As expected, households of commuter sample report higher income levels than households of noncommuter sample, presumably because commuters households consistently include workers earning wages. Similarly, house holds of commuter sample report higher car

PAGE 159

148 ownership levels than households of non-commuter sample because commuters households are more likely to own cars. N on-commuters are more likely to live with children who are 6 ~ 17 years ol d as shown by a higher percen tage than the whole Swiss sample. The percentage of households locati ng in rural area in the current samples are lower than that of the whole Swiss sample, probably because there are less rural areas in Aargau Canton. Table 4.23 Crosstabulation of Mode Choice and Time-of-day Choices for Non-commuters Time-of-Day Choices Mode Choice Non-Peak Peak Total Frequency SOV 1070 491 1561 HOV 1096 542 1638 Transit 647 324 971 Non-motorized2692 1122 3814 Total 5505 2479 7984 Column Percent SOV 68.5% 31.5% 100.0% HOV 66.9% 33.1% 100.0% Transit 66.6% 33.4% 100.0% Non-motorized70.6% 29.4% 100.0% Total 69.0% 31.0% 100.0% Row Percent SOV 19.4% 19.8% 19.6% HOV 19.9% 21.9% 20.5% Transit 11.8% 13.1% 12.2% Non-motorized48.9% 45.3% 47.8% Total 100.0% 100.0% 100.0% Table 4.22 compares the person characteristic s of samples with those of the whole Swiss sample. The major differences between commuters and non-commuters are consistent with expectations. Commuters are predominantly in the age groups of 30 ~ 59 years while 35.1% of non-commuter s are older than or equal to 60 years of age. 75.5% of commuters are employed full time while only 16.3% of non-commuters are employed full time. 88.1% of commuters hold driver license whereas 56.0% of non-commuters

PAGE 160

149 hold driver license. Finally, commuters make 1.63 work trips and 3.11 non-work trips per day, while non-commuters make 4.18 non-work trips per day. Prior to commencing the model developmen t effort, descriptive analysis of the potential relationship between mode choice and peak-period departure of trip was undertaken. Trip departed in the time period of 6:00 AM 8:59 AM and 4:00 PM 6:59 PM are defined as peak-period trips. Ta bles 4.23 and Table 4.24 offer simple crosstabulations of time-of-day choice against mode choice for non-commuters and commuters, respectively. Table 4.24 Crosstabulation of Mode Choice and Time-of-day Choices for Commuters Time-of-Day Choices Mode Choice Non-Peak Peak Total Frequency SOV 1107 872 1979 HOV 472 238 710 Transit 217 318 535 Non-motorized977 418 1395 Total 2773 1846 4619 Column Percent SOV 55.9% 44.1% 100.0% HOV 66.5% 33.5% 100.0% Transit 40.6% 59.4% 100.0% Non-motorized70.0% 30.0% 100.0% Total 60.0% 40.0% 100.0% Row Percent SOV 39.9% 47.2% 42.8% HOV 17.0% 12.9% 15.4% Transit 7.8% 17.2% 11.6% Non-motorized35.2% 22.6% 30.2% Total 100.0% 100.0% 100.0% For non-commuters, there are only slight differences in distribution across timeof-day choices and mode choices. Se emingly, trips using HOV mode (33.1%) and Transit mode (33.4%) are more likely to be sc heduled in peak period, as compared with the average peak period distri bution of 31.0%. On the other side, peak-per iod trips are

PAGE 161

150 more likely to use HOV mode (21.9%) and Tran sit mode (13.1%), as compared with the average mode distribution of 19.6% for HOV mode and 12.2% for Transit mode. For commuters, the differences in distribution ac ross time-of-day choices and mode choices are more remarkable than those for non-comm uters. Seemingly, trips using SOV mode (44.1%) and Transit mode (59.4%) are more lik ely to be scheduled in peak period, as compared with the average peak period dist ribution of 40.0%. On the other side, peakperiod trips are more likely to use SOV mode (47.2%) and Transit mode (17.2%), as compared with the average mode distri bution of 42.8% for SOV mode and 11.6% for Transit mode. 4.3.3 Model Estimation Results 4.3.3.1 Estimation Results for Non-commuters Table 4.25 offers definition and description of variables adopted in the models. Table 4.26 provides the non-commuter mode l estimation results under the causal structure where multinomial mode choices aff ect binary time-of-day choice. The first block offers the estimation results of recurs ive models, i.e. a multinomial logit model for mode choices and a binary probit model for p eak-period departure choice of trips. The second block offers the estimation results of mixed binary-multinomial choice model, in which the standard deviations gi of heterogeneity in binary probit model are fixed at 1. The third block offers the estimation results of mixed binary-multinomial choice model, in which fi is equal to gi in terms of the absolute value and the sign of figi is forced to be consistent with the sign of fi estimated in the second block. Level-Of-Service (LOS) variables, car ownership and transit seasonal ticket subscription are specified into the multinomial m ode choice model. In all the three types

PAGE 162

151 of models, travel time using various m odes appears significantly negative in the corresponding utility function, as expected. TERMTIME and PKLOT_SH take negative coefficients, which is consistent wi th expectation that l onger terminal time and shortage of parking spaces at destination tend to reduce th e possibility of auto mode choice. It is almost impossible for persons with no cars in household to drive and to use SOV mode, thus the coefficient of CAR_0 appears highly nega tive (around -3.5) in SOV utility function. However, the coeffici ent appears modestly negative (around -1.3) in HOV utility function since these persons may use HOV mode as passengers although they are unlikely to be drivers. The persons with more than one car in household are more accessible to cars thereby more lik ely to use SOV mode and HOV mode, as evidenced by positive coefficients of CAR_GE2 in both utility functions. As for the transit mode choice, riders appear equally se nsitive to in-vehicle time and waiting time at the initial station, as indicated by the almost identically negative coefficients of IVEH and OWT. The positive coefficient of FREQ indicates that the more service frequencies can increase the possibility of tr ansit choice, which is consistent with expectation. People subscribi ng transit seasonal tickets ar e much more likely to use transit than those without subs cription, as indicated by the highly positive coefficient of TRST_SUB (around 2.0) in trans it utility function. In the binary probit model for peak peri od departure, it can be found that the persons elder than 60 years old are less likely to make their non-work trips in peak period, presumably because old people are more sensit ive to traffic congestion and more inclined to avoid it than young people. The positive co efficient of HHSIZE2 indicates that noncommuters living in two-member household are more likely to make peak-period non-

PAGE 163

152 work trips, possibly for sharing cars with the other household member who commutes in peak period. Shopping trips are less likely to be scheduled in peak pe riod than the trips for the other purposes, as evidenced by the negative coefficient of SHOPPING. The positive coefficient of SERVICE indicates that non-commuters prefer to make service trips in peak period, because the trips for taking children to school are the main body of service trips and most of them are unde rtaken in morning peak period. Table 4.25 Variable Description in Timing-mode Choice Model Commuters Activity Sample Non-Commuters Activity Sample Sample Size 4619 7984 Variable Name Variable Description Mean Std. Dev. Mean Std. Dev. CAR_TIME Car in-vehicle time (100 mins) 0.110.10 0.08 0.09 TERMTIME Car terminal time (min) 5.482.12 5.17 2.25 PKLOT_SH Measurement of parking lot shortage 2.645.53 2.27 4.91 CAR_0 Household does not own car 0.120.32 0.21 0.41 CAR_GE2 Household owns more than one car ( 2) 0.380.49 0.28 0.45 IVEH Transit in-vehicle time (100 mins) 0.120.13 0.09 0.11 OWT Waiting time at 1st transit station (min) 0.140.16 0.11 0.13 FREQ Transit frequency within 2 hours 7.637.88 7.71 8.10 TRST_SUB Transit seasonal ticket is subscribed 0.210.41 0.22 0.42 NM_TIME Average travel time by bicycle and on foot (100 mins) 0.670.84 0.47 0.70 OLD Person is over 60 years old 0.040.21 0.35 0.48 HHSIZE2 Household has two members 0.320.47 0.28 0.45 NSWISS Household is not located in a permanent address of Switzerland 0.010.12 0.00 0.07 SHOPPING Trip purpose is shopping 0.130.33 0.20 0.40 SERVICE Trip purpose is service 0.030.18 0.04 0.19 SOV Trip mode is SOV 0.430.49 0.20 0.40 HOV Trip mode is HOV 0.150.36 0.21 0.40 TRANSIT Trip mode is Transit 0.120.32 0.12 0.33 NMOTOR Trip mode is bicycle or walk 0.300.46 0.48 0.50 PEAK Trip is departed in peak period 0.400.49 0.31 0.46 In binary probit model, theoretically speak ing, the coefficients of variables should be proportional to the standard deviation of normal random error term in the utility function because dependent variable is an unobservable latent variable. This latent variable can be arbitrarily changed by scali ng up the coefficients and standard deviation

PAGE 164

153 of normal random term without changing the probability of observed binary choices. Thus, in standard binary probit model, th e standard deviation of normal random error term is normalized at 1 for estimating a unique set of coefficients. In mixed binarymultinomial choice model, the standard deviation gi of normal heterogeneity is involved into the random component of binary probit model and then the standard deviation of random component must be greater than 1. Thus the coefficients in mixed model will be enlarged in response to the increment in th e standard deviation of random component. The ratio between the coefficients in mixed model and those in recursive model can be calculated as 1 g g g g2 4 2 3 2 2 2 1 In the mixed model where gi is fixed at 1, the ratio is a constant ( 236 2 5 ). In the second block, almost all the coefficients for other exogenous variables have been scaled up by 2 ~ 3 times. However, the coefficient of endogenous dummy variable TRANSIT is scaled up by around 7 times, much higher than 2.236 (0.133 vs. 0.900), which is caused by the correlation between the error terms in transit utility function and peak-period de parture utility function. In the mixed model where gi is fixed at 1, this correlation can be estimated by the equation that ) 1 I )( 6 (f f ) v u ( Corr2 2 i i q qi Accordingly, the correla tion between the errors terms in transit utility function and peak-peri od departure utility func tion is calculated as -0.305. This provides a good reason as to why the coefficient of TRANSIT is much greater than the calculated theoretical value (0.900 > 0.133.236 0.297). The correlations between utility f unctions of SOV, HOV, Non-motorized and that of peakperiod departure are 0.147, 0.092 and -0.017, se quentially. SOV and HOV appear insignificant in all the models.

PAGE 165

154 The third block provides the mixed m odel with the restriction that |fi| = |gi|, through which it is believed that the correla tions can be better accommodated into the simultaneous equations model. In the curren t model, the correlations between utility functions of SOV, HOV, Transit, Non-motorized mode and that of peak-period departure are 0.051, 0.013, -0.442 and -0.091, sequentially. In the current mixed model, the absolute values of correlations for SOV a nd HOV are less and thos e for Transit and Nonmotorized are greater than in the mixed model where gi is fixed at 1. The most useful contribution from the mixed m odel with the restriction |fi| = |gi| is the accommodation of error correlation greater than 447 0 1 I 1 (I = 4). The correlation of -0.442 between transit utility and peak-period departure util ity can be rarely accommodated in the mixed model where gi is fixed at 1. Similarly, the coeffici ents of variables in binary choice model will be scaled up with th e involvement of additional heterogeneities. The ratio is calculated as 582 1 1 0.442 1.083 0.168 327 02 2 2 2 Correspondingly, the estimated coefficients in the current mi xed model are scaled up by around 1~2 times compared with in recursive model. Two t ypes of mixed models yield rather similar estimation results for endogenous dummy va riable indicating mode choices. TRANSIT takes positive coefficient 0.789, whic h is a bit less than 0.900 in the second block. Both types of models support the hypothesi s that transit trips are more likely to be scheduled in peak period.

PAGE 166

155 Table 4.26 Non-commuter Model (Mode Time-of-day) Recursive Models Mixed Models (gi is fixed at 1) Mixed Models (|fi| = |gi|) Variable Coeff. t-test Coeff. t-test Coeff. t-test Mode Choice Model SOV Mode Choice Model Constant -1.616 -16.96 -1.777 -15.72 -1.785 -15.27 CAR_TIME -5.063 -6.27 -8.345 -7.62 -8.781 -7.77 TERMTIME -0.037 -2.03 -0.009 -0.48 -0.008 -0.42 PKLOT_SH -0.028 -3.15 -0.027 -2.89 -0.027 -2.87 CAR_0 -3.495 -12.76 -3.571 -12.85 -3.578 -12.88 CAR_GE2 0.587 8.41 0.609 8.32 0.618 8.34 f1 --0.448 2.35 0.327 1.04 HOV Mode Choice Model Constant -1.456 -15.67 -1.585 -15.11 -1.609 -15.08 CAR_TIME -3.929 -4.95 -7.222 -6.68 -7.662 -6.89 TERMTIME -0.088 -4.93 -0.061 -3.21 -0.059 -3.07 PKLOT_SH -0.023 -2.76 -0.022 -2.55 -0.022 -2.50 CAR_0 -1.260 -11.75 -1.302 -11.61 -1.326 -11.48 CAR_GE2 0.397 5.53 0.413 5.59 0.426 5.64 f2 --0.269 1.24 0.168 0.56 Transit Mode Choice Model Constant -3.914 -36.66 -4.426 -19.73 -4.448 -23.63 IVEH -1.577 -2.41 -2.898 -3.73 -3.103 -3.93 OWT -1.627 -3.37 -2.889 -4.73 -3.126 -4.96 FREQ 0.044 10.13 0.051 8.70 0.050 9.26 TRST_SUB 1.884 22.58 2.148 16.27 2.148 18.77 f3 ---1.061 -5.70 -1.083 -7.90 Non-motorized Mode Choice Model NM_TIME -4.448 -26.58 -5.031 -23.17 -5.192 -21.04 f4 ---0.049 -0.30 -0.442 -2.87 Time-of-Day (Peak-Period Departure Choice) Model Constant -0.482 -19.27 -1.031 -13.16 -0.753 -11.84 OLD -0.116 -3.48 -0.358 -4.72 -0.259 -4.26 HHSIZE2 0.072 2.06 0.183 2.34 0.132 2.33 NSWISS -0.607 -2.31 -1.257 -2.16 -0.934 -2.23 SHOPPING -0.147 -3.86 -0.329 -3.88 -0.233 -3.69 SERVICE 0.240 3.22 0.521 3.15 0.365 3.05 SOV 0.036 0.90 -0.252 -1.39 -0.128 -0.83 HOV 0.069 1.76 -0.045 -0.23 -0.034 -0.25 TRANSIT 0.133 2.84 0.900 5.89 0.789 4.84 g1 ---1.000 -0.327 1.04 g2 ---1.000 -0.168 0.56 g3 ---1.000 -1.083 7.90 g4 ---1.000 -0.442 2.87

PAGE 167

156 Table 4.27 provides the non-commuter mode l estimation results under the causal structure where binary time-of-day choice affects multinomial mode choice. All the coefficients of exogenous variables take reasonable signs with good behavioral interpretation. Magnitude of coefficients varies across all types of models in a reasonable manner. In recursive model, the endogenous dummy variable PEAK appears positively significant in HOV utility function but insignificant in SOV and Transit utility function. PEAK appears insignificant in a ll the utility functions in mixed model where gi is fixed at 1, where the correlations betw een utility functions of SOV, HOV, Transit, Non-motorized and that of peak-peri od departure are 0.146, -0.070, -0.196 and -0.124, sequentially. In the mixed model where |fi| is equal to |gi|, PEAK appears insignificant in SOV and HOV utility functions and positively significant in Transit utility function. The correlations between utility functions of SOV, HOV, Transit, Non-motorized and that of peak-period departure are 0.000, -0.005, -0.160, -0.082, sequentially. Surprisingly, the absolute values of correlations are le ss than those in the mixed model with gi being fixed at 1. It indicates that various restrictions imposed on fi and gi make great impact on the estimation result of the endogeno us dummy variables. 4.3.3.2 Estimation Results for Commuters Table 4.28 offers commuter model estima tion results under th e causal structure where multinomial mode choices affect binary time-of-day choices. All the exogenous variables take reasonable coefficients in m ode choice model. In the binary peak-period departure model, commuters with kids in household are more likely to schedule their non-work trips in peak period, as evidenced by the positive coefficient of WITH_KID.

PAGE 168

157 Table 4.27 Non-commuter Model (Time-of-day Mode) Recursive Models Mixed Models (gi is fixed at 1) Mixed Models (|fi| = |gi|) Variable Coeff. t-test Coeff. t-test Coeff. t-test Mode Choice Model SOV Mode Choice Model Constant -1.662 -16.76 -1.551 -12.80 -1.646 -16.69 CAR_TIME -5.604 -6.58 -5.960 -6.40 -5.835 -6.51 TERMTIME -0.032 -1.76 -0.033 -1.68 -0.032 -1.72 PKLOT_SH -0.028 -3.12 -0.029 -3.10 -0.028 -3.09 CAR_0 -3.492 -12.75 -3.566 -12.59 -3.532 -12.86 CAR_GE2 0.586 8.40 0.615 7.89 0.600 8.42 PEAK 0.103 1.38 -0.455 -1.20 -0.062 -0.51 f1 --0.442 1.64 0.023 0.10 HOV Mode Choice Model Constant -1.521 -15.66 -1.537 -9.43 -1.494 -15.38 CAR_TIME -4.497 -5.37 -4.896 -5.23 -4.713 -5.34 TERMTIME -0.083 -4.59 -0.082 -4.33 -0.083 -4.51 PKLOT_SH -0.023 -2.72 -0.023 -2.70 -0.023 -2.71 CAR_0 -1.258 -11.73 -1.302 -11.02 -1.301 -11.84 CAR_GE2 0.395 5.50 0.414 5.38 0.410 5.59 PEAK 0.158 2.16 0.062 0.15 -0.018 -0.15 f2 ---0.203 -0.60 -0.091 -0.53 Transit Mode Choice Model Constant -3.891 -34.14 -4.172 -14.13 -4.132 -31.44 IVEH -1.898 -2.81 -1.960 -2.77 -1.925 -2.77 OWT -1.786 -3.56 -1.879 -3.57 -1.831 -3.54 FREQ 0.044 10.01 0.047 7.97 0.047 10.04 TRST_SUB 1.885 22.57 1.988 14.85 1.969 21.42 PEAK -0.034 -0.35 0.110 0.34 0.317 2.04 f3 -----0.585 -1.83 -0.511 -5.16 Non-motorized Mode Choice Model NM_TIME -4.527 -26.26 -4.698 -19.19 -4.649 -24.51 f4 ---0.370 -1.02 -0.360 -4.15 Time-of-Day (Peak-Period Departure Choice) Model Constant -0.446 -21.98 -0.987 -18.39 -0.532 -14.76 OLD -0.114 -3.44 -0.308 -3.98 -0.151 -3.73 HHSIZE2 0.075 2.15 0.130 1.55 0.081 1.97 NSWISS -0.595 -2.27 -1.103 -1.93 -0.688 -2.20 SHOPPING -0.145 -3.79 -0.340 -3.74 -0.171 -3.64 SERVICE 0.253 3.44 0.513 2.43 0.256 2.92 g1 --1.000 -0.023 0.10 g2 --1.000 -0.091 0.53 g3 --1.000--0.511 5.16 g4 ----1.000--0.360 4.15

PAGE 169

158 A plausible reason is that these persons have to undertake the responsibility of taking children to school or kindergarten on thei r commute way. Similar to non-commuters, commuters living in two-member household are more inclined to schedule non-work trips in peak period, possibly because they need to serve the other household member on their commute way. For the similar reason, service trips are more likely to be scheduled in peak period by commuters, as evidenced by the positive coefficient of SERVICE. Different from non-commuters, commuters tend to schedule shopping trips in peak period probably for pursuing shopping activities on co mmute way, as indicated by the positive coefficient of SHOPPING. Since most comm uters go back home from work place in PM peak period, HOME takes positive coefficient in the model. In all types of models, the endogenous dummy variables SOV and TRANSIT take significantly positive coefficient in bi nary departure time choice model, but HOV appears insignificant in all the mode ls. In the mixed model where gi is fixed at 1, the error correlations betw een utility functions of SOV, HOV, Transit, Non-motorized and that of peak-period departure are 0.09 3, 0.304, -0.309 and 0.283, sequentially. In the mixed model where |fi| is equal to |gi|, the error correlations be tween utility functions of SOV, HOV, Transit, Non-motorized and th at of peak-period departure are 0.001, 0.385, 0.498 and 0.095, sequentially. The high error corr elation (-0.498) cannot be allowed in the mixed model where gi is fixed at 1. As mentioned be fore, -0.447 is the most negative correlation which can be accommodated in that type of model. Under the influence of error correlations, the coefficients of SOV and Transit are much more positive than those in recursive model. However, no c onsiderable differences are found in the coefficients of SOV and Transit between two types of mixed models.

PAGE 170

159 Table 4.28 Commuter Model (Mode Time-of-day) Recursive Models Mixed Models (gi is fixed at 1) Mixed Models (|fi| = |gi|) Variable Coeff. t-test Coeff. t-test Coeff. t-test Mode Choice Model SOV Mode Choice Model Constant -0.618 -4.81 -0.747 -4.74 -0.831 -5.74 CAR_TIME -5.130 -5.64 -7.245 -5.85 -9.350 -6.47 TERMTIME -0.092 -4.09 -0.091 -3.11 -0.057 -1.97 PKLOT_SH -0.036 -4.19 -0.043 -4.24 -0.041 -4.32 CAR_0 -2.850 -11.54 -3.234 -10.90 -3.126 -11.08 CAR_GE2 0.683 8.23 0.805 7.07 0.734 7.63 f1 --0.274 1.37 0.057 0.29 HOV Mode Choice Model Constant -1.490 -9.55 -1.956 -7.24 -2.090 -8.09 CAR_TIME -6.718 -6.97 -9.189 -7.04 -11.536 -7.84 TERMTIME -0.076 -2.76 -0.074 -2.03 -0.034 -0.94 PKLOT_SH -0.021 -1.96 -0.028 -2.11 -0.025 -1.96 CAR_0 -1.693 -7.37 -2.033 -7.09 -1.892 -6.87 CAR_GE2 0.499 4.86 0.631 4.62 0.584 4.56 f2 --1.192 3.36 1.242 3.67 Transit Mode Choice Model Constant -3.960 -24.14 -4.730 -14.98 -4.822 -17.55 IVEH -2.514 -2.85 -2.950 -2.73 -3.891 -3.50 OWT -1.766 -2.80 -3.365 -3.97 -4.488 -4.53 FREQ 0.014 2.21 0.019 2.40 0.019 2.38 TRST_SUB 2.560 20.72 2.982 14.11 3.050 16.59 f3 ---1.223 -5.03 -1.477 -7.24 Non-motorized Mode Choice Model NM_TIME -5.017 -21.48 -6.066 -13.66 -6.052 -19.37 f4 --1.051 3.28 0.545 1.85 Time-of-Day (Peak-Period Departure Choice) Model Constant -0.726 -15.58 -1.940 -14.94 -1.749 -7.69 WITH_KID 0.148 2.11 0.285 1.85 0.304 1.97 HHSIZE2 0.102 2.47 0.217 2.40 0.234 2.55 SHOPPING 0.303 4.76 0.694 5.00 0.668 4.40 SERVICE 0.439 4.05 0.999 4.21 0.959 3.84 HOME 0.245 5.52 0.554 5.66 0.557 5.12 SOV 0.336 7.30 1.268 7.01 1.062 3.90 HOV 0.080 1.29 -0.056 -0.21 -0.753 -1.59 TRANSIT 0.719 10.88 2.759 13.92 2.789 7.20 g1 ---1.000 -0.057 0.29 g2 ---1.000 -1.242 3.67 g3 ---1.000 -1.477 7.24 g4 ---1.000 -0.545 1.85

PAGE 171

160 Table 4.29 Commuter Model (Time-of-day Mode) Recursive Models Mixed Models (gi is fixed at 1) Mixed Models (|fi| = |gi|) Variable Coeff. t-test Coeff. t-test Coeff. t-test Mode Choice Model SOV Mode Choice Model Constant -0.748 -5.55 -0.535 -3.71 -0.547 -3.87 CAR_TIME -4.729 -4.78 -4.931 -4.57 -5.087 -4.75 TERMTIME -0.099 -4.31 -0.110 -4.27 -0.106 -4.30 PKLOT_SH -0.035 -3.98 -0.037 -3.97 -0.036 -3.99 CAR_0 -2.860 -11.58 -2.949 -11.29 -2.955 -11.50 CAR_GE2 0.688 8.28 0.739 7.64 0.728 8.05 PEAK 0.495 5.41 -0.039 -0.26 -0.047 -0.32 f1 ---0.061 -0.35 -0.111 -0.45 HOV Mode Choice Model Constant -1.479 -9.13 -1.388 -7.84 -1.432 -8.14 CAR_TIME -5.943 -5.68 -6.156 -5.43 -6.281 -5.56 TERMTIME -0.088 -3.15 -0.099 -3.23 -0.096 -3.23 PKLOT_SH -0.020 -1.87 -0.021 -1.93 -0.021 -1.92 CAR_0 -1.708 -7.42 -1.810 -7.36 -1.812 -7.52 CAR_GE2 0.492 4.78 0.547 4.78 0.535 4.86 PEAK 0.098 0.88 -0.397 -2.13 -0.384 -2.09 f2 ----0.292 -1.20 -0.434 -2.91 Transit Mode Choice Model Constant -4.366 -23.88 -4.296 -18.17 -4.343 -20.28 IVEH -2.621 -2.87 -2.738 -2.86 -2.771 -2.89 OWT -0.949 -1.50 -1.061 -1.61 -1.139 -1.72 FREQ 1.422 2.21 0.015 2.23 0.016 2.31 TRST_SUB 2.600 20.78 2.649 17.67 2.680 18.69 PEAK 0.872 6.32 0.735 3.57 0.760 3.79 f3 ---0.298 -0.98 -0.449 -2.48 Non-motorized Mode Choice Model NM_TIME -4.888 -20.42 -5.243 -14.98 -5.207 -18.21 f4 ---0.784 -3.08 -0.680 -5.63 Time-of-Day (Peak-Period Departure Choice) Model Constant -0.606 -11.07 -1.335 -10.84 -0.818 -7.01 WITH_KID 0.135 1.96 0.276 1.79 0.171 1.77 HHSIZE2 0.092 2.25 0.225 2.48 0.137 2.36 SHOPPING 0.356 5.68 0.759 5.37 0.464 4.74 SERVICE 0.444 4.17 0.911 3.73 0.558 3.45 HOME 0.349 8.10 0.752 7.64 0.457 5.93 LN_DISW 0.033 1.72 0.068 1.59 0.043 1.63 g1 ---1.000 -0.111 0.45 g2 ---1.000 -0.434 2.91 g3 ---1.000 -0.449 2.48 g4 ---1.000 -0.680 5.63

PAGE 172

161 Table 4.29 offers commuter model estima tion results under th e causal structure where binary departure time choice affects multinomial mode choices. All the exogenous variables take reasonable coefficients. In recursiv e model, the endogenous dummy variable PEAK appears significantly posit ive in both SOV utility function and Transit utility function. In the mixed model where gi is fixed at 1, the error correlations between utility functions of SOV, HOV, Transit, Nonmotorized and that of peak-period departure are -0.021, -0.099, -0.101 and -0.233, respectiv ely. In the mixed model where |fi| is equal to |gi|, the error correlations between util ity functions of SOV, HOV, Transit, Nonmotorized and that of peak-peri od departure are -0.007, -0.102, -0.109, -0.233, respectively. In both types of mixed models, the coefficient of PEAK in SOV utility function turns to be insignificant, that in HOV utility function turns to be significantly negative, but the coefficient in Transit ut ility function is still significantly positive but somewhat less than that in r ecursive model. No considerab le differences are found in the coefficients of SOV a nd Transit between two types of mixed models. 4.3.4 Model Performance Comparisons Based on Non-nested Test Table 4.30 compares the goodness-of-fit meas urements across various types of model and causal structure. Non-nested test is employed to identify the dominant causal structure between mode choice and time-of-day choice. Th e causal structure in which time-of-day choice affects mode choice is re jected by non-nested test in all types of models. Thus, it is relatively safe to c onclude that both comm uters and non-commuters are more likely to make decision on mode choi ce then to select trip departure time conditional on the predetermined mo de. This finding is consiste nt with that in Tringides

PAGE 173

162 et al. (2004), where recursive bivariate probit model is adopt ed. In addition, it is found that the mixed models genera lly better fit the data than recursive models and mixed models in which |fi| = |gi| better fit the data than mixed model in which gi is fixed at 1. Table 4.30 Comparison of Goodness-of-f it of Timing-mode Choice Models Non-Commuter Models Commuter Models Mode Time Time Mode Mode Time Time Mode Sample Size 7984 4619 LL at zero: LL(0) -16602.3 -9604.94 LL at constant: LL(c) -15539.8 -9032.11 Recursive Model # of Parameters 27 27 27 28 LL at convergence (LL) -12526.0 -12527.7 -7293.66 -7329.65 2 at zero 0.2455 0.2454 0.2406 0.2369 Adj. 2 at zero 0.2439 0.2438 0.2378 0.2340 2 at constant 0.1939 0.1938 0.1925 0.1885 Adj. 2 at constant 0.1922 0.1921 0.1895 0.1854 Non-nested Test (Prob.) 0.0001 (0.034) 0.0039 (0.000) Mixed Models (gi is fixed at 1) # of Parameters 31 31 31 32 LL at convergence -12486.1 -12504.2 -7264.35 -7321.25 2 at zero 0.2479 0.2468 0.2437 0.2378 Adj. 2 at zero 0.2461 0.2450 0.2405 0.2344 2 at constant 0.1965 0.1953 0.1957 0.1894 Adj. 2 at constant 0.1945 0.1933 0.1923 0.1859 Non-nested Test (Prob.) 0.0011 (0.000) 0.0060 (0.000) Mixed Models (fi = gi) # of Parameters 31 31 31 32 LL at convergence -12480.5 -12498.4 -7254.83 -7317.93 2 at zero 0.2483 0.2472 0.2447 0.2381 Adj. 2 at zero 0.2464 0.2453 0.2414 0.2348 2 at constant 0.1969 0.1957 0.1968 0.1898 Adj. 2 at constant 0.1949 0.1937 0.1933 0.1862 Non-nested Test (Prob.) 0.0011 (0.000) 0.0067 (0.000) 4.3.5 Discussions and Conclusions Figures 4.3 and 4.4 summarize and illustra te the causal relationships between binary time-of-day choices and multinomial mode choices according to the mixed binarymultinomial choice model. The causal relationship rejected by non-nested is not dominant among population but probably ex ist among population, thus both dominant

PAGE 174

163 and undominant causal relationship are illustrate d in the figures for comparison purpose. For non-commuters, transit riders are more likely to undertake trips in peak period compared with other modes. One plausible e xplanation might be the transit riders are not as sensitive to the peak-period congestion as the travelers using ot her modes. Switzerland provides an excellent transit se rvice in peak period which enables transit riders pursue more non-work trips in peak period than auto and non-motorized travelers. On the other side, non-commuters peak-period non-work tr ips are more dependent on transit mode. As expected, travel time by auto or non-mo torized mode is highly sensitive to traffic congestion, thus travelers prefer to use pub lic transit for their non-work trips. Similar to non-commuters, commuters non-work trips by SOV and transit are more likely to be scheduled in peak period. Commuters may like to schedule their nonwork activities when driving on commute way, which serves as a reasonable explanation as to why their non-work SOV trips are more likely to occur in peak period. Similar to non-commuter transit riders, commuter tran sit riders are less sensitive to traffic congestion in peak period than commuters us ing alternative modes. On the other side, commuters peak-period non-work trips are mo re dependent on transit mode but less dependent on HOV mode. Similar to non-co mmuters, commuters prefer to use public transit for their non-work trip s in peak period so as to avoid traffic congestion. In addition, it is the rare case that commuters can make non-work trips with passengers in the cars if they undertake non-work tr ips on commute way, which explains why commuters non-work trips are less likely to depend on HOV in peak period. Tringides et al. (2004) found that SOV m ode choice negatively affects peak-period trip departure and peak-period trip departure negatively affects SOV mode choice using recursive

PAGE 175

164 bivariate probit model. The mixed binary-multinomial choice model further explores this problem and realizes that such negative effect s are attributable to the positive dependency between transit usage and peak-period departure. Note: Solid arrow represents positive impact Figure 4.3 Diagram of Causal Relationship of Mixed Binary-multinomial Choice Models for Non-commuters Note: Solid arrow represents positive impact and dashed arrow represents negative impact Figure 4.4 Diagram of Causal Relationship of Mixed Binary-multinomial Choice Models for Commuters In summary, this section points to th e possible behavioral mechanism where people tend to first make choices that are s ubject to constraints a nd then make choices that are less constrained. For both commuters and non-commuters, mode choice is Transit Mode Period Peak Choice HOV Mode SOV Mode Choice Transit Mode Period Peak Choice HOV Mode SOV M ode C h o i ce

PAGE 176

165 determined first because of possible modal av ailability constraints and greater departure time flexibility. People first think about the decision regarding mode and then determine the most suitable time for pursuing the non-work activity. Th ese conclusions are reasonable and consistent with previous findings (Bhat, 1997). New microsimulation models of travel a nd activity behavior attempt to predict travel and activity patterns at the level of the individual decision-maker or traveler. The development of such models calls for a d eeper understanding of the causal decision mechanisms that govern travel and activity pa rticipation decisions. Two major elements of travel and activity behavior include departure time choice and mode choice as planners would undoubtedly expect such advanced m odel systems to offer information about travel demand by mode and time-of-day. This study attempts to shed considerable light on the relationship between these two elements of behavior by considering alternative formulations of joint model systems of de parture time choice and mode choice for nonwork trips. As departure time choice for wo rk trips tends to be governed largely by work schedules and constraints, st udies of work trip departur e time choice have largely examined the issue with respect to traveler se nsitivity to congestion, travel time reliability, and arrival/departure time wi ndow sizes. On the other hand, less attention has been paid to the issue of departure time choice for nonwork trips, a growing segment of trip making that is accounting for a larger sh are of trips at all times of day. This section considers two alternative formulations of joint model systems indicating two possible alternat ive causal relationships between departure time choice and mode choice for non-work trips. The an alysis employs the Swiss household travel survey data collected in 2000. The model es timation effort was conducted separately for

PAGE 177

166 commuters and non-commuters due to the diffe rent scheduling and time constraints under which these demographic groups make activity and travel decisions. Mode choice were treated as multinomial choices among SOV, HO V, Transit and Non-motorized mode and time-of-day choice a binary choice between p eak period and non-peak period. Under this scheme, the mixed binary-multinomial choice modeling framework was applied to estimate the model systems and clarify the dire ction of causal relati onships between these dimensions of behavior. It is believed that people generally ma ke decisions on choice variables that are more constrained first. For both commu ters and non-commuter samples, it was found that the data better supports the causa l relationship where mode choice preceded departure time choice. These findings are c onsistent with the no tion that choices on constrained dimensions are made first. Swi ss people may be more mode constrained than time-of-day constrained due to the modal availa bility issue, need to engage in non-work activities that serve household members and ot her household obligations (leading to more shared ride trips). Models of activity and travel behavior sh ould incorporate relationships such as those identified in this section to more accurately portray the decision mechanisms that may be driving traveler patterns. As with most research efforts of this type, limitations apply to this study and additional research is warranted. First and foremost, it must be recognized that the identification of true causal re lationships based on a statis tical analysis of revealed behavior data is extremely difficult and ch allenging. This study provides a framework by which alternative hypotheses re garding causal relationships ca n be tested, but true causal relationships may be best identified by colle cting and analyzing beha vioral process data

PAGE 178

167 that collects information about the thought process that went into a certain decision or behavioral choice. Also, despit e the best efforts of the aut hors, research results may be sensitive to model specification and choice of explanatory vari ables. Finally additional research should examine whether the relati onships found to be more suitable in this section extend to other data sets and geographical contexts.

PAGE 179

Chapter Five: Conclusions and Discussions 5.1 Contributions to the Field 5.1.1 Methodological Contribution On the modeling methodological aspect, th is dissertation is dedicated into proposing a simultaneous equations mode ling methodology integrating unordered discrete choices into the fram ework of structural equations model system, which allows the causal analysis between unordered discrete variable and continuous variable or between two unordered discrete variables. Such modeling methodol ogy is highly desired in travel behavior study, where many dependent variables of interest are unordered discrete in nature. Non-nested statistical test which used to be applied in discrete choice model has been extended into joint discrete-con tinuous model system, thus alternative causal structure in discrete-continuous model can be compared and selected in a rigorous way. In addition, this dissertation contribu tes to addressing endogenous problem in discrete choice model for travel behavior re search community. IIA problem has been of much concern for many years but endogenous problem has not received transportation professions attention as much as it deserves. Econometricians have invested great effort and made considerable advance in this res earch topic, exemplified by semiparametric 168

PAGE 180

method for robust estimation (Lewbel, 2000). Compared with these approaches, the modeling methodology proposed in this dissertation is more practical but less robust. 5.1.2 Behavioral Contribution The plausible causal relati onship among the activity and travel variables, regardless of continuous variab les or discrete choices, can be quantified in the proposed econometric modeling framework. By co mparing the goodnessof-fit measure of competing causal models, travel behavior rese archer may virtually identify the dominant causal relationship between activity timing a nd duration, trip chaining pattern and mode choice, trip departure time a nd mode choice. This modeli ng methodology allows travel behavior researcher better understand mechan ism of travelers decision-making process only through analyzing the reveal ed data which is available in most cases. In addition, the endogenous nature of activity and travel variables has been completely recognized in this dissertation, which corresponds to the co mprehensive correlations among travelers behavior. 5.1.3 Practical Contribution The causal relationship identified by th e proposed models can aid in the development of activity-based travel demand mode l system. It will guide the modelers to specify activity-based sub-models and to decide the application sequence of these submodels, such as activity timing model, activity duration model and mode choice model. Section 4.1 identifies the dominant causal relationship between trip chaining and mode choice is Tour Type Mode Choice, Section 4.3 id entifies the dominant causal 169

PAGE 181

relationship between mode choice and tim e-of-day choice is Mode Choice Time-ofDay Choice and the mixed model in Section 4.2 identifies that the dominant causal relationship between timing and dur ation is Time-of-Day Choice Duration. It is interesting and surprising to depict a uniquely dominant decision process that most Swiss people follow for pursuing non-work activities: Activity Sequence Mode Choice Timing Duration. This sequence of model a pplication is recommended according to the analytical results of this dissertation. 5.1.4 Empirical Contribution With the consideration of endogenous problem, the coefficient of endogenous variables can be more accurately estimated in the proposed simultaneous equations modeling system. In the proposed model, we may specify endogenous variables of interest, regardless of being continuous or discrete, into the mode l equation which can be linear regression model for continuous dependent variable or latent utility function for discrete choices. For example, variable pr icing policy may change time-of-day choices of freeway users or transit riders and policy makers ar e concerned about how these people change travel mode in re sponse to the change in time-of-day choices. In that case, endogenous variables indicating ti me-of-day choices are required to be specified into the mode choice model. The impact of endoge nous variables can be accurately estimated with the proposed modeling methodology, 170

PAGE 182

5.2 Future Research Direction It is far from the end to come up with a robust and practical modeling methodology that completely solves the endogen ous problem in discrete choice model. As found in this dissertation, the estimate of endogenous variable is highly sensitive to the assumption of error structure. In the future, more robust modeling methodology (semi-parametric or limited-information method) needs to be introduced into travel behavior context, where the performance of those modeling approaches requires further exploration. In addition, various distributional assumptions on the error structures can be further explored in the mixed model framew ork. For example, normally distributional assumption can be replaced with log-normal distribution. In addi tion, the heterogeneity can be non-parametrically and discretely di stributed, similar to the assumption adopted by Mroz (1999). It still remains a wide space for future research effort. On the other side, we remain great enthusiasm in seeking a modeling methodology that allows the coex istence of two unidirectional causal relationships in one single model. As we know, population is not homogenous and people behave subject to different decision-making process. The dominant causal structure identified by the proposed model must be unidirectional, whic h cannot describe the behavior of all the people among the population. Ye et al. (2006) attempted to use a modeling approach, called simultaneous logit model (Schmidt and Strauss, 1975), to accommodate such bidirectional causal relationship. Howe ver, this modeling methodology assumes a simultaneous causal relationship at macrosc opic level and cannot al low the coexistence of alternative unidirectional causal structur es in one model. A desirable modeling methodology is expect not only to allow the coexistence of alternative unidirectional 171

PAGE 183

causal structures in one single model but also to be able to identify latent market segments that belong to certain casual structure. It rema ins interesting but challenging topic for future research effort. It must be noted that causal relationships are being extracted and examined in this dissertation from statistical relationships estimated on revealed outcome data. While such data provides insights into what people have done, it does not provide true insights into the decision mechanisms and behavioral processes underlying the revealed outcomes. One must exercise care when drawing infere nces regarding behavi oral causality from statistical indicators. In or der to truly understand and iden tify causal relationships, data regarding underlying behavior al processes and decision mechanisms are needed. Activity-travel scheduling surveys that i nvolve the collection of data on underlying behavioral processes make it possible to study travel decisions in a robust framework. Such data would greatly help further e xplore the causal linkage s among activity-travel variables. In addition, such data would help further explore the decision processes that govern activity-travel engagement patterns. Future research into the development of microsimulation models of activity and travel behavior should include attempts to collect and analyze such data. 172

PAGE 184

References Abkowitz, M. D. (1981). An analysis of the commuter departure time decision. Transportation 10 283-297. Aptech (2005). GAUSS 7.0, Aptech Syst ems. Maple Valley, Washington. Arentze, T. and Timmermans, H. (2000). Albatross: A learning-based transportation oriented simulation system EIRASS, Eindhoven Univ ersity of Technology, The Netherlands. Axhausen, K. and Garling, T. (1992). Activity-based approaches to travel analysis: conceptual frameworks, models and research problems. Transport Reviews 12 493-517. Ben-Akiva, M. E. and Swait, J. D. (1984). The Akaike likelihood ratio index. Transportation Science 20(2) 133-136. Ben-Akiva, M. and Lerman, S. R. (1985). Discrete Choice Analysis: Theory and Application to Travel Demand The MIT Press, Cambridge. Bhat, C. R. (1995). A heteroscedastic ex treme value model of intercity mode choice. Transportation Research B 29(6) 471-483. Bhat, C. R. (1996). A hazard-based duration model of shopping activity with nonparametric baseline specification a nd nonparametric control for unobserved heterogeneity. Transportation Research B 30(1) 189-207. Bhat, C. R. (1997). Work travel mode choice and number of non-work commute stops. Transportation Research B 31(1) 41-54. Bhat, C. R. (1998a). Analysis of travel mode and departure time choice for urban shopping trips. Transportation Research B 32(6) 361-371. Bhat, C. R. (1998b). A model of post-home arrival activity part icipation behavior. Transportation Research B 32(6) 387-400. Bhat, C. R. (1998c). Accommodating flexible substitution patterns in multidimensional choice modeling: formulation and application to travel mode and departure time choice. Transportation Research B 32(7) 455-466. 173

PAGE 185

Bhat, C. R. and Koppelman, F. S. (1999). A retrospective and prospective survey of time-use research. Transportation 26(2) 119-139. Bhat, C. R. and Misra, R. (1999). Di scretionary activity time allocation of individuals between in-home and out-of-hom e and between weekdays and weekends. Transportation 26(2) 193-229. Bhat, C. R. and Singh, S. K. (2000). A comprehensive daily activity-travel generation model system for workers. Transportation Research A 34(1) 1-22. Bhat, C. R. (2001a). Modeling the commut e activity-travel pattern of workers: formulation and empirical analysis. Transportation Science 35(1) 61-79. Bhat, C. R. (2001b). Quasi-random maxi mum simulated likelihood estimation of the mixed multinomial logit model. Transportation Research B 35 677-693. Bhat, C. R. (2004). A comprehensive econometric micro-simulator for daily activity-travel Patterns (CEMDAP). CD-ROM of the 83rd Annual Meeting of the Transportation Research Board. National Research Council, Washington, D.C. Bhat, C. R., and Sardesai R. (2006). The impact of stop-making and travel time reliability on commute mode choice. Transportation Research B 40(9) 709-730. Cambridge Systematics, Inc. (1997). Time -of-day modeling procedures: state-ofthe-practice, state-of-the-art Final Report, Travel Model Improvement Program, U.S. Department of Transportation, Washington, D.C. Cox, D. (1961). Tests of separate families of hypotheses. Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, 1. Berkeley: University of California Press. Cox, D. (1962). Further results on tests of separate families of hypotheses. Journal of the Royal Statistical Society B 24 406-424. Dubin, J. and McFadden, D. (1984). An econometric analysis of residential electric appliance hold ings and consumption. Econometrica 52(2) 345-362. Eluru, N. and Bhat, C. R. (2005). A joint econometric analysis of seat belt use and crash-related injury severity. Technical pa per, Department of Civil Engineering, The University of Texas at Austin. Fujii, S. and Kitamura, R. (2000). Evaluation of trip-inducing effects of new freeways using a structural equations model system of commuters time use and travel. Transportation Research B 34(5) 339-354. 174

PAGE 186

Goulias, K. G. (1997). Activity-based trav el forecasting: what are some issues? In Texas Transportation Institute (ed) Ac tivity-Based Travel Forecasting Conference, June 2-5, 1996: Summary, Recommendations, an d Compendium of Papers, Travel Model Improvement Program. US Department of Transportation, Washington DC 37-49. Golob, T. F. (2000). A simultaneous mode l of household activity participation and trip chain generation. Transportation Research B 34(5) 355-376. Golob, T. F. (2003). Structural equation modeling for travel behavior research. Transportation Research B 37(1) 1-25. Greene, W. H. (1998). Gender economics cour ses in liberal arts colleges: further results. Journal of Economic Education 29(4) 291-300. Greene, W. H. (2002). LIMDEP Version 8.0: User's Manual Econometric Software, Inc.. Plainview, NY. Greene, W. H. (2003). Econometric Analysis Fifth Edition. Pearson Education, Inc., NJ. Hamed, M. and Mannering, F. L. (1993). Modeling travelers' postwork activity involvement: toward a new methodology. Transportation Science 27(4) 381-394. Harvey, A. S. and Taylor, M. E. (2000). Activity settings and travel behavior: a social contact perspective. Transportation 32(1) 53-73. Hensher, D. A. and Reyes A. J. (2000). Trip chaining as a barrier to the propensity to use public transport. Transportation 27(4) 341-361. Horowitz, J. L. (1983). Statistical compar ison of non-nested pr obabilistic discrete choice models. Transportation Science 17(3) 319-350. Hunt, J. D., and Patterson, D. M. (1996). A stated preference examination of time of travel choice for a recreational trip. Journal of Advanced Transportation 30(3) 17-44. Kasturirangan, K., Pendyala, R. M. and K oppelman, F. S. (2002). Role of history in modeling daily activity frequency and duration for commuters. Transportation Research Record 1807, Journal of Transportation Research Board. National Research Council, Washington D.C. 129-136. Kitamura, R., Yamamoto, T., Fujii, S.. and Sampath, S. (1996). A discretecontinuous analysis of time allocation to tw o types of discretionary activities which accounts for unobserved heterogeneity. In Lesort, J.B. Transportation and Traffic Theory, Elsevier, Oxford 431-453. 175

PAGE 187

Kitamura, R., Chen, C., Pendyala, R. M. and Narayanan R. (2000). Microsimulation of activity-travel patter ns for travel demand forecasting. Transportation 27(1) 25-51. Kumar, A. and Levinson, D. (1994). Tem poral variations on allocation of time. Transportation Research Record 1439, Transportation Research Board National Research Council, Washington, D.C. 118-127. Lee, L. F. (1983). Generalized econo metric models with selectivity. Econometrica 51(2) 507-512. Levinson, D. and Kumar, A. (1995). Activit y, travel, and the allocation of time. Journal of the American Planning Association 61(4) 458-470. Lewbel, A. (2000). Semiparametric qualitative response model estimation with unknown heteroscedasticity or instrumental variables. Journal of Econometrics 97 145177. Lockwood, P. B. and Demetsky, M. J. (1994). Non-work travel a study of changing behavior. Presented at the 73rd Annual Meeting of the Transportation Research Board January 9-13, Washington, D.C. McFadden, D. L. (1973). Conditional logit an alysis of qualitative choice behavior, in P. Zarembka (ed.), Frontiers in Econometrics Academic Press. McFadden, D. and Train, K. (2000). Mixed MNL models for discrete response. Journal of Applied Econometrics 15(5) 447-470. Maddala, G. S. (1983). Limited dependent and qualitative variables in econometrics. Cambridge University Press, Cambridge. Mahmassani, H. and Chang, G. L. (1985) Dynamic aspects of departure-time choice behavior in a commuting system: theoretic al framework and experimental analysis. Transportation Research Record 1037 National Research Council, Washington D.C. 88101. Mahmassani, H. and Stephan, D. (1988). Experimental investigation of route and departure time dynamics of urban commuters. Transportation Research Record 1203. National Research Council, Washington D.C. 69-84. Mannering, F. L., Murakami, E. and Kim, S. G. (1994). Models of travelers' activity choice and home-stay duration: analysis of functional form and temporal stability. Transportation 21(4) 371-392. 176

PAGE 188

McGuckin, N. and Murakami, E. (1999) Examining trip-chaining behavior: a comparison of travel by men and women. Transportation Research Record 1693. National Research Council, Washington D.C. 79-85. Mroz, T. A. (1987). The sensitivity of an empirical model of married women's hours of work to economic and statistical assumptions. Econometrica 55 765-799. Mroz, T. A. (1999). Discrete factor approximations in simultaneous equation models: estimating the impact of a dummy e ndogenous variable on a continuous outcome. Journal of Econometrics 92 233-274. Muthen, B. (1979). A structural probit model with latent variables. Journal of the American Statistical Association 74(368) 807-811. Noland, R. B. and Small, K. A. (1994). Travel-time uncertainty, departure time choice, and the cost of morning commutes. Transportation Research Record 1439. National Research Council, Washington D.C. 150-158. Pas, E. I. and Harvey, A. S. (1997) Time use research and travel demand analysis and modeling. In Stopher, P. R. and Lee-Gosselin, M. Understanding Travel Behavior in an Era of Change Elsevier, Oxford 315-338. Pendyala, R. M, Kitamura, R, Chen, C. and Pas, E. I. (1997). An activity-based micro-simulation analysis of transportation control measures. Transport Policy 4(3) 183192. Pendyala, R. M, Kitamura, R. and Reddy, D. V. G. P. (1998). Application of an activity-based travel demand model incorporating a rule-based algorithm. Environment and Planning B: Planning and Design 25 753-772. Pendyala, R. M, Yamamoto, T. and Kitamura, R. (2002). On the formulation of time space prisms to model constraints on personal activity-travel engagement. Transportation 29(1) 73-94. Pendyala, R. M. and Bhat, C. R. (2004). An exploration of the relationship between timing and duration of maintenance activities. Transportation 31(4) 429-456. Pendyala, R. M., Kitamura R., Kikuchi, A., Yamamoto, T., Fujii, S. (2005). FAMOS: the florida activity mobility simulator. CD-ROM of the 84th Annual Meeting of the Transportation Research Board. National Research Council, Washington D.C. Rhine, S. L.W., Greene, W. H. and Toussaint-Comeau M. (2006). The importance of check-cashing businesses to the unbanked: racial/ethnic differences. The Review of Economics and Statistics 88(1) 146-157. 177

PAGE 189

Schmertmann, C. P. (1994). Selectivity bi as correction methods in polychotomous sample selection models. Journal of Econometrics 60 101-132. Schmidt, P. and Strauss, R. P. (1975). Estimation of models with jointly dependent qualitative variables: a simultaneous logit approach. Econometrica 43(4) 745755. Shiftan, Y. (1998). Practical appr oach to model trip chaining. Transportation Research Record 1645 National Research Council, Washington D.C. 17-23. Steed, J. L. and Bhat, C. R. (2000). On modeling departure time choice for homebased social/recreational and shopping trips. Transportation Research Record 1706 National Research Council, Washington D.C. 152-159. Stopher, P. R. (1993). Defi ciencies of travel fore casting methods relative to mobile emissions. ASCE Journal of Transportation Engineering 119(5) 723-741. Strathman, J. G. and Dueker, K. J. (1 994). Effect of household structure and selected characteristics on trip chaining. Transportation 21 23-45. Strathman, J. G. and Dueker, K. J. (1 995). Understanding trip chaining, special reports on trip and vehicle attributes. 1990 NPTS Reports Series Publication No. FHWAPL-95-033. U.S. Department of Transportation 1-1 ~ 1-27. Train, K. (2002). Discrete choice methods with simulation Cambridge University Press. Tringides, C. A., Ye, X. and Pendyala, R. M. (2004). Departure-time choice and mode choice for non-work trips: alternativ e formulations of joint model systems. Transportation Research Record 1898 National Research Council, Washington D.C. 1-9. Walker, J. (2002). The mixed logit (or logit kernel) model: dispelling misconceptions of identification. Transportation Research Record 1805. National Research Council, Washington D.C. 86-98. Wang, J. J. (1996). Timing utility of daily activities and its impact on travel. Transportation Research A 30 189-206. Weiner, E. and Ducca, F. (1996). U pgrading travel demand forecasting capabilities: USDOT Travel Model Improvement Program. TR News 186. Transportation Research Board, National Research Council, Washington D.C. 2-6. Wen, C-H. and Koppelman, F. S. ( 2000). A conceptual and methodological framework for the generation of activity travel patterns. Transportation Research B 32(1) 5-23. 178

PAGE 190

Vovsha, P. (1995). Application of cross-nest ed logit model to mode chocie to in Tel Aviv, Israel, Metropolitan Area. Transportation Research Record 1607 National Research Council, Washington D.C. 6-15. Yamamoto, T. and Kitamura, R. (1999). An analysis of time allocation to in-home and out-of-home discretionary activities across working days and non-working days. Transportation 26(2) 211-230. Ye, X. and Pendyala, R.M. (2003). Descri ption of the Switzerland Microcensus 2000 travel survey sample. Research Re port prepared for Jenni + Gottardi AG, Department of Civil and Environmental Engine ering, University of South Florida, Tampa, FL. Ye, X., Pendyala, R. M. and Gottardi, G. (2004). An exploration of the relationship between auto mode choice and complexity of trip chaining patterns. Transportation Research B 41(1) 96-113. 179

PAGE 191

Bibliography Barnard, P. O. and Hensher, D. A. ( 1992). Joint estimation of a polychotomous discrete-continuous choice system : an analysis of the spatial distribution of retail expenditures. Journal of Transport Economics and Policy XXVI(3) 299-312. Bhat, C. R. (2003). Simulation estimation of mixed discrete choice models using randomized and scrambled halton sequences. Transportation Research B 37(9) 837-855. Dissanayake, D. and Morikawa, T. ( 2002). Household travel behavior in developing countries: nested logit model of vehicle ownership, mode choice, and trip chaining. Transportation Research Record 1805, J ournal of the Transportation Research Board 45-52. National Research Council, Washington, D.C. Doherty, S. T. and Miller, E. J. ( 2000) A computerized household activity scheduling survey. Transportation 27(1) 75-97. Feller, W. (1971). An introduction to probability theory and its applications Wiley, New York. Geweke, J., Keane, M. and Runkle, D. (1994). Alternat ive computational approaches to inference in the multinomial probit model. Review of Economics and Statistics 76 609-632. Hajivassiliou, V, McFadden, D., and Paul R. (1996). Simulation of multivariate normal orthant probabilities: methods and programs. Journal of Econometrics 72 85-134. Hanemann, W. M. (1984). Discrete/con tinuous models of consumer demand. Econometrica 52 541-561. Keane, M. (1992). A note on identification in the multinomial probit model. Journal of Business and Economics Statistics 10 193-200. Keane, M. (1994). A computa tionally practical simulation estimator for panel data. Econometrica 62(1) 95-116. Koppelman, F. S. and Pas, E. I. (198 4). Estimation of disaggregate regression models of person trip generation with multiday data. Proceedings of the Ninth International Symposium on Transportation and Traffic Theory Delft, the Netherlands. 180

PAGE 192

Mannering, F. L. and Hensher, D. A. (1987). Discrete-continuous econometric models and their applicati on to transport analysis. Transport Reviews 7(3) 227-244. Ouyang, Y., Shankar, V. and Yamamoto, T. (2002). Modeling the simultaneity in injury causation in multi-vehicle collisions. Transportation Research Record 1784 National Research Council, Washington D.C. 143-152. Pendyala, R. M. and Ye, X. (2005). C ontributions to understanding joint relationships among activity and travel variables, In H. Timmermans (ed.) Progress in Activity-Based Analysis Pergamon, Elsevier Sc ience Ltd., Oxford, UK 1-24. 181

PAGE 193

Appendices 182

PAGE 194

Appendix A: Gauss Code for Genera ting and Storing Halton Sequences proc halton(n,s); local phi,i,j,y,x,k; k=floor(ln(n+1) ./ ln(s)); phi={0}; i=1; do while i .le k; x=phi; j=1; do while j .lt s; y=phi+(j/s^i); x=x|y; j=j+1; endo; phi=x; i=i+1; endo; x=phi; j=1; do while j .lt s .and rows(x) .lt (n+1); y=phi+(j/s^i); x=x|y; j=j+1; endo; phi=x[2:(n+1),1]; retp(phi); endp; /* The procedure is extracted from the c odes for mixed logit model by Kenneth Train, Professor in Department of Economics at University of California, Berkeley */ n = 3.3e7; h1 = halton(n,2); n1 = cdfni(h1); outhalt = "c:\\gauss\\data\\n1"; let vnames = h1; create f1 = ^outhalt with ^vnames, 0, 8; if writer(f1, n1 ) /= n; print "Disk Full"; end; endif; closeall f1; h2 = halton(n,3); n2 = cdfni(h2); 183

PAGE 195

Appendix A: (Continued) outhalt = "c:\\gauss\\data\\n2"; let vnames = h2; create f1 = ^outhalt with ^vnames, 0, 8; if writer(f1, n2 ) /= n; print "Disk Full"; end; endif; closeall f1; h3 = halton(n,5); n3 = cdfni(h3); outhalt = c:\\gauss\\data\\n3"; let vnames = h3; create f1 = ^outhalt with ^vnames, 0, 8; if writer(f1, n3 ) /= n; print "Disk Full"; end; endif; closeall f1; h4 = halton(n,7); n4 = cdfni(h4); outhalt = "c:\\gauss\\data\\n4"; let vnames = h4; create f1 = ^outhalt with ^vnames, 0, 8; if writer(f1, n4 ) /= n; print "Disk Full"; end; endif; closeall f1; 184

PAGE 196

Appendix B: Gauss Code of Mixed Discrete-continuous Model (Exemplified by Non-commuter Model Where Time-of-day Choice Affects Activity Duration) library maxlik; N= 14970; Load data[N,119] = "C:\\PHD Dissertation\\Swiss\\timing_dur ation\\main_act_file_gauss.dat"; commuter = data[., 100 ]; filter_x = (commuter .== 0); filter_x =miss(filter_x,0); data = packr(data~filter_x); N = rows(data); one=ones(rows(data),1); /* define variables */ intnr = data[., 1 ]; hhnr = data[., 2 ]; tripnum = data[., 3 ]; /* Definition of variables from th e dataset is tedious and excluded. */ ln_dur = data[., 115 ]; ampeak = data[., 116 ]; pmpeak = data[., 117 ]; midday= data[., 118 ]; offpeak = data[., 119 ]; age = age/100; pmale = (sex.==1); age_sq = age.*age; car_0 =(n_auto.==0) ; car_ge2 = (n_auto .>=2); hhsize1 = (hhsize .==1 ); low_inc = (hhincome .<3 .and hhincome .>0 ); high_inc = (hhincome .>= 6 ); y1 = ampeak; y2 = pmpeak; y3 = midday; y4 = offpeak; y = ln_dur; xx1 = one~age~hhsize~low_inc~car_0; xx2 = one~age~pmale~high_inc~car_0; xx3 = one~age~pmale~car_0; xx = one~age~age_sq~pmale~hhsize~high_inc~car_ge2~y1~y2~y3; 185

PAGE 197

Appendix B: (Continued) data = xx1~xx2~xx3~xx~y1~y2~y3~y4~y; s_n = 100; /* number of the random seeds */ outhalt = "c:\\gauss\\data\\n1"; open fin = ^outhalt for read; call seekr(fin,(1000)); as1 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n2"; open fin = ^outhalt for read; call seekr(fin,(1000)); as2 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n3"; open fin = ^outhalt for read; call seekr(fin,(1000)); as3 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n4"; open fin = ^outhalt for read; call seekr(fin,(1000)); as4 = readr(fin,s_n*N); fin = close(fin); as1 = (reshape(as1,N,s_n)); as2 = (reshape(as2,N,s_n)); as3 = (reshape(as3,N,s_n)); as4 = (reshape(as4,N,s_n)); proc lpr(b,z); local xxx1, xxx2, xxx3, xxx4, xxx, u1,u2,u3,u4,p,p1,p2,p3,p4,ln_p,i,pd_sum,sigma,pd,d,u; pd_sum = 0; i = 1; do while (i<=s_n); xxx1 = xx1~as1[.,i]; xxx2 = xx2~as2[.,i]; xxx3 = xx3~as3[.,i]; 186

PAGE 198

Appendix B: (Continued) xxx4 = as4[.,i]; xxx = xx~as1[.,i]~as2[.,i]~as3[.,i]~as4[.,i]; u1=xxx1*b[1:cols(xxx1)] ; u2=xxx2*b[cols(xxx1)+1:co ls(xxx1~xxx2)]; u3=xxx3*b[cols(xxx1~xxx2)+1:cols(xxx1~xxx2~xxx3)] ; u4=xxx4*b[cols(xxx1~xxx2~xxx3)+1:cols(xxx1~xxx2~xxx3~xxx4)]; p1=exp(u1)./ ( exp(u1) + exp(u2) + exp(u3) + exp(u4)); p2=exp(u2)./ ( exp(u1) + exp(u2) + exp(u3) + exp(u4) ); p3=exp(u3)./ ( exp(u1) + exp(u2) + exp(u3) + exp(u4)); p4= exp(u4)./ ( exp(u1) + e xp(u2) + exp(u3) + exp(u4)); p = (p1.^y1).*(p2.^y2).*(p3.^y3).*(p4.^y4); u = y xxx*b[cols(xxx1~xxx2~xxx3~xxx4)+1:cols(xxx1~xxx2~xxx3~xxx4~xxx)] ; sigma = b[cols(xxx1~xxx2~xxx3~xxx4~xxx)+1]; d = (1/sigma)*pdfn(u/sigma); pd = p.*d; pd_sum = pd_sum + pd; i = i + 1; endo; retp ( ln(pd_sum/s_n) ); endp; proc lgd(b,z); local xxx1,xxx2,xxx3,xxx4,xxx,u1,u2,u3,u4,p1,p2,p3,p4,g1,g2,g3,g4,p,g,i,p_sum,g_sum, gp_sum,pd_sum,gds_sum,gp,d,gds,pd,gd,d_gamma,d_sigma,gd_sum,sigma,u; p_sum = 0; g_sum = 0; gd_sum =0; pd_sum = 0; gds_sum =0; i = 1; do while (i<=s_n); xxx1 = xx1~as1[.,i]; xxx2 = xx2~as2[.,i]; xxx3 = xx3~as3[.,i]; xxx4 = as4[.,i]; xxx = xx~as1[.,i]~as2[.,i]~as3[.,i]~as4[.,i]; u1=xxx1*b[1:cols(xxx1)] ; u2=xxx2*b[cols(xxx1)+1:co ls(xxx1~xxx2)]; u3=xxx3*b[cols(xxx1~xxx2)+1:cols(xxx1~xxx2~xxx3)] ; u4=xxx4*b[cols(xxx1~xxx2~xxx3)+1:cols(xxx1~xxx2~xxx3~xxx4)]; 187

PAGE 199

Appendix B: (Continued) p1=exp(u1)./ ( exp(u1) + exp(u2) + exp(u3) + exp(u4)); p2=exp(u2)./ ( exp(u1) + exp(u2) + exp(u3) + exp(u4) ); p3=exp(u3)./ ( exp(u1) + exp(u2) + exp(u3) + exp(u4)); p4= exp(u4)./ ( exp(u1) + e xp(u2) + exp(u3) + exp(u4)); p = (p1.^y1).*(p2.^y2).*(p3.^y3).*(p4.^y4); g1 = (y1.*p1.*(1 p1)+ y2.*p2.*( p1) + y3.*p3.*(p1)+ y4.*p4.*(p1)).*xxx1; g2 = (y1.*p1.*(p2)+ y2.*p2.*(1 p2) + y3.*p3.*(-p2)+ y4.*p4.*(p2)).*xxx2 ; g3=(y1.*p1.*(-p3) + y2.*p2.*( p3) + y3.*p3.*(1 p3)+ y4.*p4.*(p3)).*xxx3; g4=(y1.*p1.*(-p4) + y2.*p2.*( p4) + y3.*p3.*(-p4)+ y4.*p4.*(1p4)).*xxx4; g = g1~g2~g3~g4; u = y xxx*b[cols(xxx1~xxx2~xxx3~xxx4)+1:cols(xxx1~xxx2~xxx3~xxx4~xxx)] ; sigma = b[cols(xxx1~xxx2~xxx3~xxx4~xxx)+1]; d = (1/sigma)*pdfn(u/sigma); pd = p.*d; pd_sum = pd_sum + pd; gd = g.*d; gd_sum = gd_sum + gd; d_gamma = u/(sigma^2).*d.*xxx; d_sigma = d.*(u.^2/sigma^3 1/sigma); gds = (d_gamma~d_sigma).*p; gds_sum = gds_sum + gds; i = i + 1; endo; gp = gd_sum./pd_sum; gds = gds_sum./pd_sum; retp ( gp~gds ); endp; _max_GradProc = &lgd; _max_CovPar = 1; b0 = zeros(cols(xx1~xx2 ~xx3~xx)+8,1)|1 ; 188

PAGE 200

Appendix B: (Continued) _max_parnames = "one"|"age"|"hhs ize"|"low_inc"|" car_0"|"f1"| "one"| "age"|"pmale"|"high_inc"|"car_0"|"f2"| "one "|"age"|"pmale"|"car_0"|"f3"|"f4"| "one "|"age"|"age_sq"|" pmale"|"hhsize"| "high_inc"|" car_ge2"|"ampeak"|"pmpeak"|"midday" |"g1"|"g2"|"g3"|"g4"|"sigma"; _max_Active = 1|1|1|1|1|1| 1|1|0|1|1|1| 1|1|0|1|1|0| 1|1|1|1|1|1|0|1|1|1|1|1|1|0|1 ; {b,f,g,cov,ret}=maxlik(data,0,&lpr,b0); call maxprt(b,f,g,cov,ret); 189

PAGE 201

Appendix C: Gauss Code of Discrete-continuous Model Based on Lee Transformation (Exemplified by Non-commuter Model Where Time-of-day Choice Affects Activity Duration) library maxlik; N= 14970; Load data[N,119] = "C:\\PHD Dissertation\\Swiss\\timing_dur ation\\main_act_file_gauss.dat"; commuter = data[., 100 ]; filter_x = (commuter .== 0); filter_x =miss(filter_x,0); data = packr(data~filter_x); N = rows(data); one=ones(rows(data),1); /* define variables */ intnr = data[., 1 ]; hhnr = data[., 2 ]; tripnum = data[., 3 ]; /* Definition of variables from th e dataset is tedious and excluded. */ ln_dur = data[., 115 ]; ampeak = data[., 116 ]; pmpeak = data[., 117 ]; midday= data[., 118 ]; offpeak = data[., 119 ]; age = age/100; pmale = (sex.==1); age_sq = age.*age; car_0 =(n_auto.==0) ; car_ge2 = (n_auto .>=2); hhsize1 = (hhsize .==1 ); low_inc = (hhincome .<3 .and hhincome .>0 ); high_inc = (hhincome .>= 6 ); y1 = ampeak; y2 = pmpeak; y3 = midday; y4 = offpeak; y = ln_dur; xx1 = one~age~hhsize~low_inc~car_0; xx2 = one~age~pmale~high_inc~car_0; xx3 = one~age~pmale~car_0; xx = one~age~age_sq~pmale~hhsize~high_inc~car_ge2~y1~y2~y3; data = xx1~xx2~xx3~xx~y1~y2~y3~y4~y; 190

PAGE 202

Appendix C: (Continued) proc lpr(b,z); local ut1,ut2,ut3,p1,p2,p3,p4,p,r1,r2,r3,r4,sigma,l,bb1,bb2,bb3,bb4,pp1,pp2,pp3,pp4; ut1 = xx1*b[1:cols(xx1)]; ut2 = xx2*b[cols(xx1)+1:cols(xx1~xx2)]; ut3 = xx3*b[cols(xx1~xx2)+1:cols(xx1~xx2~xx3)]; p1 = exp(ut1)./(exp(ut1)+e xp(ut2)+ exp(ut3) + 1); p2 = exp(ut2)./(exp(ut1)+e xp(ut2)+ exp(ut3) + 1); p3 = exp(ut3)./(exp(ut1)+e xp(ut2)+ exp(ut3) + 1); p4 = 1 p1 -p2 p3; r1 = b[cols(xx1~xx2~xx3~xx)+1]; r2 = b[cols(xx1~xx2~xx3~xx)+2]; r3 = b[cols(xx1~xx2~xx3~xx)+3]; r4 = b[cols(xx1~xx2~xx3~xx)+4]; sigma = b[cols(xx1~xx2~xx3~xx)+5]; l = (y xx*b[cols(xx1~xx2~xx3)+1:cols(xx1~xx2~xx3~xx)])/sigma; bb1 = (cdfni(p1) r1*l)./sqrt(1-r1^2); bb2 = (cdfni(p2) r2*l)./sqrt(1-r2^2); bb3 = (cdfni(p3) r3*l)./sqrt(1-r3^2); bb4 = (cdfni(p4) r4*l)./sqrt(1-r4^2); pp1 = (1/sigma/sqrt(2*pi))* exp(-l.^2/2).*cdfn(bb1); pp2 = (1/sigma/sqrt(2*pi))* exp(-l.^2/2).*cdfn(bb2); pp3 = (1/sigma/sqrt(2*pi))* exp(-l.^2/2).*cdfn(bb3); pp4 = (1/sigma/sqrt(2*pi))* exp(-l.^2/2).*cdfn(bb4); p = y1.*pp1 + y2.*pp2 + y3.*pp3 + y4.*pp4; retp (ln(p)); endp; _max_CovPar = 1; b0 = zeros(cols(xx1~xx2 ~xx3~xx)+4,1)|1 ; _max_parnames = "one"|"age"| "hhsize"|"low_ inc"|"car_0"| "one "|"age"|"pmale"|"high_inc"|"car_0"| "one"|"age"|"pmale"|"car_0"| "one"|"age"|"age_sq"|"pmale"|"hhsize "|"high_inc"|"car_ge2"| "ampeak"|"pmpeak"|"midday"|"r1"|"r2"|"r3"|"r4"|"sigma"; 191

PAGE 203

Appendix C: (Continued) _max_Active = 1|1|1|1|1| 1|1|1|1|1| 1|1|1|1| 1|1|1|1|1|1|0|1|1|1|1|1|1|1|1 ; {b,f,g,cov,ret}=maxlik(data,0,&lpr,b0); call maxprt(b,f,g,cov,ret); 192

PAGE 204

Appendix D: Gauss Code of Mixed Binary-multinomial Choice Model (g i Fixed at 1, Exemplified by Non-commuter Model Where Binary Time-of-day Choice Affects Multinomial Mode Choice) library maxlik; N= 12939; load data[N,139] = "C:\\PHD Dissert ation\\Swiss\\tod_mode\\mode_tod.dat"; commuter = data[., 105 ]; filter_x = (commuter .== 0 ); filter_x = miss(filter_x,0); data = packr(data~filter_x); N = rows(data); one = ones(rows(data),1); /* define variables */ intnr = data[., 1 ]; hhnr = data[., 2 ]; tripnum = data[., 3 ]; /* Definition of variables from th e dataset is tedious and excluded. */ ampeak = data[., 136 ]; pmpeak = data[., 137 ]; midday= data[., 138 ]; offpeak = data[., 139 ]; old = (age .> 60); y1 = sov; y2 = hov; y3 = transit; y4 = nmotor; z1 = (ampeak .or pmpeak) ; z2 = (midday .or offpeak ); peak = z1; nm_time = (time_bic+time_wk)/2 ; age = age/100; pmale = (sex.==1); age_sq = age.*age; car_0 =(n_auto.==0) ; car_ge2 = (n_auto .>=2); hhsize1 = (hhsize .==1 ); trst_sub = ((o_sub .== 1) .or (g_sub.==1)); shopping = (purpose .== 3); leisure = (purpose .== 4); 193

PAGE 205

Appendix D: (Continued) service = (purpose .== 5); home = (purpose .== 7); swiss = (add_swit .==1 ); hhsize2 = (hhsize .==2 ); s_n = 100; /* number of the random seeds */ outhalt = "c:\\gauss\\data\\n1"; open fin = ^outhalt for read; call seekr(fin,(1000)); as1 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n2"; open fin = ^outhalt for read; call seekr(fin,(1000)); as2 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n3"; open fin = ^outhalt for read; call seekr(fin,(1000)); as3 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n4"; open fin = ^outhalt for read; call seekr(fin,(1000)); as4 = readr(fin,s_n*N); fin = close(fin); as1 = (reshape(as1,N,s_n)); as2 = (reshape(as2,N,s_n)); as3 = (reshape(as3,N,s_n)); as4 = (reshape(as4,N,s_n)); xx1 = one~(car_time/100)~termtime~pklot_d~car_0~car_ge2~z1; xx2 = one~(car_time/100)~termtime~pklot_d~car_0~car_ge2~z1; xx3 = one~(iveh/100)~(owt/100)~freq~trst_sub~z1; xx4 = (nm_time/100); zz1 = one~old~hhsize2~swiss~shopping~service; proc lpr(b,z); 194

PAGE 206

Appendix D: (Continued) local u1,u2,u3,u4,v1,v2,v3,v4,xxx1,xxx2,xxx3,xxx4,zzz1,zzz2,zzz3,zzz4, p,p_u,p_v,pu1,pu2,pu3,pu4,pv1,pv2,pv3,pv4,ln_p,i,puv_sum,p_uv; puv_sum = 0; i = 1; do while (i<=s_n); xxx1 = xx1~as1[.,i]; xxx2 = xx2~as2[.,i]; xxx3 = xx3~as3[.,i]; xxx4 = xx4~as4[.,i]; zzz1 = zz1~as1[.,i]~as2[.,i]~as3[.,i]~as4[.,i]; u1= xxx1*b[1:cols(xxx1)] ; u2= xxx2*b[cols(xxx1)+1:cols(xxx1~xxx2)] ; u3 = xxx3*b[cols(xxx1~xxx2)+1:cols(xxx1~xxx2~xxx3)] ; u4 = xxx4*b[cols(xxx1~xxx2~xxx3)+1:cols(xxx1~xxx2~xxx3~xxx4)]; pu1=exp(u1)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu2=exp(u2)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu3= exp(u3)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu4= exp(u4)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); p_u = (pu1.^y1).*(pu2.^y2).*(pu3.^y3).*(pu4.^y4); v1 = zzz1*b[cols(xxx1~xxx2~xxx3~xxx4)+1:cols(xxx1~xxx2~xxx3~xxx4~zzz1)]; pv1 = cdfn(v1); pv2 = cdfn(-v1); p_v = (pv1.^z1).*(pv2.^z2); p_uv = p_u.*p_v; puv_sum = puv_sum + p_uv; i = i + 1; endo; retp ( ln(puv_sum/s_n) ); endp; proc lgd(b,z); local xxx1,xxx2,xxx3,xxx4,zzz1,zzz2,zzz3,zzz4, u1,u2,u3,u4,v1,v2,v3,v4,pu1,pu2,pu3,pu4,pv1,pv2,pv3,pv4,p,g,i,g_u,g_v,gu_sum, gv_sum,puv_sum,p_u,p_v,p_uv,gu_1,gu_2,gu_3,gu_4,gv_1,gv_2,gv_3,gv_4,gu,gv; 195

PAGE 207

Appendix D: (Continued) gu_sum =0; gv_sum = 0; puv_sum =0; i = 1; do while (i<=s_n); xxx1 = xx1~as1[.,i]; xxx2 = xx2~as2[.,i]; xxx3 = xx3~as3[.,i]; xxx4 = xx4~as4[.,i]; u1= xxx1*b[1:cols(xxx1)] ; u2= xxx2*b[cols(xxx1)+1:cols(xxx1~xxx2)] ; u3 = xxx3*b[cols(xxx1~xxx2)+1:cols(xxx1~xxx2~xxx3)] ; u4 = xxx4*b[cols(xxx1~xxx2~xxx3)+1:cols(xxx1~xxx2~xxx3~xxx4)]; pu1=exp(u1)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu2=exp(u2)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu3= exp(u3)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu4= exp(u4)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); p_u = (pu1.^y1).*(pu2.^y2).*(pu3.^y3).*(pu4.^y4); zzz1 = zz1~as1[.,i]~as2[.,i]~as3[.,i]~as4[.,i]; v1 = zzz1*b[cols(xxx1~xxx2~xxx3~xxx4)+1:cols(xxx1~xxx2~xxx3~xxx4~zzz1)]; pv1 = cdfn(v1); pv2 = cdfn(-v1); p_v = (pv1.^z1).*(pv2.^z2); p_uv = p_u.*p_v; gu_1=(y1.*pu1.*(1-pu1)+y2.*pu2.*(-pu1)+y3.* pu3.*(-pu1)+y4.*pu4.*(-pu1)).*xxx1; gu_2=(y1.*pu1.*(-pu2)+y2.*pu2.*(1-pu2)+y3.* pu3.*(-pu2)+y4.*pu4.*(-pu2)).*xxx2; gu_3=(y1.*pu1.*(-pu3)+y2.*pu2.*(-pu3)+y3.*p u3.*(1-pu3)+y4.*pu4.*(-pu3)).*xxx3; gu_4=(y1.*pu1.*(-pu4)+y2.*pu2.*(-pu4)+y3.*p u3.*(-pu4)+y4.*pu4.*(1-pu4)).*xxx4; g_u = (gu_1~gu_2~gu_3~gu_4).*p_v; gv_1 = (z1.*pdfn(v1) z2.*pdfn(v1) ).*zzz1; 196

PAGE 208

Appendix D: (Continued) g_v = (gv_1).*p_u; gu_sum = gu_sum + g_u; gv_sum = gv_sum + g_v; puv_sum = puv_sum + p_uv; i = i + 1; endo; gu = gu_sum./puv_sum; gv = gv_sum./puv_sum; retp ( gu~gv ); endp; b0 = zeros(cols(xx1~xx2~ xx3~xx4~zz1)+4,1)|1|1|1|1; _max_GradProc = &lgd; _max_Active = 1|1|1|1|1|1|1| 1| 1|1|1|1|1|1|1| 0| 1| 1|1|1|1|1| 1| 1| 1| 1|1|0|1|1|1|0|0|0|0 ; _max_CovPar = 1; _max_parnames = "CONS_1"|"CARTIME1"|"TERMTIM1"|"PKLOT_D1"|"CAR_01"|"CAR_GE1" |"peak"|"f1"| "CONS_2"|"CARTIME2"|"TERMTIM2"|"PKLOT_D2"|"CAR_02"|"CAR_GE2" |"peak"|"f2"| "CONS_3"|"IVEH3"|"OWT3"|"FREQ3 "|"TRST_SUB"|"peak"|"f3"| "nm_time"|"f4"| "CONS_1"|"old"|"hhsize2"|"s wiss"|"shopping"|"service"| "g11"|"g12"|"g13"|"g14" ; {b,f,g,cov,ret}=maxlik(data,0,&lpr,b0); call maxprt(b,f,g,cov,ret); 197

PAGE 209

Appendix E: Gauss Code of Mixed Binary-multinomial Choice Model (|f i | = |g i |, Exemplified by Non-commuter Model Where Binary Time-of-day Choice Affects Multinomial Mode Choice) library maxlik; N= 12939; load data[N,139] = "C:\\PHD Dissert ation\\Swiss\\tod_mode\\mode_tod.dat"; commuter = data[., 105 ]; filter_x = (commuter .== 0); filter_x = miss(filter_x,0); data = packr(data~filter_x); N = rows(data); one = ones(rows(data),1); /* define variables */ intnr = data[., 1 ]; hhnr = data[., 2 ]; tripnum = data[., 3 ]; /* Definition of variables from th e dataset is tedious and excluded. */ sov = data[., 132 ]; hov = data[., 133 ]; transit = data[., 134 ]; nmotor = data[., 135 ]; ampeak = data[., 136 ]; pmpeak = data[., 137 ]; midday= data[., 138 ]; offpeak = data[., 139 ]; old = (age .> 60); y1 = sov; y2 = hov; y3 = transit; y4 = nmotor; z1 = (ampeak .or pmpeak) ; z2 = (midday .or offpeak ); with_kid = ((hhsize hh6plus).>0); nm_time = (time_bic+time_wk)/2 ; age = age/100; pmale = (sex.==1); age_sq = age.*age; car_0 =(n_auto.==0) ; car_ge2 = (n_auto .>=2); hhsize1 = (hhsize .==1 ); 198

PAGE 210

Appendix E: (Continued) trst_sub = ((o_sub .== 1) .or (g_sub.==1)); shopping = (purpose .== 3); leisure = (purpose .== 4); service = (purpose .== 5); home = (purpose .== 7); swiss = (add_swit .==1 ); hhsize2 = (hhsize .==2 ); s_n = 100; /* number of the random seeds */ outhalt = "c:\\gauss\\data\\n1"; open fin = ^outhalt for read; call seekr(fin,(1000)); as1 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n2"; open fin = ^outhalt for read; call seekr(fin,(1000)); as2 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n3"; open fin = ^outhalt for read; call seekr(fin,(1000)); as3 = readr(fin,s_n*N); fin = close(fin); outhalt = "c:\\gauss\\data\\n4"; open fin = ^outhalt for read; call seekr(fin,(1000)); as4 = readr(fin,s_n*N); fin = close(fin); as1 = (reshape(as1,N,s_n)); as2 = (reshape(as2,N,s_n)); as3 = (reshape(as3,N,s_n)); as4 = (reshape(as4,N,s_n)); xx1 = one~(car_time/100)~termtime~pklot_d~car_0~car_ge2~z1; xx2 = one~(car_time/100)~termtime~pklot_d~car_0~car_ge2~z1; xx3 = one~(iveh/100)~(owt/100)~freq~trst_sub~z1; xx4 = (nm_time/100); zz1 = one~old~hhsize2~swiss~shopping~service; 199

PAGE 211

Appendix E: (Continued) proc lpr(b,z); local u1,u2,u3,u4,v1,v2,v3,v4,xxx1,xxx2,xxx3,xxx4,zzz1,zzz2,zzz3,zzz4, p,p_u,p_v,pu1,pu2,pu3,pu4,pv1,pv2,pv3,pv4,ln_p,i,puv_sum,p_uv, b_eta1,b_eta2 ,b_eta3,b_eta4, b_eta ; puv_sum = 0; i = 1; do while (i<=s_n); u1= xx1*b[1:cols(xx1)] + as1[.,i ]*b[cols(xx1~xx2~xx3~xx4~zz1)+1]; u2=xx2*b[cols(xx1)+1:cols(xx1~xx2)] + as2[ .,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+2]; u3=xx3*b[cols(xx1~xx2)+1:cols(xx1~xx2~xx3)] +as3[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+3]; u4=xx4*b[cols(xx1~xx2~xx3)+1:cols(xx1~xx2~xx3~xx4)]+ as4[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+4]; pu1=exp(u1)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu2=exp(u2)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu3= exp(u3)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu4= exp(u4)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ) ; p_u = (pu1.^y1).*(pu2.^y2).*(pu3.^y3).*(pu4.^y4); v1 = zz1*(b[cols(xx1~xx2~xx3~xx4) +1:cols(xx1~xx2~xx3~xx4~zz1)]) + as1[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+1] as2[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+2] as3[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+3] as4[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+4]; pv1 = cdfn(v1); pv2 = cdfn(-v1); p_v = (pv1.^z1).*(pv2.^z2); p_uv = p_u.*p_v; puv_sum = puv_sum + p_uv; i = i + 1; endo; retp ( ln(puv_sum/s_n) ); endp; proc lgd(b,z); local xxx1,xxx2,xxx3,xxx4,zzz1,zzz2,zzz3,zzz4, u1,u2,u3,u4,v1,v2,v3,v4,pu1,pu2,pu3,pu4,pv1,pv2,pv3,pv4,p,g,i,g_u,g_v, 200

PAGE 212

Appendix E: (Continued) gu_sum,gv_sum,puv_sum,p_u,p_v,p_uv,gu_1,gu_2,gu_3,gu_4,gv_1,gv_2,gv_3,gv_4,gu, gv, b_eta1,b_eta2 ,b_eta3,b_eta4, b_eta, g_c df,g_et,guet,gvet,guet_1,guet_2, guet_3, guet_4, gvet_1,gvet_2, gvet_3, gvet_4,get_sum, get, g1,g2,g3,g4 ; gu_sum =0; gv_sum = 0; puv_sum =0; get_sum = 0; i = 1; do while (i<=s_n); u1 = xx1*b[1:cols(xx1)] + as1[.,i ]*b[cols(xx1~xx2~xx3~xx4~zz1)+1]; u2=xx2*b[cols(xx1)+1:cols(xx1~xx2)] + as2[ .,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+2]; u3=xx3*b[cols(xx1~xx2)+1:cols(xx1~xx2~xx3)] + as3[.,i]*b[cols (xx1~xx2~xx3~xx4~zz1)+3]; u4=xx4*b[cols(xx1~xx2~xx3)+1:cols(xx1~xx2~xx3~xx4)] + as4[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+4]; pu1=exp(u1)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu2=exp(u2)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu3= exp(u3)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ); pu4= exp(u4)./ ( exp(u1) + exp( u2) + exp(u3) + exp(u4) ) ; p_u = (pu1.^y1).*(pu2.^y2).*(pu3.^y3).*(pu4.^y4); v1 = zz1*(b[cols(xx1~xx2~xx3~xx4) +1:cols(xx1~xx2~xx3~xx4~zz1)]) + as1[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+1] as2[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+2] as3[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+3] as4[.,i]*b[cols(xx1~xx2~xx3~xx4~zz1)+4]; pv1 = cdfn(v1); pv2 = cdfn(-v1); p_v = (pv1.^z1).*(pv2.^z2); p_uv = p_u.*p_v; g_cdf = (z1.*pdfn(v1) z2.*pdfn(v1) ); gu_1=(y1.*pu1.*(1-pu1)+y2.*pu2.*(-pu1)+ y3.*pu3.*(pu1)+y4.*pu4.*(pu1)).*xx1; gu_2=(y1.*pu1.*(-pu2)+y2.*pu2.*(1-pu2)+y3.* pu3.*(-pu2)+y4.*pu4.*(pu2)).*xx2 ; gu_3=(y1.*pu1.*(-pu3)+y2.*pu2.*(-pu3)+y3.*p u3.*(1-pu3)+ y4.*pu4.*(pu3)).*xx3; gu_4=(y1.*pu1.*(-pu4)+y2.*pu2.*(-pu4)+y3.*p u3.*(-pu4)+y4.*pu4.*(1pu4)).*xx4 ; 201

PAGE 213

Appendix E: (Continued) guet_1=(y1.*pu1.*(1-pu1)+y2.*pu2.* (-pu1)+y3.*pu3.*(-pu1)+y4.*pu4.*(-pu1)) .*as1[.,i]; guet_2=(y1.*pu1.*(-pu2)+y2.*pu2.*(1pu2)+y3.*pu3.*(-pu2)+y4.*pu4.*(-pu2)) .*as2[.,i]; guet_3=(y1.*pu1.*(-pu3)+y2.*pu2.*(pu3)+y3.*pu3.*(1-pu3)+y4.*pu4.*(-pu3)) .*as3[.,i]; guet_4=(y1.*pu1.*(-pu4)+y2.*pu2.*(-pu4)+y3.*pu3.*(-pu4)+y4.*pu4.*(1-pu4)) .*as4[.,i]; guet = guet_1~guet_2~guet_3~guet_4; gvet_1 = g_cdf.*as1[.,i]; gvet_2 = -g_cdf.*as2[.,i]; gvet_3 = -g_cdf.*as3[.,i]; gvet_4 = -g_cdf.*as4[.,i] ; gvet = gvet_1~gvet_2~gvet_3~gvet_4; g_et = guet.*p_v + gvet.*p_u ; g_u = (gu_1~gu_2~gu_3~gu_4).*p_v; gv_1 = g_cdf.*zz1; g_v = (gv_1).*p_u; gu_sum = gu_sum + g_u; gv_sum = gv_sum + g_v; get_sum = get_sum + g_et; puv_sum = puv_sum + p_uv; i = i + 1; endo; gu = gu_sum./puv_sum; gv = gv_sum./puv_sum; get = get_sum./puv_sum; retp ( gu~gv~get); endp; b0= zeros(cols(xx1~ xx2~xx3~xx4~zz1)+4,1); _max_GradProc = &lgd; max_Active = 1|1|1|1|1|1|1| 1|1|1|1|1|1|1| 202

PAGE 214

Appendix E: (Continued) 1| 1|1|1|1|1| 1| 1|1|1|1|1|1|1|1|1|1 ; _max_CovPar = 1; _max_parnames = "CONS_1"|"CARTIME1"|"TERMTIM1"|"PKLOT_D1"|"CAR_01"|"CAR_GE1"| "peak"| "CONS_2"|"CARTIME2"|"TERMTIM2"|"PKLOT_D2"|"CAR_02"|"CAR_GE2"| "peak"| "CONS_3"|"IVEH3"|"OWT3"|"FRE Q3"|"TRST_SUB"|"peak"| "nm_time"| "CONS_1"|"old"|"hhsize2"|"s wiss"|"shopping"|"service"| "f1"|"f2"|"f3"|"f4" ; {b,f,g,cov,ret}=maxlik(data,0,&lpr,b0); call maxprt(b,f,g,cov,ret); 203

PAGE 215

About the Author Xin Ye received a Bachelors Degree in Traffic Civil Engineering from Tongji University, Shanghai, China in 2000 and a M.S. degree in Civil Engineering from University of South Florida in 2004. He continued to study for a Ph.D. degree in Transportation Systems of Civil Engineering at the University of South Florida in 2004. While in the Ph.D. program at the Univer sity of South Florida, Mr. Ye was very active in transportation research. He has coauthored three journal publications in Transportation Research Transportation and Transportation Research Record respectively and made several paper presentations at important conferences such as International Symposium on Transportation and Traffic Theory (ISTTT) and Annual Meeting of Transportation Research Board.