xml version 1.0 encoding UTF8 standalone no
record xmlns http:www.loc.govMARC21slim xmlns:xsi http:www.w3.org2001XMLSchemainstance xsi:schemaLocation http:www.loc.govstandardsmarcxmlschemaMARC21slim.xsd
leader nam Ka
controlfield tag 001 001709525
003 fts
005 20060614112159.0
006 med
007 cr mnuuuuuu
008 060516s2005 flua sbm s000 0 eng d
datafield ind1 8 ind2 024
subfield code a E14SFE0001379
035
(OCoLC)68905526
SFE0001379
040
FHM
c FHM
d FHM
049
FHMM
090
T56 (Online)
1 100
Kababji, Hani.
0 245
Multichannel functional data decomposition and monitoring
h [electronic resource] /
by Hani Kababji.
260
[Tampa, Fla.] :
b University of South Florida,
2005.
502
Thesis (M.S.I.E.)University of South Florida, 2005.
504
Includes bibliographical references.
516
Text (Electronic thesis) in PDF format.
538
System requirements: World Wide Web browser and PDF reader.
Mode of access: World Wide Web.
500
Title from PDF of title page.
Document formatted into pages; contains 65 pages.
520
ABSTRACT: With current advances in sensors and information technology, online measurements of process variables become increasingly accessible for process control and monitoring. Such measurements may take the shape of curves rather than scalar values. The term Multichannel Functional Data (MFD) is used to represent the observations of multiple process variables in the shape of curves. Generally MFD contains rich information about processes. The challenge of process control in MFD is that Statistical Process Control (SPC) is not directly applicable. Furthermore, there is no systematic approach to interpret the complex variation in MFD. In this research, our objective is to develop an approach to systematically analyze the complex variation in MFD for process change detection and process faulty condition discrimination. The main contributions of this thesis are: MFD decomposition, process change detection, and process faulty condition discrimination.We decomposed MFD into global and local components. The approach reveals global and local variations that are due to global signal shifts and local variations. Global variation was extracted using weighted spline smoothing technique, whereas, local variation was obtained by subtracting the global variation from original signals. Weights were obtained using the local moving average of the generalized residuals. The proposed approach helps in process change detection and process faulty condition discrimination based on further MFD analysis using Principal Curve Regression (PCuR) Test. For process change detection, global variation component was used in the PCuR test. Incontrol global data sets were used as training data to detect process change that is due to global and local variation. On the other hand, for faulty condition discrimination purpose, local variation component was used in the PCuR test.Incontrol local variation data sets were used as training data in the PCuR test; therefore, process faulty condition that is due to local variations remains in control, whereas, process faulty condition that is due to global shifts appears as random out of control points in the PCuR test. We applied our approach on real life forging data sets. A simulation study was also conducted to verify the approach and results are promising for wide applications.
590
Adviser: Qiang Huang, Ph.D.
653
Variation decomposition.
Change detection.
Faulty condition.
Discrimination.
False alarm.
690
Dissertations, Academic
z USF
x Industrial Engineering
Masters.
773
t USF Electronic Theses and Dissertations.
4 856
u http://digital.lib.usf.edu/?e14.1379
PAGE 1
Multichannel Functional Data Decomposition and Monitoring by Hani Kababji A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Industrial Engineering Department of Industrial and Management Systems Engineering College of Engineering University of South Florida Major Professor: Qiang Huang, Ph.D. Jose L. ZayasCastro, Ph.D. Tapas Das, Ph.D. Date of Approval: November 10, 2005 Keywords: Variation decomposition, Change detection, Faulty condition, Discrimination, False alarm Copyright 2005, Hani Kababji
PAGE 2
i Table of Contents List of Figures...............................................................................................................i ii Abstract....................................................................................................................... ...iv Chapter 1. Introduction....................................................................................................1 1.1 Multichannel Functional Data and Statistical Process Control....................................1 1.2 Objective and Contribution........................................................................................5 1.3 Overview of the Research..........................................................................................6 Chapter 2. Literature Review...........................................................................................7 2.1 Multivariate Process Monitoring and Control.............................................................7 2.2 Sensor Fusion..........................................................................................................10 2.3 Process Monitoring With MFD................................................................................12 2.4 Principal Curve and Principal Curve Regrssion Test in Process Monitorinng...........15 2.5 Spline Smoothing ....................................................................................................16 2.6 Summary.................................................................................................................21 Chapter 3. Problem Statement........................................................................................23 Chapter 4. Global and Local Variations Decmpostion....................................................25 4.1 Global and Local Variations....................................................................................25 4.2 Global and Local Variations Extractions Using Weighted Spline Smoothing...........28 4.2.1 Global and Local Variations Extraction.................................................................28 4.3 Summary.................................................................................................................33 Chapter 5. Monitoring Global and Local Variations in MFD..........................................34 5.1 Review of Principal Curve Regression Model..........................................................34 5.2 Monitoring Global and Local Variations Using PCuR.....................................................36 5.3 Exploratory Study ...................................................................................................38 5.4 Summary ................................................................................................................41
PAGE 3
ii Chapter 6. Simulation Study..........................................................................................47 6.1 Simulation Data.......................................................................................................47 6.2 Simulation Result....................................................................................................48 6.3 Summary.................................................................................................................51 Chapter 7. Conclusion and Fu ture Research Suggestio ns............................................... 52 7.1 Conclusion...............................................................................................................52 7.2 Future Resear ch Suggest ions .................................................................................... 53 References..................................................................................................................... 55
PAGE 4
iii List of Figures Figure 1.1 Forging Press, Die, and Installed Sensors.............................................................2 Figure 1.2 MFD of One Part (Sample) in Forging Process..........................................................3 Figure 1.3 Theis Flow Chart............................................................................................6 Figure 4.1 Different Process Conditions.........................................................................27 Figure 4.2 Absolute Generalized Residuals of Rewighted Tonnage Signal.....................30 Figure 4.3 Sample of Global and Local Variations Extraction........................................32 Figure 5.1 Results of PCuR Test on Global Pattern Under Normal Condition................37 Figure 5.2 RSS and Mean of Residuals ........................................................................39 Figure 5.3 Results of PCuR Test on Global Pattern Under Faulty Condition One...........42 Figure 5.4 Results of PCuR Test on Global Pattern Under Faulty Condition Two..........43 Figure 5.5 Results of PCuR Test on Local Pattern Under Normal Condition..................44 Figure 5.6 Results of PCuR Test on Local Pattern Under Faulty Condition One.............45 Figure 5.7 Results of PCuR Test on Local Pattern Under Faulty Condition Two............46 Figure 6.1 Simulation Channels and Their Corresponding Principal Curve....................48 Figure 6.2 Example of Global and Local Variations ExtractionSimulation Data..........49 Figure 6.3 Sample Result of Applying PCuR Test on Simulation Data ..........................50
PAGE 5
iv Multichannel Functional Data Decomposition and Monitoring Hani Kababji Abstract With current advances in sensors and information technology, online measurements of process variables become increasingly accessible for process control and monitoring. Such measurements may take the shape of curves rather than scalar values. The term Multichannel Functional Data (MFD) is used to represent the observations of multiple process variables in the shape of curves. Generally MFD contains rich information about processes. The challenge of process control in MFD is that Statistical Process Control (SPC) is not directly applicable. Furthermore, there is no systematic approach to interpret the complex variation in MFD. In this research, our objective is to develop an approach to systematically analyze the complex variation in MFD for process change detection and process faulty condition discrimination. The main contributions of this thesis are: MFD decomposition, process change detection, and process faulty condition discrimination. We decomposed MFD into global and local components. The approach reveals global and local variations that are due to global signal shifts and local variations. Global variation was extracted using weighted spline smoothing technique, whereas, local variation was obtained by subtracting the
PAGE 6
v global variation from original signals. Weights were obtained using the local moving average of the generalized residuals. The proposed approach helps in process change detection and process faulty condition discrimination based on further MFD analysis using Principal Curve Regression (PCuR) Test. For process change detection, global variation component was used in the PCuR test. Incontrol global data sets were used as training data to detect process change that is due to global and local variation. On the other hand, for faulty condition discrimination purpose, local variation component was used in the PCuR test. Incontrol local variation data sets were used as training data in the PCuR test; therefore, process faulty condition that is due to local variations remains in control, whereas, process faulty condition that is due to global shifts appears as random out of control points in the PCuR test. We applied our approach on real life forging data sets. A simulation study was also conducted to verify the approach and results are promising for wide applications.
PAGE 7
1 Chapter 1. Introduction This chapter presents a brief introduction and motivation of this research. The research objectives and an overview of thesis work are also introduced. 1.1 Multichannel Functional Data (MFD) and Statistical Process Control This section presents a brief description about MFD and its corresponding challenges in statistical process control. An example is also provided to illustrate complexity in analyzing MFD. With current advances in sensors and information technology, online measurements for process variables become increasingly usable for industrial process monitoring and control. Process characteristics may be in the shape of multiple curves since process performance might be affected by multiple functional process variables. For instance, many industrial processes are time dependent; therefore, such process variables, which are monitored among time, take the shape of functional data. Generally, functional process variables contain rich information about monitored processes. Process characteristics track and describe the behavior of a process response with respect to a desired predictor, such as: process time, change in distance, etc....
PAGE 8
2 Therefore, multiple sensors are commonly installed to collect data of functional process variables. The term Multichannel Functional Data (MFD) is used to represent observations of multiple function process variables. For example, forging process is equipped with sensors to measure tonnage signal as process variable. Since process faults may occur at different locations in the workpiece, multiple sensors are installed at different workpiece locations. Consequently, multiple channels are extracted from process sensors measuring tonnage signals as a function of time for every sample. See Figure (1.1, 1.2) [1]. In this example, four channels were extracted measuring the same process characteristic i.e. tonnage signal. However, in every channel the corresponding tonnage signal is considered as independent variable since tonnage signals are extracted from different sensors. Moreover, the tonnage values vary among different locations in the workpiece due to process physics. On the other hand, other processes and applications may contain extracted channels with completely different process characteristic. Bolster Plate Tonnage Sensors Cross Section View Tonnage Sensors Flywheel/ Crankshaft Slide Die Figure 1.1 Forging Press, Die, and Installed Sensors
PAGE 9
3 The importance of monitoring MFD in such processes lies behind the fact that MFD monitoring provides an effective way in controlling the process since it monitors different process variables or multiple realizations of a process variable rather than just monitoring the average or the sum of sensed process variables. Therefore, more valuable information about a process can be obtained. The process cycle, which is the time interval needed for a part to be completely manufactured, in such processes is repeatable among samples on each part under the assumption that the samples of a particular channel are identical and independently distributed. Although this assumption eases controlling MFD, many challenges still exist while monitoring sensed MFD. 020406080 050010001500 Time (s)Tonnage (KN) Figure 1.2 MFD of One Part (Sample) in Forging Process Channel 1 Channel 3 Channel 2 Channel 4
PAGE 10
4 MFD is complex by its nature because of the existence of more than one variable in the shape of functional data which makes it hard to interpret. Besides, no systematic approach exists for analyzing and interpreting MFD. Statistical Process Control (SPC) is not directly applicable to this type of process characteristics because of the complexity in the variation of MFD [2, 3]. For example, the mean and the variance are not sufficient to represent process variations unlike vector process characteristics where it is easier to extract variations using process parameters. A further issue in analyzing MFD is late process change detection. Since some manufacturing processes contain high throughput time, late process change detection may result in large number of defective parts. Therefore, an online scheme for such processes monitoring is of crucial need. The online detection allows corrective actions to be taken before parts are completely manufactured. This lead to better process performance with less quality cost. Process faulty condition discrimination is a concurrent challenge with process change detection. For instance, in forging process different types of process faulty conditions may occur due to different causes, such as: deformity in counterweight, and incorrect die setup. The discrimination among different process conditions needs deep understanding of process physics and further analysis in MFD to pinpoint a key feature that helps in discriminating different process conditions.
PAGE 11
5 1.2 Objective and Contribution The objective of this research is to systematically analyze MFD for process change detection and process faulty condition discrimination. In this study, we propose a new scheme for interpreting and monitoring MFD. The approach provides a new way of interpreting variations in MFD. We decompose original sensed MFD into global and local components to capture global and local variations in the sensed data based on previous process knowledge. Global patterns can reveal process global and local shifts. However, rapid local variations and the time at which they occur can be noticed by analyzing the local variation profile taking into account the correlation among time. Global variation is extracted using weighted spline smoothing applied on the original data, whereas, local variation is obtained by subtracting global pattern from the original data. Consequently, two data sets can be obtained providing rich information about the variation patterns in the original signals. For process monitoring purposes, the data sets obtained is further used as an input for Principal Curve Regression (PCuR) test [2]. Using global component, process faulty conditions can be detected. On the other hand, process faulty condition discrimination can be achieved by using local variation as input in PCuR test.
PAGE 12
6 1.3 Overview of the Research The thesis is organized as follows: Chapter 2 presents a review of the literature related to multivariate process control, sensor fusion, spline smoothing technique, and approaches in MFD processing and analysis. Chapter 3 summarizes the problem being analyzed in this research. Chapter 4 focuses on global and local variations extraction. Chapter 5 presents the grouping technique conducted to choose training data sets and PCuR model application. In chapter 6 a simulation study is presented to verify the application of the methodology proposed. Finally, chapter 7 consists of the summary, the conclusion of this research, and some suggestions for future research. The figure below illustrates the flow chart of the thesis contents. Figure 1.3 Thesis Flow Chart Original Data Sets Smooth Curves Original Training Data Sets Spline Smoothing Approach Previous Process Knowledge Global Pattern Local Pattern Input Data PCuR Model Faulty Condition Detection Faulty Condition Discrimination Ch. 1 Introduction Ch. 2 Literature Review Ch. 3 Problem Statement Ch. 4 Ch.6 Simulation Study Ch. 5 Ch.7 Conclusions and Future Work
PAGE 13
7 Chapter 2. Literature Review This chapter presents a summary of previous work on multivariate process monitoring, sensor fusion, process monitoring with MFD, and spline smoothing technique. 2.1 Multivariate Process Monitoring and Control Multivariate quality control provides procedures to monitor multiple variables simultaneously. It is inherently more complex than univariate SPC, but it may be a more realistic representation of the data since in the real world processes do not usually have only one variable that is measured independently of all other variables in a system. However, multivariate control charts are designed to handle multiple variables in the shape of scalar vectors. Several articles found in the multivariate control charts literature present different models and control charts to monitor manufacturing processes. Lowry and Montgomery [4] and Alt [5] presented reviews of multivariate control charts. They introduced brief discussion of Hotteling, Cumulative Sum Charts (CUSUM), and Exponentially Weighted Moving Average Charts (EWMA).
PAGE 14
8 For instance, Alloway and Raghavachari [6] explored Hotelling (2T) charts to represent multivariate data. Hotteling charts were used to detect shifts in the mean. James Williams et al. [7] extended the application of Hotelling (2T) charts to monitor the coefficients of nonlinear regression applied on data profiles. They identified outlying observations and step or ramp vector in the mean over time. For the purpose of process variation detection, Hawkins [8] proposed a multivariate control scheme based on the linear regression of each variable of the remaining variables. He also extended his regression in [9] for variables in multivariate quality control to a cascade process. The regression adjusted variables were plotted on CUSUM charts for both location and scale variations. The performance of his approach depends on the nature of the process being monitored. Pignatello and Kasunic [10] introduced a multivariate CUSUM control chart (MCUSUM) to detect small deviations and shifts; they used simulation to compare the performance of this chart to the multivariate Shewart chart and multiple univariate Shewart and CUSUM chart. This work can not be applied to time related processes or functional data, since further analysis should take place to consider correlation among time. Fuchs and Benjamini [10] proposed the multivariate profile charts (PM) for Statistical Quality Control (SQC) which provides a higher accuracy in retrieving quantitative data information. Profile chart is a symbolic scatterplot, where summaries of data for individual variables are presented by symbol, and global information about the
PAGE 15
9 group is displayed by the location of the symbol in the scatterplot. It is a display for the univariate and multivariate statistics. There has been increase in multivariate quality control studies that use principal components (PCs) to aid monitoring of multiple variables i.e. to reduce the number of variables to be studied and monitored. Jackson and Morris [12] introduced the use of PCs instead of the original variables for multivariate quality control charts. The main advantage of PCs is that they are independent and few of them explain the variability of the original variables. Sparks et al. [13] used the Gabriel Biplot in multivariate process monitoring, which allows the detection of locations in variation and correlation structure accurately. The Biplot is a way to simplify the interpretation of the Principal Component Analysis of a data matrix. It is based on its singular value decomposition. For multivariate process change detection purposes, Apley and Shi [14] proposed a method based on Principal Component Analysis (PCA) to detect faults in a manufacturing process. They presented an example from the automotive industry, in which they determine which PCs are significant and analyze them to try to find the causes for the faults. The previous work been done on control charts was applied on scalar data sets rather than functional nonstationary data or data with vectors that has correlation among
PAGE 16
10 time. With current information technology and advanced data collection devices, manufacturing processes are monitored by sensors that reflects functional data in the shape of multiple curves to represent samples for process variable(s). This research studies the analysis and monitoring MFD taking time correlation into consideration. 2.2 Sensor Fusion With current advances in technology and computational systems, more data can be easily collected from any controllable system. This relies on the fact that sensors ease the process of data collection and extraction. However, many challenges still exist when using sensors in industrial processes, such as: sensor allocation and sensor fusion problems. The objective in previous research work in sensor fusion was mainly focused in determining optimal locations for sensors in a controllable system. Besides, other researchers proposed different approaches to extract and collect data from installed sensors [1416]. In this research, our sensors related objective is to develop an approach to analyze the MFD that is collected from several allocated sensors, to detect process changes and to discriminate among different faulty conditions. Process variation was lately used in sensor allocation problem, which has been studied in deep recently. For example, Y. Ding et al. [15] categorized traditional approaches in sensor optimization into two categories: multistation sensor allocation for the purpose of product inspection, and allocation of sensors for the purpose of variation
PAGE 17
11 diagnosis but at a single measurement station. In their approach, based on the understanding of the mechanism of variation propagation, they developed a backwardpropagation strategy to determine the locations of measurement stations and the minimum number of sensors needed to achieve full diagnosability by setting a minimization cost objective function. In industrial processes, sensors may measure multiple realizations for the same process variable or different process variables in a given system. When sensors read more than one process variable, further analysis might take place to combine different outputs to generate multivariate characterization [1516]. Many approaches could be used to analyze combinations of multiple variables, for example, principal component analysis reveals the variability among different variables, and correlation coefficients might reflect the relation among different monitored variables. A common data fusion estimation approach for one variable system is to take the weighted average of various sensor realizations data to generate a composite fused value. Such approach may not lead to reliable measurements especially if one or more of the sensors are faulty [18]. Kalman filtering and extended Kalman filtering are another estimation approaches been used in this field. They are linear systems techniques which cannot be used if the process model is not available; however, they work well if the data is only corrupted by noise. Such approaches are very sensitive to process outliers. Moreover,
PAGE 18
12 artificial intelligence approaches such as adaptive neural network and fuzzy logic require an extensive training of the system prior to performing the actual experiments. Considering nonlinearity in sensor functions, Suranthiran et al. [18] developed a new framework to optimally fuse the multinonlinear sensor data based on the assumption that at any particular time instance. Only one measurement from the multi readings of the sensors can be used and a partial blending of signal is not possible. This was achieved by scheduling the sensors in the sense of one sensor reading at a time instance. This approach shows good result in reducing number of sensors to be used in processes; however, in this optimal approach the correlation among time in the functional data was not considered. Therefore, there is a need for an analytical approach that can gather MFD measurements from several sensors for process control purposes. 2.3 Process Monitoring With MFD Different approaches were developed for process control purposes for MFD. Some approaches were developed for process variations detection, others were developed for process condition detection. However, there is no systematic approach exists that can interpret MFD and decompose it by analyzing the full process cycle for the purpose of process change detection and process faulty condition discrimination. This section presents some recent work on MFD and process monitoring. For process variation detection, Mahmoud et al. [19] proposed a change point approach based on the segmented regression technique for testing the constancy of the
PAGE 19
13 regression parameters in a linear profile data set. The advantages of their approach reveals more improved detection of sustained step changes in the process parameters and improved diagnostic tools to determine the sources of profile variation and the location(s) of the change point(s). Using simulation study on functional data, Jeong and Lu [3] proposed a methodology based on wavelet coefficients monitoring, to detect local shifts in processes. Their approach performed effectively against many types of process changes especially in detecting small shifts. This work has been applied on single channel functional data. Jin and Shi [20] presented a statistical approach for feature preserving data compression of tonnage information using wavelets. Moreover, Jin and Shi [21] developed a feature extraction methodology using PCA to represent variation patterns of tonnage signals, and they considered the interaction among the variables using fractional factorial design of experiment. Also, they used waveform signals [22] without prior faulty knowledge to develop a diagnostic system to automatically detect faulty conditions. Some efforts have been made in studying sensed MFD and investigating different process conditions. Kim et al. [1] studied the detection of faulty condition and faulty condition discrimination in forging process based on segmentation approach, the forging cycle was divided into segments and for each segment the distances of the original data points from their principle curve was used to produce an empirical distribution which
PAGE 20
14 leads to faulty condition discrimination. They also proposed an approach, using the empirical distributions for the squared distances of data points from their principle curve, to detect global and local profile changes in multichannel tonnage signals and to classify faulty patterns. Koh et al. [23] have developed an approach to detect faulty conditions based on collected data and engineering knowledge in tonnage of stamping machine sensed signals. The approach consists of partitioning signals into segments to represent process phases, at which they identified a set of signal attributes to describe the faulty conditions. Also, Koh et al. [24] used Haar transformation to detect and isolate multiple fault conditions. By partitioning signals into disjoint segment, mutually exclusive Haar coefficients were used to isolate faults at each stage of the process. Jin [25] proposed a new method to use the partitioned monitoring segments of press tonnage signals to monitor individual station conditions in multi operation stamping processes. For this purpose, she developed a generic signal segmentation principle. She also used Hotelling 2T control charts with consideration of the interactions among stations. She demonstrated the analysis procedure by providing a real case study of doorknob stamping process. Zhou et al. [26] proposed a new approach to utilize sensed data and fault pattern information in process monitoring. They deve loped a directionally variant control chart through effective combination of multivariate chart and univariate projection chart. They
PAGE 21
15 also showed that adding univariate projection charts can improve the detection power for the preknown process faults. They demonstrated their work by providing a case study of cyclebased tonnage monitoring of a forging process. Their approach showed capability in process faulty condition detection whet her faulty conditions are preknown or not; however, the approach could not provide faulty condition discrimination and signal variation analysis. 2.4 Principal Curve and Principal Curve Regression Test in Process Monitoring Principal Curve was first introduced by Hastie and Stuetzle [27] in 1989 as a smooth line which is self consistent curve which means that each point of the curve is the average of all data that is projected onto it. This curve has many advantages such as no distribution assumption about data. The data can be readily analyzed without transformation as opposed to NonLinear Principal Component Analysis (NLPCA). Besides, the application of this curve does not need any constraint on the dimension of the data as the curve feature maintains the shape of the original data set. Global and local changes in the curve indicate the change in the data. Many authors studied further properties for the principal curve and proved its existence. For example, Duchamp and Stuezle [28] proved the existence of principal curves and they showed that principal curves are critical points of the expected squared distance from the data. Kgl et al. [29] introduced principal curves with fixed length with the proof of their existence and uniqueness.
PAGE 22
16 The definition of the principal curve opened a wider vision for other researchers in the application of principal curves. For example, Banfield and Raftery [30] applied the principal curve to identify ice floes in satellite images and they reduce the estimation of bias that Hastie and Stuetzle [27] have introduced. In addition, LeBlance and Tibshirani [31] extended the principal curve concept to principal surfaces as they used it in multivariate regression splines study. Standford and Raftery [32] applied principal curves in clustering. Chang and Ghosh [33] applied principal curves in feature extraction and pattern classification problems. In most of their experiments they noticed that closed principal curves gives better results than opened principal curves in pattern classification problem. Lately, Huang [2] introduced the concept of Principal Curve Regression (PCuR) as a model based on the principal curve of data signals, the generated principal curve combine data using multivariate regression model. He applied this concept on tonnage signals. Using in control data the model can predict deviations in future observations. 2.5 Spline Smoothing Before embarking to the literature survey of spline smoothing, we introduce a brief review about spline smoothing. Splines are drafting aids that draw smoothed curves across data points by interpolating data points that are called knots. In this research we use spline for data interpolating since splines are piece wise functions whose individual curves meet at knots. This occurs because usually in splines the first and the second
PAGE 23
17 derivatives at the end of a polynomial, fitting an interval, are equal to the corresponding derivatives at the beginning of the following polynomial. This feature guarantees continuity among data knots [34]. Wegman and Wright [35] provided a detailed bibliographic review of previous work on spline smoothing. They emphasized that splines are interpolatory in nature. Besides, they demonstrated that the interpolation problem in splines is to fit curve through points in a plane. Besides, they introduced the cubic interpolating spline as a function with continuous derivatives up to and including order 2. They also categorized methods of spline smoothing according to the different ways of dealing wit noise into three categories: least square method, 100 percent confidence intervals methods, and regression splines method. Spline smoothing is usually used because of its natural and flexible features in fitting data [36]. Using the spline regression, the smoothing parameter can be determined based on the need of the study and data interpretation in a way which represents the trade off between two competitive aims, good fitting and avoiding rapid fluctuation. Moreover, the amount of smoothing does not depend on the response values, instead, it depends on the design points and it copes well whether or not the design points are regularly spaced. Eilers and Marx [36] presented a short review of Bsplines. They showed connections to the familiar splines penalty on the integral of second squared derivative.
PAGE 24
18 They also used nonparametric logistics regression, density estimation and scatterplot as examples. In spline smoothing, the main fitting objective is to minimize the Residual Sum of Squares (RSS) and the local variation. A measure of the rapid local variability of a curve can be given by a roughness penalty such as the integrated squared second derivative. The modified sum of squares will be defined as follows [37]: dt t G x G y g SN i i i 2 2 1] ) ( [ )] ( [ ) (! + ÂŠ == (2.1) where: is the smoothing parameter, Y is the original data, and G(t) is the fitted value. Therefore, two extreme cases might occur: 1) As ! the highest amount of smoothing occurs with the linear least squares fit with degree of freedom equals to 2. 2) As ! 0, the lowest amount of smoothing occurs with a curve that is similar to the original data since no basis functions were estimated and no degrees of freedom will be taken out from the system, thus the system will have a degree of freedom equals to the same number of design points. The amount of smoothing can be controlled by either controlling the smoothing parameter or by specifying the degree of freedom. Choosing the smoothing parameter can be subjective by choosing the curve that fits the data subjectively with a corresponding smoothing parameter. On the other hand, a very well known proposed approach for an automatic choice of the smoother parameter is the cross validation method.
PAGE 25
19 As any other linear operator, smoothed fitted curves can be represented as linear smoother: y S G= Âˆ (2.2) where, y is the fit and S is the linear operator and known as the smoother matrix. The smoother matrix S is symmetric positiven n matrix, where n is the number of design points. The expression trace(S) gives the number of basis functions or the number of parameters involved in fitting, thus the effective degree of freedom can be written as follows [38]: ) ( S trace df = (2.3) The smoother matrix diagonal values can be determined depending the on the design knots using the following equation [38]: 4 / 3 2 / 3 4 / 3 4 / 1) ( 2ÂŠ ÂŠ ÂŠ ÂŠ=i it f n S (2.4) where) (it f is the standard normal density function of the design points. Choosing the smoothing parameter in least squares spline smoothing problem has been advocated by Wahba and Wold [39, 40], and Wahba [4143]. Wahba introduced a measure for the goodness of fitting as the average squared error which is called the cross validation function. She also demonstrated that choosing the smoothing parameter is base on minimizing the cross validation function. The basic principle behind the cross validation approach is to leave the data points out one at a time and to choose the value of
PAGE 26
20 the smoothing parameter which the missing data points are best predicted by the remainder if the data. For example if iGÂŠis the smoothing spline calculated from all the data pairs except () ,i ty Xiusing smoothing parameter, the cross validation choice of is then the value of that minimizes the cross validation score XVS given by the following equation: ÂŠ ÂŠÂŠ =2 1)} ( { ) (i i it G y n XVS (2.5) An easier computation of cross validation score is given in a standard argument in regression theory by Cook and Weisberg [44] and Craven and Wahba [45].The following equation represents their easier form of evaluating XVS : "= ÂŠÂŠ ÂŠ =n i i iS trace t G y n XVS1 2 2 1)} ( 1 { )} ( Âˆ { ) ( (2.6) Caraven and Wahba also introduced a weighting technique to be implemented on the cross validation function to reflect unequally spaced data and called it the generalized cross validation function. This function was introduced by a simple matrix representation depending on the smoother matrix in the spline smoothing. Fleisher [46], Merz [47], and PaihuaMontes [48] provided a computer code of penalized least squares smoothing spline using the generalized cross validation function. Splines also have been introduced in the time series field. Wahba [43] developed the theory of periodic smoothing spline with application to spectral density estimation.
PAGE 27
21 More sophisticated techniques for smooth spline smoother matrix and its role in curve fitting were discussed by Cook and Weisberg [44]. In their book they discuss approaches to calculate the smoother matrix and the cross validation function. Many spline smoothing applications were studied by Silverman [38]. In his paper he promoted the applicability of non parametric regression using cubic splines. He also provided an inference region for curves as a confidence interval based on Bayesian formulation. Silverman also proposed a formulation for weighted smooth spline smoother matrix. Motivated from Craven and Wahba results, Silverman [49] also developed an approximation for the smoother matrix to quite reduce computational burden by providing a detailed mathematical justification to approximate the generalized cross validation score. 2.6 Summary The most commonly used multivariate control charts are based on the Hotelling 2T chart. Such control charts are dealing with scalar data rather than multiple curve functional data. This implies the necessity of an approach to analyze MFD for the purpose of process change detection and faulty condition discrimination. Two major objectives were studied in the sensor fusion problem. They were mainly concentrated in the determination of optimal sensors location and the optimal extraction for sensed data. However, in this research we analyze the MFD sensed by
PAGE 28
22 allocated sensors for the purpose of process change detection and faulty condition discrimination. The majority of the work that has been done using principal curves shows the capability of the principle curve in summarizing the data since it has no distribution assumption, and the data can be readily analyzed without transformation as opposed to NLPCA approaches. Using principal curve as a response in multivariate linear regression model as in the PCuR model shows the ability to detect deviations in multichannel samples. Since principal curve feature maintains the shape of the original data; global and local changes in the data can be reflected on the curve. Most of the work that has been done in faulty conditions diagnosis and monitoring has been applied on segments of channels signals. Some of the approaches depended on extracting empirical distributions from the data; others were depending on subjective judgment in pattern distinguishing. However, the need for an online solid approach for processes faulty conditions diagnosis offers opportunities for further research in this area. All the previous issues lead to the problem statement for this research, presented in the following chapter.
PAGE 29
23 Chapter 3. Problem Statement This chapter presents the problem definition in this research. Besides it gives a brief description about the approach being used to solve the problem. A summary of the main contributions that we have in this research is also provided. This research takes account of the following issues: MFD interpretation, process change detection, and process faulty condition discrimination. SPC concepts are not directly applicable to MFD [2, 3], since it is focused on the analysis of multivariate scalar data, and no systematic approach exists to analyze the complexity of variations in MFD. Also, late process change detection may result in high qualitycosts and large number of defectives, especially if more than faulty condition exists. Motivated by this fact, we developed an approach to interpret the variations of MFD for process change detection and process faulty condition discrimination. The approach works on both multivariate processes and univariate processes with multiple realizations. We decomposed MFD into global and local components to reveal global and local variations in sensed signals. Global patterns were extracted by applying weighted spline smoothing on original data. Weights were determined using local moving average of the generalized residuals. This method provides weights which correspond to homogenous
PAGE 30
24 variance among process cycle. Therefore, any variation in the process cycle can be captured. Local variation profiles were obtained by subtracting the global component from the original signals. Controlling and monitoring processes that include MFD become a subsequent challenge. For process change detection purpose, we used global variation component to detect changes in process variation. Incontrol global variations used as training data sets in PCuR test; therefore, process faulty conditions can be detected. For the purpose of process faulty condition discrimination, we used in control local variation as an input in the PCuR test. Process faulty condition that is due to local shifts appears out of control, whereas, process faulty condition that is due to global shifts remains in control since it is generated by shifting whole the cycle. We also conducted an exploratory study to extract the appropriate data sets needed to increase the performance of prediction in the PCuR model and to reduce the number of false alarms. The approach proposed was applied on real life forging data. A simulation study was also conducted to verify the developed approach. We simulated bivariate normal incontrol data sets with faulty conditions due to global shifts and faulty conditions that are due to multiple local problematic segments. Using the developed approach, no false alarms were observed, and results show the applicability of the developed approach on further applications.
PAGE 31
25 Chapter 4. Global and Local Variations Decomposition This chapter presents the methodology being developed to extract global and local variations in MFD. It provides a description about global and local components, the decomposition of global and local variations. 4.1 Global and Local Variations This section provides a description about the global and local components. Before embarking to the approach, the assumptions for the MFD to be analyzed in this paper are: 1) the data are all cyclic; 2) the lengths of the cycle are the same; and 3) in each channel, the observations among cycles are identically independent distributed. Since it is hard to interpret sensed MFD, unlike in vector characteristics where the mean and variance can describe process variability, we decomposed original signals into global and local patterns to extract process variation. The idea behind this decomposition is to separate global variations from local variations by which global signal shifts and local variations pattern can be captured. Global variation was extracted by developing smoothed curves using weighted spline smoothing in the sense that we increase local variations in problematic cycle intervals and decrease it elsewhere in the process cycle based on previous process and engineering knowledge. Local variation was extracted by
PAGE 32
26 subtracting global variation from original signals. Using this signal decomposition we can separate defective samples that were caused by global shift in the process cycle from others which were caused by local variation in problematic process segments. Eq. 4.1 presents the decomposed components of the original signal. ) ( ) ( ) (t L t G t Y + = (4.1) where, Y(t) is the original signal, G(t) is the global variation pattern, and L(t) is the local variation pattern. Our approach was applied on real forging data. Three data sets were collected under different faulty conditions: normal condition, faulty condition one which corresponds to deformity in counterweight, and faulty condition two that occurs due to incorrect die setup. Each data set, under each process condition, includes 41 samples of four channels representing tonnage signals as a function of 90 design points of process time. Such decomposition can be illustrated with faulty conditions in a real forging process. In Figure 4.1, three data sets were collected under different faulty conditions: normal condition, faulty condition one which corresponds to deformity in counterweight, and faulty condition two that occurs due to incorrect die setup. Each data set, under each process condition, includes 41 samples of four channels representing tonnage signals as a function of 90 design points of process time. We noticed that faulty condition two curves behave almost the same like normal condition curves. The faulty condition one occurs
PAGE 33
27 most likely in the problematic interval [20, 30] of the cycle, whereas faulty condition two takes place over all the cycle as a shift in the tonnage signal. Noticing the dissimilarity in behavior for faulty condition one than the other two conditions in the problematic interval (see Figure 4.1) [1]. We can look to the problem in a way by which we can extract data from the curves that can distinguish faulty conditions. The following section presents the methodology of decomposing signals into global and local patterns based on previous knowledge. 020406080 050010001500 Time (s)Tonnage (KN) 202224262830 0100200300400500 Time (s)Tonnage (KN) Figure 4.1 Different Process Conditions Left: global pattern; Right: local pattern in the Problematic Time Interval Faulty Condition Two Faulty Condition One Normal Condition Normal Condition Faulty Condition One Faulty Condition Two
PAGE 34
28 4.2 Global and Local Variations Extraction Using Weighted Spline Smoothing 4.2.1 Global and Local Variations Extraction Using weighted spline smoothing for data sets under normal condition, we define the process weights vector (w) as a combination of different weights assigned based on local moving average of squared generalized residuals [38]. This weight vector is further used to produce global patterns for other process conditions to detect local and global variations for each process channel. Using weights vector, the spline smoothing curves after then will be determined by three main factors: local ordinary residuals, roughness penalty, and the weights values. Eq. 4.2 represents the modified residual sum of squares demonstrating the trade off among the three factors. dt t G t G t y w g Sn i i i i 2 2 1] ) ( [ )] ( ) ( [ ) (! + ÂŠ == (4.2) The generalized residuals were used since it do not depend on the residuals variance and the smoother matrix corresponding value, besides it is useful in diagnostic procedures to detect outliers. We notice in (4.2) that generalized residuals are a scaled form of the ordinary residuals. The generalized residuals can be defined as follows [38]: 2 / 1 1 2 / 1)} ( 1 { )} ( ) ( {S tr n t G t Y w ri i iÂŠÂŠ ÂŠ = (4.3) where # is an estimate of residuals standard deviation factor; ) ( )} ( ) ( {2S tr n t G t Y wi i iÂŠ ÂŠ =" (4.4)
PAGE 35
29 Our approach starts with assigning a uniform vector of weights with {1 =iw }. The initial values of the residuals standard deviation and generalized residuals are used in finding the first combination of weights based on local moving average of the generalized residuals. Eq. 4.5 shows the formula for finding the first weights vector for each design knots [38]. "+ ÂŠ ==i id m j j i i ir m d w2) 1 ( (4.5) where, ) min( k i n di+ = and ) 1 max( k i miÂŠ = Since our data sets are of moderate size, k value was determined to be 5 as in [38]. The first weights combination produced is further can be fed back into the weight estimate model to produce new updated weight estimation w Âˆ as follows: "+ ÂŠ ==i id m j j i i i ir m d w w2) 1 ( Âˆ (4.6) These steps can be applied till we reach iteration where the produced combination of weights generates a corresponding generalized residuals plot with homogeneous variance. i.e., random generalized residual plot or some sort of convergence occurs. See Figure 4.2. As a result, the weights vectors are approximately similar for data variables extracted from the same sensor among different samples under normal process condition,
PAGE 36
30 and there is no great need to have accurate weights values [38]. Therefore, an average weight value can be used to produce global pattern for future samples. 020406080 0.00.51.01.52.02.53.0 time (s)Residual 020406080 0.00.51.01.52.02.53.0 time (s)Residual Figure 4.2 Absolute generalized residuals for reweighed Tonnage Signal Left: First reweighing iteration; Right: second reweighing iteration The basic idea of finding the cubic polynomial for a given interval can be shown in the following equation [48]: i i i i i i i id x x c x x b x x a x P+ ÂŠ + ÂŠ + ÂŠ =) ( ) ( ) ( ) (2 3 (4.7) for i=1,2,Â…, n1. where, and c b ai i i, , id are coefficients, and and x x x xi i, ) ( ) (2 3ÂŠ ÂŠ ) (ix x ÂŠ are called basis. Then, the polynomial fit G(t ) over multi intervals will be represented as follows: "==n i i i iN t G1) ( (4.8)
PAGE 37
31 where iN is an n dimensional set of basis functions for representing the natural splines, and theta (i ) stands for the basis coefficients. Based on the spline features of continuity and first and second derivative existence, the polynomial coefficients can then be determined. Since the solution for the chosen global function is cubic spline, the function can be represented as follows: "==n j j i j it N w t G1) ( ) ( (4.9) The new smoothing parameter (G) will be then determined using the extended argument of the cross validation method developed by Silverman in [38] for the weighted case. Consequently, the effective degree of freedom generated using this approach ) (GDf can be determined using the following equation: ) ) ( 2 ) ( (4 / 3 2 / 3 4 / 3 4 / 1 ÂŠ ÂŠ ÂŠ ÂŠ =i k i G Gt f w w trace Df (4.10) where "kwis the sum of weights of the design knots at t=i and f ( ti) is the standard normal density function of the design points. Using the weighted spline smoothing or the corresponding degree of freedom, the generated spline smoothing will have the required features for the purpose of process condition detection and discrimination. In Figure 4.3, we notice that this approach reveals that faulty condition one occurs due to high local variations in the problematic segment,
PAGE 38
32 whereas faulty condition two occurs due to global shift in the tonnage values. This result complies with the study conclusions conducted in [1]. 020406080 050010001500 Global pattern Time (s)Tonnage (KN) 020406080 050010001500 Global pattern Time (s)Tonnage (KN) 020406080 050010001500 Global pattern Time (s)Tonnage (KN) 020406080 201001020 Local variability profile u n de r n o rm a l co n d i t i o nTime (s)Tonnage (KN) 020406080 201001020 Local variability profile u n de r f au l ty co n d i t i o n o n e Time (s)Tonnage (KN) 020406080 201001020 Local variability profile u n de r f au l ty co n d i t i o n t w o Time (s)Tonnage (KN) Figure 4.3 Sample of Global and Local Variation Extraction
PAGE 39
33 4.3 Summary In this chapter we discussed the spline smoothing technique used in developing global patterns to the original data in order to discriminate between different faulty conditions. The weighted spline smoothed curves were used to produce the global pattern of the original signals. Weights at each data knot were determined using the local moving average of the generalized residuals. Local variability profiles were extracted by subtracting original data curves from smoothed global patterns. The next chapter presents the PCuR model and test used to detect faulty conditions and to distinguish between them. Besides, the selection of training data sets based on clustering technique is also discussed.
PAGE 40
34 Chapter 5. Monitoring Global and Local Variations of MFD Obtaining global and local variations from original signals form a key feature of signal variations. For the purpose of process conditions detection and discrimination, we used global and local patterns as an input to the PCuR model. 5.1 Review of Principal Curve Regression Model Recalling PCuR in chapter 2, Huang [2] introduced the concept of Principal Curve Regression for tonnage signals as follows: Denote by xk the p channel tonnage signal observed at the k th crank angle of a cycle, k =1,2,Â…, m where xk = (xk 1,Â…, xkp)T and the signal is of the form m k1} {x. If tonnage signals of n parts are observed, denote by m i k1 ,} {x the i th observation of m k1} {x, where xk i=(xk 1, i,Â…, xkp i)T is the i th observation of xk, i =1, Â…, n and k =1, Â…, m The n samples are of the form m k1} {X, where the n p matrix Xk=[xk ,1, xk ,2, ..., xk n]T. The idea of PCuR is to extract principal curve from m k1} {x and build regression model between the principal curve andm k1} {x.
PAGE 41
35 Once the PCuR model is established from incontrol tonnage signals, it can be used to determine whether process change occurs in future observations. The principal curve of new observation m k1} {z is treated as new responsem k1} {y. A multivariate linear regression model is assumed to adequately model Â“responseÂ” yk and Â“predictorsÂ” xk at the k th crank angle, k =1, Â…, m i.e., Tky = [1Tkx] Bk + $k (5.1) with E ( $k)= 0, Cov( $k) = %k, k =1, Â…, m (5.2) By the results of Johnson, R.A, and Wichern, D.W. [50] in multivariate regression. The predicted ellipsoids for new response yk at time k are ) 1 ] 1 [ ] 1 ]([ 1 [ 1 /( ) 1 ( 1 ) 1 (1 1# $ % & ( + # $ % & ( ÂŠ ) ) + , ÂŠ ÂŠ # $ % & ( ÂŠÂŠ ÂŠ k k T k T k k T k k k T k T k kZ X X z Z B Y p n n Z B Y ) ( 2 ) 1 (2 ,p n pF p n p n pÂŠ) ) + , .ÂŠ ÂŠ ÂŠ (5.3) We used Bonferroni correction procedure to decrease the value. This way less false alarms can occur in PCuR test. The primary value for each data knot is 10%, therefore, the actual testing value for all the cycle is ) / ( M =0.0011 for every time knot. Huang [2] applied the PCuR test on the problematic interval and results in large number of out of control points appearing under faulty condition one compared to the samples of faulty condition two.
PAGE 42
36 5.2 Monitoring Global and Local Variations Using PCuR In this research we applied PCuR model to the forging samples without segmentation, i.e. a sample consists of 90 time knots and training data sets includes the first thirty samples of normal process condition. For process change detection purpose, we applied PCuR model to the global variation data sets, which were extracted by applying spline smoothing on the original signals. The results for normal condition data sets remain in control, whereas data points for faulty condition one and two exceeded threshold randomly, which indicates out of control status for data points. For the purpose of faulty condition discrimination, we applied PCuR model on local variation profiles. As a result, all the samples under normal condition remain incontrol. However, all samples of faulty condition one are out of control and 34 samples of faulty condition two remained in control. The following figure presents a sample result of applying PCuR on extracted global variation using incontrol data sets. Results of applying PCuR test on global and local data sets using other process conditions are shown at the end of this chapter.
PAGE 43
37 020406080 051015 InControlSample1Time kTest Statistics 020406080 051015 InControlSample2Time kTest Statistics 020406080 051015 InControlSample3Time kTest Statistics 020406080 051015 InControlSample4Time kTest Statistics 020406080 051015 InControlSample5Time kTest Statistics 020406080 051015 InControlSample6Time kTest Statistics 020406080 051015 InControlSample7Time kTest Statistics 020406080 051015 InControlSample8Time kTest Statistics 020406080 051015 InControlSample9Time kTest Statistics 020406080 051015 InControlSample10Time kTest Statistics 020406080 051015 InControlSample11Time kTest Statistics 020406080 051015 InControlSample12Time kTest Statistics Figure 5.1 Result of PCuR Test on Global Pattern Under Normal Condition Using global variation data sets for process change detection, global and local variations were detected. On the other hand, local variation data sets were used for faulty condition discrimination purposes. Since faulty condition two is due to a global shift in the cycle, the corresponding local profile bear a resemblance to the local profile of the
PAGE 44
38 training data sets; therefore it appears incontrol. However local profiles for faulty condition one appears of control due to local variations. The result indicates the existence of some false alarms in applying PCuR model on local variations for faulty condition discrimination purpose. To illustrate the main cause of false alarms, we conducted an exploratory study to capture the variability among samples. 5.3 Exploratory Study According to Statistical Quality Control (SQC) concepts, an out of control status can be caused by variability and/or shift in the sample mean. This concept motivated the study of the a) RSS of the local variability data sets under normal condition for each sample composed of four channels p =4. ""==ÂŠ =p c n i i c i c st G t Y RSS11 2)) ( ) ( ( (5.4) and b) the mean of the residuals coming from the four channels for each sample under normal condition. See Figure 5.7(a,b). ""== ÂŠÂŠ =p c n i i c i c st G t Y p R11 1)) ( ) ( ( (5.5) where, sRSS is the residual sum of squares for sample s, s=1,Â…,41 sR is the residuals mean for sample s. ) (i ct Y and ) (i ct G is the original and the spline smoothed data respectively at each knot i i=1,Â…,n for channel c c=1,Â…,p.
PAGE 45
39 The following figure presents the RSS magnitudes and mean of residuals for all the samples collected under normal condition. 010203040 2.25e+082.35e+082.45e+08 SamplesRSS (a) RSS Under Normal Condition 010203040 565575585595 SamplesM ean of Local Variability (b) Mean of Residuals Under Normal Condition Figure 5.2 RSS and Mean of Residuals
PAGE 46
40 We chose twenty samples from each study as training data set that correspond to low variation for the PCuR model to predict faulty condition two as in control samples since it is caused by global signal shift. Eighteen samples were in common between the two studies groups RSSN and RN as follows: / / 0 / / 1 2 / / 3 / / 4 5 = 41 40 38 36 31 30 27 24 20 19 17 16 15 13 10 9 8 7 5 3 2 1R RSSN N / / 0 / / 1 2 / / 3 / / 4 5 = 41 40 38 36 31 30 27 24 20 19 17 16 15 13 10 9 5 1R RSSN N Using the obtained samples as training data set for PCuR test, the results improved by reducing the false alarms from 7 to 5 in detecting faulty condition two. The five false alarms samples of faulty condition two were out of control in few separate random knots i.e. not in the shape of cluster. Logically, this fact does not reflect defective condition occurrence on the working piece, since for a specimen to be defective, thermoplastic deformation may last for a time interval which will be reflected as cluster of out of control points in PCuR model. To interpret the result obtained by PCuR test for faulty condition two, we conducted the same exploratory study to capture the variability among samples collected under faulty condition two. The samples with high variation and shift in the mean R RSSF F ,are represented below: / / 0 / / 1 2 / / 3 / / 4 5 = 41 39 38 36 35 34 33 32 27 26 22 16 7 6 5 3 2 1RSSF / / 0 / / 1 2 / / 3 / / 4 5 = 35 33 27 25 3 2 1RF
PAGE 47
41 The false alarm samples for faulty condition two ( F ) are: / 0 / 1 2 / 3 / 4 5 = 35 33 3 2 1 ) ( F. We notice that faulty condition two has two groups of variability in its data set. The existence of some samples with high local variation caused the existence of false alarms in discriminating faulty condition two. 5.4 Summary In this chapter global and local variations under normal process condition were used as training data sets for the purpose of monitoring in the PCuR model and test. Global variation was used to detect process change, whereas local variation was used to discriminate among faulty conditions. However, some false alarms occurred, therefore, we conducted an exploratory study to capture the variability among the in control samples. The study was based on grouping and cluster analysis. The next chapter presents a simulation study performed to verify the proposed approach in decomposing MFD into global and local components. Results of applying PCuR test on different process conditions using global and local variations are shown below:
PAGE 48
42 020406080 01020304050 outF11Time kTest Statistics 37 38 39 43 45 46 47 64 65 66 67 68 84 020406080 01020304050 outF12Time kTest Statistics 18 19 37 38 39 52 58 66 67 71 72 74 77 020406080 010203040 outF13Time kTest Statistics 37 38 39 45 46 47 48 50 52 53 57 58 5966 020406080 010203040 outF14Time kTest Statistics 37 38 39 43 44 45 46 49 50 51 52 53 58 66 67 77 020406080 0204060 outF15Time kTest Statistics 28 29 30 31 35 36 37 38 39 42 43 45 46 47 48 49 52 53 55 57 60 61 65 66 67 68 70 71 72 73 74 020406080 020406080 outF16Time kTest Statistics 30 36 37 38 39 41 42 43 44 45 46 48 49 50 51 52 53 54 66 67 68 72 73 84 020406080 020406080 outF17Time kTest Statistics 1 2 3 4 29 30 31 32 33 34 35 36 37 38 39 42 43 44 45 46 49 50 53 64 65 72 73 87 88 89 90 020406080 050100150 outF18Time kTest Statistics 1 2 3 4 5 6 7 8 9 22 23 24 25 26 27 28 29 30 31 36 37 38 39 42 43 44 48 49 50 52 53 54 55 67 68 69 70 71 72 73 74 75 020406080 010203040 outF19Time kTest Statistics 35 36 37 38 40 42 43 49 50 52 53 65 66 67 77 020406080 0103050 outF10Time kTest Statistics 35 36 37 38 39 42 43 44 45 46 49 50 53 61 64 65 66 67 68 72 73 020406080 0102030 outF11Time kTest Statistics 1 18 19 38 39 45 46 77 84 020406080 010203040 outF12Time kTest Statistics 36 38 39 43 44 45 46 49 50 52 67 68 72 85 Figure 5.3 Result of PCuR Test on Global Pattern Under Faulty Condition One
PAGE 49
43 020406080 010203040 outF21Time kTest Statistics 1 2 3 25 34 35 46 52 88 89 90 020406080 01020304050 outF22Time kTest Statistics 1 2 3 4 5 6 24 25 35 36 46 47 49 52 88 89 90 020406080 020406080 outF23Time kTest Statistics 1 2 3 4 5 6 7 8 9 10 11 12 13 14 22 23 24 25 26 27 28 29 30 46 52 53 54 55 72 75 86 87 88 89 90 020406080 0102030 outF24Time kTest Statistics 45 46 49 65 66 67 68 69 71 72 020406080 0510152025 outF25Time kTest Statistics 34 35 46 68 72 020406080 01020304050 outF26Time kTest Statistics 46 47 68 72 020406080 05102030 outF27Time kTest Statistics 33 34 35 39 40 68 020406080 05152535 outF28Time kTest Statistics 33 34 35 46 47 52 68 020406080 010203040 outF29Time kTest Statistics 35 39 40 46 67 68 72 020406080 0102030 outF210Time kTest Statistics 34 35 46 47 49 52 60 68 020406080 05102030 outF211Time kTest Statistics 16 17 18 19 20 21 22 23 24 25 26 36 46 83 020406080 0102030 outF212Time kTest Statistics 33 34 35 36 46 47 48 50 55 58 Figure 5.4 Result of PCuR Test on Global Pattern Under Faulty Condition Two
PAGE 50
44 020406080 0510152025 InControlSample1Time kTest Statistics 90 020406080 051015 InControlSample2Time kTest Statistics 020406080 051015 InControlSample3Time kTest Statistics 020406080 051015 InControlSample4Time kTest Statistics 020406080 051015 InControlSample5Time kTest Statistics 020406080 051015 InControlSample6Time kTest Statistics 020406080 051015 InControlSample7Time kTest Statistics 020406080 051015 InControlSample8Time kTest Statistics 020406080 051015 InControlSample9Time kTest Statistics 020406080 051015 InControlSample10Time kTest Statistics 020406080 051015 InControlSample11Time kTest Statistics 72 020406080 051015 InControlSample12Time kTest Statistics Figure 5.5 Result of PCuR Test on Local Pattern Under Normal Condition
PAGE 51
45 020406080 050010001500 outF11Time kTest Statistics 9 89 90 020406080 010203040 outF12Time kTest Statistics 60 72 020406080 051020 outF13Time kTest Statistics 28 020406080 05102030 outF14Time kTest Statistics 1 020406080 010203040 outF15Time kTest Statistics 86 020406080 051525 outF16Time kTest Statistics 23 58 020406080 01020304050 outF17Time kTest Statistics 9 10 12 17 33 44 020406080 010203040 outF18Time kTest Statistics 6 25 35 46 64 020406080 010203040 outF19Time kTest Statistics 1 12 16 17 33 020406080 0510152025 outF10Time kTest Statistics 020406080 010203040 outF11Time kTest Statistics 60 67 76 020406080 05102030 outF12Time kTest Statistics 65 Figure 5.6 Result of PCuR Test on Local Pattern Under Faulty Condition One
PAGE 52
46 020406080 0510152025 outF21Time kTest Statistics 020406080 0204060 outF22Time kTest Statistics 16 020406080 0510152025 outF23Time kTest Statistics 020406080 05102030 outF24Time kTest Statistics 67 020406080 0510152025 outF25Time kTest Statistics 020406080 0510152025 outF26Time kTest Statistics 020406080 0510152025 outF27Time kTest Statistics 020406080 0510152025 outF28Time kTest Statistics 020406080 0510152025 outF29Time kTest Statistics 020406080 0510152025 outF210Time kTest Statistics 020406080 020406080 outF211Time kTest Statistics 1 16 17 020406080 0510152025 outF212Time kTest Statistics Figure 5.7 Result of PCuR Test on Local Pattern Under Faulty Condition Two
PAGE 53
47 Chapter 6. Simulation Study 6.1 Simulation Data We conducted the simulation study to verify our approach applicability on further applications and different types of process variables. We simulated a bivariate process with three process conditions: normal condition, a faulty condition due to two problematic segments in the cycle (faulty condition one), and a faulty condition due to global shift in the signal (faulty condition two). Thirty samples were simulated for every process variable with a cycle length equals to 90 knots. The signal functions under normal condition are given in the following equations: K T Kx x Y+ + + = )] sin( 1 ), cos( 1 [ (6.1) where, KY is channel denotation at time k, k=1,Â… ,90. Noise K is generated from normal distribution. The covariance structure of the noise is# $ % & ( 1 0 0 12. Different values of variance2were used to verify the approach under different variability structures in signals. The approach showed capability of faulty condition detection and discrimination using different combinations of covariance structures. Faulty condition one was generated by adding two deviations to shift every incontrol sample for both variables in two problematic time intervals. The length of the problematic segment is 10 knots.
PAGE 54
48 The whole simulated signal was shifted to generate channels for faulty condition two. Figure 6.1 presents a sample of the simulated channels with their principal curve under normal condition. 0.51.01.52.0 1.01.21.41.61.82.02.2 CH1CH2 Figure 6.1 Simulated Channels and their Corresponding Principal Curve 6.2 Simulation Result Global and local patterns were extracted by applying weighted spline smoothing on incontrol data sets. Weights were determined using local moving average of generalized residuals. Two iterations were generated till convergence occurred. Since the two channels present two different variables, two average weight vectors were produced. The weight vectors were used in fitting the two channels of future coming samples under different process conditions. Figure 6.2 shows global and local variations extractions for one channel of the simulated data.
PAGE 55
49 020406080 02468 TimeSimulated Variable 1 020406080 02468 TimeSimulated Variable 1 020406080 02468 TimeSimulated Variable 1 020406080 0.30.10.00.10.20.3 Under Normal Condition TimeLocal Variation 020406080 0.30.10.00.10.20.3 Under Fault y Condition One TimeLocal Variation 020406080 0.30.10.00.10.20.3 Under Fault y Condition Two TimeLocal Variation Figure 6.2 Example of Global and Local Variation ExtractionSimulated Data The local variation under normal condition represents white noise. However, in the problematic segments, under faulty condition one, the variation magnitude is relatively higher than elsewhere in the simulated cycle. This indicates that faulty condition one is due to two problematic segments in the time intervals [20:30], and [50:60]. On the other hand, local variation under faulty condition two is similar to the
PAGE 56
50 variation under normal condition, which shows that the cause of faulty condition two is a global shift in the signal. 020406080 0510152025 Under Normal ConditionTime kTest Statistics 020406080 01020304050 Under Faulty Condition OneTime kTest Statistics 23 24 29 32 33 34 38 45 77 020406080 01020304050 Under Faulty Condition TwoTime kTest Statistics 4 5 24 32 33 34 39 45 65 82 83 86 89 020406080 0510152025 Under Normal ConditionTime kTest Statistics 020406080 01020304050 Under Faulty Condition OneTime kTest Statistics 23 30 31 32 34 35 38 49 77 020406080 0510152025 Under Faulty Condition TwoTime kTest Statistics Figure 6.3 Sample Results of Applying PCuR on Simulated Data First row is global data results; Second row presents results of local variations For the purpose of process change detection, we used global variation as input in the PCuR test. As a result, all samples under normal condition remains incontrol, however, all samples for faulty condition one and two are out of control. On the other hand, using local variation as an input in the PCuR test, we could discriminate among process faulty conditions. All training data sets under normal condition remains in
PAGE 57
51 control, and all samples of faulty condition one were out of control, whereas, all the samples of faulty condition two remained in control. See Figure 6.1. 6.3 Summary In this chapter, a simulation study was conducted to verify the developed approach. Three data sets were simulated representing bivariate data under three different process conditions: normal condition; faulty condition one, which is due to two local variations in the simulated cycles and; faulty condition two, which appears as global shift in the whole simulated cycles. Results were satisfactory. Global data appeared out of control for the faulty conditions and in control for the training data using the PCuR test. However, local variation data showed out of control status for faulty condition one, and remained in control for the incontrol data sets and faulty condition two.
PAGE 58
52 Chapter 7. Conclusion and Future Research Suggestions This chapter presents the conclusion of the work that has been done in this research. A summary of the main contributions is also provided. Finally, directions for future research are also suggested in this chapter. 7.1 Conclusion This research presents a monitoring scheme that is able to handle and analyze multichannel functional data by decomposing data into global and local variations. In our methodology, we addressed weighted spline smoothing technique to generate global patterns for the purpose of detecting global shifts and variations that causes faulty conditions. Weights were determined using local moving average of the generalized residuals for diagnostic procedures. Local variations were obtained by subtracting global variations from original data. Using local variations, as a key feature to predict samples with local variations, we could discriminate among different faulty conditions. The data extracted from both global and local approaches were tested using PCuR test. Our example and results were illustrated using forging data sets from the industry. Furthermore, we simulated data to verify the proposed approach and results were satisfactory since no false alarms were observed. Therefore, our main contributions and achievements in this research can be summarized as follows:
PAGE 59
53 Â€ MFD interpretation by decomposition MFD into global and local components. Â€ Process change detection using the extracted global variations from MFD by analyzing the whole cycle in a given industrial process. Â€ Discriminating among different process conditions using local variation as input in the PCuR test. 7.2 Future Research Suggestions A further extension of this study can be focused on determining which process variable is significantly contributing in causing faulty conditions to occur. This work can start by decomposing the Hotelling 2 T statistic into independent components for each variable as in [51]. However, correlation among time should be considered in MFD. Another desirable research direction is to estimate a confidence region for MFD or curve data using multiple samples. This study was accomplished for one sample depending on the finite dimensional Bayesian formulation for the curve estimation [38]. Finding confidence region for multi sample data provides solid structure for detecting faulty process conditions with MFD. Last but not least research suggestion is to formulate an automatic approach to determine the amount of smoothness for the purpose of process change detection. This can be achieved by determining an optimal value of the effective degree of freedom by which local variations can be revealed. The optimal value can be controlled by assigning an objective function that maximizes the ratio of local variation in the problematic
PAGE 60
54 segment to the variation at the rest of a process cycle. Other spline smoothing parameters can be determined using the variation and the smoother matrix as illustrated in [38].
PAGE 61
55 References [1] J. Kim, Q. Huang, J. Shi, 2005. Online MultiChannel Forging Tonnage Monitoring and Fault Pattern Classification using Principal Curve, ASME Transactions, Journal of Manufacturing Science and Engineering, in press. [2] Q. Huang, 2005, Principal Curve Regression and Analysis of Multiple Curve Data. Submitted to IIE Transactions. [3] M. Jeong, J. Lu, 2003, WaveletBased SPC Procedure for Complicated Functional. Conference paper, INFORMS Annual Meeting. [4] C. A. Lowry, and D. C. Montgomery, 1995, A Review of Multivariate Control Charts, IEE Transactions, 27(6), pp. 800811. [5] F. B. Alt, 1985, Multivariate Quality Control, S. Kontz & N. L. Johnson (Eds), Encyclopedia of Statistical Sciences, New York: John Wiley & Sons,6, pp. 110122. [6] J. A. Jr. Alloway, M. Raghavachari, 1990, Multivariate Control Charts Based on trimmed Means, ASQC Quality Congress Transactions, San Francisco. [7] J. William, W. Woodall, J. Birch, 2003, Phase I Analysis of Nonlinear Product and Process Quality Profiles, submitted to journal of quality and technology. [8] D. M. Hawkiins, 1991, Multivariate Quality Control Based on RegressionAdjusted Variables, Technometrics, 33(1), pp. 6175. [9] D. M. Hawkiins, 1993, Regression Adjustment for Variables in Multivariate Quality Control Journal of Quality Technology, 25(3), pp170181.
PAGE 62
56 [10] J. J. Jr. Pignatiello, 1985, Development of Multivariate CUSUM chart, Computers in Engineering, Proceedings of the international computers in engineering conference, 2, pp. 427432. [11] C. Fuchs, and Y. Benjamini, 1994, Multivariate Profile Charts for Statistical Process Control, Technometrics,36(2), pp. 182195. [12] J. E. Jackson, R. H. Morris, 1957, An Application of Multivariate Quality Control to Photographic Processing, American Statistical Association Journal (June), 2, pp. 186199. [13] R. Sparks, A Adolphson, and A. Phatak, 1997, Multivariate Process Monitoring Using The Dynamic Biplot, International Statistical Review / Revue Internationale de Statistique ,65(3), pp. 325349. [14] D. W. Apley, J. Shi, 2001, A FactorAnalysis Method for Diagnosing Variability in Multivariate Manufacturing Processes, Technometrics, 43(1), pp. 8495. [15] Y. Ding, P. Kim, D. Ceglarek, J. Jin, 2003, Optimal Sensor Distribution for Variation Diagnosis in Multistation Assembly Processes, IEEE Transaction on Robotics and Automation,19(4), pp. 543556. [16] M. RodrguezMndez, A. Arrieta, V. Parra, A.Bernal, A. Vegas, S. Villanueva, R. GutirrezOsuna, J. de Saja, 2004, Fusion of Three Sensory Modalities for the Multimodal Characterization of Red Wines, IEEE, Sensors Journal, 4(3), pp 348355. [17] R. Luo, C. Yih, K. Su, 2002, Multisensor Fusion and Integration: Approaches, Applications, and Future Research Directions, IEEE, Sensors Journal, 2(2), pp. 107119. [18] S. Suranthiran and S. Jayasuriya, 2004, Optimal Fusion of Multiple Nonlinear Sensor Data, IEEE, Sensors Journal, 4(5), pp. 651663. [19] M. Mahmoud, W. Woodall ,P. Paker, D. Hawkins, 2004. A Change Point Method for linear Profile Data, Submitted to Journal of Quality and Technology. [20] J. Jin,and J. Shi, 2001, Automatic Feature Extraction of Waveform Signals for Inprocess Diagnostic Performance Improvement, Journal of Intelligent Manufacturing, 12, pp267268.
PAGE 63
57 [21] J. Jin, and J. Shi 2000, Diagnostic Feature Extraction from Stamping Tonnage Signals Based on Design of Experiment, ASME Transactions, Journal of Manufacturing Science and Engineering, 122(2), pp. 360369 [22] J. Jin, and J. Shi, 1999, FeaturePreserving Data Compression of Stamping Tonnage Information Using WaveletsÂ”, Technometrics, 41, pp. 327339. [23] C. K. H. Koh, J. Shi, and J. Black, 1996, Tonnage Signature Attribute Analysis for Stamping Process, NAMRI/SME Transactions, 23, pp193198. [24] C. K. H. Koh, J. Shi, W. Williams, J. Ni, 1999, Multiple Fault Detection and Isolation Using the Haar Transform Part 1: Theory, ASME. Transactions, Journal of Manufacturing Science and Engineering, 121(2), pp290294. [25] J. Jin, 2004, Individual Station Monitoring Using Press Tonnage Sensors for Multiple Operation Stamping Processes, ASME Transactions, Journal of Manufacturing Science and Engineering 126(1), pp. 8390. [26] S. Zhou, N. Jin, and J. Jin, 2004, Cyclebased Signal Monitoring Using a Directionally Variant Control Chart System, accepted by IIE Transactions on Quality and Reliability. [27] T. Hastie, and, W. Stuetzle (1989). Principal Curves. Journal of the American Statistical Association, 84, pp.502516. [28] T. Duchamp, W Stuezle, 1996, Extremal Properties of Principal Curves in the Plane, Annals of Statistics, 24, pp. 15111520. [29] B. Kgl, A. Krzyzak, T. Linder, and K. Zeger, 2000, Learning and Design of Principal Curves, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, pp. 281297. [30] J.D. Banfield, A.E. Raftery, 1992, Ice Floe Identification in Satellite Images Using Mathematical Morphology and Clustering about Principal Curves, Journal of the American Statistical Association, 87, pp. 716. [31] M. LeBlance, R.J. Tibshirani, 1994, Adaptive Principal Surfaces, Journal of American Statistical Association, 89, pp. 5364. [32] D. Standford, A.E. Raftery, 1997, Principal Curve Clustering with Noise, Technical Report 317, Department of Statistics, University of Washington. [33] K. Chang J. Ghosh, 1998, Principal curves for nonlinear feature extraction and classification, Applications of Artificial Neural Networks in Image Processing III, SPIE Photonics West '98 Electronic Image Conference, SPIE ,3307 pp. 120129.
PAGE 64
58 [34] O. l. Mangasarian, L. L. Shumakher, 1969, Splines Via Optimal Control, Approximations With Special Emphasis On Spline Functions, (Ed. I. J, Schoenberg New York, Academic Press, pp. 119156. [35] E. Wegman, I Wright, 1982, Splines in Statistics, Journal of American Statistical Association, 78(382), pp.351365. [36] P. Eilers and B. Marx, 1996, Flexible Smoothing with BSpline and Penalties, Statistical Science., 11(2), pp. 112114. [37] T. Hastie, R. Tisbshirani, J. Friedman, The Elements of Statistical Learning, Data Mining, Inference, and Prediction. New York: SpringerVerlag, 2001, ch. 5. [38] B. W. Silverman., 1985. Some Aspects of the Spline Smoothing Approach to NonParametric Regression Curve Fitting, Journal of Royal Statistical Society, 47(1), pp. 152. [39] G. Wahba, S. Wold, 1975a, A Completely Automatic French Curve: Fitting Spline Functions by CrossValidationÂ”, Communications in StatisticsÂ”, 4, pp. 117. [40] G. Wahba, S. Wold, 1975b, Periodic Splines for Spectral Density Estimation: The Use of CrossValidation for Determining the Degree of Smoothing,, Communications in Statistics, 4, pp. 125141. [41] G. Wahba, 1976, A survey of Some Smoothing Problems and the Methods of Generalized CrossValidation for Solving Them, Applications of Statistics (ed. P.R. Krishnaiah), 1, pp. 507534. [42] G. Wahba, 1979, Automatic Smoothing of the Log Periodgram, Journal of the American Statisticl Association, 75, pp. 122132. [43] G. Wahba, 1980, Spline Bases, Regularization, and Gerealized CrossValidation for Solving Approximation Problems with Large Quantities of Noisy Data, Proceedings of the International Conference on Approximation Theory in Honor of George Lornez, (ed. Ward Cheney, Academic Press. [44] R.D. Cook, and S. Weisberg ,1982, Residuals and Influence in Regression, London: Chapman and Hall. [45] P. Craven, and G. Wahba, 1979, Smoothing Noisy Data with Spline Functions, Numer. Math., 31, pp. 377403.
PAGE 65
59 [46] J, Fleisher, 1979, Spline Smoothing Routines, Reference Manual for the 1110, Academic Computer Center, The University of Wisconsin, Madison. [47] P. Merz, 1978, Spline Smoothing by Generalized CrossValidation: a Technique for Data Smoothing, Chevron Research Company, Richmond, CA. [48] L. PaihuaMontes, 1979, Quelques Methodes Numeriques Pour le Calcul de Fanctions Splines a Une et Plusier Variable, Thesis, Universite Scientifique et Medicale de Grenoble. [49] B. W. Silverman, 1984, A Fast and Efficient Cross Validation Method for Smoothing Parameter Choice in Spline Regression, Journal of American Statistics Association, 79, pp. 584589. [50] R. Johnson, and D. Wichern, 2002, Applied Multivariate Statistical Analysis, (5th ed.), Prentice Hall, pp. 383398. [51] R. Mason, N. Tracy, and J. Young, 1995, Decomposition of 2 T for Multivariate control chart interpretation. Journal of Quality Technology, Vol. 27(2), pp. 99108.
