Treating fuzziness in subjective evaluation data

Treating fuzziness in subjective evaluation data

Information Sciences 176 (2006) 3610–3644 www.elsevier.com/locate/ins Treating fuzziness in subjective evaluation data Yoshiteru Nakamori a a,* , M...

416KB Sizes 0 Downloads 0 Views

Information Sciences 176 (2006) 3610–3644 www.elsevier.com/locate/ins

Treating fuzziness in subjective evaluation data Yoshiteru Nakamori a

a,*

, Mina Ryoke

b

School of Knowledge Science, Japan Advanced Institute of Science and Technology, 1-1 Asahidai, Tatsunokuchi, Ishikawa 923-1292, Japan b Graduate School of Business Sciences, University of Tsukuba, Tsukuba, Japan

Received 23 January 2005; received in revised form 1 February 2006; accepted 3 February 2006

Abstract This paper proposes a technique to deal with fuzziness in subjective evaluation data, and applies it to principal component analysis and correspondence analysis. In the existing method, or techniques developed directly from it, fuzzy sets are defined from some standpoint on a data space, and the fuzzy parameters of the statistical model are identified with linear programming or the method of least squares. In this paper, we try to map the variation in evaluation data into the parameter space while preserving information as much as possible, and thereby define fuzzy sets in the parameter space. Clearly, it is possible to use the obtained fuzzy model to derive things like the principal component scores from the extension principle. However, with a fuzzy model which uses the extension principle, the possibility distribution spreads out as the explanatory variable values increase. This does not necessarily make sense for subjective evaluations, such as a 5-level evaluation, for instance. Instead of doing so, we propose a method for explicitly expressing the vagueness of evaluation, using certain quantities related to the eigenvalues of a matrix which specifies the fuzzy parameter spread. As a numerical example, we present an analysis of subjective evaluation data on local environments.  2006 Elsevier Inc. All rights reserved.

*

Corresponding author. Tel.: +81 761 51 1755; fax: +81 761 51 1149. E-mail address: [email protected] (Y. Nakamori).

0020-0255/$ - see front matter  2006 Elsevier Inc. All rights reserved. doi:10.1016/j.ins.2006.02.015

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3611

Keywords: Subjective evaluation data; Fuzzy principal component analysis; Fuzzy correspondence analysis; Local environment evaluation

1. Introduction Two approaches have been proposed for extending fuzzy logic to principal component analysis. The first method introduces the concept of a fuzzy group, assigns membership values of data vectors in the fuzzy group, and then performs the principal component analysis in which the membership values are used as the weights of data vectors [18]. For example, when given attributes such as operating profit margin, size and growth rate for many information industry companies, a principal component analysis can be formulated by taking the sales ratio of the information industry department of each company as the data weight which corresponds to the membership value in the fuzzy group called ‘‘information industry’’. The second technique is a principal component analysis in which the data is given as fuzzy numbers [19]. For example, when given 5 years worth of the above attribute data on information industry companies, the possibility distribution of the attribute values is expressed, from that data, as fuzzy numbers. To reflect that possibility in the principal component, a linear programming problem is formulated, which includes an ordinary eigenvalue problem in principal component analysis, and the fuzzy principal component scores are derived by solving that. The data covered in this paper is 3-mode data, in which multiple objects were subjectively evaluated using multiple evaluation criteria. If multiple evaluators are oriented with ‘‘5 years of data’’, multiple objects oriented with ‘‘multiple information industry companies’’, and multiple evaluation criteria oriented with ‘‘a number of attributes such as operating margin, scale and growth rate’’, it is theoretically possible to apply the second technique above. Incidentally, traditional techniques for handling 3-mode data are roughly grouped into two types [1]. One is the PARAFAC model [4], and the other is the Tucker Model [15]. Research on both of these models is ongoing, but in the most general terms, these methods are being employed to discover models shared across all modes. Therefore, they are not suitable for use in expressing variation to understand each evaluation object or evaluation item, like that being emphasized in this paper. The data handled here are values obtained by subjectively evaluating various aspects of the objects with five levels. While they are crisp values, they are also ‘‘volatile’’ values, depending on the values of the evaluator and the situation. For example, in response to the question ‘‘Can fish caught in the rivers and ponds near the region where you live be eaten?’’, the data is comprised of

3612

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

five levels of answers, ranging from ‘‘No, they cannot be eaten at all’’ to ‘‘Yes, they can be eaten without any problems’’. Because they are not being asked their preferences, evaluators attempt to reply as objectively as possible, but the result is a data set with large variance due to factors such as sensitivity of evaluators to the environment. Here, this sort of data is called sensibility data, or ‘‘kansei data’’ in Japanese. When we attempt to extract the features of a sensibility data set using the conventional multivariate analysis technique, it is easiest to use average data relating to evaluators, and this contains a certain degree of information. However, to more effectively use information inherent in the evaluation of objects using human sensibility, it is important to develop techniques for modeling individual differences in evaluation and in the spread of vagueness. It is of interest to note that subjective evaluation data represented by linguistic terms has been extensively used in linguistic decision analysis, e.g., [2,17,6]. Here, we attempt modeling through a concept that differs from the technique of Yabuuchi and Watada [18,19]. This is one attempt to numerically quantify the vagueness inherent in data, but it is difficult to compare the superiority/inferiority of models based on different concepts or ideas when there is no specific external standard to be predicted. Here, we are attempting to solve problems where weights cannot be assigned to the objects of analysis or the evaluators, and therefore the first technique of Yabuuchi and Watada cannot be applied. As mentioned earlier, their second technique can be applied; in this research, however, we focused primarily on discovering the principal components and weight parameters which seem to preserve the differences of opinion among evaluators in the data space. Although Yabuuchi and Watada’s second method allows us to adopt certain evaluation criteria and uniquely determine the fuzzy weights, it is based on linear programming, and thus the approach does not entail preserving differences of opinion. On the other hand, the main topic in this research is comparing the objects in principal component analysis, and we maintain that it should be sufficient to derive relative fuzziness. The mathematical model of vague concepts was firstly introduced by Zadeh [20], using the notion of partial degrees of membership. Since then, the problem of efficiently constructing membership functions of fuzzy sets in a given particular application has been studied by many distinguished fuzzy scholars including Turksen [16], Kruse et al. [7], and Pedrycs [10] among others. The specific meaning of a vague concept in a proposition is usually evaluated in different ways for different assessments of an entity by different agents, contexts, etc. [11]. Huynh et al. [5] show that the context model [3] provides a practical framework for constructing membership functions of fuzzy concepts. This paper also tries to construct membership functions, but that express fuzziness of principal component scores, in a particular situation where we have to treat a set of subjective evaluation data. Unlike the regression analysis, the principal component analysis does not assume the existence of external variables which

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3613

can act as criteria to justify the membership functions. Therefore, in this paper, we try to construct membership functions that express the relative fuzziness between principal component scores. In the next section, we describe the background of this research, and the data structures handled. Then, we consider two methods for finding the fuzzy principal component scores: fuzzifying the sensibility data, and fuzzifying the weight parameters of the principal component model. The former method is used in analyzing the objects of evaluation. The latter method, on the other hand, provides a model for performing overall evaluations when new evaluation data has been obtained. Based on this model, we can obtain the fuzzy principal component scores from the extension principle. However, a possibility model which uses the extension principle has the additional feature of a possibility distribution which spreads out as the value of the explanatory variable increases. Accordingly, in this paper, we propose a method for explicitly expressing the evaluation vagueness, using certain quantities related to the eigenvalues of a matrix which specifies the fuzzy parameter spread. As a numerical example, we carry out an analysis of data obtained by having residents conduct a sensory evaluation of their local environment. This paper is primarily concerned with fuzzy principal component analysis; fuzzy correspondence analysis is described briefly as its direct application.

2. Purpose of research and model structure 2.1. Purpose of research We indicate the objects of evaluation with m = 1, 2, . . . , M, the evaluation items with n = 1, 2, . . . , N, and the evaluators with k = 1,2, . . . , K. For example, we handle 3-mode structure data as follows: • Evaluation objects: (m = 1) Examination student 1, (m = 2) Examination student 2, . . . • Evaluation items: (n = 1) Scholastic record, (n = 2) Human qualities, (n = 3) Future potential, . . . • Evaluators: (k = 1) Examiner A, (k = 2) Examiner B, . . . Here, the evaluation value zmnk of examiner k regarding evaluation object m from the standpoint of evaluation item n is often given as a 5-level value, but this is an extremely vague value, and in this paper is referred to as sensibility data. The data vector for evaluator k is written as zmk ¼ ðzm1k ; zm2k ; . . . ; zmNk Þt ;

zmnk 2 f1; 2; 3; 4; 5g

ð1Þ

3614

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

If we assume that the overall evaluation of examination student m by examiner k will be found by a linear weighted sum as follows: xmk ¼ a1k zm1k þ a2k zm2k þ    þ aNk zmNk

ð2Þ

then the weight vector itself ak ¼ ða1k ; a2k ; . . . ; aNk Þt

ð3Þ

is not absolute, and in many cases is ‘‘tacit’’. In fact, in pass/fail determination, for instance, the order of the examined students is determined by averaging the ‘‘tacit’’ overall evaluation values {xmk} with respect to the examiners, without factoring the individual evaluation of each examiner. Determination methods like that given above are often used in a variety of settings, but for applications such as university entrance examinations which carry a strong requirement for objectivity, the weight vector between examiners is often made uniform. One technique which can be used in this situation is principal component analysis. A weight vector can be determined from data averaged with respect to the examiners using principal component analysis, but the purpose here is to find fuzzy principal component scores which take into account the evaluation’s vagueness and fluctuation. Two possible methods of doing this are: • Find the membership function for a fuzzy weight vector which ‘‘in some sense’’ preserves the differences in the dispersion of the evaluator’s evaluation. • By calculating the fuzzy principal components of the object using the extension principle in fuzzy set theory, find the membership function for the principal component scores by all evaluators. Based on this logic, fuzzy regression analysis has been studied [8,9], but because an external criterion exists there the method of least squares is applicable, and a mapping from the data space to the parameter space can be found. This can be used to achieve the above objective ‘‘to some extent’’. However, a different approach is needed because this research does not assume the existence of external criteria (overall evaluation values). Since a criterion for measuring the absolute amount of ‘‘vagueness’’ does not exist, our objective is to find principal component scores which involve relative values of fuzziness. 2.2. Model structure There are three possible methods of finding fuzzy principal component scores:

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3615

(1) Fuzzifying sensibility data. (2) Fuzzifying weight parameters. (3) Fuzzifying both sensibility data and weight parameters. The last method is mathematically difficult to handle, and, as will become clear later, the same information must be duplicated to use in this method, so we shall investigate only the first two. • Average model: Both methods begin by building a principal component model using an evaluation data matrix with the evaluation objects and evaluation items as the indices of the rows and columns, which is obtained by averaging sensibility data with respect to the evaluators. The weight parameters of the average principal component model are identified by solving an eigenvalue problem for the variance–covariance matrix between evaluation items. This will be explained in Section 3. • Analysis of evaluation objects: For the first method, the average model described above is the final model, and fuzzified sensibility data is input separately into the average model to find the fuzzy principal component. This is an extremely easy method and makes it possible to discover which evaluation for which object has dispersed to what extent. However, when fuzzifying the sensibility data, it is necessary to determine the perspective, such as whether to stress the possibilities which the data can assume or to stress the variance of data. This will be treated in Section 4. • Overall evaluation model: In the second method, a fuzzy principal component model is built by fuzzifying the parameters of the average model. In the course of fuzzifying, the focus is on the parameters of the average model; identification of parameter vectors for individual evaluators is done so that differences of opinion are preserved as much as possible and the fuzzy parameters are found using these types of information. It is possible to calculate the fuzzy principal component (overall evaluation) if crisp evaluation data given as inputs to this model. In this paper, the model based on this method is called the fuzzy principal component model. This will be considered in Section 5. Here, when fuzzifying data or parameters, the fuzzy component spread differs depending on whether variance–covariance information is used or a possibility distribution is implemented. In this paper, fuzzification is done using a method like that of Tanaka and Ishibuchi [12,13] which emphasizes the possibilities of data. This is one idea, but in the environment evaluation which is shown as an example, differences of opinion may appear between the standpoints of the government and residents. Sometimes considering data possibilities is important, but in other cases that is actually not helpful, so it is necessary to decide the method on a case by case basis.

3616

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3. Data structure and average model 3.1. Data structure Generally speaking, in many cases the sensibility evaluation data which can be obtained does not have the complete set of three modes as described above. For example, due to time constraints, systems in which all examiners examine all examination students are unusual. Therefore, we assume a data structure like that shown below as a realistic set-up. Let E = {1, 2, . . . , K} be the set of evaluators, and O = {1, 2, . . . , M} be the set of objects of evaluation. When letting Em be the set of evaluators which evaluated object m, and letting Ok be the set of objects evaluated by evaluator k, we get M [ Em ; Em 6¼ /; 8m ð4Þ E¼ m¼1



K [

Ok ;

Ok 6¼ /; 8k

ð5Þ

k¼1

This includes special cases like the following: Case 1: The case where all objects are evaluated by all evaluators (complete 3-mode data): Em ¼ E; 8m;

jEm j ¼ jEj ¼ K;

jOk j ¼ jOj ¼ M

ð6Þ

Case 2: The case where only one object is evaluated by each evaluator: jOk j ¼ 1; 8k;

Em \ Em0 ¼ /; m 6¼ m0 ;

M X

jEm j ¼ K

ð7Þ

m¼1

Here, j Æ j indicates the number of elements in a set. The data treated in this paper as a sample application corresponds to Case 2 above, so the theory corresponding to Case 2 is developed in this paper. 3.2. Environment evaluation data In this paper we analyze questionnaire data relating to waterside spaces adjacent to the residence locality. The questionnaire survey was conducted in November 2001 using the direct distribution method, and was targeted at residents of Komatsu City and the town of Tsurugi, both located in Ishikawa Prefecture, Japan. Here we use data excerpted from that survey. The following is the list of the evaluation items used for numerical experiments in this paper:

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

(n = 1) (n = 2) (n = 3) (n = 4) (n = 5)

3617

Vegetation like reeds and water plants can be found. Water recreation (swimming, boating, etc.) is possible. Waterside barbecuing and camping are possible. River embankments are established. The water is clear.

The data used for analysis is indicated in Table 1. For reference, the table also gives sensory evaluation values relating to water quality and the pleasantness of the waterside: (Water quality) Water quality is good. (Pleasantness) Waterside space is pleasant. The evaluation objects were watersides near the following geographical sites. Figures in parentheses after the place name are the biochemical oxygen demand (BOD; mg/l, average for 1999), an indicator of water quality. Larger BOD values indicate a greater degree of contamination. (m = 1) (m = 2) (m = 3) (m = 4) (m = 5) (m = 6) (m = 7)

Hakusan Gokuchi Dike (BOD = 0.5). Mikawa Bridge (BOD = 0.6). Tatsunokuchi Bridge (BOD = 0.7). Nomi Bridge (BOD = 0.8). Tsurugashima Bridge (BOD = 0.9). Miyuki Bridge (BOD = 4.5). Ukiyanagi New Bridge (BOD = 5.2).

Each evaluator evaluated only one geographical site, corresponding to Case 2. 3.3. Average principal component scores Let the evaluator average vector for the object m be zm ¼ ðzm1 ; zm2 ; . . . ; zmN Þt

ð8Þ

where we define zmn ¼

1 X zmnk ; jEm j k2Em

jEm j > 0

Note that the following holds: X X ðzmk  zm Þ ¼ zmk  jEm jzm ¼ 0 k2Em

k2Em

ð9Þ

ð10Þ

3618

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

Table 1 Waterside subjective evaluation data m

k

n=1

n=2

n=3

n=4

n=5

Water quality

Pleasantness

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

1 1 1 1 5 3 4 5 5 1 1 5 3 2 2 3

5 1 4 2 2 1 5 5 1 3 4 5 2 1 5 2

5 1 1 5 2 1 5 5 5 1 4 5 2 1 5 2

2 5 3 5 2 5 5 3 5 5 4 1 4 5 2 5

2 2 1 5 1 1 3 5 1 3 1 1 3 1 1 1

3 3 2 4 2 3 2 5 2 3 2 2 3 1 1 2

3 4 3 4 2 3 4 5 3 4 1 3 3 4 1 3

2 2 2 2 2 2 2 2

17 18 19 20 21 22 23 24

2 2 5 4 3 4 5 2

2 2 1 5 2 2 3 4

5 4 5 4 2 2 5 4

4 4 5 3 3 2 4 4

1 4 2 3 3 3 3 2

3 3 3 3 3 4 5 2

3 3 3 3 3 3 3 3

3 3 3 3 3 3 3 3 3 3 3 3 3 3

25 26 27 28 29 30 31 32 33 34 35 36 37 38

5 1 4 4 2 1 3 1 3 2 2 2 2 3

1 3 1 2 1 2 5 1 1 2 1 2 2 5

2 3 4 4 1 4 4 3 5 2 2 4 4 5

5 3 4 5 5 5 4 4 3 5 2 5 4 4

4 4 4 4 2 4 2 3 4 5 2 1 2 2

3 3 3 2 3 3 4 3 3 3 3 1 3 3

3 3 4 3 3 4 5 3 3 3 3 3 2 3

4 4 4 4

39 40 41 42

5 3 2 5

5 1 5 4

5 1 1 1

5 5 4 5

2 3 1 3

3 3 2 3

3 3 2 3

5 5

43 44

5 4

4 5

1 1

1 5

2 1

2 2

2 3

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3619

Table 1 (continued) m

k

n=1

n=2

n=3

n=4

n=5

Water quality

Pleasantness

5 5 5 5 5 5 5 5

45 46 47 48 49 50 51 52

2 4 5 5 4 5 4 5

5 4 5 5 1 2 3 5

1 2 1 1 1 1 4 1

5 3 4 5 4 5 2 2

3 4 2 2 2 1 2 1

3 2 2 3 2 3 2 1

4 2 4 3 2 4 3 2

6 6 6 6 6 6 6 6 6 6 6 6

53 54 55 56 57 58 59 60 61 62 63 64

2 5 1 4 2 4 1 4 4 5 2 2

1 1 1 4 2 4 1 1 1 1 1 2

1 1 1 3 1 1 1 1 1 1 1 1

5 1 1 4 4 5 5 4 5 1 1 1

1 2 1 2 3 1 1 1 1 1 1 1

2 1 2 2 2 2 1 1 2 1 1 1

2 2 2 3 2 3 2 4 2 3 2 1

7 7 7 7 7 7 7 7 7

65 66 67 68 69 70 71 72 73

4 4 1 5 5 5 4 5 2

5 4 1 1 2 1 4 4 1

1 1 1 1 1 2 1 1 1

5 2 5 5 2 5 4 3 3

1 1 1 1 3 1 2 2 2

2 1 1 1 2 2 2 3 2

3 2 2 1 3 2 3 3 3

Let the variance–covariance matrix between evaluation items for the average data be 1 0 s11 s12    s1N C Bs B 21 s22    s2N C ð11Þ S¼B .. C .. .. C B .. @ . . A . . sN 1

sN 2

   sNN

where snn0 ¼

M 1 X ðzmn  zn Þðzmn0  zn0 Þ; M m¼1

zn ¼

M 1 X zmn M m¼1

ð12Þ

3620

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

Here, we set z0 ¼ ðz1 ; z2 ; . . . ; zN Þ

t

ð13Þ

and set zmk  z0 and zm  z0 anew to zmk and zm. Therefore, the following holds in the rest of this paper: M M X X 1 X zm ¼ zmk ¼ 0 ð14Þ jEm j k2Em m¼1 m¼1 We find the eigenvalues and eigenvectors of the variance–covariance matrix S by Sap ¼ kp ap ;

p ¼ 1; 2; . . . ; N

ð15Þ

where we let t

ap ¼ ðap1 ; ap2 ; . . . ; apN Þ ;

atp ap ¼ 1; 8p;

k1 P k2 P    P kN P 0 ð16Þ

From the above, the pth principal component score xpm for the object m due to the average data is given by xpm ¼ atp zm

ð17Þ

As described above, zm is actually found by subtracting the average vector z0 for the evaluator and object, so z0 is mapped to the origin of the space spanned by the principal component axes. 3.4. Average model for environment evaluation When we work from the data given in Table 1: fzmnk ; m ¼ 1; 2; . . . ; 7; n ¼ 1; 2; . . . ; 5; k ¼ 1; 2; . . . ; 73g

ð18Þ

and find the average data for evaluators: fzmn ; m ¼ 1; 2; . . . ; 7; n ¼ 1; 2; . . . ; 5g

ð19Þ

the results are as shown in Table 2. Table 2 Average data for evaluators

m=1 m=2 m=3 m=4 m=5 m=6 m=7

n=1

n=2

n=3

n=4

n=5

Water quality

Pleasantness

2.688 3.375 2.500 3.750 4.300 3.000 3.889

3.000 2.625 2.071 3.750 3.900 1.667 2.556

3.125 3.875 3.357 2.000 1.400 1.167 1.111

3.813 3.625 4.143 4.750 3.600 3.083 3.778

2.000 2.625 3.071 2.250 2.000 1.333 1.556

2.500 3.250 2.860 2.750 2.200 1.500 1.800

3.125 3.000 3.210 2.750 2.900 2.330 2.440

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3621

Here, z0 is calculated as z0 ¼ ð3:357; 2:796; 2:291; 3:827; 2:119Þ

t

ð20Þ

then subtracting this from the original average data shown in Table 2, we have 8 z1 ¼ ð0:669; 0:204; 0:834; 0:014; 0:119Þt > > > > t > > z ¼ ð0:018; 0:171; 1:584; 0:202; 0:506Þ > > 2 > t > > > < z3 ¼ ð0:857; 0:725; 1:066; 0:316; 0:952Þ t ð21Þ z4 ¼ ð0:393; 0:954; 0:291; 0:923; 0:131Þ > > t > > z5 ¼ ð0:943; 1:104; 0:891; 0:227; 0:119Þ > > > > > > z6 ¼ ð0:357; 1:129; 1:124; 0:744; 0:786Þt > > : z7 ¼ ð0:532; 0:240; 1:180; 0:049; 0:563Þt Calculating the variance–covariance matrix using Eqs. (11) and (12), we obtain 1 0 0:377 0:310 0:375 0:018 0:115 B 0:310 0:580 0:083 0:184 0:031 C C B C B C 0:375 0:083 1:125 0:119 0:476 S¼B ð22Þ B C B C @ 0:018 0:184 0:119 0:229 0:137 A 0:115 The eigenvalues and 8 > k ¼ 1:521; > > 1 > > > > < k2 ¼ 0:801; k3 ¼ 0:168; > > > > > k4 ¼ 0:094; > > : k5 ¼ 0:033;

0:031

0:476

0:137

0:306

eigenvectors of S are calculated as follows: t

a1 ¼ ð0:355; 0:162; 0:839; 0:089; 0:369Þ a2 ¼ ð0:382; 0:800; 0:176; 0:360; 0:230Þ

t t

a3 ¼ ð0:385; 0:157; 0:380; 0:792; 0:236Þ

ð23Þ

t

a4 ¼ ð0:637; 0:459; 0:088; 0:006; 0:614Þ

a5 ¼ ð0:418; 0:314; 0:335; 0:485; 0:616Þt

The first and second principal component scores (x1m, x2m) for objects m = 1, 2, . . . , 7 due to the average data are calculated as follows: 8 ðx12 ; x22 Þ ¼ ð1:520; 0:193Þ ðx11 ; x21 Þ ¼ ð0:859; 0:022Þ; > > > > < ðx13 ; x23 Þ ¼ ð1:695; 0:386Þ; ðx14 ; x24 Þ ¼ ð0:408; 1:225Þ ð24Þ > ðx15 ; x25 Þ ¼ ð1:324; 0:978Þ; ðx16 ; x26 Þ ¼ ð0:990; 1:687Þ > > > : ðx17 ; x27 Þ ¼ ð1:352; 0:345Þ Fig. 1 shows a plot of these scores on a two-dimensional plane. Looking at the a1 and a2 components, it appears that the first axis (horizontal axis) reflects evaluation of water quality, and the second axis (vertical axis) reflects the degree to which the natural conditions remain.

3622

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

Fig. 1. Average principal component model.

4. Fuzzy data mapping 4.1. Fuzzifying sensibility data Here we fuzzify the average data vector zm, and introduce the fuzzy vector Z m ¼ ðZ m1 ; Z m2 ; . . . ; Z mN Þ

t

First we calculate the variance–covariance matrix 1 X t Tm ¼ ðzmk  zm Þðzmk  zm Þ jEm j k2Em

ð25Þ

ð26Þ

Here, we assume that Tm is a positive definite matrix. Assuming cm to be a positive real number which will be determined later, we let D Z m ¼ cm  T m

ð27Þ

and define the fuzzy vector Zm with the following multi-dimensional membership function: t

lZ m ðzÞ ¼ expfðz  zm Þ D1 Z m ðz  zm Þg

ð28Þ

The parameter cm can be set as follows [14]. That is, taking a certain real number h 2 (0, 1), we find the minimum cm satisfying the following inequality: minflZ m ðzmk Þg P h k2Em

ð29Þ

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3623

That value is t

cm ¼

maxk2Em fðzmk  zm Þ T 1 m ðzmk  zm Þg  log h

ð30Þ

This is a method which reflects the approach of an analyst who asks the question: to what degree should data possibilities be incorporated? Since an external standard (overall evaluation) does not exist, h must be determined subjectively. However, as will be shown later, it is possible to find the relative fuzziness of principal components, and thereby compare the vagueness of evaluation with respect to objects. Note 1: Here we explain the meaning of fuzzifying the above data for the case of one-dimensional data. As shown in Fig. 2, we assume that five data points (we have used only five to make the situation easier to visualize) are dispersed around the average, and that the leftmost data point is farthest from the average. This is a one-dimensional case, so lZ m ðzÞ given by Eq. (28) is a bellshape with left–right symmetry, and this is regarded as the possibility distribution of the data. Here, the diagram on the left is the possibility of the leftmost data point set as h1, and h2 (>h1) is the diagram on the right. Determination of the spread of the membership function depends on how we estimate the possibility of the data occurring most distant from the average. Our approach is to achieve this sort of spread by multiplying the variance–covariance matrix Tm by cm. 4.2. Fuzzy principal components (1) If we apply the extension principle [20], the membership function for the fuzzy pth principal component Xpm of the object m is given as follows: n o lX pm ðxÞ ¼ max lZ m ðzÞjx ¼ atp z ð31Þ z

This can be found by solving the following optimization problem [14]: ( t minimize J ðzÞ ¼ ðz  zm Þ D1 Z m ðz  zm Þ subject to x ¼ atp z

Fig. 2. Membership function as a possibility distribution.

ð32Þ

3624

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

That is, if we introduce the Lagrange multiplier k and set   t t Lðz; kÞ ¼ ðz  zm Þ D1 Z m ðz  zm Þ þ k x  ap z

ð33Þ

and solve oLðz; kÞ ¼ 0; oz

oLðz; kÞ ¼0 ok

ð34Þ

then we obtain z ¼ zm þ

x  atp zm DZ ap atp DZ m ap m

Substituting this for J(z), we have 8  2 9 > = < x  atp zm > lX pm ðxÞ ¼ exp  > atp DZ m ap > ; :

ð35Þ

ð36Þ

Note 2: The membership function for the fuzzy pth principal component found above can be interpreted as follows. That is, the orthogonal projection onto the pth principal component axis for the individual data vector is xpmk ¼ atp zmk

ð37Þ

When this is averaged over evaluators, it becomes xpm ¼ atp zm

ð38Þ

The variance is given by atp T m ap . The value of this multiplied by cm is atp DZ m ap . Note 3: If there is no change in data, or if the change is extremely small, the fuzzy vector Zm is defined by the following equation:  1; z ¼ zm lZ m ðzÞ ¼ ð39Þ 0; otherwise In this case, we have  1; x ¼ atp zm lX pm ðxÞ ¼ 0; otherwise

ð40Þ

Note 4: In modeling with a Gaussian distribution, if xpm ¼ ap1 zm1 þ ap2 zm2 þ    þ apN zmN

ð41Þ

and the zmn’s are mutually independent in accordance with the Gaussian distribution N ðlmn ; r2mn Þ, then xpm follows the Gaussian distribution: ! N N X X 2 2 N apn lmn ; apn rmn ð42Þ n¼1

n¼1

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3625

Thus we can calculate the distribution of principal component scores. However, a chi-square test confirms that the environment evaluation data used in this paper does not satisfy the independence hypothesis. We shall note that, even if we choose to ignore this fact and graph the distribution of xpm, the results closely resemble the results shown later in Section 4.4. 4.3. Relative fuzziness Using the above method, it is possible to express the relative fuzziness of the principal component scores for objects as shown in Fig. 3. Here, the membership function for the fuzzy principal component score in the plane of the first and second principal component axes is defined by the following equation: lX 1m X 2m ðx1 ; x2 Þ ¼ lX 1m ðx1 Þ  lX 2m ðx2 Þ (  2  2 ) x1  at1 zm x2  at2 zm ¼ exp   at1 DZ m a1 at2 DZ m a2

ð43Þ

Fig. 3 shows the a-level sets described by fðx1 ; x2 ÞjlX 1m X 2m ðx1 ; x2 Þ P ag

ð44Þ

2nd principal component

Student 5

Student 1

Student 6

1st principal component

Student 4

Student 2

Student 3

Fig. 3. Conceptual diagram of fuzzy principal component scores.

3626

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

Table 3 Values of parameter cm h

0.005

0.01

0.05

0.10

0.25

0.50

c1 c2 c3 c4 c5 c6 c7

1.91 1.27 1.58 2.43 1.63 2.08 1.51

2.19 1.46 1.82 2.80 1.88 2.39 1.74

3.37 2.24 2.80 4.30 2.89 3.67 2.67

4.39 2.92 3.64 5.60 3.76 4.78 3.48

7.29 4.84 6.05 9.29 6.24 7.93 5.77

14.6 9.69 12.1 18.6 12.5 15.9 11.5

The ellipses in Fig. 3 express the fuzziness of the principal component scores, but their size is not necessarily absolute and they are the results of subjective judgment. However, the size of the ellipses can be regarded as indicative of relative size of fuzziness. This is the meaning of relative fuzziness in this paper. 4.4. Principal component analysis for environmental evaluation data by fuzzifying data We calculate the variance–covariance matrices T1, T2, . . . , T7 defined by Eq. (26), and find the matrix DZ m defined by Eq. (27). Here, the value of cm is calculated as in Table 3 for a number of h values. Using h = 0.005 from Table 3, and graphing the sets given by Eq. (44) taking a = 0.9 and a = 0.7, we obtain Figs. 4 and 5.

Fig. 4. Analysis by fuzzifying data: a = 0.9.

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3627

Fig. 5. Analysis by fuzzifying data: a = 0.7.

In the above method, it is difficult to decide how to determine the h taken into account in Eq. (29) and the a used when graphing. Interpretation of h is particularly difficult. Nevertheless, it is thought to express the relative fuzziness of sensibility data, as described above.

5. Fuzzy principal component model In this section, we construct the following fuzzy principal component model by fuzzifying the weight parameters: X p ¼ Ap1 z1 þ Ap2 z2 þ    þ ApN zN

ð45Þ

Here, Xp is the fuzzy number indicating the pth principal component, Apn is the fuzzy number indicating the weight of evaluation item n, and zn indicates the crisp evaluation value for evaluation item n. Modeling in this way allows us to output as a fuzzy number the principal component score for a new evaluation vector not used in constructing the model. 5.1. Model identification The membership function for the fuzzy vector Ap ¼ ðAp1 ; Ap2 ; . . . ; ApN Þ

t

ð46Þ

3628

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

is defined by t

lAp ðaÞ ¼ expfða  ap Þ D1 Ap ða  ap Þg

ð47Þ

Here, the center of the membership function is given by the eigenvector of the variance–covariance matrix S found with Eq. (15): ap ¼ ðap1 ; ap2 ; . . . ; apN Þ

t

ð48Þ

On the other hand, DAp , which governs the spread, is established as follows. First, the weight vector specific to evaluator k apk ¼ ðap1k ; ap2k ; . . . ; apNk Þ

t

ð49Þ

is defined by the following equation: apk ¼ ap þ ðzmk  zm Þ;

k 2 Em ; 8p

ð50Þ

This takes into account the dispersion of the evaluation of evaluator k, and the following equation holds due to Eq. (10): K M X 1 X 1 X apk ¼ ap þ ðzmk  zm Þ ¼ ap K k¼1 K m¼1 k2Em

ð51Þ

Here, the variance–covariance matrix of {apk} is a common matrix for p, as indicated below: K K 1 X 1 X t t ðapk  ap Þðapk  ap Þ ¼ ðzmk  zm Þðzmk  zm Þ K k¼1 K k¼1 PM M X 1 X t m¼1 jEm j  T m ¼ ðzmk  zm Þðzmk  zm Þ ¼ P M K m¼1 k2Em m¼1 jEm j



ð52Þ

Then we write DAp as DA and define DA ¼ R

ð53Þ

DA is given by the weighted average of Tm using jEmj, and variance is emphasized for geographical sites with many data points. For example, if N = 2 and K = 7, the positional relationship of evaluator weight parameters and the alevel sets are shown in Fig. 6. Note 5: For Case 1 (jOkj = jOj = M) or in the general case (jOkj > 1, $k), if we define X apk ¼ ap þ ðzmk  zm Þ; 8p ð54Þ m2Ok

then the average of evaluators’ parameters coincides with ap, and the variance is given as

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3629

Fig. 6. Conceptual diagram of weight membership functions.



K X X 1 X t ðzmk  zm Þðzm0 k  zm0 Þ K k¼1 m2Ok m0 2O

ð55Þ

k

5.2. Fuzzy principal components (2) When a crisp evaluation vector z is given, the membership function for the fuzzy principal component score can be found by applying the extension principle [20] as follows: 8  2 9 > = < x  atp z > ð56Þ lX p ðxÞ ¼ maxflAp ðaÞjx ¼ at zg ¼ exp  a > zt D A z > ; : The equation above is obtained by solving the following optimization problem: ( minimize ða  ap Þt D1 A ða  ap Þ ð57Þ t subject to x ¼ a z Even in this case, using z = zm, it is possible to express the relative fuzziness of the principal component score for object m, as indicated in Fig. 7. Here, the membership function for the fuzzy principal component score in the first and second principal component plane is defined by the following equation:

3630

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

2nd principal component

Student 5

Student 1

Student 6

1st principal component

Student 4 Student 2

Student 3

Fig. 7. Conceptual diagram of fuzzy principal component scores.

lX 1m X 2m ðx1 ; x2 Þ ¼ lX 1m ðx1 Þ  lX 2m ðx2 Þ (  2  2 ) x1  at1 zm þ x2  at2 zm ¼ exp  ztm DA zm

ð58Þ

Fig. 7 indicates the a-level sets described by fðx1 ; x2 ÞjlX 1m X 2m ðx1 ; x2 Þ P ag

ð59Þ

However, from Eq. (58) it is clear that there is a problem with this method in that fuzziness depends on the length of the evaluation vector. The following attempts to correct this point. 5.3. Numerical quantification of vagueness The a-level set for the fuzzy principal component score in the pth–qth principal component plane is expressed by the following circle:  2  2 xp  atp z þ xq  atq z ¼ zt DA z  ð log aÞ; 0 < a < 1 ð60Þ One feature of a possibility linear model based on the extension principle is that the possibility spreads out as the value of the explanatory variable increases.

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3631

This is understandable in the case of a regression model, but in a 5-level evaluation, it is unnatural to assume that 5 has greater vagueness than 3. We can instead regard 3 as being the most vague. So we consider removing the effect of length of the evaluation data vector as a factor. First, we can try to correct the size of the circle as follows: 2

2

ðxp  atp zÞ þ ðxq  atq zÞ ¼

zt D A z  ð log aÞ zt z

ð61Þ

However, since our intention is not to indicate the absolute value of fuzziness, we propose drawing a circle like the following. First, letting kmax(DA) and kmin(DA) be respectively the maximum and minimum eigenvalues of the matrix DA, note that kmin ðDA Þ 6

zt D A z 6 kmax ðDA Þ zt z

ð62Þ

We assume that the fuzzy principal component score of the evaluation vector z in the pth–qth principal component plane can be expressed by the following circle:  2  2 xp  atp z þ xq  atq z ¼ r2 ð63Þ Here, we find the radius r so it satisfies r2  r2min k  kmin ðDA Þ ¼ r2max  r2min kmax ðDA Þ  kmin ðDA Þ

ð64Þ

where k¼

zt DA z zt z

ð65Þ

and rmax and rmin are design parameters indicating the maximum and minimum radius. The increased size of the k value means that there is a large possibility of opinion dispersal for the direction of the evaluation vector z. This is the primary idea being proposed in this paper. Note 6: We can now write the membership function of the pth principal component score as ( ) 2 ðx  atp zÞ lX p ðxÞ ¼ exp  ð66Þ k instead of the one given in Eq. (56), where k¼

zt DA z zt z

ð67Þ

3632

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

and z is any evaluation vector. If we define the membership function like this, it is possible to draw the a-level set directly, not using Eqs. (63) and (64). 5.4. Evaluation model of local environment based on fuzzification of weights Here we illustrate the proposed method using questionnaire data relating to waterside spaces adjoining the locality of residence. First, we use Eq. (50) to find the weight parameters peculiar to evaluator k: fapk ; k ¼ 1; 2; . . . ; 73g

ð68Þ

These variance–covariance matrices DA, which do not depend on the principal component p, are calculated as follows: 1 0 1:761 0:154 0:348 0:152 0:061 C B B 0:154 1:991 0:600 0:308 0:008 C C B C B ð69Þ DA ¼ B 0:348 0:600 1:463 0:151 0:124 C C B B 0:152 0:308 0:151 1:666 0:024 C A @ 0:061 The eigenvalues and 8 k1 ¼ 2:664; > > > > > > > < k2 ¼ 1:692; k3 ¼ 1:540; > > > > > k4 ¼ 1:079; > > : k5 ¼ 0:927;

0:008

0:124

0:024

1:020

eigenvectors of DA are calculated as follows: a1 ¼ ð0:375; 0:695; 0:504; 0:347; 0:050Þt t

a2 ¼ ð0:869; 0:466; 0:070; 0:120; 0:091Þ

a3 ¼ ð0:023; 0:257; 0:282; 0:917; 0:111Þ a4 ¼ ð0:285; 0:355; 0:525; 0:154; 0:703Þ

t

ð70Þ

t

a5 ¼ ð0:151; 0:329; 0:621; 0:018; 0:695Þt

The maximum and minimum eigenvalues for DA are kmax ðDA Þ ¼ 2:664;

kmin ðDA Þ ¼ 0:927

ð71Þ

and we set rmax ¼ 0:5;

rmin ¼ 0:05

ð72Þ

The fuzzy principal component scores when the average data zm (m = 1, 2, . . . , 7) is input to the model are given in Fig. 8. Fig. 8 differs from Figs. 4 and 5, but these were attempts to reflect the dispersion of the evaluations in the dispersion of the principal components, and Fig. 8 models the fuzziness of the evaluation vector for each geographical site. The latter stands on the idea of standardizing the evaluation vector when calculating the degree of vagueness.

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3633

Fig. 8. Fuzzy principal component scores for respective evaluated geographical sites.

The following are the evaluation vectors obtained from evaluation near the Mikawa Bridge (m = 2): 8 > < z21 ¼ ð2; 2; 5; 4; 1Þ; z22 ¼ ð2; 2; 4; 4; 4Þ; z23 ¼ ð5; 1; 5; 5; 2Þ z24 ¼ ð4; 5; 4; 3; 3Þ; z25 ¼ ð3; 2; 2; 3; 3Þ; z26 ¼ ð4; 2; 2; 2; 3Þ ð73Þ > : z27 ¼ ð5; 3; 5; 4; 3Þ; z28 ¼ ð2; 4; 4; 4; 2Þ When these subtracting z0 ¼ ð3:357; 2:796; 2:291; 3:827; 2:119Þ

ð74Þ

are input to the model, we obtain Fig. 9. The item indicated by 2 in Fig. 9 denotes the case where average data for 8 people was used, and the items indicated by 21, 22, . . . , 28 denote the cases where the respective evaluator data was input. Furthermore, Figs. 10 and 11 show the fuzzy principal component scores for a number of possible (extremal) input vectors. Looking at Fig. 10, we can see that the evaluation vectors from (1, 1, 1, 1, 1) to (5, 5, 5, 5, 5) lie on a single straight line. The small radii of the evaluation vectors (4, 4, 1, 4, 3) and (4, 4, 1, 4, 5) are attributable to the fact that they are nearly parallel with the eigenvector corresponding to the minimum eigenvector of DA. This suggests that there is a tendency for responses not to disperse in that direction (the direction where the average vector z0 is actually drawn). Conversely, the large radius of the evaluation vector (2, 1, 2, 5, 2) is because it is almost parallel with

3634

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

Fig. 9. Fuzzy principle component scores from 8 evaluators at geographical site 2.

Fig. 10. Fuzzy principal component scores (1).

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3635

Fig. 11. Fuzzy principal component scores (2).

the eigenvector which corresponds to the maximum eigenvalue of DA. This suggests that responses are dispersed in that direction. In other words, this indicates that there are many responses which seem to vary in proportion with the vector when the average vector is drawn from this vector. We can say that this signifies the fact that the evaluated geographical site is difficult to specify.

6. Correspondence analysis Here we shall consider the handling of sensibility data in correspondence analysis, by directly applying the technique of the previous section. First we construct an average model, and then we introduce relative fuzziness with respect to evaluation objects, and relative fuzziness with respect to evaluation items. 6.1. Average model We first normalize the evaluator average data {zmn} given by Eq. (9) as follows: pmn ¼

zmn ; z0

z0 ¼

M X N X m¼1 n¼1

zmn

ð75Þ

3636

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

and prepare the following correlation table: 0 1 p11 p12    p1N B p21 p22    p2N C B C P ¼B . .. C .. .. @ .. . A . . pM1 pM2    pMN

ð76Þ

Here, if we let pm ¼

N X

pn ¼

pmn ;

n¼1

M X

ð77Þ

pmn

m¼1

then the following equation holds: M N X X pm ¼ pn ¼ 1 m¼1

ð78Þ

n¼1

In correspondence analysis, a quantity xm is associated with evaluation object m, and a quantity yn is associated with evaluation item n, and we find the following vectors to maximize the correlation coefficient qxy defined below: t

x ¼ ðx1 ; x2 ; . . . ; xM Þ ;

y ¼ ðy 1 ; y 2 ; . . . ; y N Þ

t

ð79Þ

Here, the correlation coefficient is defined by the following equation: rxy qxy ¼ rx ry where r2x

¼

rxy ¼

M X

pm x2m

m¼1 M X



M X

!2 pm xm

r2y

;

¼

N X

m¼1 N X

pmn xm y n 

m¼1 n¼1

n¼1 M X m¼1

pm xm

N X

pn y n

pn y 2n



N X

ð80Þ

!2 pn y n

ð81Þ

n¼1

ð82Þ

n¼1

This maximization problem reverts to an eigenvalue problem, and the solution is given by the eigenvectors (see Note 8): ~i ¼ ð~xi1 ; ~xi2 ; . . . ; ~xiM Þt ; x t

~ yi ¼ ð~y i1 ; ~y i2 ; . . . ; ~y iN Þ ;

i ¼ 1; 2; . . . ; M

ð83Þ

i ¼ 1; 2; . . . ; N

ð84Þ

However, the first eigenvector is a meaningless solution, and in correspondence analysis, we check the proximity relationship between objects and evaluation items by plotting the following on a two-dimensional plane using the second and third eigenvectors: ð~x2m ; ~x3m Þ; m ¼ 1; 2; . . . ; M ð~y 2n ; ~y 3n Þ; n ¼ 1; 2; . . . ; N

ð85Þ ð86Þ

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3637

Note 7: The correlation coefficient introduced above is defined as follows. First we calculate the weighted averages of {x1, x2, . . . , xM} and {y1, y2, . . . , yN}: x ¼

M X

pm xm ;

y ¼

m¼1

N X

ð87Þ

pn y n

n¼1

and define the variance and covariance as follows: r2x ¼

M X

2

r2y ¼

pm ðxm  xÞ ;

m¼1

rxy ¼

M X N X

N X

2

pn ðy n  y Þ

ð88Þ

n¼1

pmn ðxm  xÞðy n  y Þ

ð89Þ

m¼1 n¼1

If the object m strongly responds to n then pmn becomes large. In this case we give the similar values to xm and yn, and make rxy large. ~ and ~ Note 8: Standardizing x y: ! M X 1 t ~ ¼ ð~x1 ; ~x2 ; . . . ; ~xM Þ ; ~xm ¼ xm  pm xm x ð90Þ rx m¼1 ! N X 1 t ~ ~y n ¼ y  p y ð91Þ y ¼ ð~y 1 ; ~y 2 ; . . . ; ~y N Þ ; ry n n¼1 n n and introducing the following matrices Px and Py: 0 1 0 p1 0    0 p1 B 0 p C B 0    0 2 B C B C Py ¼ B Px ¼ B .. C; .. .. B .. B .. @ . @ . . A . . 0 0    pM 0

0 p2 .. . 0



0

1

 0 C C C .. C .. . A .    pN

ð92Þ

we obtain the following necessary conditions: ~ ¼ 0; P~ y  qxy P x x

~  qxy P y ~ P tx y¼0

ð93Þ

We can solve these equations by transforming them into an eigenvalue problem for a symmetric matrix. 6.2. Relative fuzziness with respect to evaluation objects We define a quantity, as follows, indicating the variation in the evaluation by evaluator k of the evaluation object m: bmk ¼

N 1 X ðzmnk  zmn Þ; N n¼1

k 2 Em

ð94Þ

3638

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

Using this, the following vector is introduced:  1; k 2 Em t bk ¼ bmk ðd1k ; d2k ; . . . ; dMk Þ ; dmk ¼ 0; k 2 6 Em

ð95Þ

The pseudo-eigenvector peculiar to the evaluator k ~ik ¼ ð~xi1k ; ~xi2k ; . . . ; ~xiMk Þt x

ð96Þ

is defined by the following equation: ~ i þ bk ~ik ¼ x x

ð97Þ

Here, the following equation holds due to Eq. (10): K 1 X ~i ~ik ¼ x x K k¼1

ð98Þ

We introduce a fuzzy vector, whose components are fuzzy numbers X i ¼ ðX i1 ; X i2 ; . . . ; X iM Þt

ð99Þ

and define the membership function with the following equation:   ~i Þt D1 ~i Þ lX i ðxÞ ¼ exp ðx  x X i ðx  x

ð100Þ

The matrix DX i , which stipulates the spread of the membership function, is defined using the variance–covariance matrix: K 1 X ~i Þð~ ~ i Þt ð~ xik  x xik  x K k¼1 0 K P 2 0 B k¼1 b1k d1k B K B P b22k d2k 0 1B B ¼ B k¼1 KB .. .. B . . B @ 0 0





0



0

..

.. .

.



K P

1

b2Mk dMk

C C C C C C C C C A

ð101Þ

k¼1

Here, DX i does not depend on i, so we write DX, and set DX ¼ R

ð102Þ

When we introduce the vector t

am ¼ ðam1 ; am2 ; . . . ; amM Þ ;

 amm0 ¼

1;

m ¼ m0

0;

m 6¼ m0

ð103Þ

we obtain X im ¼ atm X i

ð104Þ

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3639

Applying the extension principle, the membership function of Xim is found as follows: (  2 )   ~i x  atm x t lX im ðxÞ ¼ max lX i ðxÞjx ¼ am x ¼ exp  t ð105Þ x am DX am We define the membership function for the fuzzy vector (X2m, X3m) by lX 2m X 3m ðx2 ; x3 Þ ¼ lX 2m ðx2 Þ  lX 3m ðx3 Þ (  2  2 ) ~2 þ x3  atm x ~3 x2  atm x ¼ exp  atm DX am

ð106Þ

As in the previous section, we let the following be the circle indicating relative fuzziness:  2  2 ~2 þ x3  atm x ~3 ¼ r2m ð107Þ x2  atm x and establish the radius rm with the following equation: r2m  r2min d m  minfd m g ¼ r2max  r2min maxfd m g  minfd m g

ð108Þ

where dm ¼

K atm DX am 1 X 1 X 2 ¼ b2mk dmk ¼ b t K k¼1 K k2Em mk am am

ð109Þ

6.3. Relative fuzziness with respect to evaluation items The vector ck indicating the response variation of an evaluator with respect to evaluation item n is defined as follows: t

ck ¼ ðc1k ; c2k ; . . . ; cNk Þ ;

cnk ¼ zmnk  zmn ; k 2 Em

ð110Þ

The pseudo-eigenvector peculiar to evaluator k ~ yik ¼ ð~y i1k ; ~y i2k ; . . . ; ~y iNk Þt

ð111Þ

is given by the following equation: ~ yi þ c k yik ¼ ~

ð112Þ

Here too, the following equation holds due to Eq. (10): K 1 X ~ik ¼ ~ yi y K k¼1

ð113Þ

3640

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

Just as we previously introduced the relative fuzziness with respect to the evaluation object, here too we introduce a fuzzy vector, whose components are fuzzy numbers t

Y i ¼ ðY i1 ; Y i2 ; . . . ; Y iN Þ

ð114Þ

The membership function is defined as follows: t

yi Þ D1 yi Þg lY i ðyÞ ¼ expfðy  ~ Y i ðy  ~

ð115Þ

Here too, DY i is defined by using the variance–covariance matrix K 1 X t ð~ yik  ~ yi Þð~ yik  ~ yi Þ K k¼1 0 K K P P c c c1k c2k 1k 1k B k¼1 k¼1 B K K BP P c2k c1k c2k c2k 1B B ¼ B k¼1 k¼1 KB .. .. B . . B K K @P P cNk c1k cNk c2k

T ¼

k¼1

  ..

.



k¼1

K P

1

c1k cNk C C C c1k cNk C C C k¼1 C .. C . C K A P cNk cNk k¼1 K P

ð116Þ

k¼1

This is not dependent on i, so we write DY i as DY, and define as follows: DY ¼ T

ð117Þ

Next, we introduce the vector t

an ¼ ðan1 ; an2 ; . . . ; anN Þ ;

 ann0 ¼

1; n ¼ n0 0; n ¼ 6 n0

ð118Þ

and map the fuzzy vector Yi onto the fuzzy numbers corresponding to the nth evaluation item Yin: Y in ¼ atn Y i

ð119Þ

Using the extension principle, we obtain the following membership function: (  2 ) t ~ y  a y i n lY n ðyÞ ¼ maxflY i ðyÞjy ¼ atn yg ¼ exp  t ð120Þ y an D Y an The membership function of the fuzzy vector (Y2n, Y3n) is defined as follows: lY 2n Y 3n ðy 2 ; y 3 Þ ¼ lY 2n ðy 2 Þ  lY 3n ðy 3 Þ (  2  2 ) y 2  atn ~ y2 þ y 3  atn ~y3 ¼ exp  atn DY an

ð121Þ

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

We define the relative fuzziness by the following equation:  2  2 y 2  atn ~ y2 þ y 3  atn ~ y3 ¼ s2n

3641

ð122Þ

The radius sn is calculated using the following equation: s2n  s2min d n  minfd n g ¼ 2 2 smax  smin maxfd n g  minfd n g

ð123Þ

where dn is given by the following equation: dn ¼

K M X an DY an 1 X 1 X ¼ c2nk ¼ ðzmnk  zmn Þ2 t K k¼1 K m¼1 k2Em an an

ð124Þ

Thus, the square of the radius is determined in such a way that it varies in proportion to the variance. Note that dn is given by the weighted average of the variance of the original data, as indicated below: n P o PM M X jEm j jE1m j k2Em ðzmnk  zmn Þ2 m¼1 1 X 2 dn ¼ ðzmnk  zmn Þ ¼ ð125Þ PM K m¼1 k2Em m¼1 jE m j 6.4. Application to subjective evaluation data on waterside environments Here we conduct correspondence analysis, in which relative fuzziness has been expressed, by using sensibility evaluation data on waterside environments. The second and third eigenvectors were found as follows: 1 1 0 0 0:8296 1:3480 B 1:0557 C B 0:0947 C C C B B C C B B B 1:2917 C B 0:9417 C C C B B C C B B x3 ¼ B 0:6092 C ð126Þ x2 ¼ B 0:5099 C; C C B B B 1:1373 C B 1:0063 C C C B B C C B B @ 0:7321 A @ 1:6744 A 1:1812 0:8953 1 1 0 0 1:0326 0:8225 C C B B B 0:6659 C B 1:9050 C C C B B C C y2 ¼ B y3 ¼ B ð127Þ B 1:9128 C; B 0:2835 C C C B B @ 0:1640 A @ 0:5305 A 0:7161 0:5799 When we set rmax ¼ smax ¼ 0:5;

rmin ¼ smin ¼ 0:05

ð128Þ

3642

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

we can find the centers and radii of circles indicating the evaluation objects m = 1, 2, . . . , 7 and evaluation items n = 1, 2, . . . , 5, as indicated below: 8 r1 ¼ 0:5000 m ¼ 1 : ðx21 ; x31 Þ ¼ ð0:8296; 1:3480Þ; > > > > > m ¼ 2 : ðx22 ; x32 Þ ¼ ð1:0557; 0:0947Þ; r2 ¼ 0:1531 > > > > > > r3 ¼ 0:3195 > < m ¼ 3 : ðx23 ; x33 Þ ¼ ð1:2917; 0:9417Þ; ð129Þ m ¼ 4 : ðx24 ; x34 Þ ¼ ð0:5099; 0:6092Þ; r4 ¼ 0:1757 > > > > m ¼ 5 : ðx25 ; x35 Þ ¼ ð1:1373; 1:0063Þ; r5 ¼ 0:0500 > > > > > m ¼ 6 : ðx26 ; x36 Þ ¼ ð0:7321; 1:6744Þ; r6 ¼ 0:3529 > > > : m ¼ 7 : ðx27 ; x37 Þ ¼ ð1:1812; 0:8953Þ; r7 ¼ 0:1532 8 n ¼ 1 : ðy 21 ; y 31 Þ ¼ ð1:0326; 0:8225Þ; s1 ¼ 0:4375 > > > > > s2 ¼ 0:5000 > < n ¼ 2 : ðy 22 ; y 32 Þ ¼ ð0:6659; 1:9050Þ; ð130Þ n ¼ 3 : ðy 23 ; y 33 Þ ¼ ð1:9128; 0:2835Þ; s3 ¼ 0:3398 > > > > n ¼ 4 : ðy 24 ; y 34 Þ ¼ ð0:1640; 0:5305Þ; s4 ¼ 0:4090 > > : n ¼ 5 : ðy 25 ; y 35 Þ ¼ ð0:7161; 0:5799Þ; s5 ¼ 0:0500

(n=2) Recreation

n=2 m=1 m=5

m=4 (n=5) Clean

n=3 m=2

n=1 n=4 (n=1) Water plans

m=7

n=5

m=3

m=6

(n=4) Embankments

Fig. 12. Results using the proposed technique.

(n=3) Barbecuing Camping

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

3643

Fig. 12 was obtained from the above. The figure shows circles, with radii corresponding to the relative fuzziness obtained using the proposed technique.

7. Conclusion In this paper we considered a new fuzzy principal component analysis technique for analyzing sensibility data. We investigated a method of fuzzifying data, and a method of fuzzifying weight parameters. In both cases, we attempted as much as possible to faithfully reflect the dispersion of evaluation values in sensibility evaluation due to the evaluators in the principal component score. However, the methods of achieving this reflection differed. Incidentally, one remaining problem is that ‘‘parameter setting is extremely ad hoc, and we would like to optimize this by introducing some kind of standard’’. However, it is impossible to consider an absolute value for ‘‘vagueness’’. This also does not mean there is a solid basis for setting the problem to keep the fuzzy eigenvalue spread to a minimum, in the Yabuuchi and Watada method. Consequently, in this paper, we have introduced the concept of relative fuzziness. The main purpose of principal component analysis is seeing the positional relationships of objects in a low-dimensional principal component space, but we introduced vagueness in position to this in a relative fashion. We then assumed that vagueness was a reflection of the manner of dispersion of evaluation data. This allowed us to see the distinctive features of the analysis method of evaluators. This could be used in considering combinations of test examiners to fairly conduct university or company entrance examinations. Finally, the proposed fuzzy principal component model was modified to eliminate effects of the evaluation vector length. In this way, the fuzziness of the principal component score is found using the relationship with the variance–covariance matrix which stipulates the fuzziness of the model parameters. The radii of the circles in several figures in the paper express the relative fuzziness of the locations of objects for evaluation or words used in evaluation. We can understand, looking at those figures, how opinions are spread in terms of objects or words. Such information is useful in decision-making or the next stage analysis using subjective evaluation. We also proposed correspondence analysis incorporating relative fuzziness, as a direct application of the second of the abovementioned methods for principal component analysis. Additionally, in this paper we developed theory relating to evaluation of a single object by evaluators; an issue for future study will be expanding this to cases where general incomplete 3-mode data is available.

3644

Y. Nakamori, M. Ryoke / Information Sciences 176 (2006) 3610–3644

References [1] P. Arabie, J.D. Carroll, W.S. DeSarbo, Three-way Scaling and Clustering, Sage Publications, 1987. [2] J.L. Garcı´a-Lapresta, A general class of simple majority decision rules based on linguistic opinions, Information Sciences 176 (2006) 352–365. [3] J. Gebhardt, R. Kruse, The context model: an integrating view of vagueness and uncertainty, International Journal of Approximate Reasoning 9 (1993) 282–314. [4] R.A. Harshman, M.E. Lundy, Data preprocessing and the extended PARAFAC model, in: H.G. Law, C.W. Snyder, J.A. Hattie, R.P. McDonald (Eds.), Research Methods for Multimode Data Analysis, Praeger, 1984, pp. 184–216. [5] V.N. Huynh, Y. Nakamori, T.B. Ho, G. Resconi, A context model for fuzzy concept analysis based upon modal logic, Information Sciences 160 (2004) 111–129. [6] V.N. Huynh, Y. Nakamori, A satisfactory-oriented approach to multiexpert decision-making with linguistic assessments, IEEE Transaction on Systems, Man, and Cybernetics, Part B: Cybernetics 35 (2) (2005) 184–196. [7] R. Kruse, J. Gebhardt, F. Klawonn, Numerical and logical approaches of fuzzy set theory by the context model, in: R. Lowen, M. Roubens (Eds.), Fuzzy Logic: State of the Art, Kluwer Academic Publishers, 1993, pp. 365–376. [8] Y. Nakamori, M. Ryoke, Modeling of fuzziness in multivariate data analysis, in: Proceedings of the SMC’99 (1999 IEEE International Conference on Systems, Man and Cybernetics) Tokyo, Japan, October 12–15, 1999, pp. 302–307. [9] Y. Nakamori, M. Ryoke, Fuzzy data analysis for three-way data, in: Proceedings of the Joint 9th IFSA World Congress and 20th NAFIPS International Conference—Fuzziness and Soft Computing in the New Millennium, Vancouver, Canada, June 25–28, 2001, pp. 2189–2194. [10] W. Pedrycz, Fuzzy equalization in the construction of fuzzy sets, Fuzzy Sets and Systems 119 (2001) 329–335. [11] G. Resconi, I.B. Turksen, Canonical forms of fuzzy truthoods by meta-theory based upon modal logic, Information Sciences 131 (2001) 157–194. [12] H. Tanaka, Fuzzy data analysis by possibilistic linear models, Fuzzy Sets and Systems 24 (1987) 363–375. [13] H. Tanaka, H. Ishibuchi, Identification of possibilistic linear systems by quadratic membership functions of fuzzy parameters, Fuzzy Sets and Systems 41 (1991) 145–160. [14] H. Tanaka, H. Ishibuchi, Soft Data Analysis, Asakura-Shoten, 1995. [15] L.R. Tucker, The extension of factor analysis to three-dimensional matrices, in: H. Gulliksen, N. Frederiksen (Eds.), Contributions to Mathematical Psychology, Holt, Rinehart and Winston, 1964, pp. 110–127. [16] I.B. Turksen, Measurement of membership functions and their acquisition, Fuzzy Sets and Systems 40 (1991) 5–38. [17] Z. Xu, A method based on linguistic aggregation operators for group decision making with linguistic preference relations, Information Sciences 166 (2004) 19–30. [18] Y. Yabuuchi, J. Watada, Fuzzy principal component analysis and its application, Journal of Biomedical Fuzzy Systems Association 3 (1997) 83–92. [19] Y. Yabuuchi, J. Watada, Y. Nakamori, Fuzzy principal component analysis for fuzzy data, in: Proceedings of the 6th IEEE International Conference on Fuzzy Systems, Barcelona, Spain, July 1–5, 1997, pp. 1127–1130. [20] L.A. Zadeh, The concept of linguistic variable and its application to approximate reasoning, Information Sciences 8 (1975) 199–249, II: 8 (1975), III: 9 43-80.