analiza in componente principale

30
ANALIZA ÎN COMPONENTE PRINCIPALE

Upload: alexandra-ronaldinha

Post on 11-Nov-2014

246 views

Category:

Documents


5 download

DESCRIPTION

Analiza in componente principale

TRANSCRIPT

Page 1: Analiza in componente principale

ANALIZA

ÎN

COMPONENTE PRINCIPALE

Page 2: Analiza in componente principale

IIINNNTTTRRROOODDDUUUCCCEEERRREEE

Studiem cu ajutorul acestei metode un tabel indivizi x variabile, în cazul în care toate variabilele sunt numerice.

3UH]HQW P� PDL� vQWkL� R� DERUGDUH� H[SORUDWRDUH� FDUH� SHUPLWH� GHVFULHUHD�indivizilor în multiplele lor� GLPHQVLXQL� úL� YL]XDOL]DUHD� DFHVWRU� UHOD LL� vQWUH�variabile.

8UPHD] � $QDOL]D� vQ� &RPSRQHQWH� 3ULQFLSDOH� �$�&�3��� $FHDVW � PHWRG �SHUPLWH�RE LQHUHD�XQHL�K U L�D�LQGLYL]LORU�vQ�IXQF LH�GH�DVHP Q ULOH�GLQWUH�HL�úL�R�KDUW �D�YDULDELOHORU�vQ�IXQF LH�GH�FRUHOD LLOH lor.

(VWH�GH�DVHPHQL�SRVLELO �RE LQHUHD�XQHL�UHSUH]HQW UL�VLPXOWDQH�SH�R�KDUW �D�LQGLYL]LORU�úL�D�YDULDELOHORU�

0DL�PXOW��HVWH�QHFHVDU�V �FRPSOHW P�UHSUH]HQWDUHD�JUDILF ��D�GDWHORU��FX�R��tipologie a indivizilor.

3UH]HQW P� GH� DVHPHQL� PHWRGD� GH� FODVLILFDUH� DVFHQGHQW � LHUDUKLF � FDUH�IRORVHúWH��FULWHULXO�OXL�:DUG��IRDUWH�ELQH�DGDSWDW�OD�WUDWDUHD�GDWHORU�QXPHULFH�

Analiza exploratoare a datelor multidimensionale

7DEHOXO���YD�VHUYL�GUHSW�ILU�FRQGXF WRU�SHQWUX�DFHDVW �SUH]HQWDUH��/LQLLOH�WDEHOXOXL� UHSUH]LQW � PRGHOH� GH� PDúLQL� GLQ� DQXO� ����� LDU� FRORDQHOH��FDUDFWHULVWLFLOH� WHKQLFH�� FDSDFLWDWHD� FLOLQGULF � VDX� FLOLQGUHH�� SXWHUH�� YLWH] ��JUHXWDWH��OXQJLPH��O LPH�

Tabelul 1 Nr. Crt. Model Cilindree Putere 9LWH] Greutate Lungime / LPH

Caracteristicile celor 24 modHOH�GH�PDúLQL

Page 3: Analiza in componente principale

Studiul descriptiv al indivizilor

�&HL����LQGLYL]L�SRW�IL�UHSUH]HQWD L�vPSUHXQ �FX�FHOH���FDUDFWHULVWLFL�DOH�ORU�utilizând graficul în stea din figura 1.

Figura 1

Fiecare individ este reprezentat printr-un hexagon. Fiecare vârf al

KH[DJRQXOXL�FRUHVSXQGH�XQHL�YDULDELOH��3HQWUX�LQGLYLGXO�UHSUH]HQWDW��GLVWDQ D�GH�OD�YkUI�OD�RULJLQH�HVWH��SURSRU LRQDO �FX�DEDWHUHD�YDORULORU�YDULDELOHL�ID �GH�YDORDUHD�HL�PLQLP ��HD�HVWH�PLQLP �GDF �FDUDFWHULVWLFD�HVWH�PLQLP �úL�PD[LP �GDF �FDUDFWHULVWLFD�HVWH�PD[LP �

&HL����LQGLYL]L��VXQW�UHSUH]HQWD L�vQ�ILJXUD��� 6 �SXQFW P�FkWHYD�FD]XUL�SDUWLFXODUH� � 3HXJHRW� ����5DOO\H�� 6HDW� ,EL]D� 6;L� úL�&LWURsQ$;�6SRUW� DX� YLWH]H�

PDUL�úL�SXWHUH�PDUH�vQ�Uaport cu celelalte caracteristici ale lor � 1LVVDQ�9DQHWWH�úL�9:�&DUDYHOOH�VH�FDUDFWHUL]HD] �SULQ�YLWH]H�PLFL�

Grafic în stea

Putere

Cilindree Greutate

9LWH]

/ LPH Lungime

Page 4: Analiza in componente principale

Figura 2

Graficele în stea pentru indivizi

� 5HQDXOW����DUH�R�SXWHUH�PLF ��vQ�UDSRUW�FX�FLOLQGUHHD�VD��$FHVWD�HVWH�

un diesel. ÌQ�JHQHUDO��PXO LPHD�FDUDFWHULVWLFLORU�HYROXHD] �vQ�DFHODúL�VHQV� *UDILFHOH�vQ�VWHD��FUHVF�UHJXODW�GH�OD�PDúLQLOH�PLFL��SUHFXP�)RUG�)LHVWD�úL�

)LDW�8QR��OD�FHOH�PDL�PDUL��SUHFXP�%09���,�5RYHU����L�úL�5HQDXOW����

Page 5: Analiza in componente principale

Studiul descriptiv al variabilelor Tabelul 2

Rezumate statistice ale datelor Statistici elementare

Variabila Media Dispersia Abaterea

medie S WUDWLF

Minim Maxim

&RUHOD LL

Variabile Cilindree Putere 9LWH] Greutate Lungime

ÌQ� WDEHOXO� �� VXQW� SUH]HQWDWH� FkWHYD� VWDWLVWLFL� HOHPHQWDUH� úL� PDWULFHD�

FRUHOD LLlor dintre variabile. 3UHFL] P� F � GLVSHUVLLOH� VXQW� FDOFXODWH� vPS U LQG� SULQ� Q� úL� QX�SULQ� �Q-1),

GHRDUHFH�HVWH�YRUED�GH�R� DQDOL] � JHRPHWULF � D�GDWHORU� úL� QX�H[LVW � LQIHUHQ �VWDWLVWLF �

�6WDWLVWLFD� LQIHUHQ LDO � VWXGLD] � XQ� HúDQWLRQ� úL� WUDJH� FRQFOX]LL� SHQWUu vQWUHDJD�PXO LPH��

3HQWUX� D� DYHD� R� YL]LXQH� FRPSOHW � D� GDWHORU� úL� D� LQWHU-UHOD LLORU� vQWUH�variabile, am construit graficul din figura 3.

Page 6: Analiza in componente principale

Figura 3

*UDILFXO�FRUHOD LLORU�LQWHU-variabile

Toate variabilele sunt corelate pozitiv între ele. AceaVWD� vQVHDPQ � F �

H[LVW � XQ� IDFWRU� P ULPH� �WDOLH�� úL� F � OD� R� SULP � DQDOL] �� PDúLQLOH� SRW� IL�RUGRQDWH� GH� OD� FHOH�PDL�PLFL� OD� FHOH�PDL�PDUL��$FHDVWD� VH� YHGH�GH� DOWIHO� úL�examinând global graficele în stea din figura 2.

9LWH]D�HVWH�IRDUWH�FRUHODW �FX�SXWHUHD�úL�PDL�SX LQ�FX�DOWH�YDULDELOH�

Variabilele grupului (Cilindree, Lungime, Greutate) sunt puternic corelate între ele:

Cor (Cilindree, Lungime) = 0.86 Cor (Cilindree, Greutate) = 0.90 Cor (Lungime, Greutate) = 0.92

&RUHOD LL�LPSRUWDQWH�PDL�VXQW�FHOH�GLQWUH�/XQJLPH�úL�/ LPH��������úL�FHOH�GLQWUH�3XWHUH�úL�9LWH] �������

Page 7: Analiza in componente principale

3XWHP� V � UH]XP P� úL� V � YL]XDOL] P� DFHDVW � SULP � DQDOL] � SULQWU-o FODVLILFDUH� LHUDUKLF � DVFHQGHQW � D� YDULDELOHORU�� OXkQG� GUHSW� LQGLFH� GH�VLPLODULWDWH�vQWUH�YDULDELOH��FRUHOD LLOH�ORU��DLFL�WRDWH pozitive).

'DF � FRUHOD LLOH� � YDULDELOHORU� DX� YDORUL� QHJDWLYH�� OX P� vQ� FRQVLGHUD LH�valorile lor absolute.

6 �GHVFULHP�DFHDVW �PHWRG � ÌQ� SULPD� HWDS � VH� UHJUXSHD] � FHOH� GRX � YDULDELOH� FHOH� PDL� FRUHODWH��

*UHXWDWH�úL�/XQJLPH��������� ÌQ�D�GRXD�HWDS ��VH�FDXW �FHD�PDL�SXWHUQLF �FRUHOD LH�FDUH�D�U PDV� (VWH�YRUED�GH�FRUHOD LD�GLQWUH�YDULDELOHOH�*UHXWDWH�úL�&LOLQGUHH������� Variabila Cilindree se va uni deci cu grupul (Greutate, Lungime) ÌQ� D� WUHLD� HWDS �� JUXSXO� �3XWHUH�� 9LWH] �� VH� IRUPHD] � FX� R� FRUHOD LH� GH�

0.894. ÌQ�D�SDWUD�HWDS ��/ LPHD�vQWkOQHúWH�JUXSXO��&LOLQGUHH��*UHXWDWH��/XQJLPH�� �&RU�/ LPH��/XQJLPH� �������� ùL�� vQ� VIkUúLW�� vQ� D� FLQFHD� HWDS �� FHOH� GRX � JUXSXUL� �3XWHUH�� 9LWH] �� úL�

�&LOLQGUHH��*UHXWDWH��/XQJLPH��/ LPH��IX]LRQHD] � &HD�PDL�SXWHUQLF �FRUHOD LH�vQWUH�R�YDULDELO �D�SULPXOXL�JUXS�úL�R�YDULDELO �

a celui de-DO�GRLOHD�JUXS�HVWH�HJDO �FX��������FHD�GLQWUH�3XWHUH�úL�&LOLQGUHH� Acest procedeu de agregare iterativ este vizualizat în figura 4 printr-o

GHQGRJUDP � Indicele de agregare calculat este cHD� PDL� SXWHUQLF � FRUHOD LH� vQWUH�

YDULDELOHOH�XQXL�JUXS�úL�FHOH�DOH�FHOXLODOW�JUXS��vQ�PRPHQWXO�UHJUXS ULL�

Figura 4 &ODVLILFDUHD�LHUDUKLF �DVFHQGHQW �D�YDULDELOHORU

0HWRGD�FRUHOD LLORU�PD[LPH

. Indice de agregare Variabila

3XWHP� GH� DVHPHQL� V � P VXU P� DVLPLODULWDWHD� vQWUH� ILHFDUH� YDULDELO � úL�

PXO LPHD� WXWXURU� YDULDELOHORU� �LQFOXVLY� HD� vQV úL�� XWLOL]kQG� S WUDWHOH�FRUHOD LLORU��(VWH�YRUED�GHFL�GH�D�P VXUD�LPSRUWDQ D�XQHL�YDULDELOH�

'H�H[HPSOX�� LPSRUWDQ D�YDULDELOHL�&LOLQGUHH�HVWH�FDOFXODW � I FkQG�PHGia WXWXURU�S WUDWHORU�FRUHOD LLORU�VDOH�FX�PXO LPHD�GH�YDULDELOH������

715.06

29.4)709.0864.0905.0693.0861.01(

6

1 22222 ==+++++

Page 8: Analiza in componente principale

7DEHOXO� �� FRQ LQH� VLPLODULWDWHD� ILHF UHL� YDULDELOH� FX� vQWUHDJD� PXO LPH� D�variabilelor:

Tabelul 3

$VLPLODULWDWHD�ILHF UHL�YDULDELOH�FX�PXO LPHD�WXWXURU�YDULDELlelor. $VWIHO��YDULDELOD�FDUH�UH]XP �FHO�PDL�ELQH�PXO LPHD�FHORU���YDULDELOH�HVWH�

FLOLQGUHHD��9LWH]D�HVWH�R�YDULDELO �PDL�LQGHSHQGHQW �ID �GH�FHOHODOWH� Analiza în Componente Principale 'DWHOH� FDUH� WUHEXLH� DQDOL]DWH�VH�SUH]LQW � VXE� IRUPD�XQXL� WDEHO� LQGLYizi x

YDULDELOH�� ([LVW � S� YDULDELOH� � PJ XXX ,.....,,......,1 observate pentru n indivizi ������L����Q�� 1RW P� FX� ijx � YDORDUHD� OXDW � GH� YDULDELOD� jX pentru individul i,

),.....( 1 ipii xxx = PXO LPHD�FDUDFWHULVWLFLORU pentru individul i, ∑=

−=n

ijijjj xx

nsx

1

22 )(1

,

úL� js �PHGLD��GLVSHUVLD�úL�DEDWHUHD�PHGLH�S WUDWLF �D�YDULDELOHL� jX . 3UHFL] P� F � GLIHUHQ HOH� � vQWkOQLWH� vQWUH� SURJUDPHOH� GH� $QDOL] � vQ�

&RPSRQHQWH� 3ULQFLSDOH� IUDQFH]H� úL� Dmericane, la nivelul calculului FRPSRQHQWHORU� SULQFLSDOH�� SURYLQ� GLQ� vPS U LUHD� FX� Q� VDX� �Q-1) în calculul dispersiei.

$QDOL]D�vQ�&RPSRQHQWH�3ULQFLSDOH�FRQVW �vQ�F XWDUHD�XQXL�QXP U�PLF�GH�variabile noi mYY ,......,1 numite componente principale QHFRUHODWH� vQWUH� HOH� úL�FDUH�V �UH]XPH�FkW�PDL�ELQH�SRVLELO�GDWHOH�GH�SOHFDUH��LQL LDOH�

0DL� PXOWH� FULWHULL� SHUPLW� RE LQHUHD� FRPSRQHQWHORU� SULQFLSDOH�� &ULWHULXO��LQHU LHL��HVWH�FHO�PDL�YHFKL��3HDUVRQ��������úL�SUH]LQW �DYDQWDMHOH�

- DERUGDUHD�HVWH�JHRPHWULF ��FHHD�FH�SHUPLWH�vQ HOHJHUHD�PDL�SURIXQG �D�PHWRGHL�úL�D�UH]XOWDWHORU�LQWHUPHGLDUH�FDUH�DMXW �OD�LQWHUSUHWDUHD�HL��proprii din acest punct de vedere.

- $QDOL]D� &RUHVSRQGHQ HORU� GHYLQH� R� JHQHUDOL]DUH� D� $QDOL]HL� vQ�&RPSRQHQWH� 3ULQFLSDOH� úL� VH� vQ HOHJH� PDL� ELQe în acest cadru geometric.

- UH]XOWDWHOH�SURJUDPHORU�IUDQFH]H�GH�DQDOL] �vQ�FRPSRQHQWH�SULQFLSDOH�FRUHVSXQG�DFHVWHL�DERUG UL�

&ULWHULXO� LQHU LHL� HVWH� vQ� DFHODúL� WLPS� PXOW� PDL� FRPSOH[� GHFkW� FHOHODOWH�GRX � FULWHULL� SURSXVH� GH� +RWHOOLQJ� �������FULWHULXO� FRUHOD LHL� úL� FULWHULXO�dispersiei.

Page 9: Analiza in componente principale

Vom prezenta de asemeni aceste criterii, ele corespunzând rezultatelor RE LQXWH�FX�SURJUDPHOH�DPHULFDQH�$�&�3�

3UH]HQWDUHD��$�&�3��FRQIRUP�DERUG ULL�JHRPHWULFH�D�OXL�3HDUVRQ

1RUXO�GH�SXQFWH�DVRFLDW�GDWHORU�úL�FDUDFWHULVWLFLOe sale

ÌQ� DFHDVW � DERUGDUH� JHRPHWULF � VH� DVRFLD] � GDWHORU� QRUXO� GH� SXQFWH�},..,,....,{ 1 ni xxxN = într-XQ� VSD LX� � GH� GLPHQVLXQH� S�� ILHFDUH� YHFWRU� ix de

caracteristici ),.....( 1 ipi xx ale individului i este considerat drept un punct într-un VSD LX�FX�S�GLPHQVLXQL�

&HQWUXO�GH�JUHXWDWH��DO�QRUXOXL�1�HVWH�SXQFWXO�J��DOH�F UXL�FRRUGRQDWH�VXQW�mediile diferitelor variabile.

xxxxxn

g pj

n

ii === ∑

=),..,,....,(

11

1

.

Pentru exemplul nostru: g = (1906.114,183,1111,422,169). 9HFWRUXO�J�UHSUH]LQW �FXPYD�FDUDFWHULVWLFLOH�XQHL�PDúLQL�PHGLL� ÌPSU úWLHUHD� QRUXOXL� vQ� MXUXO� FHQWUXOXL� V X� GH� JUHXWDWH� VH� P VRDU � FX�

DMXWRUXO��LQHU LHL�WRWDOH�D�QRUXOXL�1��GHILQLW �SULQ�� ∑∑= =

−=n

i

p

jjij xx

ngNI

1 1

2)(1

),( .

,QHU LD� WRWDO � SRDWH� IL� FDOFXODW � GLUHFW�� ILLQG� HJDO � FX� VXPD� GLVSHUVLLOor YDULDELOHORU�GLQ�SUREOHP �

∑ ∑ ∑

∑∑∑

= = =

= ==

=−=

=−==

p

j

n

i

p

jjjij

n

i

p

jjij

n

ii

sxxn

xxn

gxdn

gNI

1 1 1

22

1 1

2

1

2

)(1

)(1

),(1

),(

2E LQHP�SHQWUX�H[HPSOX� I(N,g)=267072+1441+609+50824+1638+56=321640. 6H�REVHUY �F �LQHU LD��QRUXOXL�VH�GDWRUHD] ��vQ�SULQFLSDO�FLOLQGUHHL� $FHDVWD� GLQ� FDX]D� DOHJHULL� XQLW LORU� GH� P VXU �� 'DF � DP� IL� P Vurat

FLOLQGUHHD� vQ� OLWUL�� LPSRUWDQ D� � H[DJHUDW �D� FLOLQGUHHL� vQ� FDOFXOXO� LQHU LHL�DU� IL�GLVS UXW�

ÌQ� SUDFWLF �� DGHVHRUL� HVWH� SUHIHUDELO� V � RE LQHP� R� GHVFULHUH� D� GDWHORU�LQGHSHQGHQW �GH�DOHJHUHD�XQLW LORU�

3XWHP�V �RE LQHP�GDWH�RPRJHQH�WUDQVIRUPkQG��GDWHOH�LQL LDOH�vQ�YDULDELOH�FHQWUDWH� UHGXVH�� ILHF UHL� YDULDELOH� jX L� VH� DVRFLD] � YDULDELOD� FHQWUDW � UHGXV �

j

jjj s

xXX

−=* �GH�PHGLH���úL�GLVSHUVLH���

1RXO�WDEHO�VWXGLDW�HVWH�IRUPDW�GLQ�FDQWLW LOH�

j

jijij s

xxx

−=* .

Page 10: Analiza in componente principale

La individXO�L�VH�DVRFLD] �DFXP�SXQFWXO� ).,....,( **1

*ipii xxx =

Noul nor de puncte este },....,{ **1

*nxxN = .

Centrul de greutate al norului *N �HVWH���úL� LQHU LD�VD�WRWDO �HVWH�HJDO �FX�QXP UXO�S�DO�YDULDELOHORU�

3ULPD�D[ �SULQFLSDO �úL�SULPD�FRPSRQHQW �SULQFLSDO 9RP� VWXGLD� FRQVWUXF LD� úL� SURSULHW LOH� SULPHL� FRPSRQHQWH� SULQFLSDOH��

apoi vom ilustra interpretarea ei cu ajutorul unui exemplu. 3ULPD�HWDS �D�DFHVWHL�FRQVWUXF LL�FRQVW �vQ�F XWDUHD�SULPHL�D[H�SULQFLSDOH�D�

norului de puncte *N . 3ULPD�D[ �SULQFLSDO & XW P� V � IDFHP� FD� R� GUHDSW � 1∆ � V � WUHDF � FkW� PDL� � ELQH� SRVLELO� SULQ�

mijlocul norului de puncte *N . 6H� P VRDU � vPSU úWLHUHD� QRUXOXL� *N în jurul unei drepte ∆ cu ajutorul

LQHU LHL� ),( * ∆NI norului *N �UDSRUWDW �OD�GUHDSWD�∆ .

∑=

=∆n

iii yxd

nNI

1

*2* ),(1

),( unde iy � HVWH� SURLHF LD� RUWRJRQDO � )( *ixP∆ a

punctului *ix pe dreapta ∆ .

Dreapta 1∆ �FDXW �V �PLQLPL]H]H ),( * ∆NI úL�VH�QXPHúWH�SULPD�D[ �SULQFLSDO �a norului *N .

6H�SRDWH�DU WD�F �GUHDSWD 1∆ trece prin originea O, centrul de greutate al norului *N al datelor centrate-UHGXVH�úL�HVWH�JHQHUDW �GH��YHFWRUXO�XQLWDU� 1u , YHFWRU��SURSULX�QRUPDW�DO�PDWULFHL�5�D�FRUHOD LLORU�vQWUH�YDULDELOHOH� jX , asociat

la cea mai mare valoare proprie 1λ . 9DORULOH�SURSULL�úL�YHFWRULL�SURSULL�DL�PDWULFHL�5�VXQW�FXSULQúL�vQ�WDEHOXO��� & XWDUHD�SULPHL�D[H�SULQFLSDOH� 1∆ �HVWH�YL]XDOL]DW �vQ�ILJXUD���

Tabelul 4

9DORUL�úL�YHFWRUL�SURSULL�DL�PDWULFHL�GH�FRUHOD LL�

Page 11: Analiza in componente principale

Figura 5

& XWDUHD�SULPHL�D[H�SULQFLSDOH

3HQWUX�H[HPSOXO�FX�PDúLQLOH�DP�RE LQXW��

).3811.0;4246.0;4252.0;3497.0;4182.0;4434.0(

6745.4

1

1

==

u

λ

Se poate verifica rezultatul:

| ∑=

=++==6

1

22211 13811.0......4434.0|||

jjuu

3ULPD�FRPSRQHQW �SULQFLSDO Prima� FRPSRQHQW � SULQFLSDO � 1Y � HVWH� R� QRX � YDULDELO � GHILQLW � SHQWUX�

ILHFDUH� LQGLYLG� L� SULQ� OXQJLPHD� DOJHEULF � D� SURLHF LHL� SXQFWXOXL� *ix pe axa

1∆ .Valoarea lui )(1 iY este deci egaO �FX�SURGXVXO�VFDODU�vQWUH�YHFWRULL� 1u úL� *ix :

∑=

−==

p

j j

jijji s

xxuOyiY

111 )()(

Astfel, valoarea primei componente principale 1Y SHQWUX�5RYHU�HVWH�HJDO �FX�

1Y (Rover)=0.44*1.49+0.41*1.67+0.34*1.58+0.43*1.13+0.43*1.17+0.38*0.83=3.19

Global, 1Y se scrie deci:

.38.043.043.0

34.041.044.0***

***1

LatimeLungimeGreutate

VitezaPutereCilindreeY

+++

+++=

Valorile lui 1Y pentru fiecare individ sunt cuprinse în tabelul 5.

Page 12: Analiza in componente principale

Tabelul 5 3 WUDWHOH�GLVWDQ HORU�SkQ �OD�RULJLne, componentele principale

úL�S WUDWHOH�FRVLQXVXULORU

3ULPD�FRPSRQHQW �SULQFLSDO � 1Y HVWH�FHQWUDW ��ILLQG�FRPELQD LH�OLQLDU �GH�variabile centrate.

6H�SRDWH�DU WD�F �GLVSHUVLD�VD�HVWH�HJDO �FX� 1λ :

Dispersie ∑∑==

====n

ini

n

i

yyIydn

iYn

Y1

112

1

211 )0},,....,({)0,(

1)(

1)( λ .

Dispersia primei componente principale 1Y � HVWH�HJDO �FX� LQHU LD�QRUXOXL�de puncte proiectate pe 1∆ , în raport cu centrul de greutate O.

&RUHOD LLOH� vQWUH� YDULDELOHOH� jX � úL� FRUHVSRQGHQ D� SULQFLSDO � 1Y pot fi calculate cu ajutorul formulei:

jj uYXcor 111),( λ=

6H� GHGXFH� F � DVLPLODULWDWHD� � OXL� 1Y � ID � GH� PXO LPHD� GH� YDULDELOH� HVWH�

HJDO �FX�

∑=

=p

jj p

YXcorp 1

11

2 ),(1 λ

Pentru exempluO� QRVWUX� RE LQHP�� 776.06

656.4 = comparabil cu 0.715 al

cilindreei din tabelul 3. &RUHOD LLOH�vQWUH� jX úL� 1Y �DSDU�vQ�SULPD�FRORDQ �D�WDEHOXOXL���

Page 13: Analiza in componente principale

Tabelul 6. &RUHOD LL�YDULDELOH-componente principale

PULPD� FRPSRQHQW � SULQFLSDO 1Y � ILLQG� IRDUWH� FRUHODW � SR]LWLY� FX� WRDWH�YDULDELOHOH��HD�SRDWH�IL�LQWHUSUHWDW �FD�XQ�IDFWRU�GH�P ULPH��FODVkQG�PDúLQLOH�de la cele mai mici ( 1Y (Fiat Uno)= -3.76; 1Y (Ford Fiesta)= - 3.50) la cele mai mari ( 1Y (Renault 25)=3.44; 1Y (BMV530i)=3.95).

&DOLWDWHD�JOREDO �D�SULPHL��FRPSRQHQWH�SULQFLSDOH 3HQWUX� D� P VXUD� FDOLWDWHD� JOREDO � D� SULPHL� FRPSRQHQWH� SULQFLSDOH�

FRQVLGHUDW � FD� UH]XPDW� DO� GDWHORU�� VH� IRORVHúWH� IRUPXOD� GH� GHVFRPSXQHUH� D�LQHU LHL�WRWDOH�

Vectorul iy � ILLQG� SURLHF LD� RUWRJRQDO � D� YHFWRUXOXL� *ix pe dreapta 1∆ ,

avem: ),()0,()0,( *22*2

iiii yxdydxd += de unde :

∑∑∑===

+=n

iii

n

ii

n

ii yxd

nyd

nxd

n 1

*2

1

2

1

*2 ),(1

)0,(1

)0,(1

,QHU LD�WRWDO �

pxdn

NIn

ii == ∑

=1

*2* )0,(1

)0,(

VH�GHVFRPSXQH�GHFL�vQ�GRX �S U L�

- primul termen ∑=

=n

ini yyIyd

n 11

2 )0},,....,({)0,(1 �UHSUH]LQW �LQHU LD�WRWDO �D�

norului },....,{ 1 nyy D� SURLHF LLORU� SXQFWHORU� *ix pe axa 1∆ �$FHDVW �

FDQWLWDWH��UHSUH]LQW �LQHU LD�H[SOLFDW �GH�D[D� 1∆ úL�HVWH�HJDO �FX�� 1λ

- al doilea termen ),(),(1

1*

1

*2 ∆=∑=

NIyxdn

n

iii � UHSUH]LQW � LQHU LD� UH]LGXDO �

a norului în jurul axei 1∆ Pentru�H[HPSOXO�FX�PDúLQLOH�RE LQHP� - LQHU LD�WRWDO �S � - LQHU LD�H[SOLFDW �GH� 1∆ = 1λ =4.656 - LQHU LD�UH]LGXDO � S- 1λ =1.344

Page 14: Analiza in componente principale

&DOLWDWHD�JOREDO �D�SULPHL�FRPSRQHQWH�SULQFLSDOH�VH�P VRDU �SULQ��SDUWHD�

de iner LH� H[SOLFDW �p1λ �6H� UHJ VHúWH� � DSURSULHUHD� FRPSRQHQWHL� SULQFLSDOH� 1Y

ID �GH�PXO LPHD�GH�YDULDELOH�

ÌQ�H[HPSOX��SDUWHD�GH�LQHU LH�H[SOLFDW �GH� 1∆ �HVWH�HJDO �FX� .776.06

656.4 = Se

poate sSXQH�F �������GLQ� LQHU LD� WRWDO �HVWH�H[SOLFDW �SULQ�DOXQJLUHD�QRUXOXL�de-a lungul primei axe principale.

&DOLWDWHD�UHSUH]HQW ULL�LQGLYL]LORU�SH�SULPD�D[ �SULQFLSDO

&DOLWDWHD� UHSUH]HQW ULL� ILHF UXL� LQGLYLG�SH� D[D� 1∆ � VH�P VRDU � FX�Djutorul S WUDWXOXL�FRVLQXVXOXL�XQJKLXOXL�IRUPDW�GH�YHFWRUXO� *

ix cu axa 1∆ :

.)0,(

)(

)0,(

)0,(),(cos

*2

21

*2

2

1*2

ii

ii xd

iY

xd

ydx ==∆

Astfel avem pentru Rover:

94.080.10

18.10),(cos

8.1083.0

17.113.158.167.149.1)0,(

19.3)(

12

2

222222

1

==∆

=++++++=

=

Rover

Roverd

RoverY

5RYHU�HVWH�ELQH�UHSUH]HQWDW�SH�D[D�SULQFLSDO ��� 1∆ . 3 WUDWHOH�GLVWDQ HORU��ILHF UXL�LQGLYLG�OD�RULJLQH�úL�S WUDWHOH�FRVLQXVXULORU�

sunt date în Tabelul 5.

$�GRXD�D[ �SULQFLSDO �úL�D�GRXD�FRPSRQHQW �SULQFLSDO

3UH]HQW P� FRQVWUXF LD� úL� SURSULHW LOH� FHOHL� GH-a doua componente principale.

$�GRXD�D[ �SULQFLSDO

6H�FDXW �R�D[ � 2∆ RUWRJRQDO �FX� 1∆ úL�FDUH�V �PLQLPL]H]H�LQHU LD� ),( * ∆NI . $FHDVW � D� GRXD� D[ � SULQFLSDO � 2∆ � WUHFH� SULQ� RULJLQHD�2� úL� HVWH� JHQHUDW � GH�vectorul 2u ��YHFWRU�SURSULX�QRUPDW��GLQ�PDWULFHD�GH�FRUHOD LL�5��DVRFLDW� OD�D�doua cea mai mare valoare proprie 2λ .

Valoarea proprie 2λ � úL�YHFWRUXO�SURSULX� 2u �SHQWUX�H[HPSOXO�FX�PDúLQLOH�se� DIO � vQ� 7DEHOXO� ��� & XWDUHD� FHOHL� GH-a doua axe principale 2∆ este YL]XDOL]DW �vQ�)LJXUD���

Page 15: Analiza in componente principale

Figura 6

& XWDUHD�FHOHL�GH-a doua axe principale

6 � QRW P� FX� iz � úL� ia � SURLHF LLOH� SXQFWXOXL� *

ix pe axa 2∆ � úL� SH� SODQXO�( 1∆ , 2∆ ) respectiv. Vectorii iy �úL� iz �VXQW�GH�DVHPHQL�SURLHF LLOH�SXQFWHORU� ia pe axele 1∆ úL�� 2∆ .

Din descompunerea:

),()0,()0,(

),()0,()0,(*222

*22*2

iiii

iiii

axdzdyd

axdadxd

++=

=+= deducem:

)),(,()0},,....,({)0},,....,({),( 21

*11 ∆∆++= NIzzIyyIONI nn (1)

unde

∑=

=∆∆n

iii axd

nNI

1

*221

* ),(1

)),(,(

HVWH�LQHU LD�QRUXOXL� *N în raport cu planul ),( 21 ∆∆ . 6H�SRDWH�GHPRQVWUD�F �)),(,( 21

* ∆∆NI � HVWH� PLQLP � vQ� UDSRUW� FX� LQHU LD� ID � GH� WRDWH� FHOHODOWH� SODQH�posibile.

Planul ),( 21 ∆∆ �VH�QXPHúWH��SULPXO�SODQ�SULQFLSDO��(VWH�SODQXO�FDUH��WUHFH�cel mai bine posibil prin mijlocul norului *N �vQ�VHQVXO�FULWHULXOXL�LQHU LHL�

$�GRXD�FRPSRQHQW �SULQFLSDO

$�GRXD�FRPSRQHQW �SULQFLSDO � � 2Y �HVWH�R�YDULDELO �QRX �GHILQLW �SHQWUX�fiecare individ i prin :

2Y �L�� OXQJLPHD�DOJHEULF �D�VHJPHQWXlui ],0[ iz

Page 16: Analiza in componente principale

2Y (i) =∑=

−p

j j

jijj s

xxu

12 )( .

Pentru exemplu, putem scrie 2Y global:

**

****2

48.030.0

26.066.042.003.0

LatimeLungime

GreutateVitezaPutereCilidreeY

−−

−++=

Valorile sale sunt date în Tabelul 5. Pentru Rover, 2Y ia valoarea 0.77. $�GRXD�FRPSRQHQW �SULQFLSDO � 2Y �HVWH�FHQWUDW �úL�GH�GLVSHUVLH�HJDO �FX� 2λ . Putem scrie:

21

1

2

1

222

)0},,....,({

)0,(1

)(1

)(

λ==

=== ∑∑==

n

n

ii

n

i

zzI

zdn

iYn

YDisp (2)

0DL� PXOW� FRUHOD LD� vQWUH� 1Y � úL� 2Y � HVWH� HJDO � FX� ]HUR�� &RUHOD LLOH� vQWUH�variabilele JX �úL� 2Y �VH�FDOFXOHD] �FX��DMXWRUXO�IRUPXOHL��

jJ uYXcor 222 ),( λ=

&RUHOD LLOH� �GLQWUH�YDULDELOHOH�úL�FRPSRQHQWD�SULQFLSDO � 2Y din exemplul nostru sunt datH� vQ�7DEHOXO� ��� 3XWHP�REVHUYD� F � 2Y � HVWH� FRUHODW � SR]LWLY� FX�YDULDELOHOH� ÄPRWRU´�&LOLQGUHH�� 3XWHUH�� 9LWH] �� úL� FRUHODW � QHJDWLY� FX�YDULDELOHOH�ÄFRQIRUW´��*UHXWDWH��/XQJLPH��/ LPH��

$� GRXD� FRPSRQHQW � SULQFLSDO � 2Y RSXQH� DVWIHO� PDúLQL� VSRUWLYH� FX� XQ�motor prea puternic în raport cu confortul

( 2Y (Peugeot 205 Rallye)=1.48, 2Y (Audi 90 Quattro)=1.36) OD�PDúLQL�IDPLOLDOH��FX�XQ�FRQIRUW�VSRULW�vQ�UDSRUW�FX�PRWRUXO

( 2Y (VW Caravelle= - 2.38, 2Y (Nissan Vanette)= - 1.82).

&DOLWDWHD�JOREDO �D�FHOHL�GH-D�GRXD�FRPSRQHQW �SULQFLSDO �úL�D�SULPHORU�

GRX �FRPSRQHQWH�SULQFLSDOH

'LQ�HFXD LLOH�����úL�����VH�GHGXFH�F �SDUWHD�GH�LQHU LH�H[SOLFDW �GH�D�GRXa

D[ �SULQFLSDO � HVWH� HJDO � FX� p2λ � �� LDU� DFHHD�H[SOLFDW �GH� SODQXO� ),( 21 ∆∆ este

HJDO �FX�p

)( 21 λλ + .

În exemplu, 2∆ H[SOLF � �������� ������� GLQ� LQHU LD� WRWDO �� LDU� ),( 21 ∆∆ H[SOLF ������������������ �������GLQ�LQHU LD�WRWDO �

&DOLWDWHD�UHSUH]HQW ULL�LQGLYL]LORU�SH�D�GRXD�D[ �SULQFLSDO �úL�SH�SULPXO�

plan principal

&DOLWDWHD��UHSUH]HQW ULL�ILHF UXL�SXQFW� *ix pe axa 2∆ ��úL�SH�SODQXO� ),( 21 ∆∆ se

P VRDU � FX� DMXWRUXO�S WUDWHORU� FRVLQXVXULORU� XQJKLXULORU� IRUPDWH� GH� YHFWRUXO�*ix �SH�GH�R�SDUWH�úL�GH�D[D� 2∆ sau planul ),( 21 ∆∆ �SH�GH�DOW �SDUWH�

Page 17: Analiza in componente principale

Pe 2∆ :

)0,(

)(

)0,(

)0,(),(cos

*2

22

*2

2

2*2

ii

ii xd

iY

xd

zdx ==∆

Pe ),( 21 ∆∆ :

),(cos),(cos

)0,(

)0,()0,(

)0,(

)0,()),(,(cos

2*2

1*2

*2

22

*2

2

21*2

∆+∆=

=+==∆∆

ii

i

ii

i

ii

xx

xd

zdyd

xd

adx

Pentru Rover, avem:

00.106.094.0)),(,(cos

06.0),(cos

212

22

=+=∆∆

=∆

Rover

Rover

&X�FkWHYD�DSUR[LP UL�GDWRUDWH��URWXQMLULORU��SXWHP�DILUPD�F �5RYHU�VH�DIO �FRQ LQXW�vQ�SULPXO�SODQ�SULQFLSDO�

Rezultate generale

Extinzând� UH]XOWDWHOH� SUH]HQWDWH� vQ� VHF LXQLOH� SUHFHGHQWH�� VH� RE LQ� R�PXO LPH� GH� S� D[H� SULQFLSDOH� p∆∆ ,.......,1 generate de vectorii proprii

RUWRQRUPD L� puu ,.......,1 ��DVRFLD L�OD�YDORULOH�SURSULL� pλλ ,.......,1 aranjate în ordinea

descUHVF WRDUH�GLQ�PDWULFHD�GH�FRUHOD LL�5� )LJXUD���YL]XDOL]HD] �DFHVW�QRX�UHSHU�

Figura 7

Axele principale. Componentele principale

Page 18: Analiza in componente principale

Componentele principale pYY ,......,1 sunt definite prin ∑=

=p

jijhjh xuiY

1

*)( .

(OH�UHSUH]LQW �FRRUGRQDWHOH punctelor *ix în noul reper.

6H�SRDWH�DU WD�F �HOH�VXQW�FHQWUDWH��GH�GLVSHUVLH� hλ úL�QHFRUHODWH�vQWUH�HOH� Punctele *

ix pot fi exprimate în acest nou reper:

∑=

=p

hkhi uiYx

1

* )(

Formulele carH� XUPHD] � VXQW� IRDUWH� LPSRUWDQWH� úL� VH� GHGXF� GLUHFW� GLQ�procesul de construire al componentelor principale:

Formula de reconstituire a datelor:

∑=

=p

hhjhij uiYx

1

* )( (3)

)RUPXOD�GH�UHFRQVWLWXLUH�D�PDWULFHL�FRUHOD LLORU�GLQWUH�YDULDELOH:

∑=

=p

hhlhjhlj uuXXcor

1

),( λ (4)

)RUPXOD�GH�GHVFRPSXQHUH�D�S WUDWXOXL�GLVWDQ HL�XQXL�SXQFW�OD�RULJLQH

∑=

==p

hhii iYxxd

1

22**2 )(||||)0,(

de unde se deduce:

(i) ∑=

=∆p

hhix

1

*2 1),(cos

(ii) ∑=

=p

hh p

1

λ

&DOFXOXO�FRUHOD LLORU�vQWUH�YDULDELOHOH�� jX �úL�FRPSRQHQWHOH�SULQFLSDOH� hY

hjhhj uYXcor λ=),( (5)

'HGXFHP� F � DVLPLODULWDWHD� FRPSRQHQWHL� SULQFLSDOH� hY cu variabilele

pXX ,....,1 � HVWH� HJDO � FX� �� ∑=

=p

j

phj p

YXcorp 1

2 ),(1 λ � DGLF � SDUWHD� GH� LQHU LH�

H[SOLFDW �GH�D[D�SULQFLSDO � h∆ . 'LVWDQ D�OXL�0DKDODQRELV 3HQWUX�D�P VXUD�GLVWDQ D�GLQWUH�XQ�LQGLYLG�úL�FHQWUXO�GH�JUHXWDWH�DO�QRUXOXL��

VH�XWLOL]HD] �DGHVHD�GLVWDQ D�OXL�0DKDODQRELV� (D� VH� GHILQHúWH� vQ� IHOXO� XUP WRU�� VH� FRQVWUXLHVF�PDL� vQWkL� FRPSRQHQWHOH�

principale hZ preferabil pentru datele de origine decât pentru datele centrate-UHGXVH�� 3HQWUX� � DFHDVWD� VH� XWLOL]HD] � YHFWRULL� SURSULL� hv din matricea de covariaQ � D� YDULDELOHORU� jX � úL� VH� FDOFXOHD] � YDULDELOHOH� hZ cu ajutorul

formulei:

Page 19: Analiza in componente principale

∑=

−=p

jjijhjh xxviZ

1

)()(

'LVWDQ D�OXL�0DKDODQRELV� ),( xxd iM dintre punctul ix �úL�FHQWUXO��GH�JUHXWDWH�x �DO�QRUXOXL�IRUPDW�GLQ�GDWHOH�GH�RULJLQH�VH�GHILQHúWH�FX�DMXWRUXO�IRUPXOHL�

∑=

=p

hhiM iZxxd

1

2*2 )(),(

unde *hZ este variabila hZ UHGXV �

5HSUH]HQW UL�JUDILFH (VWH�YRUED�GH�UHSUH]HQW UL�JUDILFH�DOH�LQGLYL]LORU�úL�YDULDELOHORU Harta indivizilor 3URLHF LLOH� SXQFWHORU� *

ix pe primul plan principal ),( 21 ∆∆ au drept coordonate pe axele principale 1∆ , 2∆ valorile )(1 iY �úL� )(2 iY .

5HSUH]HQWDUHD�JUDILF �D�SXQFWHORU ))(),(( 21 iYiYAi = �QH�G �DVWIHO�FHO�PDL�EXQ�rezultat al datelor dintr-XQ�SODQ��$FHDVW �KDUW �D� LQGLYL]LORU�HVWH�UHSUH]HQWDW �în Figura 8.

6H� YHULILF � LQWHUSUHWDUHD� D[HORU� SUH]HQWDW � DQWHULRU�� PDúLQLOH� DSDU� GH-a OXQJXO�SULPHL� D[H� vQ� IXQF LH�GH�PRGHOXO� ORU��GH� OD� FHOH�PDL�PLFL� �)LDW�8QR��)RUG�)LHVWD��OD�FHOH�PDL�PDUL��5HQDXOW�����%09����L���úL�GH-a lungul celei de-D�GRXD�D[H� vQ� IXQF LH�GH�FDUDFWHULVWLFD� ORU�� GH� OD�PDúLQLOH� IDPLOLDOH� ��1LVVan Vanette, VW Caravelle) la cele sportive (Citroen AX Sport, Peugeot 205 Rallye).

Figura 8

3ULPXO�SODQ�SULQFLSDO�úL�FHUFXO�FRUHOD LLORU

Page 20: Analiza in componente principale

Harta variabilelor

Variabilele sunt reprezentate într-un plan cu ajutorul punctelor: )),(),,(( 21 YXcorYXcorB jjj = Se� RE LQH� UHSUH]HQWDUHD� � JUDILF � GLQ� )LJXUD� ��

QXPLW �ÄFHUFXO�GH�FRUHOD LL´� (VWH� YL]XDOL]DW� ELQH� IDSWXO� F � SULPD� FRPSRQHQW � SULQFLSDO �� FRUHODW �

SR]LWLY�FX�WRDWH�YDULDELOHOH�SUREOHPHL��HVWH�XQ�IDFWRU�GH�WDOLH´���P ULPH���úL�F �D�GRXD��FRPSRQHQW �SULQFLSDO ��RSXQkQG��9LWH]D��3XWHUH��OD��/ LPH��/XQJLPH�úL�*UHXWDWH��FODVHD] �PRGHOHOH�FRQIRUP�FDUDFWHUXOXL�ORU�VSRUWLY�VDX�IDPLOLDO�

Lungimea jR a vectorilor-variabile jB � UHSUH]LQW � FRUHOD LD� PXOWLSO �

R( ),; 21 YYX j dintre variabila jX �úL�FHOH�GRX �FRPSRQHQWH�SULQFLSDOH��6H�RE LQH�

vQ�DGHY U� ),,;(),(),(|||| 21

22

21

22 YYXRYXcorYXcorB jjjj =+=

F FL�YDULDELOHOH� 21,YY sunt necorelate între ele. Pentru exemplul prezentat se RE LQH�

Variabile jR

Cilindree 0.96 Putere 0.98 9LWH] 0.97 Greutate 0.92 Lungime 0.97 / LPH 0.93

7RDWH�YDULDELOHOH�VXQW�ELQH�UHSUH]HQWDWH�SH�FHUFXO�GH�FRUHOD LL� 3DUWHD� GH� LQHU LH� H[SOLFDW � GH� SULPXO� � SODQ� SULQFLSDO� ILLQG� IRDUWH� PDUH�

���������FRUHOD LLOH�vQWUe variabile sunt bine reconstituite utilizând doar primii doi termeni din formula (4):

Page 21: Analiza in componente principale

∑=

=2

1

),(h

hlhjhlj uuXXcor λ �FDUH�GHYLQH��GDF � LQHP�FRQW�GH����

∑=

=2

121 ).,(),(),(

hljlj YXcorYXcorXXcor

$VWIHO�� FRUHOD LD� vQWUH� YDULDELOHOH� jX úL� lX � SRDWH� IL� DSUR[LPDW � SULQ�

produsul scalar >< lj BB , dintre vectorii jB úL� lB . Exemplu:�FRU�&LOLQGUHH��3XWHUH�� �������HVWH�ELQH�DSUR[LPDW �SULQ�

8664.040.003.089.096.0

),(),(),(),( 2211

=×+×==⋅+ YPuterecorYCilindreecorYPuterecorYCilindreecor

Cosinusul unghiului format de vectorii jB úL� lB fiind dat de formula:

,||||||||

,),cos(

lj

ljlj BB

BBBB

><= �FRUHOD LD�vQWUH� jX úL� lX se scrie aproximativ:

).,cos(||||||||),( ljljlj BBBBxxcor =

$VWIHO�� FRUHOD LLOH� vQWUH� YDULDELOH� jX sunt aproximativ reconstituite pe

FHUFXO� GH� FRUHOD LL� vQ� IXQF LH� GH� OXQJLPHD� YHFWRULORU� –� YDULDELOH� úL� D�FRVLQXVXULORU�XQJKLXULORU�GLQWUH�DFHúWL�YHFWRUL�

6H�SRDWH�YHULILFD�GH�H[HPSOX�F �GHQGRJUDPD�GLQ�)LJXUD���H[SULP �ELQH�SR]L LD�YHFWRUilor-YDULDELOH�GLQ�FHUFXO�GH�FRUHOD LL��XQLL�vQ�UDSRUW�FX�FHLODO L�

Biplotul Luându-QH� FkWHYD� SUHFDX LXQL� vQ� FHHD� FH� SULYHúWH� VFDUD� GH� UHSUH]HQWDUH��

HVWH�SRVLELO�V �VXSUDSXQHP�FHOH�GRX �JUDILFH��SULPXO�SODQ��SULQFLSDO�úL�FHUFXO�GH�FRUHOD LL��RE LQkQG�DVWIHO�R�UHSUH]HQWDUH�vPERJ LW �

$FHDVW � UHSUH]HQWDUH� VLPXOWDQ �D� LQGLYL]LORU� úL�D�YDULDELOHORU� VH�QXPHúWH�ELSORW��H[SUHVLH�LQWURGXV �GH�*DEULHO��������

3UHVXSXQHP�PDL� vQWkL�F �SDUWHD�HVHQ LDO �GLQ�LQHU LD� WRWDO �HVWH�H[SOLFDW �GH�SULPXO�SODQ�SULQFLSDO��'DF �QX�HVWH�DúD��WUHEXLH�V �OLPLW P�UH]XOWDWHOH��FDUH�YRU�XUPD�OD�SXQFWH�ELQH�UHSUH]HQWDWH�SH�SULPXO�SODQ�SULQFLSDO�úL�OD�YDULDELOH�IRDUWH�SXWHUQLF�FRUHODWH�FX�SULPHOH�GRX �FRPSRQHQWH�SULQFLSDOH�

Cu aceste ipoteze, formula

∑=

=p

hhjhij uiYx

1

* )(

de reconstituLUH�D�GDWHORU��SHUPLWH�RE LQHUHD�XQHL�EXQH�DSUR[LP UL�D�SXQFWHORU�*ijx ��XWLOL]kQG�GRDU�SULPHOH�GRX �GLPHQVLXQL�

∑=

=2

1

* )(h

hjhij uiYx Notând h

hh

YY

λ=*

FRPSRQHQWD� SULQFLSDO � hY � UHGXV � úL�

utilizând faptul�F �

hjhhj uYXcor λ=),( �DFHDVW �IRUPXO �GHYLQH�

∑=

=2

1

** ),()(h

hjhij YXcoriYx (6).

Page 22: Analiza in componente principale

Exemplu Avem 49.179.516

1.19062675*),( =−=CilindreeRoverx bine reconstituit prin

.44.103.077.09152.0

196.019.3

656.4

1

),()(),()( 2*

21*

1

=×+×=

=⋅+ YCilindreecorRoverYYCilindreecorRoverY

)RUPXOD� ���� H[SULP � IDSWXO� F � *

ijx este aproximativ reconstituit prin

produsul scalar dintre vectorii ))(),(( *2

*1

* iYiYAi = �úL� )),(),,(( 21 YXcorYXcorB jjj =

1RW P� ijP �SURLHF LD�YHFWRUXOXL� *iA pe axa )( jB∆ �JHQHUDW �GH�YHFWRUXO� jB .

$FHVWH�QRWD LL�VXQW�Yizualizate în Figura 9. /XQJLPHD�DOJHEULF � ijOP �HVWH�HJDO �FX�

),()(),(),(

1 2

1

*

22

12 hj

hh

jj

ij YXcoriYYXcorYXcor

OP ∑=+

=

Figura 9

3XQFWH�LQGLYL]L�úL�D[H�YDULDELOH� 1XPLWRUXO�ILLQG�HJDO�FX�FRUHOD LD�PXOWLSO � jR între jX �úL�SULPHOH�GRX �D[H�

principale, avem deci: ijjij OPRx =* .

$úDGDU�� SURLHF LLOH� SXQFWHORU-indivizi *iA pe axele variabile )( ijB∆ au

OXQJLPLOH� DOJHEULFH� SURSRU LRQDOH� FX� GDWHOH� *ijx �5HSDUWL LD� SURLHF LLORU� ijP pe

axa )( jB∆ � UHIOHFW � GHFL� ELQH� UHSDUWL LD� YDORULORU� *ijx ale variabilei *

jX �� úL� vQ�

FRQVHFLQ ��úL�DFHHD�D�YDORULORU� ijx ale variabilei de origine jX . ÌQ�)LJXUD����DP�FRQVWUXLW�ELSORWXO��UHSUH]HQWDUHD�VLPXOWDQ �D�LQGLYL]LORU�úL�

D�YDULDELOHORU��vQ�IHOXO�XUP WRU� - LQGLYL]LL��VXQW�UHSUH]HQWD L�SULQ�SXQFWHOH� ));(),(( *

2*

1* iYiYAi =

- variabilele sunt reprezentate prin axele )( jB∆ situate pe grafic cu

ajutorul punctelor (3cor( 1,YX j ),3 cor( 2,YX j )). Coeficientul 3 a fost

DOHV�vQ�VFRSXO�RE LQHULL�XQHL�PDL�EXQH�YL]LELOLW L�D�SXQFWHORU-variabile.

Page 23: Analiza in componente principale

Figura 10

%LSORW��UHSUH]HQWDUHD�VLPXOWDQ �D�LQGLYL]LORU�úL�D�YDULDELOHORU

$VWIHO�� VH� SRDWH� YHULILFD� IDSWXO� F � SURLHF LD� PDúLQLORU� SH� D[D� 9LWH] �

UHVWLWXLH�ELQH�UHSDUWL LD�GDWHORU�GH�SOHFDUH��SURLHF LLOH�PDúLQLORU�FHOH�PDL�UDSLGH�(BMW 530i, Renault 25, Audi 90 Quatro) se opun bine la cele mai lente (Ford Fiesta, Nissan Vanette, Fiat Uno, VW Caravelle).

'H�DVHPHQHD��SURLHF LLOH�PDúLQLORU�SH�D[D�/ LPH�RSXQ�ELQH�PDúLQD�FHD�PDL�ODW ��9:�&DUDYHOOH��OD�FHD�PDL�vQJXVW ��)LDW�8QR��

Prezentarea Analizei în Componente Principale (A.C.P.) conform

DERUG ULL�lui Hotelling)

3URFHVXO�GH� FRQVWUXLUH�DO� FRPSRQHQWHORU�SULQFLSDOH�SUH]HQWDW�SkQ �DFXP�este laborios, dar conduce la un ansamblu de rezultate foarte complet. +RWHOOLQJ� ������� D� SURSXV� � FULWHULL� FDUH� V � SHUPLW � RE LQHUHD�PDL� GLUHFW � D�componentelor principaOH��GDU��VH�SLHUGH�vQ�DFHVW�FD]�GLPHQVLXQHD�JHRPHWULF �a problemei.

9RP�SUH]HQWD�FULWHULXO�FRUHOD LHL�DSRL�DO�GLVSHUVLHL� &ULWHULXO�FRUHOD LHL 6H� FDXW � P� YDULDELOH� mFF ,.....,1 centrate-UHGXVH� úL� QHFRUHODWH� FDUH� V �

maximizeze criteriul :

Page 24: Analiza in componente principale

∑ ∑= =

m

h

p

jhj FXcor

p1 1

2 )],(1

[ (7)

&X� DOWH� FXYLQWH�� VH� FDXW � UH]XPDUHD� YDULDELOHORU� GH� RULJLQH� pXX ,.....,1

printr-XQ�QXP U�PDL�PLF�GH�YDULDELOH� mFF ,.....,1 �QHFRUHODWH�vQWUH�HOH�úL�FDUH�V �reprezinte principalele dimensiuni ale fenomenului studiat.

6H�SRDWH�GHPRQVWUD�F �PD[LPXO�IRUPXOHL�����HVWH�DWLQV�SHQWUX�YDULDELOHOH�

h

hhh

YYF

λ== * ,care sunt tocmai componentele principale reduse. Valoarea

PD[LPXOXL�HVWH��HJDO �FX� pm /)....( 1 λλ ++ . Criteriul dispersiei

6H� FDXW � P� YDULDELOH mZZ ,.....,1 de forma ∑=

=p

jjhjh XvZ

1

cu vectorii

),.....,( 1 hphh vvv = �RUWRQRUPD L��FDUH�V �PD[LPL]H]H�FULWHULXO��

)(1

h

m

h

ZDispersie∑=

(8)

6H� GHPRQVWUHD] � F � PD[LPXO� IRUPXOHL� ���� HVWH� DWLQV� SHQWUX� YHFWRULL�SURSULL�QRUPD L� mvv ,.....,1 �DL�PDWULFHL�GH�FRYDULDQ �vQWUH�YDULDELOHOH� jx asociate

la cele mai mari m valori proprii mvv ,.....,1 �úL�DUH�GUHSW�YDORDUH� mvv ++ .....1 . 'DF �OX P�P� �S��VH�RE LQH��

)(.....1

1 j

p

jp XDispersievv ∑

=

=++

Suma SULPHORU�P�YDORUL�SURSULL� UHSUH]LQW �GLVSHUVLD� H[SOLFDW �GH� FHOH�P�variabile mZZ ,.....,1 .

'DF �VH�OXFUHD] �FX�YDULDELOHOH�FHQWUDWH-reduse **1 ,...... pXX , atunci hh YZ =

úL�RE LQHP�

mh

m

h

ZDispersie λλ ++=∑=

.......)( 11

.

Metoda de clDVLILFDUH� DVFHQGHQW � LHUDUKLF � FX� DMXWRUXO� FULWHULXOXL� OXL�

Ward $FHDVW � PHWRG � FRQGXFH� OD� XQ� DOW� SURFHGHX� GH� D� UH]XPD� GDWHOH��

FRQVWUXLUHD�XQXL�WLSRORJLL��VDX�SDUWL LL��D�LQGLYL]LORU�vQ�FODVH�DVWIHO�FD�LQGLYL]LL�FDUH� DSDU LQ� DFHOHLDúL� FODVH� V � ILH� DVHP Q WRri (similari) în timp ce indivizii FDUH�DSDU LQ�OD�FODVH�GLIHULWH�V �ILH��GHRVHEL L��GHS UWD L��GLVLPLODUL��

Calitatea unei tipologii 6 �FRQVLGHU P�R�WLSRORJLH�D�PXO LPLL�QRDVWUH�GH�LQGLYL]L�vQ�N�FODVH��ILHFDUH�

FODV �DYkQG�UHVSHFWLY� knn ,.....,1 indivizi. 6 �QRW P�FX� kGG ,.....,1 �WLSRORJLD�FRUHVSXQ] WRDUH�QRUXOXL�GH�SXQFWH�DVRFLDW�

Page 25: Analiza in componente principale

},.....,{ 1 nxxN = �úL�FX� kgg ,.....,1 centrele de greutate ale acestor clase. ,QHU LD�WRWDO �D�QRUXOXL�1�VH�GHVFRPSXQH�vQ�IHOXO�XUP WRU�

.),(),()(),(11

2 ∑∑==

+=k

iii

ik

ii

i gGIn

nggd

n

ngNI

3ULPXO�WHUPHQ�GLQ�GUHDSWD�VH�QXPHúWH�LQHU LD�LQWHU-FODVH�úL�P VRDU �IHOXO�vQ�FDUH�FODVHOH�VH�GHS UWHD] �XQHOH�GH�DOWHOH�

$FHVW� WHUPHQ�VH�QRWHD] �FX�,� kGG ,.....,1 �� úL� UHSUH]LQW � LQHU LD�H[SOLFDW �GH�tipologie.

Al doilea termHQ� GLQ� GUHDSWD� VH� QXPHúWH� LQHU LD� LQWUD-FODVH� úL� P VRDU �omogenitatea claselor.

&DOLWDWHD�WLSRORJLHL�VH�P VRDU �FX�DMXWRUXO� UDSRUWXOXL�GLQWUH� LQHU LD� LQWHU-FODVH�úL�LQHU LD�WRWDO �

Criteriul lui Ward

Când în tipologia kGG ,.....,1 se înlocuiesc� GRX � FODVH� iG � úL� jG prin

reuniunea lor, ji GG � �VH�SURGXFH�R�GLPLQXDUH�D�LQHU LHL�LQWHU-clase. $FHDVW �PLFúRUDUH�

),.....,,.....,(),.....,,....,,.....,(),( 11 kjikjiji GGGGIGGGGIGGD ∪−= poate fi FDOFXODW �úL�HVWH�HJDO �FX��

),()(

),( 2ji

ji

jiji ggd

nnn

nnGGD

+=

$FHVW� FULWHULX�� XWLOL]DW� SHQWUX�P VXUDUHD�GLVWDQ HL� vQWUH�GRX � FODVH�� iG � úL�

jG �VH�QXPHúWH�FULWHULXO�GH�DJUHJDUH�DO�OXL�:DUG�

Exemplu:

6 �OX P�}{

}{*

4052

*1

Peugeot

CitroenBX

xG

xG

=

=. Avem

189.056

)169168(

1638

)440424(

50824

)10801060(

609

)180182(

1442

)9090(

267072

)17691769(

),(

222

222

*405

*2

=−+−+−+

+−+−+−=

=PeugeotCitroenBX xxd

00393.0189.0)11(24

11),( *

405*2 =×

+××=PeugeotCitroenBX xxD

&ODVLILFDUHD�LHUDUKLF �DVFHQGHQW

$OJRULWPXO� GH� FODVLILFDUH� LHUDUKLF � DVFHQGHQW � � HVWH� LWHUDWLY�� ÌQ� HWDSD�LQL LDO �VH�SOHDF �GH�OD�R�SDUWL LH�D�PXO LPLL�GH�LQGLYL]L�vQ�N�FODVH� kGG ,.....,1 úL�VH�

Page 26: Analiza in componente principale

UHJUXSHD] �FHOH�GRX �FODVH� iG �úL� jG , minimizând criteriul lui Ward, D( iG , jG ). 'HFL��vQ�WLPSXO�DFHVWHL�LWHUD LL��LQHU LD�LQWHU-FODVH�VFDGH�FX�R�FDQWLWDWH�HJDO �

cu D( iG , jG ���� /D� HWDSD� LQL LDO �� ILHFDUH� LQGLYLG� IRUPHD] � R� FODV � úL� LQHU LD�

WRWDO �HVWH�DWXQFL�HJDO �FX��LQHU LD�LQWHU-clase. /D��HWDSD�ILQDO ��QX�PDL�H[LVW �GHFkW�R�VLQJXU �FODV �úL�LQHU LD�LQWHU-clase

HVWH�GHFL�QXO ��6XPD�SLHUGHULORU�LQHU LHL�LQWHU-clase a diferitelor etape este deci HJDO �FX� LQHU LD� WRWDO ��/D�ILHFDUH�HWDS �� VH�FDOFXOHD] �XQ� LQGLFH�RE LQXW�SULQ�vPS U LUHD�SLHUGHULL�GH�LQHU LH�LQWHU-FODVH�OD�LQHU LD�WRWDO �

6H��DOHJH�WLSRORJLD�RE LQXW �OD�HWDSD�FRUHVSXQ] WRDUH�XQHL�FUHúWHUL�EUXWDOH�a indicelui

$SOLFD LH $P� UHDOL]DW� R� FODVLILFDUH� LHUDUKLF � DVFHQGHQW � D� GDWHORU� FHQWUDWH-reduse

din exemplul cu ajutorul criteriului lui Ward. 7DEHOXO���LQGLF �GHVI úXUDUHD�DOJRULWPXOXL�úL�UH]XOWDWHOH��VXQW�YL]XDOL]DWH�

cu ajutorul dendogramei (arborelui de clasificare) din Figura 11. ÌQ�SULPD�HWDS �VH�UHJUXSHD] �PRGHOHOH�&LWURHQ�%;�����úL�3HXJHRW���������

SHQWUX�FDUH�GLVWDQ D�OXL�:DUG�HVWH�HJDO �FX� 00393.0),( *4

*6 =xxD .

,QGLFHOH� GH� DJUHJDUH� HVWH� HJDO� FX� � ��������� ������ úL� DSDUH� vQ�GHQGRJUDP � OD� QLYHOXO� OXL�&LWURHQ�%;�� DGLF � D� HOHPHQWXOXL� FDUH� SUHFHGH�SH�FHO ODW��&ODVD�HVWH�QXPHURWDW ����

/D�D�GRXD�HWDS ��VH�RE LQH�FODVD�����UHJUXSkQG�)RUG�6LHUUD������úL�3HXJHRW�����%UHDN�������DO�F UHL�LQGLFH�GH�DJUHJDUH� 6/),( *

11*12 xxD este egal cu 0.15%.

ÌQ�D�WUHLD�HWDS �VH�FRQVWUXLHúWH�FODVD����UHJUXSkQG�5HQDXOW��������úL�FODVD���� ��������DO�F UHL�LQGLFH�GH�DJUHJDUH� 6/),(,( *

6*4

*2 xxxD este egal cu 0.19%.

$OJRULWPXO� XUPHD] � DFHODúL� SURFHGHX� SkQ � OD� XOWLPD� HWDS � FkQG� VH��UHJUXSHD] �FODVD����D�PLFLORU�PDúLQL��&LYLF��6HDW���)LHVWD��úL�FODVD����IRUPDW ��GLQ�UHVWXO�HúDQWLRQXOXL�

&ULWHULXO�OXL�:DUG�FXPXODW�GH�OD�XOWLPD�LWHUD LH�SHUPLWH�FDOFXODUHD�LQHU LHL�H[SOLFDWH�SULQ�GLIHULWHOH�WLSRORJLL�FRQVWUXLWH��ÌQ�DGHY U��OD�XOWLPD�LWHUD LH�DYHP�

I(43,46)=D(43,46)=3.07202 Întrucât iQHU LD� H[SOLFDW � ,����� SULQ� FODVD� ���� IRUPDW � GLQ� DQVDPEOXO� GH�

REVHUYD LL��HVWH�QXO ��ÌQ�FRQVHFLQ ��LQHU LD�H[SOLFDW �GH�WLSRORJLD�IRUPDW �GLQ�GRX �FODVH�����úL�����HVWH�HJDO �FX���������úL�SDUWHD�GH�LQHU LH�H[SOLFDW �HVWH�HJDO �FX���������� �������

Clasa� ��� D� IRVW� IRUPDW � SULQ� UHXQLXQHD� FODVHORU� �� �$XGL�� %0:�������%0:���úL��� �7LSR��5��������9:�&DUDYHOOH��

Din D(44,45)=I(43,44,45)-I(43,46), deducem: ,���������� '��������'������� ��������������� �������� úL� SDUWHD� GH�

LQHU LH��H[SOLFDW �GH�DFHDVW �WLSRORJLH�vQ���FODVH�HVWH�HJDO �FX�������� &RQWLQX P�VSUH�vQFHSXWXO�DOJRULWPXOXL�

Page 27: Analiza in componente principale

Clasa 45 provine din reuniunea claselor: �� ù(VSDFH��2PHJD�����9:&DUDYHOOH`�úL �� ù7LSR��5������5��`� 'LQ�IDSWXO�F �'������� ,�������������-I(43,44,45) deducem: I(43,44,38,42)=D(43,46)+D(44,45)+D(38,42)= =3.07202+1.42919+0.29270=4.79391. &UHúWHUHD��LQHU LHL�H[SOLFDWH�ILLQG�PLF �DWXQFL�FkQG�VH�WUHFH�GH�OD�WLSRORJLD�

IRUPDW �GLQ���FODVH������������OD�WLSRORJLD�IRUPDW �GLQ���FODVH����������������DGRSW P�WLSRORJLD�GDWHORU�GLQ���FODVH�

Tabelul 7

&ODVLILFDUHD�LHUDUKLF �DVFHQGHQW Descrierea claselor formate

Clasa Elementul

care precede

Elementul FDUH�XUPHD]

Nr. elemente FRQ LQXWH

Criteriul lui

Ward Indice(%).

Page 28: Analiza in componente principale

Tabelul 8 0HGLLOH�YDULDELOHORU��SH�FODVH�úL�WHVWXO�)LVKHU

Figura 11

Dendograma

Page 29: Analiza in componente principale

Figura 12

Vizualizarea tipologiei din 3 clase

3HQWUX� D� LQWHUSUHWD� FX� PDL� PXOW � SUHFL]LH� DFHDVW � WLSRORJLH�� DP�reprezentat-R�SH�SODQXO�SULQFLSDO�GLQ�)LJXUD�����úL�DP�FRQVWUXLW�7DEHOXO���XQGH�YDULDELOHOH� VXQW� DUDQMDWH� vQ� RUGLQHD� GHVFUHVF WRare a testului Fisher între YDULDELOH�úL�WLSRORJLH�

&ODVD�PDúLQLORU�PLFL�FRUHVSXQGH�FODVHL��� Honda Civic, Seat Ibiza Sxi, Citroen AX Sport, Peugeot 205 Rallye,

Peugeot 205, Fiat Uno, Ford Fiesta.

&ODVD�PDúLQLORU�PHGLL�FRUHVSXQGH�FODVHL��� Fiat Tipo, Renault 19,Citroen BX, Peugeot 405, Renault 21, Espace, Opel

Omega, Ford Sierra, Peugeot 405 Break, Nissan Vanette, VW Caravelle. &ODVD�PDúLQLORU�PDUL�FRUHVSXQGH�FODVHL��� Audi 90 Quatro, BMW325ix, Ford Scorpio, Renault 25, BMW 530i,

Rover 827i.

Page 30: Analiza in componente principale

Concluzie

V-DP� SUH]HQWDW� vQ� DFHVW� FDSLWRO� � WRDWH� HOHPHQWHOH� FDUH� V � SHUPLW �LQWHUSUHWDUHD� UH]XOWDWHORU�XQXL�SURJUDP�GH�DQDOL] � vQ� FRPSRQHQWH�SULQFLSDOH��S-DX�XWLOL]DW�SURJUDPHOH�6WDWJUDSKLFV� úL�63$'�1�SHQWUX� WUDWDUHD� H[HPSOXOXL�prezentat.

3HQWUX� FLWLWRUXO� FDUH� GRUHúWH� � V � úWLH� PDL� PXOWH� GHVSUH� $QDOL]D� vQ�&RPSRQHQWH� 3ULQFLSDOH�� DWkW� OD� QLYHO� WHRUHWLF� FkW� úL� SUDFWLF�� UHFRPDQG P�OXFU ULOH�XUP WRDUH�

%RXURFKH� úL� 6DSRUWD� �������� -DFNVRQ� �������� -ROOLIIH� �������� /HEDUW��0RULQHDX� úL� )pQpORQ� �������� /HEDUW�� 0RULQHDX� úL� 7DEDQd (1977), Saporta ��������6DSRUWD��ùWHI QHVFX��������

ÌQ� FHHD� FH� SULYHúWH� PHWRGHOH� GH� RE LQHUH� D� WLSRORJLLORU� UHFRPDQG P� vQ�special:

(YHULWW� ������� úL� SURFHGXULOH� $&(&/86�&/867(5�)$67&/86� GLQ�programul SAS.

BBBIIIBBBLLLIIIOOOGGGRRRAAAFFFIIIEEE

Michel Tenenhaus - Methodes Statistiques en Gestion. Editura Dunod 1994, Paris.

Gilbert Saporta, -� $QDOL]D� GDWHORU� úL� ,QIRUPDWLF �� (GLWXUD� (FRQRPLF �9LRULFD�ùWHI QHVFX� 1996.