Práctica 2. Continuación¶
SOM como un Clasificador¶
Los mapas auto-organizativos de Kohonen también se pueden utilizar como clasificadores mediante un aprendizaje supervisado que utiliza parte de los datos para entrenar la red y ajustar los pesos de cada neurona de manera que se minimize el error de salida y deja otra parte de los datos para comprobar si el entrenamiento ha sido eficaz.
Ejemplo 1. Selección de un conjunto de entrenamiento¶
Elige de manera aleatoria un conjunto de entrenamiento que contenga un 70% de los datos de som.wine
(manteniendo la primera columna que contiene la información de la bodega de origen). A partir de este conjunto, creamos el conjunto de entrenamiento y el conjunto de comprobación escalados.
Para que la comparación sea adecuada, el segundo se debe escalar con referencia al conjunto de entrenamiento. A partir de éstos, creamos dos listas donde detallamos el conjunto que se va comprobar respecto a que atributo, en nuestro caso, las bodegas (Cultivars
)
Las bodegas son:
Cultivars grapes = (Nebbiolo, Barberas, Grignolino)
library("kohonen")
data(wines)
Swine<-scale(wines)
train.wine = sample(nrow(Swine), 125)
train.wine
- 61
- 54
- 91
- 71
- 38
- 119
- 70
- 128
- 33
- 170
- 60
- 56
- 173
- 112
- 105
- 30
- 81
- 103
- 58
- 154
- 53
- 177
- 15
- 153
- 24
- 145
- 10
- 166
- 152
- 167
- 169
- 75
- 69
- 101
- 146
- 159
- 45
- 114
- 64
- 57
- 31
- 106
- 55
- 68
- 19
- 21
- 72
- 39
- 168
- 95
- 108
- 123
- 99
- 43
- 157
- 97
- 160
- 87
- 22
- 28
- 109
- 118
- 32
- 4
- 92
- 136
- 67
- 1
- 20
- 155
- 89
- 37
- 135
- 172
- 12
- 93
- 139
- 176
- 142
- 175
- 34
- 73
- 137
- 125
- 25
- 79
- 11
- 2
- 84
- 115
- 174
- 111
- 36
- 127
- 85
- 156
- 5
- 150
- 23
- 120
- 63
- 46
- 102
- 90
- 116
- 148
- 3
- 17
- 6
- 140
- 129
- 158
- 49
- 130
- 8
- 138
- 82
- 132
- 121
- 144
- 124
- 163
- 151
- 14
- 133
Xtrain.wine = scale(wines[train.wine, ])
Xtest.wine = scale(wines[-train.wine, ],
center = attr(Xtrain.wine, "scaled:center"),
scale = attr(Xtrain.wine, "scaled:scale"))
head(Xtest.wine)
alcohol | malic acid | ash | ash alkalinity | magnesium | tot. phenols | flavonoids | non-flav. phenols | proanth | col. int. | col. hue | OD ratio | proline |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1.3296123 | -0.2385973 | 0.78284260 | -0.67864943 | 1.4081430 | 0.5449968 | 0.5422879 | -0.4872614 | -0.5992395 | -0.02575895 | 0.4626919 | 1.35932800 | 1.7616771 |
1.0806026 | -0.9436719 | -0.51832324 | -1.15556399 | -0.1513374 | 1.1689387 | 1.1914430 | -1.1982460 | 0.4961288 | 0.85534002 | 0.2563394 | 1.31847626 | 0.9668342 |
2.1886959 | -0.6087614 | -0.05908824 | -2.52669337 | -0.6259619 | 1.3659730 | 1.7391676 | 0.4607180 | 2.2487182 | 0.11635379 | 1.2468316 | 0.20186206 | 1.3006682 |
1.6284240 | -0.4413062 | 1.20380802 | 0.03672242 | 1.3403395 | 0.8733873 | 1.1812999 | -0.3292649 | 0.7152025 | 0.44118290 | 0.5039624 | 0.09292409 | 1.7139865 |
1.4914687 | -0.7321495 | 0.28533801 | -1.00652819 | 0.5266975 | 1.6943634 | 1.9826007 | -0.4082632 | 0.5143850 | 1.45627388 | 1.1642906 | 0.32441728 | 2.9857352 |
0.4954298 | -0.5735077 | 0.82111218 | -1.12575683 | -0.4903549 | 0.9554849 | 0.9784390 | -0.2502666 | -0.2341167 | -0.12726805 | -0.1150952 | 0.86910713 | 1.4437399 |
trainingdata = list(measurements = Xtrain.wine,
vintages = vintages[train.wine])
trainingdata$vintages
- Grignolino
- Barolo
- Grignolino
- Grignolino
- Barolo
- Grignolino
- Grignolino
- Grignolino
- Barolo
- Barbera
- Grignolino
- Barolo
- Barbera
- Grignolino
- Grignolino
- Barolo
- Grignolino
- Grignolino
- Barolo
- Barbera
- Barolo
- Barbera
- Barolo
- Barbera
- Barolo
- Barbera
- Barolo
- Barbera
- Barbera
- Barbera
- Barbera
- Grignolino
- Grignolino
- Grignolino
- Barbera
- Barbera
- Barolo
- Grignolino
- Grignolino
- Barolo
- Barolo
- Grignolino
- Barolo
- Grignolino
- Barolo
- Barolo
- Grignolino
- Barolo
- Barbera
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Barolo
- Barbera
- Grignolino
- Barbera
- Grignolino
- Barolo
- Barolo
- Grignolino
- Grignolino
- Barolo
- Barolo
- Grignolino
- Barbera
- Grignolino
- Barolo
- Barolo
- Barbera
- Grignolino
- Barolo
- Barbera
- Barbera
- Barolo
- Grignolino
- Barbera
- Barbera
- Barbera
- Barbera
- Barolo
- Grignolino
- Barbera
- Grignolino
- Barolo
- Grignolino
- Barolo
- Barolo
- Grignolino
- Grignolino
- Barbera
- Grignolino
- Barolo
- Grignolino
- Grignolino
- Barbera
- Barolo
- Barbera
- Barolo
- Grignolino
- Grignolino
- Barolo
- Grignolino
- Grignolino
- Grignolino
- Barbera
- Barolo
- Barolo
- Barolo
- Barbera
- Grignolino
- Barbera
- Barolo
- Barbera
- Barolo
- Barbera
- Grignolino
- Barbera
- Grignolino
- Barbera
- Grignolino
- Barbera
- Barbera
- Barolo
- Barbera
Levels:
- 'Barbera'
- 'Barolo'
- 'Grignolino'
testdata = list(measurements = Xtest.wine, vintages = vintages[-train.wine])
Ejercicio 1. Clasificación del conjunto de entrenamiento y cálculo de errores¶
Utiliza un SOM del mismo tamaño y malla que en la sección anterior y el comando supersom
del paquete kohonen
para obtener la clasificación del conjunto testdata
.
Halla la matriz de confusion y dertermina el error cometido en la clasificación.
mygrid = somgrid(5, 5, "hexagonal")
som.wines = supersom(trainingdata, grid = mygrid)
som.prediction = predict(som.wines, newdata = testdata)
som.prediction$predictions
- $measurements
A matrix: 52 × 13 of type dbl alcohol malic acid ash ash alkalinity magnesium tot. phenols flavonoids non-flav. phenols proanth col. int. col. hue OD ratio proline 1 0.66351132 -0.61316816 0.93592094 -0.52961362 0.67360512 0.5778359 0.70795769 -0.02643809 -0.01504306 -0.26329024 0.93042438 0.4401639 0.940339424 2 1.59418518 -0.65282860 -0.26000355 -1.34931053 -0.40560050 0.4218504 0.70711244 -0.68475716 0.46874464 0.06052378 0.64840923 0.6205924 1.566940577 3 1.59418518 -0.65282860 -0.26000355 -1.34931053 -0.40560050 0.4218504 0.70711244 -0.68475716 0.46874464 0.06052378 0.64840923 0.6205924 1.566940577 4 0.71598123 -0.51684992 0.91405260 -0.27412368 0.58481483 1.0985693 1.12623766 -0.46469050 0.81169926 0.46612514 0.69262763 0.4002848 1.584540670 5 1.30649001 -0.55210366 -0.04268699 -1.01930269 0.50732512 1.5559703 1.50877547 -0.88225288 1.17682205 0.84721929 0.48627508 0.6317780 2.207697504 6 0.27665692 -0.07743734 -0.09189074 -0.93413937 -0.06416148 0.4324058 0.64806541 -0.76939818 0.01364516 -0.44107617 0.45090036 1.1647959 0.558057838 7 1.59418518 -0.65282860 -0.26000355 -1.34931053 -0.40560050 0.4218504 0.70711244 -0.68475716 0.46874464 0.06052378 0.64840923 0.6205924 1.566940577 8 1.59418518 -0.65282860 -0.26000355 -1.34931053 -0.40560050 0.4218504 0.70711244 -0.68475716 0.46874464 0.06052378 0.64840923 0.6205924 1.566940577 9 0.27665692 -0.07743734 -0.09189074 -0.93413937 -0.06416148 0.4324058 0.64806541 -0.76939818 0.01364516 -0.44107617 0.45090036 1.1647959 0.558057838 10 1.33850555 0.34938463 -0.11922615 -1.15556399 1.13692900 1.1079519 1.03350122 -1.07410586 0.49091281 0.14071597 0.04409105 1.1297801 0.623916250 11 0.27665692 -0.07743734 -0.09189074 -0.93413937 -0.06416148 0.4324058 0.64806541 -0.76939818 0.01364516 -0.44107617 0.45090036 1.1647959 0.558057838 12 1.33850555 0.34938463 -0.11922615 -1.15556399 1.13692900 1.1079519 1.03350122 -1.07410586 0.49091281 0.14071597 0.04409105 1.1297801 0.623916250 13 0.27665692 -0.07743734 -0.09189074 -0.93413937 -0.06416148 0.4324058 0.64806541 -0.76939818 0.01364516 -0.44107617 0.45090036 1.1647959 0.558057838 14 1.30649001 -0.55210366 -0.04268699 -1.01930269 0.50732512 1.5559703 1.50877547 -0.88225288 1.17682205 0.84721929 0.48627508 0.6317780 2.207697504 15 0.71598123 -0.51684992 0.91405260 -0.27412368 0.58481483 1.0985693 1.12623766 -0.46469050 0.81169926 0.46612514 0.69262763 0.4002848 1.584540670 16 1.30649001 -0.55210366 -0.04268699 -1.01930269 0.50732512 1.5559703 1.50877547 -0.88225288 1.17682205 0.84721929 0.48627508 0.6317780 2.207697504 17 1.59418518 -0.65282860 -0.26000355 -1.34931053 -0.40560050 0.4218504 0.70711244 -0.68475716 0.46874464 0.06052378 0.64840923 0.6205924 1.566940577 18 1.30649001 -0.55210366 -0.04268699 -1.01930269 0.50732512 1.5559703 1.50877547 -0.88225288 1.17682205 0.84721929 0.48627508 0.6317780 2.207697504 19 -0.72649655 -0.84546507 -1.37118824 -0.13786237 -0.83905856 -0.7028869 -0.21554266 -0.52111785 -0.48448665 -0.76300502 0.53933717 0.5364573 -0.799079640 20 -0.72649655 -0.84546507 -1.37118824 -0.13786237 -0.83905856 -0.7028869 -0.21554266 -0.52111785 -0.48448665 -0.76300502 0.53933717 0.5364573 -0.799079640 21 -0.74712879 -0.30910472 -0.96225041 -0.57134365 -0.91073654 0.7256116 0.70863389 -0.66105767 0.84664673 -0.80128845 1.18905290 0.5368463 -0.919804922 22 -0.74712879 -0.30910472 -0.96225041 -0.57134365 -0.91073654 0.7256116 0.70863389 -0.66105767 0.84664673 -0.80128845 1.18905290 0.5368463 -0.919804922 23 -1.61804031 -0.16147972 0.44798374 -0.03779548 -0.59206011 0.6681432 0.77304225 -0.88225288 1.49108845 -0.88351082 -0.50716504 1.0359184 -0.602980542 24 -0.72649655 -0.84546507 -1.37118824 -0.13786237 -0.83905856 -0.7028869 -0.21554266 -0.52111785 -0.48448665 -0.76300502 0.53933717 0.5364573 -0.799079640 25 -0.77950005 -0.38666293 -0.64078591 -0.10635195 -0.96497933 -1.3728035 -0.68907810 0.85570945 -0.25967533 -0.81752991 -0.03585582 -0.5416396 -0.659732325 26 -0.81187132 -0.93926517 -1.64727595 -0.58922795 3.81516703 -0.1528329 -0.20322611 -1.11924774 2.39476736 -0.96979356 1.12302009 0.3652690 0.275320863 27 -0.72649655 -0.84546507 -1.37118824 -0.13786237 -0.83905856 -0.7028869 -0.21554266 -0.52111785 -0.48448665 -0.76300502 0.53933717 0.5364573 -0.799079640 28 -0.77950005 -0.38666293 -0.64078591 -0.10635195 -0.96497933 -1.3728035 -0.68907810 0.85570945 -0.25967533 -0.81752991 -0.03585582 -0.5416396 -0.659732325 29 -1.55060018 -0.58085224 0.71905996 0.98061583 -0.58075953 -0.3991258 -0.07981902 0.94787412 -0.26758632 -0.96302628 1.61138778 0.1088109 -0.751616164 30 -1.04220532 -0.02487151 -0.16432959 0.78935322 -1.30399681 -0.2185110 -0.10940291 0.87545903 -0.51252286 -1.06724229 -0.14604808 0.6614441 -0.841433412 31 -0.74712879 -0.30910472 -0.96225041 -0.57134365 -0.91073654 0.7256116 0.70863389 -0.66105767 0.84664673 -0.80128845 1.18905290 0.5368463 -0.919804922 32 -0.81187132 -0.93926517 -1.64727595 -0.58922795 3.81516703 -0.1528329 -0.20322611 -1.11924774 2.39476736 -0.96979356 1.12302009 0.3652690 0.275320863 33 -0.74712879 -0.30910472 -0.96225041 -0.57134365 -0.91073654 0.7256116 0.70863389 -0.66105767 0.84664673 -0.80128845 1.18905290 0.5368463 -0.919804922 34 -0.72649655 -0.84546507 -1.37118824 -0.13786237 -0.83905856 -0.7028869 -0.21554266 -0.52111785 -0.48448665 -0.76300502 0.53933717 0.5364573 -0.799079640 35 -0.72649655 -0.84546507 -1.37118824 -0.13786237 -0.83905856 -0.7028869 -0.21554266 -0.52111785 -0.48448665 -0.76300502 0.53933717 0.5364573 -0.799079640 36 -0.77950005 -0.38666293 -0.64078591 -0.10635195 -0.96497933 -1.3728035 -0.68907810 0.85570945 -0.25967533 -0.81752991 -0.03585582 -0.5416396 -0.659732325 37 -1.61804031 -0.16147972 0.44798374 -0.03779548 -0.59206011 0.6681432 0.77304225 -0.88225288 1.49108845 -0.88351082 -0.50716504 1.0359184 -0.602980542 38 -1.55060018 -0.58085224 0.71905996 0.98061583 -0.58075953 -0.3991258 -0.07981902 0.94787412 -0.26758632 -0.96302628 1.61138778 0.1088109 -0.751616164 39 -0.99684997 -0.23104289 -1.04863033 0.24963071 -0.75188264 0.2142607 0.14091302 -0.11484094 0.02668526 -1.16440100 -0.26838566 0.5170040 -1.095669591 40 -0.14577029 2.12780949 -0.53745803 0.70738353 -0.45645312 0.7502408 0.63357534 -0.05277086 0.73345866 -1.02663865 0.04998684 0.7261260 -1.015504007 41 -0.99684997 -0.23104289 -1.04863033 0.24963071 -0.75188264 0.2142607 0.14091302 -0.11484094 0.02668526 -1.16440100 -0.26838566 0.5170040 -1.095669591 42 -0.25159943 -0.01532362 -0.17389699 0.38447263 0.57189987 -1.3377752 -0.80673745 -1.22457879 -1.29297283 -0.07312986 -0.88547804 -1.7272478 -0.458584082 43 -0.37921691 0.59096712 -0.37959600 0.20438770 0.17072920 -0.9738091 -1.29994315 0.95445731 -0.72703250 1.36593078 -1.28098709 -1.2398639 0.009048492 44 -0.07106738 0.65596619 0.26620322 0.38447263 -0.03833158 -1.1106384 -1.43560642 0.73721202 -1.33557048 -0.13741896 -0.82357228 -0.8421268 -0.381749268 45 0.22359414 2.22916397 -0.03995345 0.43415123 -0.68246477 -1.2091556 -1.47448810 1.36919833 -1.03434418 0.29568653 -1.02304641 -1.3981644 -0.397646126 46 0.22359414 2.22916397 -0.03995345 0.43415123 -0.68246477 -1.2091556 -1.47448810 1.36919833 -1.03434418 0.29568653 -1.02304641 -1.3981644 -0.397646126 47 -0.37921691 0.59096712 -0.37959600 0.20438770 0.17072920 -0.9738091 -1.29994315 0.95445731 -0.72703250 1.36593078 -1.28098709 -1.2398639 0.009048492 48 -0.07106738 0.65596619 0.26620322 0.38447263 -0.03833158 -1.1106384 -1.43560642 0.73721202 -1.33557048 -0.13741896 -0.82357228 -0.8421268 -0.381749268 49 -0.07106738 0.65596619 0.26620322 0.38447263 -0.03833158 -1.1106384 -1.43560642 0.73721202 -1.33557048 -0.13741896 -0.82357228 -0.8421268 -0.381749268 50 0.85649386 0.97765649 0.19604232 0.38447263 -0.37734905 -0.6974138 -1.22091190 0.97420689 -0.24020211 1.58823570 -1.42887308 -1.3278086 -0.355254505 51 0.22359414 2.22916397 -0.03995345 0.43415123 -0.68246477 -1.2091556 -1.47448810 1.36919833 -1.03434418 0.29568653 -1.02304641 -1.3981644 -0.397646126 52 -0.37921691 0.59096712 -0.37959600 0.20438770 0.17072920 -0.9738091 -1.29994315 0.95445731 -0.72703250 1.36593078 -1.28098709 -1.2398639 0.009048492 - $vintages
-
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Barolo
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Grignolino
- Barbera
- Barbera
- Barbera
- Barbera
- Barbera
- Barbera
- Barbera
- Barbera
- Barbera
- Barbera
- Barbera
Levels:
- 'Barbera'
- 'Barolo'
- 'Grignolino'
confusion<-table(vintages[-train.wine],som.prediction$predictions[["vintages"]])
confusion
Barbera Barolo Grignolino
Barbera 11 0 0
Barolo 0 18 0
Grignolino 0 0 23
error_rate <- sum(sum(confusion-diag(diag(confusion))))/sum(colSums(confusion))
error_rate
Ejercicio¶
Repite la práctica (clustering y clasificación) usando:
subconjuntos de los datos
wine
(nota: scatter matrix)variando los parámetros del SOM (neuronas, funciones) Compara los resultados obtenidos.