Predictor de precios de mercado - General Motors - Parte III

David Cerezal Landa 28 minutos de lectura

Como hemos comentado en los posts anteriores, en este set de diferentes notebooks trataremos de hacer un análisis de la serie temporal de los precios de cierre de mercado de la compañía General Motors en la bolsa de NY. Como objetivo se fijará crear un modelo de referencia y tratar de mejorarlo usando redes secuenciales. El objeto de estudio de hoy es:

El último punto, Modelos secuenciales intentaremos crear modelos de redes secuenciales que mejoren estos modelos de referencia.

Forecast: Redes secuenciales

Una vez definido nuestro modelos de referencia como ARIMA, vamos a intentar crear modelos de redes secuenciales que lo mejoren. Alcanzado este punto se incorporarán indicadores técnicos que mejorarán aún más esta predicción.

En el post de hoy trataremos:

1.- Modelo RNN

2.- Modelos LTSM

3.- Resultados Finales y conclusiones

Modelo RNN (simple)

Cargar dataset ya creado

Una vez teniendo creado el dataset, podemos importarlo directamente desde el directorio.

In [0]:
symbol = 'GM'
In [0]:
serie = pd.read_csv(file_data_GM)
serie.index = serie.pop('date')
serie.index = pd.to_datetime(serie.index)
serie.head()
Out[0]:
1. open 2. high 3. low 4. close 5. volume
date
2010-11-18 35.00 35.99 33.8900 34.19 457044300.0
2010-11-19 34.15 34.50 33.1100 34.26 107842000.0
2010-11-22 34.20 34.48 33.8100 34.08 36650600.0
2010-11-23 33.95 33.99 33.1900 33.25 31170200.0
2010-11-24 33.73 33.80 33.2186 33.48 26138000.0
In [0]:
data = serie['4. close'].values
print(data)
[34.19 34.26 34.08 ... 37.11 37.61 37.13]

Cálculo de algunos indicadores

In [0]:
#Media movil
from alpha_vantage.techindicators import TechIndicators
symbol='NYSE:GM'
ti = TechIndicators(key=API_KEY,output_format='pandas')
#Obtenemos la media movil mensual.
data_macd, meta_macd = ti.get_macd(symbol=symbol, interval='daily',series_type='close')
In [0]:
serie = pd.merge(serie, data_macd,on='date')
serie.head()
Out[0]:
1. open 2. high 3. low 4. close 5. volume MACD_Signal MACD_Hist MACD
date
2011-01-06 38.24 39.48 38.07 38.90 38556900.0 0.4572 0.5579 1.0151
2011-01-07 38.84 39.33 38.51 38.98 19901100.0 0.5918 0.5382 1.1299
2011-01-10 39.34 39.36 38.44 38.56 18341600.0 0.7081 0.4654 1.1735
2011-01-11 38.66 39.43 38.51 38.75 14856500.0 0.8084 0.4011 1.2094
2011-01-12 38.95 39.37 38.37 38.62 16773900.0 0.8894 0.3240 1.2134
In [0]:
data = serie.loc[:,['4. close', 'MACD']].values
print(data)
[[38.9     1.0151]
 [38.98    1.1299]
 [38.56    1.1735]
 ...
 [36.77   -0.181 ]
 [37.11   -0.2211]
 [37.61   -0.2102]]

El primer paso para el RNN es preparar los datos en n bloques con ventandas separadas por una un desplazamiento que va a ser igual a 1, generando un cubo de datos en el que las filas serían cada entrada a la red, las columnas el instante t-n y las dimesiones el número de variables independientes a la entrada.

Si sólo tomamos el precio de días anteriores. el número de variables independientes es 1. Este valor crecerá unitariamente con el número de variables exógenas que añadamos.

In [0]:
#split a sequence into samples
def split_sequence(sequence, n_steps):
    X = list()
    Y = list()
    for i in range(len(sequence)):
        # find the end of this pattern
        end_ix = i + n_steps
        # check if we are beyond the sequence
        if end_ix > len(sequence)-1:
            break
        # gather input and output parts of the pattern
        seq_x = sequence[i:end_ix-1,:]
        seq_y = sequence[end_ix,0:1]
        X.append(seq_x)
        Y.append(seq_y)

    return (np.array(X), np.array(Y))

Modelado

Redes Neuronales Recurrentes

Para el modelado de redes neuronales, hemos escogido arquitecturas recurrentes, adaptadas para tipos de datos secuenciales. Como ya sabemos, en el caso de una serie temporal, cada input de la secuencial depende del o de los anteriores. Las redes neuronales recurrentes (RNN) siguen el mismo principio que el perceptrón multicapa, pero cada nuevo input de la secuencia se retroalimenta también con la salida del los inputs anteriores, como podemos ver a continuación: texto alternativo

Esta retroalimentación hace el efecto de "memoria" y captura información en base a la información calculada a lo largo de la secuencia. En este caso, el problema se somete a una entrada de secuencias de precios con respecto al tiempo y queremos predecir el siguiente valor, por ello, emplearemos un modelo many-to-one.

texto alternativo

En las RNN simples pueden surgir dos problemas a la hora de intentar ajustar un modelo:

  • Vanishing Gradient: surge cuando intentamos propagar el error hacia atrás y las derivadas parciales comienzan a dar resultados cada vez más y más pequeños. Esto desencadena que la contribución de los steps iniciales sea practicamente nula en el proceso del descenso por gradiente. Esto provoca que cuando las secuencias son muy largas, se pierdan las dependencias a largo plazo. Esto puede mitigarse empleando funciones de activación distintas (Relu) o diferentes tipos de celdas recurrentes como son LSTM (Long Short Term Memory) o GRU (Gated Recurrent Unit).

  • Exploding Gradient. Esto surge cuando cuando el algoritmo asigna importancias excesívamente grandes a los pesos. Este problema puede mitigarse aplicando gradient clip o empleando algoritmos de optimización que ajusten la tasa de aprendizaje de manera más eficiente como RMSprop o Adam.

Definición del modelo

Para el modelo, consideramos la regularización L2, que saca una salida no-sparse, interesante en nuestro caso:

In [0]:
num_steps = split_steps - 1
num_var = x_train_data.shape[2]
output_size = y_train_data.shape[1]
In [0]:
import keras.backend as K
def rmse (y_true, y_pred):
    return K.sqrt(K.mean(K.square(y_pred -y_true), axis=-1))
In [0]:
from keras.models import Sequential
from keras.layers import Dense, Activation, LSTM, Dropout, SimpleRNN
from keras.optimizers import SGD
from keras.regularizers import l2
from keras.callbacks import EarlyStopping

model = Sequential()
model.add(SimpleRNN(50,
                    activation='relu',
                    kernel_regularizer=l2(0.001),
                    return_sequences=True))
model.add(BatchNormalization())
model.add(Dropout(0.2))

model.add(SimpleRNN(50,
                    activation='relu',
                    kernel_regularizer=l2(0.001),
                    return_sequences=True))
model.add(BatchNormalization())
model.add(Dropout(0.2))

model.add(SimpleRNN(50,
                    activation='relu',
                    kernel_regularizer=l2(0.001)))
model.add(BatchNormalization())
model.add(Dropout(0.2))

model.add(Dense(output_size,
                activation='linear',
                kernel_regularizer=l2(0.001)))


model.compile(optimizer='adam', loss='mse', metrics=['mse',rmse,'mae','mape'])

Entrenamiento

Para entrenar, utilizamos la técnica conocida como Early Stopping, lo cual permite "cortar" el entreanmiento para prevenir el over-fitting.

Utilizamos además 50 épocas alimentándolo con batches de 32 secuencias:

In [0]:
es = EarlyStopping(monitor='val_loss', mode='min', patience=10, verbose=1)


print('RNN')
history_train = model.fit(x_train_data, y_train_data, 
                          epochs=50,
                          batch_size=3, 
                          shuffle=False,
                          validation_split=0.3,
                          callbacks=[es],
                          verbose=1)

plt.plot(history_train.history['mean_squared_error'])
plt.plot(history_train.history['mean_absolute_error'])
plt.plot(history_train.history['rmse'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['MSE','MAE','RMSE'], loc='upper left')
plt.show()

plt.plot(history_train.history['mean_absolute_percentage_error'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['MAPE'], loc='upper left')
plt.show()
RNN
Train on 1010 samples, validate on 433 samples
Epoch 1/50
1010/1010 [==============================] - 7s 7ms/step - loss: 785.0313 - mean_squared_error: 784.9274 - rmse: 27.5482 - mean_absolute_error: 27.5482 - mean_absolute_percentage_error: 92.4799 - val_loss: 546.6240 - val_mean_squared_error: 546.5150 - val_rmse: 23.1830 - val_mean_absolute_error: 23.1830 - val_mean_absolute_percentage_error: 70.7623
Epoch 2/50
1010/1010 [==============================] - 2s 2ms/step - loss: 260.8698 - mean_squared_error: 260.7534 - rmse: 15.0494 - mean_absolute_error: 15.0494 - mean_absolute_percentage_error: 50.1351 - val_loss: 47.0533 - val_mean_squared_error: 46.9274 - val_rmse: 6.3590 - val_mean_absolute_error: 6.3590 - val_mean_absolute_percentage_error: 18.9590
Epoch 3/50
1010/1010 [==============================] - 2s 2ms/step - loss: 44.5053 - mean_squared_error: 44.3772 - rmse: 5.3668 - mean_absolute_error: 5.3668 - mean_absolute_percentage_error: 17.6814 - val_loss: 17.8657 - val_mean_squared_error: 17.7334 - val_rmse: 3.2436 - val_mean_absolute_error: 3.2436 - val_mean_absolute_percentage_error: 9.3669
Epoch 4/50
1010/1010 [==============================] - 2s 2ms/step - loss: 41.2103 - mean_squared_error: 41.0797 - rmse: 5.3577 - mean_absolute_error: 5.3577 - mean_absolute_percentage_error: 18.7917 - val_loss: 7.9954 - val_mean_squared_error: 7.8622 - val_rmse: 2.4976 - val_mean_absolute_error: 2.4976 - val_mean_absolute_percentage_error: 7.7899
Epoch 5/50
1010/1010 [==============================] - 2s 2ms/step - loss: 40.8599 - mean_squared_error: 40.7289 - rmse: 5.2604 - mean_absolute_error: 5.2604 - mean_absolute_percentage_error: 18.5122 - val_loss: 8.2714 - val_mean_squared_error: 8.1376 - val_rmse: 2.3358 - val_mean_absolute_error: 2.3358 - val_mean_absolute_percentage_error: 7.0065
Epoch 6/50
1010/1010 [==============================] - 2s 2ms/step - loss: 39.0972 - mean_squared_error: 38.9659 - rmse: 5.0940 - mean_absolute_error: 5.0940 - mean_absolute_percentage_error: 18.0374 - val_loss: 9.0630 - val_mean_squared_error: 8.9286 - val_rmse: 2.5678 - val_mean_absolute_error: 2.5678 - val_mean_absolute_percentage_error: 7.7950
Epoch 7/50
1010/1010 [==============================] - 2s 2ms/step - loss: 37.2390 - mean_squared_error: 37.1070 - rmse: 5.0111 - mean_absolute_error: 5.0111 - mean_absolute_percentage_error: 17.5992 - val_loss: 9.7055 - val_mean_squared_error: 9.5701 - val_rmse: 2.4810 - val_mean_absolute_error: 2.4810 - val_mean_absolute_percentage_error: 7.3845
Epoch 8/50
1010/1010 [==============================] - 2s 2ms/step - loss: 38.2450 - mean_squared_error: 38.1124 - rmse: 5.0381 - mean_absolute_error: 5.0381 - mean_absolute_percentage_error: 17.7691 - val_loss: 9.0316 - val_mean_squared_error: 8.8959 - val_rmse: 2.6599 - val_mean_absolute_error: 2.6599 - val_mean_absolute_percentage_error: 8.3955
Epoch 9/50
1010/1010 [==============================] - 2s 2ms/step - loss: 36.2072 - mean_squared_error: 36.0744 - rmse: 4.9409 - mean_absolute_error: 4.9409 - mean_absolute_percentage_error: 17.3287 - val_loss: 8.9092 - val_mean_squared_error: 8.7734 - val_rmse: 2.6178 - val_mean_absolute_error: 2.6178 - val_mean_absolute_percentage_error: 8.0490
Epoch 10/50
1010/1010 [==============================] - 2s 2ms/step - loss: 36.3066 - mean_squared_error: 36.1737 - rmse: 4.9753 - mean_absolute_error: 4.9753 - mean_absolute_percentage_error: 17.5653 - val_loss: 7.2585 - val_mean_squared_error: 7.1223 - val_rmse: 2.3085 - val_mean_absolute_error: 2.3085 - val_mean_absolute_percentage_error: 7.0583
Epoch 11/50
1010/1010 [==============================] - 2s 2ms/step - loss: 35.7151 - mean_squared_error: 35.5819 - rmse: 4.9014 - mean_absolute_error: 4.9014 - mean_absolute_percentage_error: 17.2317 - val_loss: 7.5640 - val_mean_squared_error: 7.4278 - val_rmse: 2.4328 - val_mean_absolute_error: 2.4328 - val_mean_absolute_percentage_error: 7.6006
Epoch 12/50
1010/1010 [==============================] - 2s 2ms/step - loss: 35.5824 - mean_squared_error: 35.4490 - rmse: 4.8950 - mean_absolute_error: 4.8950 - mean_absolute_percentage_error: 17.1913 - val_loss: 7.7086 - val_mean_squared_error: 7.5716 - val_rmse: 2.3091 - val_mean_absolute_error: 2.3091 - val_mean_absolute_percentage_error: 6.9876
Epoch 13/50
1010/1010 [==============================] - 2s 2ms/step - loss: 35.6137 - mean_squared_error: 35.4797 - rmse: 4.7993 - mean_absolute_error: 4.7993 - mean_absolute_percentage_error: 16.9660 - val_loss: 8.5271 - val_mean_squared_error: 8.3899 - val_rmse: 2.4734 - val_mean_absolute_error: 2.4734 - val_mean_absolute_percentage_error: 7.5144
Epoch 14/50
1010/1010 [==============================] - 2s 2ms/step - loss: 35.6660 - mean_squared_error: 35.5319 - rmse: 4.8553 - mean_absolute_error: 4.8553 - mean_absolute_percentage_error: 17.0522 - val_loss: 10.6545 - val_mean_squared_error: 10.5167 - val_rmse: 2.8742 - val_mean_absolute_error: 2.8742 - val_mean_absolute_percentage_error: 9.1390
Epoch 15/50
1010/1010 [==============================] - 2s 2ms/step - loss: 36.3775 - mean_squared_error: 36.2427 - rmse: 4.9929 - mean_absolute_error: 4.9929 - mean_absolute_percentage_error: 17.5291 - val_loss: 11.4388 - val_mean_squared_error: 11.3007 - val_rmse: 2.9772 - val_mean_absolute_error: 2.9772 - val_mean_absolute_percentage_error: 9.4595
Epoch 16/50
1010/1010 [==============================] - 2s 2ms/step - loss: 35.0341 - mean_squared_error: 34.8989 - rmse: 4.8476 - mean_absolute_error: 4.8476 - mean_absolute_percentage_error: 17.0562 - val_loss: 8.7656 - val_mean_squared_error: 8.6270 - val_rmse: 2.6318 - val_mean_absolute_error: 2.6318 - val_mean_absolute_percentage_error: 8.2130
Epoch 17/50
1010/1010 [==============================] - 2s 2ms/step - loss: 34.8604 - mean_squared_error: 34.7249 - rmse: 4.7980 - mean_absolute_error: 4.7980 - mean_absolute_percentage_error: 16.8143 - val_loss: 963.7691 - val_mean_squared_error: 963.6302 - val_rmse: 24.8952 - val_mean_absolute_error: 24.8952 - val_mean_absolute_percentage_error: 72.4567
Epoch 18/50
1010/1010 [==============================] - 2s 2ms/step - loss: 34.5862 - mean_squared_error: 34.4503 - rmse: 4.8394 - mean_absolute_error: 4.8394 - mean_absolute_percentage_error: 16.9539 - val_loss: 9.2740 - val_mean_squared_error: 9.1350 - val_rmse: 2.7145 - val_mean_absolute_error: 2.7145 - val_mean_absolute_percentage_error: 8.4752
Epoch 19/50
1010/1010 [==============================] - 2s 2ms/step - loss: 33.5859 - mean_squared_error: 33.4498 - rmse: 4.7741 - mean_absolute_error: 4.7741 - mean_absolute_percentage_error: 16.7261 - val_loss: 8.9045 - val_mean_squared_error: 8.7650 - val_rmse: 2.6516 - val_mean_absolute_error: 2.6516 - val_mean_absolute_percentage_error: 8.2715
Epoch 20/50
1010/1010 [==============================] - 2s 2ms/step - loss: 32.8114 - mean_squared_error: 32.6750 - rmse: 4.7105 - mean_absolute_error: 4.7105 - mean_absolute_percentage_error: 16.5936 - val_loss: 10.1314 - val_mean_squared_error: 9.9918 - val_rmse: 2.8437 - val_mean_absolute_error: 2.8437 - val_mean_absolute_percentage_error: 8.9522
Epoch 00020: early stopping

Predicción

Las predicciones del modelo serán entonces:

  • Dado una ventana de 5 días de precios y sus diferenciales de grado 1, obtener el valor del día siguiente. Obtendremos para cada cada secuencia de Test, la predicción del siguiente día y al almacenaremos los resultados.
  • Una vez tenidos los resultados del modelo, calcularemos las métricas de error para evaluar su ajuste.
  • Por último, graficaremos tanto los puntos de predicción como los valores reales para ver cómo se se comporta la red con valores que no han sido vistos antes y cómo se ajusta al patrón de la serie.
In [0]:
y_pred = model.predict(x_test_data, verbose=0)

Métricas

In [0]:
def mape(y_true, y_pred): 
    y_true_, y_pred_ = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true_ - y_pred_) / y_true_)) * 100
In [0]:
prd, ytr = np.squeeze(y_pred), np.squeeze(y_test_data)

mse = sk_mse(ytr, prd)
rmse = np.sqrt(sk_mse(ytr, prd))
mae = sk_mae(ytr, prd)
mape = mape(ytr, prd)

metrics_ = pd.DataFrame([mse,rmse,mae,mape], index=['mse','rmse','mae','mape'], columns=['Metrics'])
metrics_.loc['mape'] = metrics_.loc['mape'].map(lambda x: str(round(x, 2))+'%')
metrics_
Out[0]:
Metrics
mse 26.9055
rmse 5.18705
mae 4.18687
mape 10.6%

Visualización de la predicción

Podemos ver que el modelo aprende el patrón que sigue la serie real, excepto en algunmos tramos donde ocurren valores anómalos (picos). Pero a grandes rasgos, consigue detectar cuándo la serie tiende a subir o a bajar su valor.

In [0]:
y_pred[:5], y_test_data[:5]
Out[0]:
(array([[33.913372],
        [33.92289 ],
        [33.924004],
        [33.925922],
        [33.906902]], dtype=float32), array([[31.4 ],
        [31.85],
        [31.75],
        [32.04],
        [32.98]]))
In [0]:
plt.plot(y_pred, c='seagreen', alpha=0.6)
plt.legend(['prediction'], loc='best')
plt.title('RNN')
Out[0]:
Text(0.5, 1.0, 'RNN')
In [0]:
plt.plot(y_test_data, c='orange', alpha=0.5)
plt.legend(['real'], loc='best')
plt.title('RNN')
Out[0]:
Text(0.5, 1.0, 'RNN')

Long Short Term Memory (LSTM)

Cargar dataset ya creado

Una vez teniendo creado el dataset, podemos importarlo directamente desde el directorio.

In [0]:
serie = pd.read_csv(file_data_GM)
serie.index = serie.pop('date')
serie.index = pd.to_datetime(serie.index)
serie.head()
Out[0]:
1. open 2. high 3. low 4. close 5. volume
date
2010-11-18 35.00 35.99 33.8900 34.19 457044300.0
2010-11-19 34.15 34.50 33.1100 34.26 107842000.0
2010-11-22 34.20 34.48 33.8100 34.08 36650600.0
2010-11-23 33.95 33.99 33.1900 33.25 31170200.0
2010-11-24 33.73 33.80 33.2186 33.48 26138000.0
In [0]:
serie.shape
Out[0]:
(2229, 5)

Cálculo de algunos indicadores

Calculamos algunos indicadores como el MACD, la media móvil, etc. Por último, sabemos que la serie tiene fuertes indicios de ser de tipo random walk. Realizamos la diferenciación de grado 1 sobre la serie y lo agregamos como variable.

In [0]:
serie['26_ema'] = serie['4. close'].ewm(span=26, min_periods=0, adjust=True, ignore_na=False).mean()
serie['12_ema'] = serie['4. close'].ewm(span=12, min_periods=0, adjust=True, ignore_na=False).mean()
serie['MACD'] = serie['12_ema'] - serie['26_ema']
serie = serie.fillna(0)
serie.head()
Out[0]:
1. open 2. high 3. low 4. close 5. volume 26_ema 12_ema MACD
date
2010-11-18 35.00 35.99 33.8900 34.19 457044300.0 34.190000 34.190000 0.000000
2010-11-19 34.15 34.50 33.1100 34.26 107842000.0 34.226346 34.227917 0.001571
2010-11-22 34.20 34.48 33.8100 34.08 36650600.0 34.173765 34.170185 -0.003581
2010-11-23 33.95 33.99 33.1900 33.25 31170200.0 33.915521 33.879718 -0.035803
2010-11-24 33.73 33.80 33.2186 33.48 26138000.0 33.814522 33.771116 -0.043406
In [0]:
daily = serie['4. close'].asfreq('D', method='pad')
mean_roll = daily.rolling(15).mean().fillna(0)
mean_roll.name = 'mavg'
std_roll = daily.rolling(15).std().fillna(0)
std_roll.name = 'mstd'
diff_1 = serie['4. close'].diff().dropna()
diff_1.name = 'diff_1'
In [0]:
serie = pd.concat([serie, mean_roll, std_roll, diff_1], axis=1).dropna()
In [0]:
serie.head()
Out[0]:
1. open 2. high 3. low 4. close 5. volume 26_ema 12_ema MACD mavg mstd diff_1
date
2010-11-19 34.15 34.50 33.1100 34.26 107842000.0 34.226346 34.227917 0.001571 0.0 0.0 0.07
2010-11-22 34.20 34.48 33.8100 34.08 36650600.0 34.173765 34.170185 -0.003581 0.0 0.0 -0.18
2010-11-23 33.95 33.99 33.1900 33.25 31170200.0 33.915521 33.879718 -0.035803 0.0 0.0 -0.83
2010-11-24 33.73 33.80 33.2186 33.48 26138000.0 33.814522 33.771116 -0.043406 0.0 0.0 0.23
2010-11-26 33.41 33.81 33.2100 33.80 12301200.0 33.811613 33.778137 -0.033477 0.0 0.0 0.32

Preparación del input

Para poder alimentar la red neuronal recurrente, el input debe estar formado por secuencias de N timesteps de ventana fija. El input puede ser univariable (solo se tiene en cuenta una variable para el análisis) o multivariable donde cada secuencia temporal será la combinación de dos o más variables, las cuales podrían ser indicadores o variables creadas a partir de la información de la serie, como la diferencición de la misma.

La selección de las variables que compondrán la secuencia así como el tamaño de la ventana es un parámetro que hay que ajustar en función de las características de la serie. Para el entrenamiento, seleccionamos los primeros 1553 steps, dejando 661 para realizar la evaluación del modelo.

In [0]:
def split_sequence(sequence, n_steps):
	X, y = list(), list()
	for i in range(len(sequence)):
		end_ix = i + n_steps
		if end_ix > len(sequence)-1:
			break
		seq = sequence[i:end_ix+1]
		X.append(seq)
	return np.array(X)

def train_test(data, test_size=0.2, scale=True):
    if scale:
        scaler = MinMaxScaler(feature_range=(0,1))
        data2 = scaler.fit_transform(data)
    else:
        data2 = np.copy(data)
        scaler = None
    
    tr, ts = data2[0:int(1-len(data2)*test_size),:], data2[int(1-len(data)*test_size):,:]
    xtr, xts = tr[:,:-1,np.newaxis], ts[:,:-1,np.newaxis]
    ytr, yts = tr[:,-1], ts[:,-1]
    
    for i in (xtr, xts, ytr, yts):
        print(i.shape)
    
    return (xtr, xts, ytr, yts, scaler)
    

def split_sequence_indicators(frame, columns, n_steps, y_true='close', test_size=0.2, scale=True):
    xtrain, xtest = [], []
    targ_tr, targ_ts = None, None
    scalers = {}
    
    for col in columns:
        xy = split_sequence(frame[col].values, n_steps)
        xtr, xts, ytr, yts, scaler = train_test(xy, test_size, scale)
        scalers[col] = scaler
        if all(i is None for i in [targ_tr, targ_ts]) and col == y_true:
            targ_tr, targ_ts = ytr, yts
        xtrain.append(xtr)
        xtest.append(xts)
    
    xtrain = np.concatenate(xtrain, axis=2)
    xtest = np.concatenate(xtest, axis=2)
    
    print()
    for i in (xtrain, xtest, ytr, yts):
        print(i.shape)
    
    return (xtrain, xtest, targ_tr, targ_ts, scalers)

def normalize(data):
    normalized = (data-data.mean())/data.std()
    return normalized
In [0]:
xTrain, xTest, ytrain, ytest, scalers_ = split_sequence_indicators(serie, ['4. close', 'diff_1'], 5, y_true='4. close', test_size=0.3, scale=False)
(1558, 5, 1)
(665, 5, 1)
(1558,)
(665,)
(1558, 5, 1)
(665, 5, 1)
(1558,)
(665,)

(1558, 5, 2)
(665, 5, 2)
(1558,)
(665,)
In [0]:
xTrain[0], ytrain[0]
Out[0]:
(array([[34.26,  0.07],
        [34.08, -0.18],
        [33.25, -0.83],
        [33.48,  0.23],
        [33.8 ,  0.32]]), 33.8)

Modelado

 LSTM

Como se ha mencionado anteriormente en las RNN, una de las formas más empleadas para mitigar los problemas de las RNN simples es usar arquitecturas LSTM. Básicamente, son RNN capaces de aprender dependencias a largo plazo. Esto nos permite trabajar con secuencias más largas. Para lograrlo, las celdas LSTM poseen tres unidades internas:

  • Unidad de mantenimiento
  • Unidad de actualización
  • Unidad de output

Estas puertas permiten a la celda analizar el input y determinar a lo largo de la secuencia qué es importante, y por tanto mantenerlo y qué no lo es, y por tanto desecharlo. texto alternativo

El modelo que mejores resultados ha obtenido es el modelo base_lstm_model.

Arquitectura

  • Input: recibe un input de dimensión (batch, timesteps, features), es decir, un input multivariable que consta del valor de cierre a nivel día ("close") y de la diferenciación de grado 1 ("diff_1") como segunda variable. La ventana elegida para el modelo es de 5 timesteps. Dado un batch de 32, una ventana de 5 steps y 2 features, la entrada del modelo tendría las siguientes dimensiones: (32, 5, 2).
  • 3 combinaciones de:
    • BatchNormalization: permite normalizar la salida de cada capa de la red para manetener una distribución constante antes de alimentar la siguiente capa.
    • Una celda LSTM de 50 unidades con regularización L2 de 0.001
    • Dropout: permite reducir el sobre ajuste del modelo "apagando" conexiones entre capas con una probabilidad dada
  • Output:
    • Una capa densa de 1 unidad que mapea la predicción del valor del día siguiente con función de activación lineal y regularización L2 de 0.002

A continuación definimos las dos últimas dimensiones del input:

In [0]:
n_features = 1 if len(xTrain.shape) < 3 else xTrain.shape[2]
time_steps = xTrain.shape[1]

print(f'n_features: {n_features}\ntime_steps: {time_steps}')
n_features: 2
time_steps: 5

 Definición del modelo

In [0]:
def base_lstm_model():
    input_ = Input((time_steps, n_features), name='Input1')
    x = BatchNormalization(name='Bn1')(input_)
    x = Reshape((time_steps, n_features))(x)

    x = LSTM(50, return_sequences=True, name='Lstm1', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn2')(x)
    x = Dropout(0.2, name='Dp1')(x)

    x = LSTM(50, return_sequences=True, name='Lstm2', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn3')(x)
    x = Dropout(0.2, name='Dp2')(x)

    x = LSTM(50, name='Lstm3', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn4')(x)
    x = Dropout(0.2, name='Dp3')(x)

    out = Dense(1, activation='linear', name='Output', kernel_regularizer=regularizers.l2(0.002))(x)

    model = Model(inputs=input_, outputs=out, name='Regressor1')
    return model


def lstm_model():
    input_ = Input((time_steps, n_features), name='Input1')
    x = BatchNormalization(name='Bn1')(input_)
    x = Reshape((time_steps, n_features))(x)
    
    x = LSTM(100, return_sequences=True, name='Lstm1', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn2')(x)
    x = Dropout(0.2, name='Dp1')(x)

    x = LSTM(100, return_sequences=True, name='Lstm2', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn3')(x)
    x = Dropout(0.2, name='Dp2')(x)

    x = LSTM(64, return_sequences=True, name='Lstm3', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn4')(x)
    x = Dropout(0.2, name='Dp3')(x)
    
    x = LSTM(50, name='Lstm4', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn5')(x)
    x = Dropout(0.2, name='Dp4')(x)
    
    out = Dense(1, activation='linear', name='Output', kernel_regularizer=regularizers.l2(0.002))(x)

    model = Model(inputs=input_, outputs=out, name='Regressor1')
    return model



def cnnlstm_model():
    input_ = Input((time_steps, n_features), name='Input1')
    x = BatchNormalization(name='Bn1')(input_)
    x = Reshape((time_steps, n_features))(x)
    
    x = Conv1D(filters=128, kernel_size=3, activation='relu', name='Conv1')(x)
    x = MaxPooling1D(pool_size=2, strides=None, padding='same', name='MaxPool1')(x)
    x = BatchNormalization(name='Bn2')(x)
    x = Dropout(0.2, name='Dp1')(x)
    
    x = LSTM(100, return_sequences=True, name='Lstm1')(x)
    x = BatchNormalization(name='Bn3')(x)
    x = Dropout(0.2, name='Dp2')(x)
    
    x = LSTM(100, return_sequences=True, name='Lstm2', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn4')(x)
    x = Dropout(0.2, name='Dp3')(x)
    
    x = LSTM(50, name='Lstm3', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn5')(x)
    x = Dropout(0.2, name='Dp4')(x)
    
    out = Dense(1, activation='linear', name='Output', kernel_regularizer=regularizers.l2(0.002))(x)
    
    model = Model(inputs=input_, outputs=out, name='Regressor1')
    return model



def cnnGru_model():
    input_ = Input((time_steps, n_features), name='Input1')
    x = BatchNormalization(name='Bn1')(input_)
    x = Reshape((time_steps, n_features))(x)
    
    x = Conv1D(filters=128, kernel_size=3, activation='relu', name='Conv1')(x)
    x = MaxPooling1D(pool_size=2, strides=None, padding='same', name='MaxPool1')(x)
    x = BatchNormalization(name='Bn2')(x)
    x = Dropout(0.5, name='Dp1')(x)
    
    x = GRU(100, return_sequences=True, name='Lstm1')(x)
    x = BatchNormalization(name='Bn3')(x)
    x = Dropout(0.5, name='Dp2')(x)
    
    x = GRU(100, return_sequences=True, name='Lstm2', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn4')(x)
    x = Dropout(0.2, name='Dp3')(x)
    
    x = GRU(50, name='Lstm3', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn5')(x)
    x = Dropout(0.2, name='Dp4')(x)
    
    out = Dense(1, activation='linear', name='Output')(x)
    
    model = Model(inputs=input_, outputs=out, name='Regressor1')
    return model


def cnn2lstm_model():
    input_ = Input((time_steps, n_features), name='Input1')
    x = BatchNormalization(name='Bn1')(input_)
    x = Reshape((time_steps, n_features))(x)
    
    x1 = Conv1D(filters=128, kernel_size=2, activation='relu', name='Conv1')(x)
    x1 = MaxPooling1D(pool_size=2, strides=None, padding='valid', name='MaxPool1')(x1)
    x1 = BatchNormalization(name='Bn2')(x1)
    x1 = Dropout(0.2, name='Dp1')(x1)
    
    x2 = Conv1D(filters=128, kernel_size=3, activation='relu', name='Conv2')(x)
    x2 = MaxPooling1D(pool_size=2, strides=None, padding='same', name='MaxPool2')(x2)
    x2 = BatchNormalization(name='Bn22')(x2)
    x2 = Dropout(0.2, name='Dp11')(x2)
    
    concat = Multiply(name='concat')([x1, x2])
    
    x = LSTM(100, return_sequences=True, name='Lstm1')(concat)
    x = BatchNormalization(name='Bn3')(x)
    x = Dropout(0.2, name='Dp2')(x)
    
    x = LSTM(100, return_sequences=True, name='Lstm2', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn4')(x)
    x = Dropout(0.2, name='Dp3')(x)
    
    x = LSTM(50, name='Lstm3', kernel_regularizer=regularizers.l2(0.001))(x)
    x = BatchNormalization(name='Bn5')(x)
    x = Dropout(0.2, name='Dp4')(x)
    
    out = Dense(1, activation='linear', name='Output', kernel_regularizer=regularizers.l2(0.002))(x)
    
    model = Model(inputs=input_, outputs=out, name='Regressor1')
    return model
In [0]:
model = base_lstm_model()
In [0]:
model.summary()
Model: "Regressor1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Input1 (InputLayer)          (None, 15, 2)             0         
_________________________________________________________________
Bn1 (BatchNormalization)     (None, 15, 2)             8         
_________________________________________________________________
reshape_8 (Reshape)          (None, 15, 2)             0         
_________________________________________________________________
Lstm1 (LSTM)                 (None, 15, 50)            10600     
_________________________________________________________________
Bn2 (BatchNormalization)     (None, 15, 50)            200       
_________________________________________________________________
Dp1 (Dropout)                (None, 15, 50)            0         
_________________________________________________________________
Lstm2 (LSTM)                 (None, 15, 50)            20200     
_________________________________________________________________
Bn3 (BatchNormalization)     (None, 15, 50)            200       
_________________________________________________________________
Dp2 (Dropout)                (None, 15, 50)            0         
_________________________________________________________________
Lstm3 (LSTM)                 (None, 50)                20200     
_________________________________________________________________
Bn4 (BatchNormalization)     (None, 50)                200       
_________________________________________________________________
Dp3 (Dropout)                (None, 50)                0         
_________________________________________________________________
Output (Dense)               (None, 1)                 51        
=================================================================
Total params: 51,659
Trainable params: 51,355
Non-trainable params: 304
_________________________________________________________________
In [0]:
mod_version = '0.8'
model_name = f'lstm_model_{symbol}_v{mod_version}'
model_name
Out[0]:
'lstm_model_GM_v0.8'
In [0]:
#save_keras_model(export_path='.', model=model, model_name=model_name)
Saved model to disk as ./lstm_model_GM_v0.9
Out[0]:
(True, '.')
In [0]:
def train_network(model, inputs, outputs, params):
    history = model.fit(inputs,
                        outputs,
                        epochs=params['epochs'],
                        batch_size=params['batch_size'],
                        shuffle=True,
                        validation_split=0.2,
                        callbacks=params['callbacks']
                        )
    
    return history

def mape(y_true, y_pred):
    diff = K.abs((y_true - y_pred) / K.clip(K.abs(y_true),
                                            K.epsilon(),
                                            None))
    return 100. * K.mean(diff, axis=-1)
In [0]:
np.random.seed(42)
model.compile(optimizer='adam',
              loss='mse', #mean_absolute_percentage_error,
              #loss_weights = losses_weights,
              metrics=[mape, 'mae']
             )
In [0]:
train_params = {'epochs': 50,
                'batch_size': 32,
                'callbacks': [EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=10),
                              ModelCheckpoint(model_name+'.h5', monitor='val_loss', mode='min', verbose=1, save_best_only=True)]
               }

Entrenamiento

Entrenamos el modelo por 50 épocas alimentándolo con batches de 32 secuencias.

In [0]:
#hist = train_network(model, xTrain, ytrain, train_params)
Train on 1236 samples, validate on 310 samples
Epoch 1/50
1236/1236 [==============================] - 15s 12ms/step - loss: 943.0905 - mape: 101.9550 - mean_absolute_error: 30.5704 - val_loss: 895.3371 - val_mape: 93.5342 - val_mean_absolute_error: 29.8886

Epoch 00001: val_loss improved from inf to 895.33711, saving model to lstm_model_GM_v0.9.h5
Epoch 2/50
1236/1236 [==============================] - 4s 4ms/step - loss: 913.4573 - mape: 101.2667 - mean_absolute_error: 30.1624 - val_loss: 867.4886 - val_mape: 92.1287 - val_mean_absolute_error: 29.4289

Epoch 00002: val_loss improved from 895.33711 to 867.48856, saving model to lstm_model_GM_v0.9.h5
Epoch 3/50
1236/1236 [==============================] - 5s 4ms/step - loss: 875.8655 - mape: 99.4817 - mean_absolute_error: 29.5408 - val_loss: 869.1093 - val_mape: 92.1328 - val_mean_absolute_error: 29.4593

Epoch 00003: val_loss did not improve from 867.48856
Epoch 4/50
1236/1236 [==============================] - 5s 4ms/step - loss: 834.0199 - mape: 96.9678 - mean_absolute_error: 28.8087 - val_loss: 765.5155 - val_mape: 86.5691 - val_mean_absolute_error: 27.6379

Epoch 00004: val_loss improved from 867.48856 to 765.51547, saving model to lstm_model_GM_v0.9.h5
Epoch 5/50
1236/1236 [==============================] - 5s 4ms/step - loss: 774.6928 - mape: 93.6449 - mean_absolute_error: 27.7623 - val_loss: 756.8908 - val_mape: 85.9582 - val_mean_absolute_error: 27.4890

Epoch 00005: val_loss improved from 765.51547 to 756.89082, saving model to lstm_model_GM_v0.9.h5
Epoch 6/50
1236/1236 [==============================] - 5s 4ms/step - loss: 703.7569 - mape: 89.0417 - mean_absolute_error: 26.4295 - val_loss: 700.5804 - val_mape: 82.4909 - val_mean_absolute_error: 26.4385

Epoch 00006: val_loss improved from 756.89082 to 700.58035, saving model to lstm_model_GM_v0.9.h5
Epoch 7/50
1236/1236 [==============================] - 5s 4ms/step - loss: 619.9641 - mape: 83.1511 - mean_absolute_error: 24.7741 - val_loss: 561.8658 - val_mape: 74.1771 - val_mean_absolute_error: 23.6628

Epoch 00007: val_loss improved from 700.58035 to 561.86583, saving model to lstm_model_GM_v0.9.h5
Epoch 8/50
1236/1236 [==============================] - 5s 4ms/step - loss: 523.4752 - mape: 76.3574 - mean_absolute_error: 22.7380 - val_loss: 487.8759 - val_mape: 68.8868 - val_mean_absolute_error: 22.0604

Epoch 00008: val_loss improved from 561.86583 to 487.87586, saving model to lstm_model_GM_v0.9.h5
Epoch 9/50
1236/1236 [==============================] - 5s 4ms/step - loss: 427.3276 - mape: 68.6596 - mean_absolute_error: 20.4720 - val_loss: 390.6747 - val_mape: 61.6415 - val_mean_absolute_error: 19.7311

Epoch 00009: val_loss improved from 487.87586 to 390.67466, saving model to lstm_model_GM_v0.9.h5
Epoch 10/50
1236/1236 [==============================] - 5s 4ms/step - loss: 337.7580 - mape: 61.1728 - mean_absolute_error: 18.1564 - val_loss: 313.5560 - val_mape: 55.2599 - val_mean_absolute_error: 17.6617

Epoch 00010: val_loss improved from 390.67466 to 313.55604, saving model to lstm_model_GM_v0.9.h5
Epoch 11/50
1236/1236 [==============================] - 5s 4ms/step - loss: 259.1328 - mape: 52.9807 - mean_absolute_error: 15.8142 - val_loss: 245.7770 - val_mape: 48.8031 - val_mean_absolute_error: 15.6227

Epoch 00011: val_loss improved from 313.55604 to 245.77701, saving model to lstm_model_GM_v0.9.h5
Epoch 12/50
1236/1236 [==============================] - 5s 4ms/step - loss: 187.1905 - mape: 44.6546 - mean_absolute_error: 13.3016 - val_loss: 178.6244 - val_mape: 41.4727 - val_mean_absolute_error: 13.3102

Epoch 00012: val_loss improved from 245.77701 to 178.62435, saving model to lstm_model_GM_v0.9.h5
Epoch 13/50
1236/1236 [==============================] - 5s 4ms/step - loss: 133.9164 - mape: 36.8668 - mean_absolute_error: 11.0302 - val_loss: 106.6127 - val_mape: 32.0915 - val_mean_absolute_error: 10.2757

Epoch 00013: val_loss improved from 178.62435 to 106.61273, saving model to lstm_model_GM_v0.9.h5
Epoch 14/50
1236/1236 [==============================] - 5s 4ms/step - loss: 87.2871 - mape: 29.1872 - mean_absolute_error: 8.7359 - val_loss: 73.9379 - val_mape: 26.6030 - val_mean_absolute_error: 8.5469

Epoch 00014: val_loss improved from 106.61273 to 73.93792, saving model to lstm_model_GM_v0.9.h5
Epoch 15/50
1236/1236 [==============================] - 5s 4ms/step - loss: 59.3175 - mape: 23.0814 - mean_absolute_error: 6.8884 - val_loss: 38.6296 - val_mape: 18.9930 - val_mean_absolute_error: 6.1223

Epoch 00015: val_loss improved from 73.93792 to 38.62957, saving model to lstm_model_GM_v0.9.h5
Epoch 16/50
1236/1236 [==============================] - 5s 4ms/step - loss: 42.5138 - mape: 18.8758 - mean_absolute_error: 5.6259 - val_loss: 22.1341 - val_mape: 14.3496 - val_mean_absolute_error: 4.5938

Epoch 00016: val_loss improved from 38.62957 to 22.13413, saving model to lstm_model_GM_v0.9.h5
Epoch 17/50
1236/1236 [==============================] - 5s 4ms/step - loss: 28.6746 - mape: 14.6229 - mean_absolute_error: 4.3627 - val_loss: 12.0898 - val_mape: 10.5429 - val_mean_absolute_error: 3.3495

Epoch 00017: val_loss improved from 22.13413 to 12.08984, saving model to lstm_model_GM_v0.9.h5
Epoch 18/50
1236/1236 [==============================] - 5s 4ms/step - loss: 22.7058 - mape: 13.0452 - mean_absolute_error: 3.8600 - val_loss: 9.5645 - val_mape: 8.9047 - val_mean_absolute_error: 2.8910

Epoch 00018: val_loss improved from 12.08984 to 9.56454, saving model to lstm_model_GM_v0.9.h5
Epoch 19/50
1236/1236 [==============================] - 5s 4ms/step - loss: 17.3008 - mape: 10.9155 - mean_absolute_error: 3.2461 - val_loss: 3.9953 - val_mape: 5.4291 - val_mean_absolute_error: 1.7434

Epoch 00019: val_loss improved from 9.56454 to 3.99529, saving model to lstm_model_GM_v0.9.h5
Epoch 20/50
1236/1236 [==============================] - 5s 4ms/step - loss: 15.9539 - mape: 10.6181 - mean_absolute_error: 3.1225 - val_loss: 1.3632 - val_mape: 2.6197 - val_mean_absolute_error: 0.8409

Epoch 00020: val_loss improved from 3.99529 to 1.36324, saving model to lstm_model_GM_v0.9.h5
Epoch 21/50
1236/1236 [==============================] - 5s 4ms/step - loss: 14.5641 - mape: 10.2284 - mean_absolute_error: 3.0103 - val_loss: 1.0513 - val_mape: 2.2749 - val_mean_absolute_error: 0.7277

Epoch 00021: val_loss improved from 1.36324 to 1.05126, saving model to lstm_model_GM_v0.9.h5
Epoch 22/50
1236/1236 [==============================] - 5s 4ms/step - loss: 14.4249 - mape: 10.1539 - mean_absolute_error: 2.9676 - val_loss: 1.3969 - val_mape: 2.7617 - val_mean_absolute_error: 0.8910

Epoch 00022: val_loss did not improve from 1.05126
Epoch 23/50
1236/1236 [==============================] - 5s 4ms/step - loss: 15.4299 - mape: 10.5142 - mean_absolute_error: 3.0810 - val_loss: 0.9469 - val_mape: 2.0291 - val_mean_absolute_error: 0.6411

Epoch 00023: val_loss improved from 1.05126 to 0.94685, saving model to lstm_model_GM_v0.9.h5
Epoch 24/50
1236/1236 [==============================] - 5s 4ms/step - loss: 14.9976 - mape: 10.4425 - mean_absolute_error: 3.0645 - val_loss: 1.2196 - val_mape: 2.4256 - val_mean_absolute_error: 0.7615

Epoch 00024: val_loss did not improve from 0.94685
Epoch 25/50
1236/1236 [==============================] - 5s 4ms/step - loss: 13.8848 - mape: 9.8138 - mean_absolute_error: 2.9006 - val_loss: 0.9852 - val_mape: 2.1130 - val_mean_absolute_error: 0.6652

Epoch 00025: val_loss did not improve from 0.94685
Epoch 26/50
1236/1236 [==============================] - 5s 4ms/step - loss: 14.2419 - mape: 10.0624 - mean_absolute_error: 2.9498 - val_loss: 1.0022 - val_mape: 2.1511 - val_mean_absolute_error: 0.6815

Epoch 00026: val_loss did not improve from 0.94685
Epoch 27/50
1236/1236 [==============================] - 5s 4ms/step - loss: 13.5996 - mape: 9.9798 - mean_absolute_error: 2.9223 - val_loss: 1.0403 - val_mape: 2.2971 - val_mean_absolute_error: 0.7226

Epoch 00027: val_loss did not improve from 0.94685
Epoch 28/50
1236/1236 [==============================] - 5s 4ms/step - loss: 14.3143 - mape: 10.1534 - mean_absolute_error: 3.0084 - val_loss: 0.8587 - val_mape: 1.9496 - val_mean_absolute_error: 0.6217

Epoch 00028: val_loss improved from 0.94685 to 0.85871, saving model to lstm_model_GM_v0.9.h5
Epoch 29/50
1236/1236 [==============================] - 5s 4ms/step - loss: 13.6602 - mape: 9.7687 - mean_absolute_error: 2.8800 - val_loss: 0.8767 - val_mape: 1.9785 - val_mean_absolute_error: 0.6230

Epoch 00029: val_loss did not improve from 0.85871
Epoch 30/50
1236/1236 [==============================] - 5s 4ms/step - loss: 13.7913 - mape: 9.8895 - mean_absolute_error: 2.9023 - val_loss: 0.8516 - val_mape: 1.7773 - val_mean_absolute_error: 0.5656

Epoch 00030: val_loss improved from 0.85871 to 0.85158, saving model to lstm_model_GM_v0.9.h5
Epoch 31/50
1236/1236 [==============================] - 5s 4ms/step - loss: 12.8634 - mape: 9.5612 - mean_absolute_error: 2.8046 - val_loss: 0.7580 - val_mape: 1.7034 - val_mean_absolute_error: 0.5386

Epoch 00031: val_loss improved from 0.85158 to 0.75797, saving model to lstm_model_GM_v0.9.h5
Epoch 32/50
1236/1236 [==============================] - 5s 4ms/step - loss: 12.6364 - mape: 9.5363 - mean_absolute_error: 2.8095 - val_loss: 0.6839 - val_mape: 1.5693 - val_mean_absolute_error: 0.5002

Epoch 00032: val_loss improved from 0.75797 to 0.68392, saving model to lstm_model_GM_v0.9.h5
Epoch 33/50
1236/1236 [==============================] - 5s 4ms/step - loss: 12.1602 - mape: 9.2310 - mean_absolute_error: 2.7179 - val_loss: 0.8065 - val_mape: 1.8008 - val_mean_absolute_error: 0.5680

Epoch 00033: val_loss did not improve from 0.68392
Epoch 34/50
1236/1236 [==============================] - 5s 4ms/step - loss: 12.6006 - mape: 9.5097 - mean_absolute_error: 2.7746 - val_loss: 0.9921 - val_mape: 2.1675 - val_mean_absolute_error: 0.6942

Epoch 00034: val_loss did not improve from 0.68392
Epoch 35/50
1236/1236 [==============================] - 5s 4ms/step - loss: 11.8797 - mape: 9.2566 - mean_absolute_error: 2.7379 - val_loss: 1.0988 - val_mape: 2.2790 - val_mean_absolute_error: 0.7439

Epoch 00035: val_loss did not improve from 0.68392
Epoch 36/50
1236/1236 [==============================] - 4s 4ms/step - loss: 12.8867 - mape: 9.4886 - mean_absolute_error: 2.7986 - val_loss: 0.9222 - val_mape: 1.9428 - val_mean_absolute_error: 0.6285

Epoch 00036: val_loss did not improve from 0.68392
Epoch 37/50
1236/1236 [==============================] - 5s 4ms/step - loss: 11.4435 - mape: 9.1275 - mean_absolute_error: 2.6654 - val_loss: 0.8069 - val_mape: 1.7983 - val_mean_absolute_error: 0.5731

Epoch 00037: val_loss did not improve from 0.68392
Epoch 38/50
1236/1236 [==============================] - 5s 4ms/step - loss: 12.3426 - mape: 9.4397 - mean_absolute_error: 2.7561 - val_loss: 1.2831 - val_mape: 2.6728 - val_mean_absolute_error: 0.8745

Epoch 00038: val_loss did not improve from 0.68392
Epoch 39/50
1236/1236 [==============================] - 5s 4ms/step - loss: 12.1250 - mape: 9.3670 - mean_absolute_error: 2.7506 - val_loss: 0.9321 - val_mape: 2.0485 - val_mean_absolute_error: 0.6492

Epoch 00039: val_loss did not improve from 0.68392
Epoch 40/50
1236/1236 [==============================] - 5s 4ms/step - loss: 12.5294 - mape: 9.3223 - mean_absolute_error: 2.7676 - val_loss: 1.4251 - val_mape: 2.7011 - val_mean_absolute_error: 0.8918

Epoch 00040: val_loss did not improve from 0.68392
Epoch 41/50
1236/1236 [==============================] - 5s 4ms/step - loss: 11.5328 - mape: 9.1536 - mean_absolute_error: 2.7082 - val_loss: 0.7588 - val_mape: 1.7872 - val_mean_absolute_error: 0.5730

Epoch 00041: val_loss did not improve from 0.68392
Epoch 42/50
1236/1236 [==============================] - 5s 4ms/step - loss: 11.3477 - mape: 9.0098 - mean_absolute_error: 2.6583 - val_loss: 1.0152 - val_mape: 2.1508 - val_mean_absolute_error: 0.7007

Epoch 00042: val_loss did not improve from 0.68392
Epoch 00042: early stopping
In [0]:
lossNames = ["loss"]
plt.style.use("ggplot")
(fig, ax) = plt.subplots(len(lossNames), 1, figsize=(8, 8))
 
for (i, l) in enumerate(lossNames):
    metric_ = hist.history
    title = "Loss for {}".format(l) if l != "loss" else "Total loss"
    
    if len(lossNames) == 1:
        ax_ = ax
    else:
        ax_ = ax[i]
    
    ax_.set_title(title)
    ax_.set_xlabel("Epoch #")
    ax_.set_ylabel("Loss")
    ax_.plot(np.arange(0, len(metric_[l])), metric_[l], label=l)
    ax_.plot(np.arange(0, len(metric_[l])), metric_["val_" + l], label="val_" + l)
    ax_.legend()
    
plt.plot() 
plt.tight_layout()
In [0]:
metricNames = ["mape", "mean_absolute_error"]
plt.style.use("ggplot")
(fig, ax) = plt.subplots(len(metricNames), 1, figsize=(8, 8))
 
for (i, l) in enumerate(metricNames):
    metric_ = hist.history
    
    if len(metricNames) == 1:
        ax_ = ax
    else:
        ax_ = ax[i]
    
    ax_.set_title("{}".format(l))
    ax_.set_xlabel("Epoch #")
    ax_.set_ylabel("Accuracy")
    ax_.plot(np.arange(0, len(metric_[l])), metric_[l], label=l)
    ax_.plot(np.arange(0, len(metric_[l])), metric_["val_" + l], label="val_" + l)
    ax_.legend()

plt.plot()

plt.tight_layout()

Carga del modelo entrenado

Cargamos el modelo por nombre, y cargamos los pesos ya entrenados.

In [0]:
loaded_reg = load_keras_model(model_name)
loaded_reg.summary()
lstm_model_GM_v0.8.json
Loaded model from disk
Model: "Regressor1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Input1 (InputLayer)          (None, 5, 2)              0         
_________________________________________________________________
Bn1 (BatchNormalization)     (None, 5, 2)              8         
_________________________________________________________________
reshape_5 (Reshape)          (None, 5, 2)              0         
_________________________________________________________________
Lstm1 (LSTM)                 (None, 5, 50)             10600     
_________________________________________________________________
Bn2 (BatchNormalization)     (None, 5, 50)             200       
_________________________________________________________________
Dp1 (Dropout)                (None, 5, 50)             0         
_________________________________________________________________
Lstm2 (LSTM)                 (None, 5, 50)             20200     
_________________________________________________________________
Bn3 (BatchNormalization)     (None, 5, 50)             200       
_________________________________________________________________
Dp2 (Dropout)                (None, 5, 50)             0         
_________________________________________________________________
Lstm3 (LSTM)                 (None, 50)                20200     
_________________________________________________________________
Bn4 (BatchNormalization)     (None, 50)                200       
_________________________________________________________________
Dp3 (Dropout)                (None, 50)                0         
_________________________________________________________________
Output (Dense)               (None, 1)                 51        
=================================================================
Total params: 51,659
Trainable params: 51,355
Non-trainable params: 304
_________________________________________________________________
In [0]:
loaded_reg.load_weights('./'+model_name+'.h5')

Predicción

Las predicciones del modelo serán entonces:

  • Dado una ventana de 5 días de precios y sus diferenciales de grado 1, obtener el valor del día siguiente. Obtendremos para cada cada secuencia de Test, la predicción del siguiente día y al almacenaremos los resultados.
  • Una vez tenidos los resultados del modelo, calcularemos las métricas de error para evaluar su ajuste.
  • Por último, graficaremos tanto los puntos de predicción como los valores reales para ver cómo se se comporta la red con valores que no han sido vistos antes y cómo se ajusta al patrón de la serie.
In [0]:
n_features
Out[0]:
2
In [0]:
to_pred = xTest.reshape(xTest.shape[0], time_steps, n_features)
yhat = loaded_reg.predict(to_pred, verbose=0).reshape((-1,1))
trues = ytest.reshape(-1,1)
yhat.shape, trues.shape
Out[0]:
((664, 1), (664, 1))
In [0]:
yhat[:5], trues[:5]
Out[0]:
(array([[36.103447],
        [36.196735],
        [35.98827 ],
        [35.516903],
        [36.138325]], dtype=float32), array([[36.61],
        [36.14],
        [35.73],
        [36.33],
        [36.83]]))

Métricas

In [0]:
def mape(y_true, y_pred): 
    y_true_, y_pred_ = np.array(y_true), np.array(y_pred)
    return np.mean(np.abs((y_true_ - y_pred_) / y_true_)) * 100
In [0]:
prd, ytr = np.squeeze(yhat), np.squeeze(trues)

mse = sk_mse(ytr, prd)
rmse = np.sqrt(sk_mse(ytr, prd))
mae = sk_mae(ytr, prd)
mape = mape(ytr, prd)

metrics_ = pd.DataFrame([mse,rmse,mae,mape], index=['mse','rmse','mae','mape'], columns=['Metrics'])
metrics_.loc['mape'] = metrics_.loc['mape'].map(lambda x: str(round(x, 2))+'%')
metrics_
Out[0]:
Metrics
mse 1.85992
rmse 1.36379
mae 0.903904
mape 2.26%

Visualización de la predicción

Podemos ver que el modelo aprende el patrón que sigue la serie real, excepto en algunmos tramos donde ocurren valores anómalos (picos). Pero a grandes rasgos, consigue detectar cuándo la serie tiende a subir o a bajar su valor.

In [0]:
plt.plot(yhat, c='seagreen', alpha=0.6)
plt.plot(trues, c='orange', alpha=0.5)
plt.legend(['prediction', 'real'], loc='best')
plt.title(model_name)
Out[0]:
Text(0.5, 1.0, 'lstm_model_GM_v0.8')

Visualicemos tramos más precisos

In [0]:
plt.plot(yhat[50:150], c='seagreen', alpha=0.6)
plt.plot(trues[50:150], c='orange', alpha=0.5)
plt.legend(['prediction', 'real'], loc='best')
plt.title(model_name)
Out[0]:
Text(0.5, 1.0, 'lstm_model_GM_v0.8')
In [0]:
plt.plot(yhat[250:350], c='seagreen', alpha=0.6)
plt.plot(trues[250:350], c='orange', alpha=0.5)
plt.legend(['prediction', 'real'], loc='best')
plt.title(model_name)
Out[0]:
Text(0.5, 1.0, 'lstm_model_GM_v0.8')

Resultados finales y Conclusiones

Después de estudiar los diferentes modelos de aprendizaje automático para realizar predicciones de mercado, vemos que esos cambios bruscos en los precios son difíciles de aprender para cualquiera de los modelos presentados, debido a que estas fluctuaciones de precio se pueden deber a factores externos, políticos y sociales. Los resultados son los siguientes:

Modelos

ARIMA => MSE: 0.36; RMSE: 0.63

NAÏVE DAILY => MSE: 0.46; RMSE: 0.68

RF EST. => MSE: 0.45; RMSE: 0.67

Simple RNN => MSE: 26.9; RMSE: 5.18

LTSM => MSE: 1.85; RMSE: 1.36

Los modelos tradicionales de referencia, como arrojan los resultados presentados no arrojan malos resultados en predicciones a 1 día.

De las redes secuenciales estudiadas aquí se comporta mejor las redes con capas LSTM. La posible diferencia, a peor, entre los modelos no secuenciales y los secuenciales puede venir porque estos no se adecuan tan bien a los cambios bruscos, y que los modelos simples que tienen principalmente en cuenta los últimos días son capaces de adaptarse mejor a estas series temporales tan parecidas a un Random Walk.

Es difícil predecir los precios de mercado a largo plazo, para cualquiera de los modelos sin tener en cuenta los factores externos que afectan a la economía mundial y por ende a las empresas en particular.

In [ ]:
 
David Cerezal Landa

David Cerezal Landa

Admin del Mono al ordenador. Me fuerzo a escribir porque mi mente es demasiado volatil.

Etiquetas: , , , ,

Actualizado:

Comentar