Technical forecasting – Historical analogue method, Time series analysis
Historical analogue method –
Analogue method for forecasting is simple and straightforward and operates on a simple principle of comparing current patterns with similar patterns from the past.
In which we are collecting the datasets for uber, yellow and green taxi for the year 2014, for predicting the future trend and demand of taxis for a particular location and for a time interval.
Time series analysis –
In time series analysis, we analyze time-series data, which is a series of data plotted for a given time to extract meaningful statistics of the required data. In time series forecasting method, we apply this model to predict the values based on recorded data.
In this application, we have converted time into 10 min bin time interval and with any regionID (using zip code) and 10 min time bin we can predict the no of pickups. In order to evaluate the performance of our model, we split the data into a training set (80% of data set) and testing set (20% of data set), where the training examples are all ordered chronologically before the testing examples. This configuration mimics the task of predicting future numbers of taxi pickups using only past data.
For training the network, we will encode the historical dataset and add date, day of week and time as impacting factor, which outputs from the predicting feed-forward neural network.
LSTM Model –
Here we transform time series to supervised training for time series we can achieve this by using the observation from the last time let’s Say ‘t-1’ and the observations at the current time step t as output.
The dataset includes the pickup and drop-off dates, times, and locations of taxis in New York City. We selected 6 months of logs, containing more than 100 million instances.
Below, we call this dataset the NYC dataset. For each dataset, we use the first 70% as the training instances and the remaining 30% as the test instances. If a model needs to fine-tune the hyperparameters, we further divide the training instances into training (60%) and validation (10%) sets.
An encoder-decoder (used as an autoencoder) and a fully connected feed-forward network, used to predict the number of trips in a city based on previous data or to detect anomalies in real-time.
—– Akansha Kaushik
Hadoop is an open-source framework of programs that is used to store and process big data. Hadoop uses multiple clusters of computers to analyze big data sets in parallel. The distributed processing of data sets can