Predictive Modelling Process:
Once the dataset is available. It is sent for processing, cleaning up, etc. The refined dataset is split into train and test sets in the ratio of 70% and 30% respectively. The larger set forms the training data set and will be used to train the model whereas, the purpose of the test dataset is used to evaluate the performance of the final model at the very end. There are many different learning algorithms viz. Random Forest, Support Vector Machine (SVM), Naive Bayes, Artificial Neural Networks (ANN), Decision Tree Classifiers which can be used for training the model. Techniques such as cross-validation are used in the model creation and refinement steps to evaluate the classification performance. The most popular tools used are Python, R, Scikit lib, SAS, Mathematica and Matlab. Once the model is ready, its performance is evaluated on the test data at the very end. There are many techniques for evaluating the performance of a model. The techniques vary according to the type of model (regression, classification) and the problem domain.
As a complete solution to Predictive Modelling, the ALTEN Calsoft Labs’ Predictive Analytics Platform provides multiple micro-services for various data processes, analysis processes and finally data visualization processes.