Data-set preparation

Before data can be used to design a neural network, four steps in data preparation might be applied.


Figure 3.1: Data preparation for neural network design.


  1. Raw data is first collected.
  2. In the data processing step, the data-set can be cleaned by removing corrupted and incorrect records. Transformation techniques may be used to achieve useful features or to reduce the data dimension. Categorical variables in the data-set are also converted to numerical values that can be used.
  3. Data labeling might be applied to label targets.
  4. Data-set is then divided into three sets: Training set is used to train a neural network, a validation set is used to prevent the over-fitting issue, and the test set is used to evaluate the how well the trained neural network could cope with completely new data-set


DLHUB supports three types of Data-set format; please refer to the section: "Load Training Data - Supported Files" for more details.


Notes: ANSCENTER is continuously working on supporting new data-set formats. These formats will be introduced in the next releases.