In this article, you will learn about the concept of overfitting and underfitting. What are the techniques of overfitting and underfitting? And What is the difference between them?
What is Overfitting?
Overfitting is a term used in statistics to refer to a modeling error that occurs when a function closely matches a dataset. Asa result, more correlations may fail to fit the additional data, and this may affect the accuracy of predictions for future observations.
Techniques of overfitting:
- Increase training data
- Reduce model complexity
- Early pause during the training phase
- To deal with excessive-efficiency
- Use the dropout for neural networks.
Refers to a model that neither models the training dataset nor generalizes the new dataset. An underfit machine learning model is not an appropriate model and it will be obvious because its training will perform poorly on the dataset.
Techniques of underfitting:
- Increase model complexity
- Increase the number of features, and performing of features.
- Remove noise from the data.
- Increase the duration of training
Overfitting vs Underfitting:
The problem of overfitting vs underfitting finally appears when we talk about multiple degrees. The degree represents the model in which the flexibility of the model, with high power, allows the freedom of the model to remove as many data points as possible. The underfill model will be less flexible and will not be able to calculate data. The best way to understand this problem is to look at the models that represent both situations.
The first is an underfloor model with a 1-degree multifunctional foot. In the icon on the left, the model function in orange is shown above the correct function and training observations. On the right, model predictions for test data are displayed compared to true function and testing data points.
Multiple 1-degree models under its on training (left) and test (right) datasets. Our model goes straight through the training set in which no data is taken care of. This is because an underperforming model has low variability and high bias. Variation refers to how much the model depends on training data. In the case of multiple 1-degrees, the model relies very little on training data because it pays barely any attention to the points instead, the model has a high bias, which means that it gives a strong impression of the data. In this example, the assumption is that the data is linear, which is obviously incorrect. When the model predicts the test, the bias leads to erroneous estimates.
How to Find the Right Balance?
Lowering high Bias or Underfitting:
- Use non-Parameterized Algorithms
- Make the model more complex with more features
- Use Non-Linear Algorithms Example (Polynomial Regression, Kernel Function in SVM
Lowering high Variance or Overfitting:
- Use More Data for training to make the model learn the maximum hidden pattern from the training data and the model becomes generalized.
- Use regular related techniques e.g., L1, L2, dropout, stopping quickly (in case of neural network), etc.
- Over Hyper Parameter Tuning to Avoid Exceeding Examples: High value of K in KNN, C, and Gamma tuning for SVM, tree depth in the decisive tree
- Use a small number of features – manual or feature selection algorithm or automate using L1, L2 regularly
- Reduce model complexity – Reduce multidimensional degree in the case of multi-step regression and logistic regression.
Overfitting and underfitting is a major problem that is also found by the experienced data analyst. I have noticed that many grade students fit into a model with very few errors in the data. Their model looks great, but the problem is that they never used the test set to leave the verification set! The model is nothing more than a maximum representation of the training data, the lessons are learned by the student as soon as someone else tries to apply their model to the new data
You may also like to read: Transfer Learning – What are the types of Transfer Learning?