Important Machine Learning Concepts Part – 1
- Naveen
- 0
Features
Input data/variables used by the ML model.
Feature Engineering
Transforming input features to be more useful for the models. e.g., mapping categories to buckets, normalizing between -1 and 1, removing null.
Train/Eval/Test
Training is data used to optimize the model, evaluation is used to asses the model on new data during training, test is used to provide the final result.
Classification/Regression
Regression helps predict a continuous quantity (e.g., housing price). Classification predicts discrete class labels (e.g., predicting red/blue/green).
Linear Regression
Predicts an output by multiplying and summing input features with weights and biases.
Logistic Regression
Similar to linear regression but predicts a probability.
Overfitting
Model performs great on the input data but poorly on the test data (combat by dropout, early stopping, or reduce # of nodes or layers).
Underfitting
Model neither perform well on training data nor on testing data, and generates a high error rate on both the training set and unseen data.
Bias/Variance
How much output is determined by the features. More variance often can mean overfitting, more bias can mean a bad model.
Regularization
Variety of approaches to reduce overfitting, including adding the weights to the loss function, randomly dropping layers (dropout).