Volunteers
...
Supervised
- Supervised learning algorithms make predictions based on a set of examples
- Classification: When the data are being used to predict a categorical variable, supervised learning is also called classification. This is the case when assigning a label or indicator, either dog or cat to an image. When there are only two labels, this is called binary classification. When there are more than two categories, the problems are called multi-class classification.
- Regression: When predicting continuous values, the problems become a regression problem.
- Forecasting: This is the process of making predictions based on past and present data. It is most commonly used to analyze trends. A common example might be an estimation of the next year sales based on the sales of the current year and previous years.
Algorithms
In progress**
...
Volunteers
Name | ML Category |
---|---|
Jahanvi | Supervised |
Akanksha | Unsupervised |
Kanak Raj | Reinforced |
Supervised
- Supervised learning algorithms make predictions based on a set of examples
- Classification: When the data are being used to predict a categorical variable, supervised learning is also called classification. This is the case when assigning a label or indicator, either dog or cat to an image. When there are only two labels, this is called binary classification. When there are more than two categories, the problems are called multi-class classification.
- Regression: When predicting continuous values, the problems become a regression problem.
- Forecasting: This is the process of making predictions based on past and present data. It is most commonly used to analyze trends. A common example might be an estimation of the next year sales based on the sales of the current year and previous years.
Algorithms
Name | Comments on Applicability | Reference |
---|---|---|
LOGISTIC REGRESSION |
| |
KNN |
| |
SUPPORT VECTOR MACHINE |
| |
Kernel SVM |
| |
RBF Kernel |
So, the rule thumb is: use linear SVMs for linear problems, and nonlinear kernels such as the RBF kernel for non-linear problems. | |
NAIVE BAYES |
| |
DECISION TREE CLASSIFICATION |
| |
RANDOM FOREST CLASSIFICATION |
| |
GRADIENT BOOSTING CLASSIFICATION |
|
Un-supervised
- Clustering - hierarchical clustering, k-means, mixture models, DBSCAN, and OPTICS algorithm
- Anomaly Detection - Local Outlier Factor, and Isolation Forest
- Dimensionality Reduction - Principal component analysis, Independent component analysis, Non-negative matrix factorization, Singular value decomposition
Algorithms
...
- (N-1) combination of clusters are formed to choose from.
- Expensive and slow. n×n distance matrix needs to be made.
- Cannot work on very large datasets.
- Results are reproducible.
- Does not work well with hyper-spherical clusters.
- Can provide insights into the way the data pts. are clustered.
- Can use various linkage methods(apart from centroid).
...
- , Singular value decomposition
Algorithms
Name | Comments on Applicability | Reference | ||||
---|---|---|---|---|---|---|
Hierarchical Clustering |
| |||||
k-means |
| |||||
Gaussian Mixture Models |
| Gaussian Mixture Models |
| DBSCAN |
| |
DBSCAN |
|
DIMENSIONALITY REDUCTION ALGORITHMS | APPLICABILITY |
---|---|
Linear Discriminant Analysis | It is used to find a linear combination of features that characterizes or separates two or more classes of objects or events. LDA is a supervised LDA is also used for clustering sometimes. And almost always outperforms logistic regression. |
Principle Component Analysis | It performs a linear mapping of the data from a higher-dimensional space to a lower-dimensional space in such a manner that the variance of the data in the low-dimensional representation is maximized. PCA is unsupervised |
Reinforcement Learning
...