Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Volunteers

...

Un-supervised

  1. Clustering -  hierarchical clusteringk-means, mixture models, DBSCAN, and OPTICS algorithm
  2. Anomaly Detection - Local Outlier Factor, and Isolation Forest
  3. Dimensionality Reduction - Principal component analysis, Independent component analysis, Non-negative matrix factorization, Singular value decomposition

...

NameComments on ApplicabilityReference
Hierarchical Clustering
  1. (N-1) combination of clusters are formed to choose from.
  2. Expensive and slow. n×n  distance matrix needs to be made.
  3. Cannot work on very large datasets.
  4. Results are reproducible.
  5. Does not work well with hyper-spherical clusters.
  6. Can provide insights into the way the data pts. are clustered.
  7. Can use various linkage methods(apart from centroid).

k-means
  1. Pre-specified number of clusters.
  2. Less computationally intensive.
  3. Suited for large dataset.
  4. Point of start can be random which leads to a different result each time the algorithm runs.
  5. K-means needs circular data. Hyper-spherical clusters.
  6. K-Means simply divides data into mutually exclusive subsets without giving much insight into the process of division.
  7. K-Means uses median or mean to compute centroid for representing cluster.

Gaussian Mixture Models

Reinforcement Learning

...

  1. Policy Optimization
  2. Q-Learning

    Policy Optimization

    Q-Learning

    optimize the parameters either directly by gradient ascent on the performance objective or indirectly, by maximizing local approximations

    learn an approximator for the optimal action-value function

    performed on-policy, each update only uses data collected while acting according to the most recent version of the policy

    performed off-policy, each update can use data collected at any point during training

    directly optimize for the thing you want

    indirectly optimize for agent performance

    More stable

    tends to be less stable

    advantage of being substantially more sample efficient when they do work, because they can reuse data more effectively 

    Less sample efficient and takes longer to learn as learning data is limited at every iteration.



  • Value-based methods
    • (Q-learning, Deep Q-learning): where we learn a value function that will map each state action pair to a value.
    • find the best action to take for each state — the action with the biggest value.
    • works well when you have a finite set of actions.
  • Policy-based methods
    • REINFORCE with Policy Gradients
    • we directly optimize the policy without using a value function.
    • when the action space is continuous or stochastic.
    • use total rewards of the episode
    • problem is finding a good score function to compute how good a policy is
  • Hybrid Method
    • Actor-Critic Method
      • Policy Learning + Value Learning
      • Policy Function → Actor: Choses to make moves
      • Value Function → Critic: Decides how the agent is performing
      • we make an update at each step (TD Learning)
      • Because we do an update at each time step, we can’t use the total rewards R(t).
      • Both learn in parallel, like GANs
      • Not Stable but several variations which are stable

Algorithms

NameComments on ApplicabilityReference

Q Learning









...