From confusion matrix to ROC graph - object-recognition

i recently implemented a Bag of Words categorization algorithm based on the one described in this paper.
All works well, but i'd like to measure accuracies of the classifiers using ROC curves or, perhaps, precision-recall graphs.
I can easily get the confusion matrix for each of the classifiers but i don't know what parameter i should change to get more points and actually plot the curves.
Could someone please explain this to me?

I think it's necessary for the classifier's output to be continuous values rather than discrete values to draw a ROC curve. If the predicted labels are continuous values, then you could set a threshold to calculate points in the ROC curve. If predicted labels are in two classes (discrete values), then you will only get one point in the ROC curve.
http://en.wikipedia.org/wiki/Receiver_operating_characteristic

Related

plot Roc curve for ANN and SVM

I'm using ANN and SVM for classification of 6 classes. All works well, but I'd like to measure accuracies of the classifiers using ROC curves.
I can easily get the confusion matrix for each of the classifiers but I don't know what parameter I should change to get more points and actually plot the ROC curves.
Could someone help me please!

Recall Precision Curve for clustering algorithms

I would like to know whether precision recall curve is relevant for clustering algorithms. For example by using unsupervised learning techniques such as Mean shift or DBSCAN.(Or is it relevant only for classification algorithms). If yes how to get the plot points for low recall values? Is it allowed to change the model parameters to get low recall rates for a model?
PR curves (and ROC curves) require a ranking.
E.g. a classificator score that can be used to rank objects by how likely they belong to class A, or not.
In clustering, you usually do not have such a ranking.
Without a ranking, you don't get a curve. Also, what is precision and recall in clustering? Use ARI and NMI for evaluation.
But there are unsupervised methods such as outlier detection where, e.g., the ROC curve is a fairly common evaluation method. The PR curve is more problematic, because at 0 it is not defined, and ton shouldn't linearly interpolate. Thus, the popular "area under curve" is not well defined for PR curves. Since there are a dozen of other measures, I'd avoid PR-AUC because of this.

Machine learning clustering algorithms: k-means and Gaussian mixtures [closed]

Suppose we clustered a set of N data points using two different clustering algorithms: k-means and Gaussian mixtures. In both cases we obtained 5 clusters and in both cases the centers of the clusters are exactly the same. Can 3 points that are assigned to different clusters in the kmeans solution be assigned to the same cluster in the Gaussian mixture solution? If no, explain. If so, sketch an example or explain in 1-2 sentences.
From my understanding of Machine Learning theory, Gaussian Mixture Model(GMM) and K-Means differ in the fundamental setting that K-Means is a Hard Clustering Algorithm, while GMM is a Soft Clustering Algorithm. K-Means will assign every point to a cluster whereas GMM will give you a probability distribution as to what is the probability of the point to belong in each of the 5 clusters. Furthermore, this also depends on the kind of parameters are you using for GMM. It could be possible for GMM to produce clusters somewhat similar to K-Means if you use a constant variance.
Now, I am not sure about this because you need to provide more information on how you are picking Hard Clusters from GMM and how are you calculating the cluster centers. If you are just making a hard assignment from GMM based on the cluster which has the maximum probability, then it could be possible that they get assigned to the same clusters. In my opinion this will be possible only if the data points are easily separable and your GMM is assuming constant variance.
As far as the cluster centers go, it depends on the way you are calculating them. If you are using the mean vectors obtained from GMM, then it is very very unlikely that K-Means and GMM will give you same cluster centers. On the other hand if you are first generating Hard clusters like mentioned above and then calculating the centers yourself, then it could be possible that they have the same centers when the hard clustering for all your points is the same in both K-Means and GMM.
I think you should provide more information about the way you are doing this, so that the community members can better help you. Also you should also identify your use case well and decide whether you need Hard or Soft Clustering. Choose GMM only if you desire soft clustering and/or you have a prior belief that your data points have been generated from Gaussian Distributions for each cluster.

Algorithm for path simplification and smoothing of 2D trajectories

I'm searching for an algorithm for path simplification and smoothing for 2D trajectories. So I have a ordered list of 2D points. These points should be simplified, e.g. with the Ramer–Douglas–Peucker algorithm. But the output must be smooth, so the resulting path should be constructed from Bezier curves or splines. Is there any modification of the Ramer–Douglas–Peucker algorithm which could handle this?
I found a path simplification algorithm in the paper.js library, which does exactly what I'm searching for: http://paperjs.org/examples/path-simplification/ But I was not able to understand the algorithm from the undocumented javascript source code.
The work you want to do falls into the category of "curve fitting". There are tons of different algorithms for curve fitting but almost all curve fitting algorithms can be divided into two different categories: interpolation and approximation. Interpolation algorithms produce a curve that passes through all the data points exactly while approximation algorithms generate a curve that lies close to the data points. Of course, hybrid algorithms also exist.
Since you want the data points to be smoothed, you should be looking for approximation algorithms. For the two algorithms you mentioned: RDP algorithm and Schneider algorithm (the one in Paper.js), they are both approximation algorithms. So, basically you can use either of them. For RDP, after obtaining the simplified path, you can use create a Catmull Rom spline or Overhauser spline thru the vertices of the simplified path to obtain a smooth curve. However, you don't have direct control for the deviation between the resulting spline and the vertices in the original path.
For Schneider algorithm, it will starts with fitting the data points by a cubic Bezier curve with end tangent constraints. If the deviation to the resulting curve is too large, then it will split the data points into two "regions" and fit each region of data with a cubic Bezier curves with end tangent constraints. This process will be repeated until the deviation to all cubic Bezier curves are small enough. As a result, it produces a series of cubic Bezier curves connected at best with C1 continuity (very likely it is actually G1 only). Furthermore, since this algorithm evaluate the end tangents from original data points, the noise in the data point will affect the end tangent evaluation and therefore the cubic Bezier fitting.
If you can spent time in the topic of curve fitting, you should look into least square fitting with B-spline curves. This will generate an output curve with high continuity (C2 for cubic B-spline curves for example). If you don't have much time to spent, then Schneider's algorithm is a good choice that strikes a balance between the implementation cost (if you have to re-implement it in a specific language) and the resulting curve's quality.
What you're trying to do is called Curve Fitting.
While the Ramer-Douglas-Peucker algorithm essentially smoothes 'noise' out of a polyline by removing unnecessary points - a curve fitting algorithm will fit bezier curves through those points.
Here is a pretty nice example on Youtube and here is the original paper describing the algorithm itself.
As for the Paper.js example:
This is the Github link for that particular functionality
you mentioned and this is pretty well commented. The research paper that
was used is this.
Also here is a very short discussion on the mailing list about
what was used and what not (apparently Ramer-Douglas-Peucker was used
but removed later)

Outlier dectection Using ELKI

I am use ELKI data mining software for outlier detection. It have many outliers detection techniques but all provides same results(same outliers with all techniques the only difference is in the size of the circle around the points as shown in figures below). I uses the mouse head dataset provided on the ELKI website. In data-set all the points are labeled with its respective cluster name, whether its is from ear_left or ear_right or head or noise. If i change the label of noise to the ear_right, it then shows that outlier point as ear_right. i have change 5 out of 10 noise label to ear_right.
here is the result of using KNN and LDOF outlier detection technique with modified data-set and in ELKI:
Is it a problem with the software or i am doing something wrong? have anyone tried it using for outlier detection? Is there any good software which can perform outlier detection using different algorithms like LOF, LDOF , KNN or where i could find algorithm source code for these techniques?
This is a very simplistic data set.
It is not surprising that the methods all work more or less good. Because this is a toy data set, not real data... on real data, outlier detection is much, much harder.
Note that the implementations in ELKI assign numerical scores. They do not produce a yes/no outlier decision; this is trivial to derive from the scores.
If you want a binary result, you can for example set the visualization scaling parameter to only visualize the top k results. In other cases, you may want to read the actual papers. For example, the authors of LOCI suggest to treat objects with a score larger than 3 as outliers. (Unfortunately, most methods do not have a particular easy interpretation available.)
Don't think in the classification box. Outlier detection is an explorative technique, not classification.
ELKI can also evaluate the quality of the outlier method using a number of measures, such as ROC AUC, ROC curves, Precision#k, AveP, Maximum-F1.

Resources