.. AUTO-GENERATED FILE -- DO NOT EDIT!

.. _example_pylab_2d:


Simple Plotting of Classifier Behavior
======================================

.. index:: plotting example

This example runs a number of classifiers on a simple 2D dataset and plots the
decision surface of each classifier.

First compose some sample data -- no PyMVPA involved.

::
  
  import numpy as N
  
  # set up the labeled data
  # two skewed 2-D distributions
  num_dat = 200
  dist = 4
  feat_pos=N.random.randn(2, num_dat)
  feat_pos[0, :] *= 2.
  feat_pos[1, :] *= .5
  feat_pos[0, :] += dist
  feat_neg=N.random.randn(2, num_dat)
  feat_neg[0, :] *= .5
  feat_neg[1, :] *= 2.
  feat_neg[0, :] -= dist
  
  # set up the testing features
  x1 = N.linspace(-10, 10, 100)
  x2 = N.linspace(-10, 10, 100)
  x,y = N.meshgrid(x1, x2);
  feat_test = N.array((N.ravel(x), N.ravel(y)))
  

Now load PyMVPA and convert the data into a proper
:class:`~mvpa.datasets.base.Dataset`.

::
  
  from mvpa.suite import *
  
  # create the pymvpa dataset from the labeled features
  patternsPos = Dataset(samples=feat_pos.T, labels=1)
  patternsNeg = Dataset(samples=feat_neg.T, labels=0)
  ds_lin = patternsPos + patternsNeg
  

Let's add another dataset: XOR. This problem is not linear separable
and therefore need a non-linear classifier to be solved. The dataset is
provided by the PyMVPA dataset warehouse.

::
  
  # 30 samples per condition, SNR 3
  ds_nl = pureMultivariateSignal(30,3)
  
  datasets = {'linear': ds_lin, 'non-linear': ds_nl}
  

This demo utilizes a number of classifiers. The instantiation of a
classifier involves almost no runtime costs, so it is easily possible
compile a long list, if necessary.

::
  
  # set up classifiers to try out
  clfs = {'Ridge Regression': RidgeReg(),
          'Linear SVM': LinearNuSVMC(probability=1,
                        enable_states=['probabilities']),
          'RBF SVM': RbfNuSVMC(probability=1,
                        enable_states=['probabilities']),
          'SMLR': SMLR(lm=0.01),
          'Logistic Regression': PLR(criterion=0.00001),
          'k-Nearest-Neighbour': kNN(k=10)}
  

Now we are ready to run the classifiers. The following loop trains
and queries each classifier to finally generate a nice plot showing
the decision surface of each individual classifier, both for the linear and
the non-linear dataset.

::
  
  for id, ds in datasets.iteritems():
      # loop over classifiers and show how they do
      fig = 0
  
      # make a new figure
      P.figure(figsize=(6, 6))
  
      print "Processing %s problem..." % id
  
      for c in clfs:
          # tell which one we are doing
          print "Running %s classifier..." % (c)
  
          # make a new subplot for each classifier
          fig += 1
          P.subplot(2, 3, fig)
  
          # plot the training points
          P.plot(ds.samples[ds.labels == 1, 0],
                 ds.samples[ds.labels == 1, 1],
                 "r.")
          P.plot(ds.samples[ds.labels == 0, 0],
                 ds.samples[ds.labels == 0, 1],
                 "b.")
  
          # select the clasifier
          clf = clfs[c]
  
          # enable saving of the values used for the prediction
          clf.states.enable('values')
  
          # train with the known points
          clf.train(ds)
  
          # run the predictions on the test values
          pre = clf.predict(feat_test.T)
  
          # if ridge, use the prediction, otherwise use the values
          if c == 'Ridge Regression' or c.startswith('k-Nearest'):
              # use the prediction
              res = N.asarray(pre)
          elif c == 'Logistic Regression':
              # get out the values used for the prediction
              res = N.asarray(clf.values)
          elif c == 'SMLR':
              res = N.asarray(clf.values[:, 1])
          else:
              # get the probabilities from the svm
              res = N.asarray([(q[1][1] - q[1][0] + 1) / 2
                      for q in clf.probabilities])
  
          # reshape the results
          z = N.asarray(res).reshape((100, 100))
  
          # plot the predictions
          P.pcolor(x, y, z, shading='interp')
          P.clim(0, 1)
          P.colorbar()
          P.contour(x, y, z, linewidths=1, colors='black', hold=True)
  
          # add the title
          P.title(c)
  
  if cfg.getboolean('examples', 'interactive', True):
      # show all the cool figures
      P.show()

.. seealso::
  The full source code of this example is included in the PyMVPA source distribution (`doc/examples/pylab_2d.py`).
