On Discriminative vs. Generative Classifiers: A comparison of logistic regression and Naive Bayes

We compare discriminative and generative learning as typified by logistic regression and naive Bayes. We show, contrary to a widelyheld belief that discriminative classifiers are almost always to be preferred, that there can often be two distinct regimes of performance as the training set size is increased, one in which each algorithm does better. This […]

Distance metric learning, with application to clustering with side-information

Many algorithms rely critically on being given a good metric over their inputs. For instance, data can often be clustered in many “plausible” ways, and if a clustering algorithm such as K-means initially fails to find one that is meaningful to a user, the only recourse may be for the user to manually tweak the […]

Latent Dirichlet Allocation

We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture […]

Classification with Hybrid Generative/Discriminative Models

Although discriminatively trained classiﬁers are usually more accurate when labeled training data is abundant, previous work has shown that when training data is limited, generative classiﬁers can out-perform them. This paper describes a hybrid model in which a high-dimensional subset of the parameters are trained to maximize generative likelihood,and another, small, subset of parameters are […]

Policy search by dynamic programming

We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to visit each state), then we can derive a policy search algorithm that terminates in a finite number of steps, and for which we can provide non-trivial performance […]

Online Learning of Pseudo-Metrics

We describe and analyze an online algorithm for supervised learning of pseudo-metrics. The algorithm receives pairs of instances and predicts their similarity according to a pseudo-metric. The pseudo-metrics we use are quadratic forms parameterized by positive semi-definite matrices. The core of the algorithm is an update rule that is based on successive projections onto the […]

Learning random walk models for inducing word dependency probabilities

Many NLP tasks rely on accurately estimating word dependency probabilities P(w1|w2), where the words w1 and w2 have a particular relationship (such as verb-object). Because of the sparseness of counts of such dependencies, smoothing and the ability to use multiple sources of knowledge are important challenges. For example, if the probability P(N|V ) of noun […]

Feature selection, L1 vs. L2 regularization, and rotational invariance

We consider supervised learning in the presence of very many irrelevant features, and study two different regularization methods for preventing overfitting. Focusing on logistic regression, we show that using L1 regularization of the parameters, the sample complexity (i.e., the number of training examples required to learn “well,”) grows only logarithmically in the number of irrelevant […]

Apprenticeship Learning Via Inverse Reinforcement Learning

We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to perform. This setting is useful in applications (such as the task of driving) where it may be difficult to write down […]

Inverted Autonomous Helicopter Flight Via Reinforcement Learning

Helicopters have highly stochastic, nonlinear, dynamics, and autonomous helicopter flight is widely regarded to be a challenging control problem. As helicopters are highly unstable at low speeds, it is particularly difficult to design controllers for low speed aerobatic maneuvers. In this paper, we describe a successful application of reinforcement learning to designing a controller for […]

Learning first order Markov models for control

First-order Markov models have been successfully applied to many problems, for example in modeling sequential data using Markov chains, and modeling control problems using the Markov decision processes (MDP) formalism. If a first-order Markov model’s parameters are estimated from data, the standard maximum likelihood estimator considers only the first-order (single-step) transitions. But for many problems, […]

Online bounds for Bayesian algorithms

We present a competitive analysis of Bayesian learning algorithms in the online learning setting and show that many simple Bayesian algorithms (such as Gaussian linear regression and Bayesian logistic regression) perform favorably when compared, in retrospect, to the single best model in the model class. The analysis does not assume that the Bayesian algorithms’ modeling […]