The multiplicative weights update method is an algorithmic technique most commonly used for decision making and prediction, and also widely deployed in game theory and algorithm design. The simplest use case is the problem of prediction from expert advice, in which a decision maker needs to iteratively decide on an expert whose advice to follow. The method assigns initial weights to the experts (usually identical initial weights), and updates these weights multiplicatively and iteratively according to the feedback of how well an expert performed: reducing it in case of poor performance, and increasing it otherwise. It was discovered repeatedly in very diverse fields such as machine learning (AdaBoost, Winnow, Hedge), optimization (solving linear programs), theoretical computer science (devising fast algorithm for LPs and SDPs), and game theory.
"Multiplicative weights" implies the iterative rule used in algorithms derived from the multiplicative weight update method. It is given with different names in the different fields where it was discovered or rediscovered.
The earliest known version of this technique was in an algorithm named "fictitious play" which was proposed in game theory in the early 1950s. Grigoriadis and Khachiyan applied a randomized variant of "fictitious play" to solve two-player zero-sum games efficiently using the multiplicative weights algorithm. In this case, player allocates higher weight to the actions that had a better outcome and choose his strategy relying on these weights. In machine learning, Littlestone applied the earliest form of the multiplicative weights update rule in his famous winnow algorithm, which is similar to Minsky and Papert's earlier perceptron learning algorithm. Later, he generalized the winnow algorithm to weighted majority algorithm. Freund and Schapire followed his steps and generalized the winnow algorithm in the form of hedge algorithm.
The multiplicative weights algorithm is also widely applied in computational geometry such as Clarkson's algorithm for linear programming (LP) with a bounded number of variables in linear time. Later, Bronnimann and Goodrich employed analogous methods to find set covers for hypergraphs with small VC dimension.
In operation research and on-line statistical decision making problem field, the weighted majority algorithm and its more complicated versions have been found independently.
In computer science field, some researchers have previously observed the close relationships between multiplicative update algorithms used in different contexts. Young discovered the similarities between fast LP algorithms and Raghavan's method of pessimistic estimators for derandomization of randomized rounding algorithms; Klivans and Servedio linked boosting algorithms in learning theory to proofs of Yao's XOR Lemma; Garg and Khandekar defined a common framework for convex optimization problems that contains Garg-Konemann and Plotkin-Shmoys-Tardos as subcases.
A binary decision needs to be made based on n experts’ opinions to attain an associated payoff. In the first round, all experts’ opinions have the same weight. The decision maker will make the first decision based on the majority of the experts' prediction. Then, in each successive round, the decision maker will repeatedly update the weight of each expert's opinion depending on the correctness of his prior predictions. Real life examples includes predicting if it is rainy tomorrow or if the stock market will go up or go down.
The multiplicative weights method is usually used to solve a constrained optimization problem. Let each expert be the constraint in the problem, and the events represent the points in the area of interest. The punishment of the expert corresponds to how well its corresponding constraint is satisfied on the point represented by an event.
In machine learning, Littlestone and Warmuth generalized the winnow algorithm to the weighted majority algorithm. Later, Freund and Schapire generalized it in the form of hedge algorithm. AdaBoost Algorithm formulated by Yoav Freund and Robert Schapire also employed the Multiplicative Weight Update Method.
Based on current knowledge in algorithms, multiplicative weight update method was first used in Littlestone's winnow algorithm. It is used in machine learning to solve a linear program.
Given labeled examples where are feature vectors, and are their labels.
The aim is to find non-negative weights such that for all examples, the sign of the weighted combination of the features matches its labels. That is, require that for all . Without loss of generality, assume the total weight is 1 so that they form a distribution. Thus, for notational convenience, redefine to be , the problem reduces to finding a solution to the following LP:
, , .
This is general form of LP.
Assume the learning rate and for , is picked by Hedge. Then for all experts ,
Initialization: Fix an . For each expert, associate the weight ≔1 For t=1,2,…,T:
1. Pick the distribution where . 2. Observe the cost of the decision . 3. Set ).
This algorithm maintains a set of weights over the training examples. On every iteration , a distribution is computed by normalizing these weights. This distribution is fed to the weak learner WeakLearn which generates a hypothesis that (hopefully) has small error with respect to the distribution. Using the new hypothesis , AdaBoost generates the next weight vector . The process repeats. After T such iterations, the final hypothesis is the output. The hypothesis combines the outputs of the T weak hypotheses using a weighted majority vote.
Input: Sequence of labeled examples (,),…,(, ) Distribution over the examples Weak learning algorithm "'WeakLearn"' Integer specifying number of iterations Initialize the weight vector: for ,..., . Do for ,..., 1. Set . 2. Call WeakLearn, providing it with the distribution ; get back a hypothesis [0,1]. 3. Calculate the error of |. 4. Set . 5. Set the new weight vector to be . Output the hypothesis:
Given a matrix and , is there a such that ?
(1)
Using the oracle algorithm in solving zero-sum problem, with an error parameter , the output would either be a point such that or a proof that does not exist, i.e., there is no solution to this linear system of inequalities.
Given vector , solves the following relaxed problem
(2)
If there exists a x satisfying (1), then x satisfies (2) for all . The contrapositive of this statement is also true. Suppose if oracle returns a feasible solution for a , the solution it returns has bounded width . So if there is a solution to (1), then there is an algorithm that its output x satisfies the system (2) up to an additive error of . The algorithm makes at most calls to a width-bounded oracle for the problem (2). The contrapositive stands true as well. The multiplicative updates is applied in the algorithm in this case.
Multiplicative weights update is the discrete-time variant of the replicator equation (replicator dyamics), which is a commonly used model in evolutionary game theory. It converges to Nash equilibrium when applied to a congestion game.
The multiplicative weights algorithm is also widely applied in computational geometry, such as Clarkson's algorithm for linear programming (LP) with a bounded number of variables in linear time. Later, Bronnimann and Goodrich employed analogous methods to find Set Covers for hypergraphs with small VC dimension.