Ad Ranking Algorithms

Introduction

 Internet search engines companies like Google, Yahoo and Microsoft have revolutionized not only the use of the Internet by individuals but also the way people do business and advertising. When the consumers type in the keywords to do the search the search engines display what they are looking for along with the highly targeted ads which are relevant to the specific search term. Search engines can provide advertisements to targeted audience precisely by linking the advertisements with the search terms. The search engine market is essentially a large auction where businesses place bids for individual keywords, together with limits specifying their maximum daily budget. The advertiser then pays the search engine when the users click on their ads. The number of ads that the search engine displays to the user is limited and each advertiser may desire a specific position for their ads to be shown on the search results page. An ad shown on the top of the page may be more likely to be clicked by the user than the ad shown on the bottom. Therefore the search engines should come up with a mechanism to rank these ads and increase their revenue.

 Several algorithms have been proposed in the recent years to maximize the revenue generated from these ads. The algorithms consider many different things like daily budget amount of advertisers, and the bid amount they are willing to pay for keywords they are interested in. The objective of each of the algorithm is to match the keywords to the highest paying advertiser interested in the keyword, thereby maximizing the revenue generated. The matchings generally take place in real time, and since advertisers have limited budgets this generalized matching problem is nontrivial.


Online Algorithms

Algorithms play an important role in Computer Science although people have been using them as long as they have been solving problems systematically. An algorithm is a complete, step-by-step procedure for solving a specific problem. Usually before designing the algorithm, all the needed data is available before the execution of the algorithm but in reality this is not the case. An online algorithm receives a sequence of requests and must respond to each request as soon as it is received. An offline algorithm may wait until all requests have been received before determining its responses. One approach to evaluating an online algorithm is to compare its performance with that of the best possible offline algorithm for the same problem. Thus, given a measure of "profit", the performance of an online algorithm can be measured by the worst-case ratio of its profit to that of the optimal offline algorithm. Since complete input is not available, the action taken by the online algorithm may seem right at that moment of time, but may turn out to be wrong after complete information is available later. In other words, online algorithms try to model a real life situation, where the entire input is not available from the start. The input is obtained incrementally, and the algorithm has to make irrevocable decisions to respond to the input. In contrast, an offline algorithm outputs an answer for a problem where whole data is given from the beginning.

Competitive Ratio

Typically, the online algorithm is compared to an optimal offline algorithm that knows the entire request sequence in advance. The competitiveness of an online algorithm is the ratio of its performance to the performance of an optimal offline algorithm. We say that an online algorithm A has competitive ratio ± (or A is ±-competitive) if, for every input sequence the ratio of cost of online algorithm (CA), to the cost of the solution produced by optimal offline algorithm (COPT) can be given by.


Matching

Matching is one of the most basic problems in online algorithms. There are several key applications namely Online Bipartite Matching, and k-server Problem - assignment of resources to jobs or clients to servers. Online Bipartite Matching has its applications in the world of search engine marketing.


Online Bipartite Matching

A bipartite (or bigraph) is a graph whose vertices can be divided into two disjoint sets B and G such that every edge connects a vertex in B to one in G and B and G are independent sets.

The Matching Problem

Initially there are a set of boy vertices B1...B5. Assume that the girl vertices arrive in a preselected order, and that the edges incident to a vertex are revealed to us only when the vertex arrives. Each girl has a set of preferred boys she wants to get matched to. The task is to decide, as each girl vertex arrives, which boy vertex to match her to, so that the size (cardinality) of the matching obtained is maximized. At that time, we have to decide to either pair the girl with a boy or dont match her at all. In the example 2.1 the girls G1...G5 come in that order.

Matching Problem

Offline Matching

Every girl can be matched to at most one boy. Therefore, a possible set of matching can be represented as a subset M of the edges, no two of which are adjacent. Such a set of edges is called a matching in the graph. A matching is called perfect if every vertex is incident to an edge of the matching or the cardinality of the match is equal to the number of vertices. In offline analysis in this example, the perfect matching would be to have G1 paired with B1, G2 with B3, G3 with B5, G4 with B2 and G5 with B4. Such a matching has a maximum cardinality (of 5 in this example).

Offline Matching

Online Matching

In online matching, when a girl arrives it must be matched to an unmatched boy at that moment in time and this decision is irrevocable. One such matching can be as follows ; G1 comes it is paired with B1, G2 gets paired with B2, G3 with B5, G4 could not get paired as both B2 & B5 are taken, while G5 gets paired with B4. The cardinality of the online matching in is 4.

Online Matching

Ad Ranking Algorithms

The paid search problem is generalization of the online bipartite matching problem. In this generalization, the boys represent the bidders while the girls represent the keywords. Each edge is a bid or an interest in the keyword. The keywords arrive online, and the bids from each bidder for a keyword are revealed once it arrives. The algorithm then decides, irrevocably, which bidder to display for that keyword. The advertisers submit a bid for a specific keyword stating their willingness to pay per click. When a user enters a keyword or query the query results are displayed along with the ads. The search engine has some constraints that it cannot exceed the declared budgets of the advertisers, chooses which advertisements to display based on relevance and how much to charge based on their bids.

Rank by Bid

In 1997, Overture came up with this idea of linking the advertisers to rank the search results by auction and give the top position to the highest bidder for the keywords. The determination was purely based on bid and not related to how relevant the ad was for the consumer. Googles founders, Larry Page and Sergey Brin, separated the advertising (or paid search) from the search rankings (natural search). In 2000, they introduced the Google model where instead of auctioning off the search rankings, they auctioned only the advertisements and separated them from the search results. Previously advertisers paid per impression, but now an advertiser paid based on the click. 

Rank by Revenue

Almost all the modern search engines separate the ads from the search results and the advertisers choose one or more search keywords to which they want to link their ads and the bid what they are willing to pay every time someone clicks on the link. These ad links are displayed next to the search results for the current keyword and these links are ordered in the order of their yield. Yield at simplistic level is the product of the bid and click through rate CTR. CTRs can be separated into advertiser specific quality score factor and a position factor. Quality score QS of the ad in turn is the weighted sum of several factors like relevance of the ad, quality of the advertiser, quality of landing page, quality of the publisher where the ad is published. Another important consideration is the daily budget of each advertiser with the ad being trafficked as long as the daily budget is not exhausted.

Take a simple case of 4 advertisers. The table below shows their bids on a given keyword and the scores.

AdvertiserBidQSYield
A1.0011.00
B0.7521.50
C0.50 4 2.00
D 0.25 7 1.75

If the ranking function ranks them based on expected revenue the advertisers would be ranked as follows

Advertiser Bid QS Yield
C 0.50 4 2.00
D 0.25 7 1.75
B 0.75 2 1.50
A 1.00 1 1.00

Bidding for the keywords takes place continuously in dynamic auctions. The advertiser tries to maximize the utility towards a specific goal whether it is a sale, download or registration. The search engine on the other hand tries to maximize revenue each day, with certain limitations, that it must respect each advertisers daily budget and the revenue depends on an advertisers bid for a keyword he wins with a user clicking on the ad.

Random Ranking

Introduced in 1990 by Karp, Vazirani, and Vazirani [KVV], online bipartite matching was one of the first problems to receive the attention of competitive analysis. This algorithm, called Ranking, initially chooses a single random ranking on the vertices that is used to choose matches throughout the entire run of the algorithm.

The algorithm has two phases

·    Initialization: Assign to each boy (bidder) a random priority or ranking.

·    Matching Phase: As each girl (keyword query) arrives, match her to the highest ranked boy or a bidder (if any).

When the query arrives, it is immediately matched to the first interested bidder. The algorithm assumes that each bid is either 0 or 1 and the daily budget is 1. The algorithm goes on to prove that in random match scenario it has a competitive ratio of 1-1/e ~ 0:63.

Balance

Kalyanasundaram and Pruhs came up with this algorithm. They have servers in place of advertisers and server requests instead of queries. In this problem each bidder has a daily budget of b dollars, but makes only 0/1 dollar bids on each query. The algorithm awards the query to that interested advertiser who has the highest unspent budget. They show that the competitive ratio of this algorithm tends competitive ratio of 1-1/e as b approaches infinity.

To illustrate, consider two advertisers A, B which bid on a couple of keywords 1, 2 as follows with each having a budget of 4 units.

A bids on 1

B bids on 1, 2

Assume that the query stream is 11112222. The balance choice would be 121222__ while the optimal choice would have been 11112222. The competitive ratio in this example is 3/4.

Greedy

Greedy Algorithm matches the queries to the advertisers with highest bid. The goal of the greedy algorithm is to maximize the revenue accrued at each step but it is proved that this algorithm achieves a competitive ratio of ½. Suppose the query sequence consists of a number of occurrences of the first query followed by a number of occurrences of the second query. The first query words are awarded to the second bidder and assume that these queries exhaust the bidder 2s budget. When second query arrives bidder 1 is not interested in this query and bidder 2 does not have any budget left to place the bid and no further revenue is accrued.

To illustrate, consider the same example.

A bids on 1

B bids on 1, 2

Assume that the query stream is 11112222. Worst case the greedy choice could be 2222____ while the optimal choice would have been 11112222 i.e. the greedy may end up randomly matching 2 for the first 4 keywords, while for 1 we dont have any interested bidder left with unspent budget. From this we can see that the greedy achieves a competitive ratio of ½.

Generalized online matching

Mehta, Saberi, Vazirani and Vijay Vazirani [MSVV] came up with this algorithm. In the early 2004 Google was ranking ads by the yield - product of the click through rate and the bid. Their algorithm enhanced the matching by considering the remaining fraction of each bidders budget against the amount of its bid in determining the allocation. The problem is to allocate the queries to the advertisers based on the bid and the daily budget.

MSVV can be generalized as follows:

·    Advertisers can enter at different times

·    Advertisers have a specific daily budget and the budgets are different.

·    A bidder pays only when the user clicks on his ad when it was presented as the query result.

·    The optimal allocation does not exhaust the advertisers entire budget.

Given a query the ranking algorithm tries to maximize bid with the fraction of budget spent. For an advertiser assume that the amount of money spent is m, while the total budget is b. The fraction of budget spent spent is m/ b. The algorithm assumes that the bids are less than the advertisers daily budget and defines a trade off function

To simplify the online algorithm the budget of each bidder can be descretized into k equal parts (called slabs) numbered 1 through k. Each bidder spends money in slab j before moving to slab j + 1.

When a new query arrives, let the bid of bidder i be bi. Allocate the query to the bidder i who maximizes

Competitive ratios

The worst case competitive ratios of the different algorithms are.

Greedy

½

Ranking

½

Balance

1-1/e

MSVV

1-1/e


References

[KP00] Bala Kalyanasundaram and Kirk R. Pruhs. An optimal deterministic algorithm for online b -matching. Theoretical Computer Science 2000.

[MSVV05] Aranyak Mehta, Amin Saberi, Umesh Vazirani, and Vijay Vazirani. Adwords and generalized online matching. FOCS, 2005

[KVV90] R.M. Karp, U.V. Vazirani, and V.V. Vazirani. An optimal algorithm for online bipartite matching. In Proceedings of the 22nd Annual ACM Symposium on Theory

[MSVV05]Mehta, Aranyak, Amin Saberi, Umesh Vazirani,and Vijay Vazirani. 2005. AdWords and Generalized On-line Matching. In Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, 264-73.

Online Matching Problem with Application: Google Adwords Problem - UNIVERSITY OF CINCINNATI, Sushma Maramreddy

On-line Bipartite Matching Made Simple Benjamin Birnbaum,Claire Mathieuy