[PYTHON] Bid optimization simulation

Motivation

I'm working on an algorithm for automatic bidding for ads (it's going to be in the past soon), but There is no log stored at all. Therefore, I had no choice but to generate data and perform a simulation.

What I wrote in this blog

Overview of multipurpose optimization

What is multipurpose optimization? Book There is a description as follows.

In multi-objective optimization, consider minimizing multiple objective functions $ \ bf {f} $. Ideally, you might want to be able to minimize all objective functions $ f_1 $, ..., $ f_k $ at the same time. However, that means that it was not necessary to solve it as a multi-objective optimization in the first place. This is because if one objective function is minimized, the other objective functions are minimized at the same time. Therefore, the problem that should usually be dealt with in multi-objective optimization is that there is a trade-off relationship between the objective functions $ \ bf {f} $, such that trying to make one objective function smaller makes the other objective function larger. There is a case. (Quoted from Mathematical Optimization Practice Guide p.110)

In other words, let's minimize (maximize) the value of the objective function in the situation where "if you set it up, you can't stand up here". Basically, you will find a Pareto solution. There are various methods, and the method differs depending on whether the Pareto front is convex or non-convex. 50 optimization problems useful for application (application optimization series) = 1211 & creativeASIN = 4254117884 & linkCode = as2 & tag = shimashimao06-22 "> 50 optimization problems useful for application (application optimization series) ) Will be helpful. However, this time I chose a heuristic method because I wanted to perform calculations at high speed and I wanted to optimize for a fairly fine time slot.

Overview of bid optimization

References are here. Real time bid optimization with smooth budget delivery in online advertising, Lee+, Turn Inc.,'13 [PDF] Please refer to slidshare for the contents. .. (However, if you look at this now, the interpretation part is a little wrong ... I will correct it if you feel like it w)

The code is created based on this paper. However, the parts that take time to generate data are omitted. As I wrote at the beginning, since there is no data, I set up a user agent and a bid agent to make a pseudo bid. The time slot is set to 1 minute, and the budget is distributed and the bid amount is optimized for each time slot.

Reference code

The code I created is shown below. Since it is written by a non-engineer, I think there are many things to do, but please be patient.

User agent

Create a site for a user agent to visit the site.

python


# coding: utf-8
import random

class Agent:
    def __init__(self,num):
        random.seed(num) #Occurrence of random seed
        self.ctr = random.uniform(0,1)
        
    def get_ctr(self):
        return self.ctr
    
    def get_visiter_ctr(self, pred):
        if random.uniform(0,1) > (1-pred):
            return self.get_ctr()
        else: return False

Next, create an agent (competitor) to bid.

Bid agent

python


# coding: utf-8

import random
from math import ceil

class Agent:
    def __init__(self,n):
        random.seed(n) #Occurrence of random seed
        self.ctr = random.uniform(0,1)
        self.maxprice = random.randint(40,90)

    def get_maxprice(self):
        return self.maxprice
        
    def get_bid_price(self, ctr, pred):
        price = self.maxprice*ctr*pred
        
        if random.uniform(0,1) > 0.35 :
            return ceil(price) + random.randint(0,5)
        else: 
            return False

How to write the code of the optimization part.

Budget and bid optimization

python


# coding:utf-8
#cython: boundscheck=False
#cython: wraparound=False

import pyximport; pyximport.install()

from math import sin, cos, pi, fmod, sqrt, floor
import random

import useragent
import bidagent

import numpy as np
cimport numpy as np
cimport cython

DTYPE = np.float64
INTTYPE = np.int64
ctypedef np.float64_t DTYPE_t
ctypedef np.int64_t INTTYPE_t

cdef extern from 'stdlib.h':
    ctypedef unsigned long size_t
    void *malloc(size_t size)
    void free(void *prt)

class CreatePredict:
    pi2 = 2*pi
    #random.seed(3)
    PHI = 2*pi/8 #Representing phase shift
#    TBit = 10000000.0
    
    def __init__(self, period, agent_num):
        self.minbit = 1000*agent_num/1000
        self.E = 1250*agent_num/1000

        self.PERIOD = period
        self.OMEGA = self.pi2/self.PERIOD
        self.AGENT_NUM = agent_num

    def getvalue(self,period):
        """
Probability of coming to the site
        sin(theta) = sin(OMEGA*t + phi)
        T = 2*pai/OMEGA
        60*60*24=86400sec
        OMEGA = 2pai/86400
        """
        cdef sec
        cdef np.ndarray[np.double_t, ndim=1] tmp = np.zeros(period,dtype=np.double)
        for sec in xrange(period):
            tmp[sec] = self.E * ( sin(self.OMEGA*sec - self.PHI) + 1 ) + self.minbit
        return tmp/self.AGENT_NUM  #Calculation of pred


cdef class ComputeBid:
    cdef int time, day
    gamma = 1.96 #Used to calculate confidence intervals(95%Assuming confidence interval)
    beta1=0.3
    beta2=0.8
    
    theta_bin = 100

    cdef np.ndarray \
        bid_num,win_num,cmsp_bag,theta,ctr_hist,mu,sigma
    cdef int period, daybuget, gcpc, bidcap, bin_n
    cdef double ctrbin, avgprice
    
    def __init__(self,period,daybuget,gcpc,bidcap,ctrbin):
        self.period = period
        self.ctrbin = ctrbin
        
        self.daybuget = daybuget
        self.gcpc = gcpc
        self.bidcap = bidcap

        self.bid_num = np.zeros(period, dtype=np.int64)
        self.theta = np.zeros(self.theta_bin, dtype=np.float64)

        #Storage of ctr hist grams
        self.bin_n = int( float(1.0)/ctrbin )
        self.ctr_hist = np.zeros((period,self.bin_n), dtype=np.int64)
        
        self.mu = np.zeros(period, dtype=np.float64) #Average value for each slot of tau of bid threshold
        self.sigma = np.zeros(period, dtype=np.float64) #Variance of bid threshold tau for each slot
        self.avgprice = 0.0
    
    def __dealloc__(self):
        pass
    
    def output(self, file,user_agent_num,bid_agent_num,days):
        cdef int t,n,i,c,b
        
        #Calculation of visit prediction
        pred_creater = CreatePredict(self.period, user_agent_num)
    
        #Occurrence of user agent
        user_agent = [ useragent.Agent(n) for n in xrange(user_agent_num) ]
        
        #Occurrence of bid agent
        bid_agent = [ bidagent.Agent(n) for n in xrange(bid_agent_num) ]
           
        for t in xrange(self.period*days):
            self.time = fmod(t,self.period)
            if fmod(t,self.period)==0:
                if self.day!=0: self.avgprice /= sum(self.win_num)
                self.bid_num = np.zeros(self.period, dtype=np.int64)
                self.win_num = np.zeros(self.period, dtype=np.int64)
                self.cmsp_bag = np.zeros(self.period, dtype=np.float64)
                self.day+=1
                
            print "progress:",t
    
            #Calculation of visit prediction
            pred = pred_creater.getvalue(self.period)  
            #Calculation of budget allocation for each slot
            slot_bag = self.calc_slot_bag(pred)            
            
            #Get visitor ctr
            ctr = []
            bidp = []
            visit = 0.0
            dbidprice = {}
            for i in xrange(user_agent_num):
                ctr.append( user_agent[i].get_visiter_ctr(pred[self.time]) )
            
                #Calculation of bid amount for each agent
                if ctr[i] == False: continue
                visit += 1.0
                
                """Create a histogram of ctr"""
                if self.day==1: self.get_hist(ctr[i])
                    
                maxprice = 0.0
                for b in xrange(bid_agent_num):
                    price = bid_agent[b].get_bid_price(ctr[i], pred[self.time])
                    if price > maxprice:
                        maxprice = price
                bidp.append(maxprice)

                """Counting the number of bids and determining the bid amount"""
                #Pacing used for bidding pricing_Calculate rate
                if self.bid_num[self.time]==0:
                    pacing_rate = 0.0
                else:
                    pacing_rate = self.bid_num[self.time]/float(visit)
                
                #Bid price determination
                bidprice = self.calc_bidprice(self.day, ctr[i], pacing_rate)
                print 'price',bidprice
                
                #Judgment whether to bid or not
                if bidprice == False: continue
                
                """Creating a dictionary for confirming the calculated bid amount"""
                if str(floor(bidprice)) not in dbidprice: dbidprice[str(floor(bidprice))] = 1
                else: dbidprice[str(floor(bidprice))] += 1
                
                #If the bid amount exceeds cap, it will be the value of cap.
                if bidprice>self.bidcap: bidprice = self.bidcap
                #If you want to bid, check the number of bids
                self.bid_num[self.time] += 1
                
                #Judgment of winning bid
                if bidprice > maxprice:
                    #Judgment of whether to bid
                    if slot_bag[self.time] - self.cmsp_bag[self.time] - bidprice > 0:
                        self.cmsp_bag[self.time] += bidprice
                        print 'cmsp=',self.cmsp_bag[self.time]
                        self.calc_theta(bidprice,maxprice)
                        self.win_num[self.time] += 1
                        #Calculation of average cost around 1bid
                        if self.day==1: self.avgprice += bidprice
        print dbidprice
        print np.float64(self.win_num)/self.bid_num
        num1 = []
        num2 = []
        for i in xrange(self.period):
            num1.append( str(slot_bag[i]) )
            num2.append( str(self.cmsp_bag[i]) )
        file.write(u'slot bags\n'+u','.join(num1)+u'\n')    
        file.write(u'comsumption bags\n'+u','.join(num2)+u'\n')
    
    def get_hist(self,ctr):
        """Create a histogram of ctr"""
        cdef int bin
        for bin in xrange(self.bin_n):
            if bin*self.ctrbin<=ctr<(bin+1)*self.ctrbin:
                self.ctr_hist[self.time,bin] += 1
                break
    
    def calc_bidprice(self, day, ctr, pacing_rate): #ctr[i]
        cdef int i
        tau = 0.0
        if day<2: 
            if ctr>=0.2:
                return random.randint(20,80)
            else: return False
        else:
            if self.cmsp_bag[self.time]!=0.0:
                #Prediction imp calculation
                imps = float(self.cmsp_bag[self.time])/self.avgprice
                print 'imps',imps
                #Calculation of predicted number of bids
                #win_rate = bid_num[self.time]/win_num[self.time]
                bids = float(imps*self.bid_num[self.time])/self.win_num[self.time]
                print 'bids',bids
                #Calculation of the number of predicted requests
                print 'pacing_rate',pacing_rate
                reqs = bids/pacing_rate
            else:
                reqs=0.0
            
            print 'reqs',reqs
            #Get bid threshold
            for i in xrange(len(self.ctr_hist[self.time,:])):
                if sum(self.ctr_hist[self.time,-(i+1):]) - reqs >0.0:
                    tau = float((self.theta_bin-1)-i)/self.theta_bin
                    break

            #Calculation of mean value
            if self.time == 0: t = self.period - 1
            else: t = self.time-1 #T here is t-Set to 1
            div_t = float(1)/(self.time+1)
            self.mu[self.time] = self.mu[t]+ div_t*(tau-self.mu[t]) #mu may need to save the current slot and the past slot
            #Variance calculation
            self.sigma[self.time] = div_t*t*self.sigma[t] \
                                + div_t*(tau-self.mu[t])*(tau-self.mu[self.time])
            
            #Provisional pricing
            u = ctr*self.gcpc # Ui = AR*G(Predictive CTR x Target CPC)
            
            #Determine the price according to the bid rate
            # beta1=0.3, beta2=0.Set to 8
            price = self.get_price(pacing_rate,u)
            print 'price1',price
            
            #Judgment of whether to bid
            return self.judge_bid(pacing_rate,price,tau)

    def calc_theta(self,bidprice,secprice):
        cdef int b
        theta = float(secprice)/bidprice
        for b in xrange(100):
            if 0.01*b<=theta<=0.01*(b+1):
                self.theta[b] += 1
                break
    
    def calc_slot_bag(self,pred):
        s = sum(pred)
        prop = np.array(pred, dtype=DTYPE)/s
        return prop*self.daybuget
    
    def get_price(self,pacing_rate,u):
        price = 0.0
        if pacing_rate<=self.beta1:
            s_theta = sum(self.theta)
            s = 0.0
            for i in xrange(len(self.theta)):
                s += self.theta[i]
                if float(s)/s_theta>0.01:
                    return i*(float(3.0)/(2*self.theta_bin))*u #price = self.theta[i]*u
        elif pacing_rate>=self.beta2: # pacing_You may need to make rage predictions
            lo =1.0 + (self.bidcap/self.avgprice-1.0)/(1.0-self.beta2) \
                                                        *(pacing_rate-self.beta2)
            return lo*u
        else:
            return u

    def judge_bid(self, pacing_rate, price, tau):
        term2 = self.gamma*self.sigma[self.time]/sqrt(self.day)
        print 'tau',tau

        if tau<self.mu[self.time]-term2: #Bid rejection limit
            return False
        elif tau>=self.mu[self.time]+term2: #Lower bid acceptance limit
            return price
        else:
            if pacing_rate>random.uniform(0,1):
                return price
            else: return False

Since compilation is required when using Cython, setup.py is also described for the time being.

setup.py

python


# coding: utf-8

from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
import numpy as np

setup(
    name="CPWOPTcritical",
    cmdclass={"build_ext":build_ext},
    ext_modules=[Extension("CySimulation",["CySimulation.pyx"])],
    include_dirs=[np.get_include()]#Specify the include path here
)

Also, because it is Cython, the main part is separated. It is described below.

main

python


# coding: utf-8

#import pyximport; pyximport.install()
import io

if __name__=='__main__':
    import CySimulation
    
    period = 1440 #Number of slots per day
    daybuget = 1000000 #Daily budget
    goalcpc = 100 #Goal cpc
    bidcap = 90 #Bid cap
    ctrbin = 0.01 #CTR histogram bin size
    days = 2 #Data generation days
    user_agent_num = 600 #Number of user agents
    bid_agent_num = 100 #Number of bid agents
            
    bid = CySimulation.ComputeBid(period, daybuget, goalcpc, bidcap,ctrbin)
    
    with io.open('sin_test','a') as f:
        bid.output(f,user_agent_num,bid_agent_num,days)

Experimental result

It seems that it will go well, but I feel that the value that I expected in the part where the data generation was omitted or the part predicted from the past data was skipped did not come out. Since the settings are quite detailed, it seems difficult to get sufficient results unless you create them precisely.

Summary (impression)

It's a fairly heuristic method, but I think it's the most usable method (paper) under conditions where a speed of 50ms or less is required in combination with CVR estimation. For this job, the first paper I read about bid optimization was the one introduced above. I have read several other papers, but the impact of this paper was the strongest. If logs other than IP and time stamp are stored, I wanted to optimize with actual data based on this method. I'm sorry.

If you find something strange, we would appreciate it if you could point it out.

Recommended Posts

Bid optimization simulation
Escalator simulation
Recommendation optimization