1. Recommendation

In recommendations, we often create a model by machine learning based on learning data, give a prediction score for user x items, and show the item TOP? With a high prediction score for each user. However, this method has the problem that the prediction score of popular items is high and the prediction score of unpopular items is small, so only popular items are advertised and unpopular items are not advertised. Advertising sites often need to advertise for unpopular items in order to get some CV.

2. Optimization

This can be resolved by limiting the number of users who advertise on an item-by-item basis. For example, you can set the maximum number of users to advertise for popular items and the minimum number of users to advertise for popular items.

3. Example

Let's assume that the number of users is 30 and the number of items is 20. Consider 5 items recommended for each user.

However, only 10 people or less advertise for all items Items 0, 1 and 2 are popular items, so only 3 or less people will advertise Items 18 and 19 are unpopular items, so let's say you want to advertise more than 7 people.

Less than $ scores_ {ui} $: User u, the predicted score for item i.

If the above condition is made into a mathematical formula

variable

$ choices_ {ui} $: Whether to advertise item i to user u (1 or 0)

Objective function

$ \sum_{u} \sum_{i} scores_{ui} * choices_{ui} $ To maximize.

Constraints

\sum_{i} choices_{ui} <= 5 (\forall u)
\sum_{u} choices_{ui} <= 10 (\forall i)
\sum_{u} choices_{ui} <= 3 (i=0,1)
\sum_{u} choices_{ui} >= 9 (i=18,19)

It can be solved by linear programming.

When it is a real number with pulp

USER =30
ITEM = 20
Users = list(range(0,USER))
Items = list(range(0,ITEM))

prob = pulp.LpProblem("test",pulp.LpMaximize)

#Variable declaration
choices = pulp.LpVariable.dicts("Choice",(Users,Items) , 0, 1, pulp.LpInteger)

#Objective function
prob += pulp.lpSum([scores[u][i] * choices[u][i] for u in Users for i in Items ])

#Constraints
#1. $\sum_{i} choice_{ui} <= 5 (\forall u)$
for u in Users:
    prob += pulp.lpSum([choices[u][i] for i in Items]) <=5

#2. $\sum_{u} choice_{ui} <= 10 (\forall i)$
for i in Items:
    prob += pulp.lpSum([choices[u][i] for u in Users]) <= 10

#3. $\sum_{u} choice_{ui} <= 3 (i=0,1)$
for i in [0,1]:
    prob += pulp.lpSum([choices[u][i] for u in Users]) <= 3

#4. $\sum_{u} choice_{ui} >= 9 (i=18,19)$
for i in [18,19]:
    prob += pulp.lpSum([choices[u][i] for u in Users]) >= 9

status = prob.solve()

Can be solved with.

Now scores are as shown below. Items 1 and 2 are popular items, so increase the score, and scores 18 and 19 are unpopular items, so decrease the score.

np.random.seed(10)
scores = np.random.rand(USER, ITEM)
scores[:,0] += 0.3
scores[:,1] += 0.3
scores[:,18] -= 0.3
scores[:,19] -= 0.3
scores = np.clip(scores, 0, 1)

The sample source is https://github.com/tohmae/pulp_sample/blob/master/score_optimize.py It is in.

When optimizing with the above score, if there are no constraints 2,3,4

	Item0	Item1	Item2	Item3	Item4	Item5	Item6	Item7	Item8	Item9	Item10	Item11	Item12	Item13	Item14	Item15	Item16	Item17	Item19
User0	1	0	0	1	0	0	0	1	0	0	0	1	0	0	1	0	0	0	0
User1	1	0	0	1	0	0	0	0	0	0	1	0	1	0	0	0	0	1	0
User2	0	1	0	1	0	0	0	1	0	0	0	0	0	0	0	1	1	0	0
User3	1	1	0	0	0	0	0	0	0	1	0	0	0	0	0	0	1	1	0
User4	1	1	1	0	0	0	0	0	0	0	0	0	0	1	0	0	1	0	0
User5	1	1	0	0	1	1	0	0	0	0	0	0	1	0	0	0	0	0	0
User6	0	1	0	1	0	0	0	0	0	0	0	0	0	0	1	1	0	0	1
User7	1	1	0	1	0	0	0	0	0	0	0	0	0	1	0	1	0	0	0
User8	0	0	0	0	0	0	0	0	1	0	1	1	0	1	0	0	1	0	0
User9	1	1	0	0	1	1	0	0	0	0	0	0	0	0	0	0	0	1	0
User10	1	1	0	0	0	0	0	0	0	0	0	1	1	0	1	0	0	0	0
User11	0	1	0	0	1	0	0	1	0	0	0	1	1	0	0	0	0	0	0
User12	0	0	1	0	0	1	0	0	1	0	0	0	1	0	1	0	0	0	0
User13	0	0	0	1	0	0	0	1	1	0	1	0	1	0	0	0	0	0	0
User14	0	0	1	1	0	0	1	0	0	0	0	0	0	0	1	1	0	0	0
User15	1	0	1	0	0	0	0	0	0	0	0	0	1	0	1	1	0	0	0
User16	1	0	0	1	0	0	0	1	0	0	0	0	0	1	0	1	0	0	0
User17	0	1	0	1	0	0	0	0	0	1	0	0	0	1	0	0	0	1	0
User18	0	0	1	1	0	0	1	0	0	1	0	0	0	0	0	0	0	1	0
User19	1	0	0	1	1	0	0	0	0	0	0	0	0	0	0	1	0	1	0
User20	0	1	0	0	1	0	0	0	0	1	1	0	0	0	0	0	0	1	0
User21	0	0	0	0	1	1	0	0	0	0	0	1	0	1	0	1	0	0	0
User22	1	0	0	1	0	0	0	0	0	0	1	1	0	0	0	0	0	1	0
User23	0	0	0	1	1	0	0	1	0	1	0	0	0	0	0	1	0	0	0
User24	0	0	0	0	1	1	0	0	0	1	0	0	0	1	0	0	1	0	0
User25	0	0	0	0	0	1	0	0	0	0	1	0	0	1	1	0	0	1	0
User26	0	1	1	0	1	0	0	0	0	0	0	1	0	1	0	0	0	0	0
User27	0	0	0	0	0	0	1	0	0	0	1	1	0	0	1	0	1	0	0
User28	0	1	0	0	0	1	0	0	0	0	0	1	1	0	0	0	1	0	0
User29	1	1	0	0	0	0	0	1	0	1	0	0	0	0	0	0	0	1	0

When constraints 2, 3 and 4 are entered

	Item0	Item1	Item2	Item3	Item4	Item5	Item6	Item7	Item8	Item9	Item10	Item11	Item12	Item13	Item14	Item15	Item16	Item17	Item18	Item19
User0	0	0	0	0	0	0	0	1	0	0	0	1	0	0	1	0	0	0	1	1
User1	0	0	0	0	0	0	0	0	1	0	1	0	1	0	0	0	0	1	0	1
User2	0	0	0	1	0	0	0	1	0	0	0	0	0	0	0	1	1	0	1	0
User3	1	0	0	0	0	0	0	0	0	1	0	0	0	0	0	0	1	1	1	0
User4	1	0	1	0	0	0	0	0	0	1	0	0	0	1	0	0	1	0	0	0
User5	0	1	0	0	1	1	0	0	0	0	0	0	1	0	0	0	0	0	1	0
User6	0	1	0	1	0	0	0	0	0	0	0	0	0	0	1	1	0	0	0	1
User7	0	0	0	1	0	1	0	0	1	0	0	0	0	1	0	1	0	0	0	0
User8	0	0	0	0	0	0	0	0	1	0	1	1	0	1	0	0	1	0	0	0
User9	0	0	0	0	1	1	0	0	0	0	0	1	0	0	0	0	1	1	0	0
User10	0	0	0	1	0	0	1	0	0	0	0	1	1	0	1	0	0	0	0	0
User11	0	0	0	0	1	0	0	1	0	0	0	1	1	0	0	0	0	0	0	1
User12	0	0	0	0	0	1	0	0	1	0	0	0	1	0	1	0	0	0	1	0
User13	0	0	0	1	0	0	0	1	1	0	1	0	1	0	0	0	0	0	0	0
User14	0	0	1	1	0	0	1	0	0	0	0	0	0	0	1	1	0	0	0	0
User15	0	0	1	0	1	0	0	0	0	0	0	0	1	0	1	1	0	0	0	0
User16	1	0	0	0	0	0	0	1	0	0	0	0	0	1	0	1	0	0	0	1
User17	0	0	0	1	0	0	0	0	0	1	0	0	0	1	0	0	0	1	1	0
User18	0	0	1	0	0	0	1	0	0	1	0	0	0	0	0	0	0	1	0	1
User19	0	0	1	1	1	0	0	0	0	0	0	0	0	0	0	1	0	1	0	0
User20	0	0	0	0	1	0	0	0	0	1	1	0	0	0	0	1	0	1	0	0
User21	0	0	0	0	0	1	0	0	0	0	0	1	0	1	0	1	0	0	0	1
User22	0	0	0	1	0	0	0	0	0	0	1	1	0	0	0	0	0	1	1	0
User23	0	0	0	1	1	0	0	1	0	1	0	0	0	0	0	1	0	0	0	0
User24	0	0	0	0	1	1	0	0	0	1	0	0	0	1	0	0	1	0	0	0
User25	0	0	0	0	0	1	0	0	0	0	1	0	0	1	1	0	0	1	0	0
User26	0	0	1	0	1	0	0	0	0	0	0	1	0	1	0	0	0	0	0	1
User27	0	0	0	0	0	0	0	0	0	0	1	1	0	0	1	0	1	0	1	0
User28	0	0	0	0	0	1	0	0	0	0	0	1	1	0	0	0	1	0	1	0
User29	0	1	0	0	0	0	0	1	0	1	0	0	0	0	0	0	0	1	0	1

It can be seen that the number of advertisements for items 1 and 2 decreases and the number of advertisements for items 18 and 19 increases when constraints 2, 3 and 4 are added.

I was able to optimize the recommendations using linear programming.