In this series, we will introduce analytical techniques and methods. If you start with specialized content, you tend to be biased toward technical and academic content, so in this chapter, I would like to give you a brief overview of the use of analytical technology from the business side (for example, online advertising operation). The following Chapter 2 gives an overview of analytical techniques from a technical perspective.
Let's consider the use of analysis technology centered on the following keywords, using online advertising operations as an example.
A. Visualization, B. Control
As a concrete example, I would like to explain the utilization of analysis technology by taking the simplified online advertisement operation process of fig.1 below as an example. I will briefly explain the flow of fig.1 so that even those who are not familiar with online advertising operation will have an operational image.
Start: Set goals for ad serving on the web. For example, for one month, 100 products can be purchased via online advertising with a budget of 1 million yen.
(1): Estimate budget digestion and conversion (material request, product purchase, etc.) occurrence rate from the target group and estimated number of users. Consider (decide) delivery destination candidates from a preliminary survey of data.
(2): Set the delivery at the delivery destination (for example, Google AdWords) to be used based on (1). Specifically, a structure called a campaign or ad group is generated on the system, and distribution creative keywords are set there. In addition, set the daily budget amount and CPC (Cost Per Click: cost per unit click).
(3): Start distribution.
(4): Delivery data is stored in a database or the like. You can check the simple tabulation results on the management screen of the distribution system or download the report.
(5): Acquire and evaluate time-series data on digestion costs and conversions for each campaign or ad group. If necessary, readjust the daily budget and CPC based on the evaluation results, and repeat the process from (3) to (5).
End: The goal is achieved, the delivery period ends, and the delivery ends when the entire budget is exhausted.
For more specific information on online advertising, please refer to the following books. ・ [New method to overturn the idea of attribution advertising effect](http://www.amazon.co.jp/ Attribution-New method to overturn the idea of advertising effect from the ground up-Tanaka-String / dp / 4844331841 /) Gen Tanaka, Yasuo Sato, Go Sugihara, Yuichi Arizono, Impress Japan
・ [Introduction to DSP / RTB Audience Targeting] Advertising Revolution from "Frame" to "People" Realized in the Big Data Era](http://www.amazon.co.jp/DSP-RTB Introduction to Audience Targeting-In the Big Data Era Realizing "Frame" to "People" Advertising Revolution-Next-Publishing / dp / 4864780013 /) Ryuji Yokoyama, Kenichi Sugawara, Yoshiteru Umeda, Impress R & D
・ [The Ad Technology From the Basics of Data Marketing to the Concept of Attribution](http://www.amazon.co.jp/ The Ad Technology-From the Basics of Data Marketing to the Concept of Attribution-Sugawara- Kenichi / dp / 4798136557 /) Kenichi Sugawara, Yuichi Arizono, Yoshihiro Okada, Go Sugihara, Shoeisha
・ [Ad Technology Professional Training Reader-Optimizing Advertising Effectiveness in the Digital Marketing Era!](Http://www.amazon.co.jp/ Ad Technology-Professional Training Reader--Optimize Advertising Effectiveness in the Digital Marketing Era- Software-Design / dp / 4774164291 /) Ryoji Yasushima, Yusuke Sato, Yuki Matsuda, Keiji Tokiyoshi, Takeshi Ishiguro, Taku Ogawa, Gijutsu-Hyoronsha
It is no exaggeration to say that the first step in data analysis is to aggregate the accumulated raw data and visualize (look at) the data in tables and graphs. In principle, the accumulated data reflects the facts up to this point (though there may be noise). Leveraging data visualization helps you make fact-based decisions and increases your chances of success in your next business cycle.
For example, consider the process of fig.1. If you have a past delivery record, the visualization technology will help you to clarify the facts of the past data in the process (1) and design and set the delivery in the process (2). For example
Recently, it has become relatively easy to visualize data with Excel (solver, analysis tool, powerpivot), Google Analytics, distribution system management screen, dashboard (Tableau, etc.).
When humans need to give meaning to data (such as when submitting a report to a customer), plotting a large amount of data as is makes it difficult to interpret the report. For this reason, it is necessary to devise a way to show the data. If you want an expression other than a graph that can be output in Excel, you can get a wider range of expressions by using tools such as D3.js, Graphvis, and R. Using these tools may require input data processing techniques from Excel.
Depending on the size of the data and the content you want to aggregate, some database or distributed processing (Hadoop is well known) may be used as an analysis technology. When it is difficult to grasp the data by simple aggregation, it often becomes necessary to process the data (user action log, etc.) in a programming language. While log-based aggregation provides visibility into user behavior details, it can be technically difficult to perform large-scale aggregation at high speed under complex conditions.
During online advertisement distribution, it is important to evaluate the distribution result and optimize the setting parameters (daily budget, CPC, etc.) every day in fig.1: (3) to (5). The visualization of A above was important to do this. Although it is possible to manually perform the visualization / evaluation of data and the adjustment of distribution parameters in (3) to (5), the following issues are often faced in the operation field.
Examples of possible solutions to this challenge are
For the processes of p1 and p3, it is possible to build an automated system by utilizing the technology introduced in Visualization of A (database, distributed processing technology, etc.). It is possible to develop a system without using special logic that uses machine learning. On the other hand, the p2 process requires some mathematical logic on how to evaluate the input data, in some cases. Rule-based, algorithm-based, and hybrids of these are selected as the evaluation logic for p2.
KPI criteria (target CPA, etc.) are set in advance, multiple pre-designed rules are evaluated by conditional branching, and budget changes and other setting parameter changes used for operation are executed inside the logic. .. You didn't necessarily have to use mathematics or complex analytical techniques. For example, consider the case where the target CPA is 1,000 yen and the budget to be consumed in one day is 30,000 yen. When the delivery data (campen unit) is (target CPA, daily budget) = (achieved, not achieved) (conditional branch evaluation), a decision is made to raise the CPC (bid unit price) of the ad group, which promotes budget exhaustion. The change range of CPC may be utilized by the operator defining a simple formula in advance (if the report is evaluated on Excel, it may be based on the experience of the operator).
merit
Demerit
KPI criteria (target CPA, etc.) are set in advance, and whether or not they are met is evaluated by a pre-designed (mathematical) algorithm, and decision-making for changing the budget used for operation and changing other setting parameters is the logic. It runs internally. For example, consider the case where the target CPA is 1,000 yen and the budget to be consumed in one day is 30,000 yen, as in the rule base. When the delivery data (campen unit) is (target CPA, daily budget) = (achieved, not achieved) (conditional branch evaluation), a decision is made to raise the CPC (bid unit price) of the ad group, which promotes budget exhaustion. At this time, the internal logic ranks all campaigns and all ad groups to be delivered by considering the mutual relationship of "how much" the target CPA is achieved and how much budget is left, and raises "how much" CPC. To decide. Probabilistic and statistical methods are sometimes used to handle variables with uncertainty.
merit
Demerit
In this chapter, we have looked at the use of analysis technology (visualization / control) in business, especially in online advertising operation examples. Visualization technology helps to read facts from data and is used for reporting and so on. While vast amounts of data visualize detailed user behavior, it is also necessary to pay attention to large-scale data aggregation technology and data presentation. Control technology utilizes rule-based logic and algorithm-based (machine learning, etc.) to support process automation. The advantages and disadvantages of rule-based / algorithm-based are complementary to each other, and when using them in practice, it is necessary to recognize their characteristics and consider matching with business.
In the following Chapter 2, I would like to give a brief overview of analytical methods such as machine learning. We will explain each technology from Chapter 3 onwards.
I'm sorry I haven't started coding yet (laughs) The chapter without coding will continue for a while, but please look forward to it as we will give you all the useful knowledge!