Last time: Please stop printing and import logging for logging

The above article is still strangely popular, but I first wrote it in 2016, which is about four years ago from this article.

Although the outline of the opinion has not changed much, there are some changes. I want to write a slightly supplementary article because I have various opinions and I have had different experiences so far, so I will write it.

Excuse the background for the original article

Certainly, I was looking at the behavior around the log of some external software, and it was just a sentence that I wrote with momentum as "Muki!". That's why it's not something that requires accuracy and fairness like a textbook. I haven't received a review like when I contributed to a doujinshi of some circle.

However, for the emotional article (rather "because"?), This article was read strangely ...

Somehow I can understand the feelings of the reader. The story of logs is not surprisingly written in technical books, and I am still in a field where I am still struggling to hone my skills. So, I am grateful that the article is useful even if it is just a hint level.

Not limited to this article, when there is a notification of LGTM (formerly "Like") or stock, I sometimes reread and rework the text, so the feeling of "mukey" may be mild now. .. As for the content, even if I look back on it now, I'm leaving it as "Well ... I'm not writing a bad story."

By the way, it's a little disappointing that I didn't see a solid counter-article saying "No!". I wonder if it would be interesting to write various things like "I and I".

(I'm glad you pointed out the comments, though I'm originally a friend)

I want everyone to think about and share the right logging strategy

Not limited to Python, logs are used when operating software.

It is one of the few points of contact with users, and
At the same time, one of the few points of contact with the program

is. I don't think it's possible not to take this seriously as a major issue in software development.

Despite its importance, I think there is an unusually small amount of consideration for logs compared to testing. The impression is that everyone is groping, or making it while looking at the field and experience to which they belong.

On the other hand, as I will explain later, I think that logging strategies often have peculiar points for each field (domain) rather than common items. Even if you share it, there is a possibility that it will be difficult to share knowledge just by writing a few lines as a best practice.

Good logging ⇔ Good software

As mentioned above, the log faces "two-way" between humans and things, so if the viewpoint or subject is slightly blurred, it will instantly become "What does this log mean ???". It's too easy to know who the information is for.

It may be said that a good log is an important requirement that leads to good software for users and operators (well, a direct UI is much more important, but in that case).

It is impossible for non-developers (sensible) people to review their execution environment without having to report it, and to land on such a log, if the quality of error handling or the root of the software is not good in the first place. .. You can easily tell the operator the log "Because there is such a suspicion, please check here!" In other words, the software that outputs the log must be the software that firmly assesses your ability and limits. I think.

WARNING: Found duplicate entries for the query "otemoyan". Choosing the first.

The above log is an example, but it makes it possible to understand "what is happening", "what is being considered a problem", and "what the software has selected there". It also implies (I'm going to) imply that "usually there shouldn't be double entries in this pattern, but it's not an error, but be careful". If that matters to the log reader, "who threw the query" might also be added here. As a virtual example, if the query part corresponds to personal information, it needs to be blurred during production.

There is probably a premise that this log seems "good", and it "always" depends on the context.

In my case, there was a time when I was doing contract development for BtoBtoB, such as "reading the log submitted by the customer with about ag / grep and identifying the problem of the customer environment from there". There are examples like this, but I had the impression that this kind of log works in such an environment. I feel that the support person in between was able to respond smoothly to the customer after that (I can't tell you much about this, but to me. I didn't get it, and it's good.)

What I missed at the time, what I think is important now

It's been a long time since I wrote the original article, but one of the points I pointed out was the biggest reflection that I honestly thought, "I see."

"Logging tactics / strategies need to be close to the domain from the entrance to the exit."

Hmm, what do you mean?

The following assumptions should have a clear and solid impact on your logging strategy:

Depends on the size of the target program
Different in the fields where the software is used (for myself, in-house services, small contracts, venture-scale BtoB, BtoC, that is Google-scale global services, mobile front end)
Depends on the data size (of the log) handled
Depends on the log browsing frequency and the route to be taken when more than a warning occurs

In the original article, this was messy, even within the scope of the Python language. Anyway, Python users are wide-ranging ...

To put it in an excuse, the sense of scale that I initially envisioned was software development with the image of "growing up" starting from "for myself" and "small-scale contract". The image is that the number of users, including the orderer who used it first, then in-house or who gave the job, will increase "maybe", but the initial load will not be large.

On the other hand, for example, the logging strategy for software that started development assuming a "global service" from the beginning of design will be very different. For example, when it comes to rewriting a large-scale SaaS that was originally popular, there were thousands, hundreds of millions, or even hundreds of millions of QPS from the beginning.

In that case, simply put, the standard scale that software encounters should be different, and from the beginning of development, that should be the first landing point for your logging strategy. The amount of logs and the speed at which they accumulate (and the complexity of the infrastructure through which the log stream flows) are also different.

The landing point assumed from the beginning will change naturally if you are aware of the large system from the beginning. In my (selfish) sense, the larger the scale, the less meaningful the log-level subdivision I care about, and the more I need to focus on the frequency of occurrence in the user environment and the actual impact. I think it will come. Well, it's really a guess, but ...

There was a certain assertion that the reader's domain did not always match that of my own, but I think this was clearly inappropriate as a writer (by the way, the original article side dared to fix it). I haven't. It's such an article, so please enjoy it like that.)

If "the domain is the same but the opinions are different", it is easy to discuss with the disagreement and the process of reconciling them, but it is often difficult to reconcile with different domains. It's a bad situation to look back on when I didn't say, "This is the area I'm thinking about. This is the logging strategy for it."

By the way, it's not Python, but I've written apps in the mobile industry, so I was thinking about logging strategies at that time as well. It was about 5 years before 2016 when I wrote the original article.

At this time, well ... the expected scale is certainly "global", but it didn't flow in as a constant log stream on a single server backend. The target is each individual terminal, which spawns similar logs in pieces. The logging strategy at this time is different from the small-scale log management on the server side.

It's a little different from a large server backend. I think that the way of making people aware of software version and hardware version will change. It is not uncommon for troubleshooting to say, "Which version did you change the wording of the log message when this error occurred?"

In any case, if software is so pervasive in society, the logic behind its operation will be completely different, and the way of thinking about logging that supports it will be completely different.

What I'm trying now

Keeping the above in mind, here's a supplementary note of what I'm trying on my domain.

Some projects do pass logger: Logger orlogger: Optional [Logger]to the function. No default argument is specified. However, you may be allowed to explicitly pass None (think of the assumed software scale as "quite small").

Not ʻOptional` means "must pass the logger". I feel that this is often a function with very low granularity, or a component that is less dependent on the context (external environment).
ʻOptional` means" If you don't want to pass the logger, pass None ". The called function uses the object if it is passed, otherwise it uses the standard logger of the domain to which the function belongs.

I also used logger: Optional [Logger] = None for a while, but this was a bad move. The log may be squeezed implicitly, which makes me very sad, so I have two choices now. It may not be necessary to have two choices, though.

This method will probably only be useful for developments on a scale that "takes control of the entire log" and "takes care of the components in detail". If the scale of the service you handle is about the leading BtoC in Japan, it depends on the load, but you probably won't take such a strategy.

So I don't think this method is very versatile, but by doing this consistently to the extent that I personally handle it, the log outlook for the project has improved considerably.

In the first place, I think that there is a "Kodawari" part in addition to the small scale. I think it's arguable whether the log is good or bad, but it's still better than ** print () **. If another technician is the leader and has another logging convention as a coding convention, it will quietly follow it.

What I'm thinking about now

This is not much different than before, and the problem is that Logger.info () and Logger.debug () alone are too coarse-grained. Logger.trace () (below debug (). It's fine to the start and end levels of the function) I want it (maybe Zabbix term) Logger.average () (above warning, from error) There is a moment when I want (below, or between info and warning) (again, this may not matter as the software scale increases).

If you make a rapper for Logger, you can probably do it. In Python, DEBUG, INFO, etc. are constant values, and there is a sufficient gap between those constant values (https://docs.python.org/3/library/logging.html#logging-levels).

However, there is also a rule of thumb that it works better in the long run if you stick to the "standard" of the language or framework, and intuitively it does not seem "strangely sly", but there is a history of hesitation. I just envy the logback that contains the trace from the beginning.

Summary

This time, I didn't have any high emotions and just wrote it appropriately, so it may not have been interesting, but I wrote it as a follow-up.

[PYTHON] Thinking about logging, then