[PYTHON] How can I write a good program?

I'm not from information science, so I had to learn a lot since I got a job as a programmer. Now I'm working as a backend engineer on a machine learning team, in a position to review and teach junior code (surprise!).

However, the way of learning is not efficient, and I think I have spent a lot of wasted time. I can't stand to let my juniors do the same thing, so I'm writing this article with the intention of being able to verbalize what I'm thinking when programming. It may be natural for those who can.

Prior to this article, I would like to talk about programmer values and attitudes, such as "How To Become A Hacker" and "Excellent Tips for becoming a good programmer ".

In addition to programming, there are many things to learn, but since there is no end to it, I will take another opportunity.

How to deal with the error?

I always get an error. Sometimes it's hard to get the job done without knowing how to deal with it. At such times, programmers (whether conscious or unconscious) act according to the following "hypothesis testing" process.

  1. Read the error message
  2. Make a hypothesis of the cause
  3. Find out and try
  4. If it doesn't work, make a different hypothesis and investigate

If you still don't understand, ask someone who seems to know more. Also, when checking, the official document is more accurate than Qiita or Stackoverflow, so check that first. In particular, AWS and Python, which we often use, are well documented.

For example, in batch processing such as "putting numerical calculation results into MySQL", for some reason, a timeout error may occur only in the production environment. At this time, first make a hypothesis that "the timeout setting value may be different between the test environment and the production environment" and confirm that it is actually the case, and since a sufficiently long value is set, "numerical value" I think that the connection was pasted before the calculation process and it has been pasted for a long time. "

Once you get into this habit, you can learn the knowledge of the next lower layer while dealing with errors. If you don't get this idea, no matter how many years you've been a programmer, you'll stay with shallow knowledge.

However, in order not to become yak shaving and investigate endlessly, "with another option Please leave the idea of "finishing once"!

The program is slow! How can I get faster?

There are times when "I was able to function, but it's strangely slow and unusable ...". In that case, take action according to this.

  1. Identify bottlenecks
  2. Speed up there

If you do not look at the bottleneck and deal with it, you will end up with a lump of feces that does not improve speed and adds complexity. I think this way of thinking is the same for business improvement and machine learning model accuracy improvement (by adding the process of "creating the target index"). Dr. Andrew Ng's Coursera Machine Learning says, "In the machine learning pipeline, we should improve modules that are critical to reduced accuracy. I think there was a story like this.

How to deal with it depends on the bottleneck.

Calculation processing is slow (CPU bound / memory bound)

If your program is strangely slow, first suspect that there is a problem with the "complexity" of the algorithm (more precisely, the time complexity). Specifically, there are many cases where multiple loops are unnecessary.

#Every time the number n of tag ids increases, the calculation step is O(n^2)Will increase!
result = []
for tag_id in list of tags that the user likes:
    for product_id,List of tag ids attached to products in(Product id,List of tag ids)List of:
        for product_tag_id in List of tag ids attached to the product:
            if tag_id == product_tag_id:
                result.append(product_id)
                break
return result

To deal with this, you need to learn "algorithms and data structures". My recommendation is "[Algorithms, Part I](https: // www." I learned from Hatena Developer's Article. This is a free Coursera course called "coursera.org/learn/algorithms-part1)". If there is a problem with the amount of calculation, please be aware that reimplementing it in a faster language (for example, Go language) will not solve it.

The above code is implemented by getting it from the dictionary (hash table).

#Tag id-> {Product id}Dictionary
Product dictionary= {
    "Tag 1": {"Product 1", "Product 2"},
    "Tag 2": {"Product 2", "Product 3"},
    ...
}

result = set(List of candidate product ids) 
for tag_id in list of user's favorite tags:
    result &=Product dictionary[tag_id] #Union
return result

Also, if you cannot improve the amount of calculation, consider parallel processing. If you are dealing with parallel processing used in Python, concurrent.futures.ProcessPoolExecutor and [joblib](https:: //joblib.readthedocs.io/en/latest/parallel.html#parallel) will be used.

Sometimes it's not a CPU issue and you're out of memory. Check the server metrics. Unix commands can be found at top.

IO is slow (IO bound)

If there are many requests to the WEB application, database or other server, the program may be slow due to the IO wait. The same is true when crawling a site.

In that case, let's deal with it by "concurrent processing" first. Please read "Parallel / Asynchronous story surrounding Python" for specific countermeasures.

  • Really executing multiple processes at the same time-> Parallel processing
  • Efficiently switching between multiple processes as if they were executed at the same time-> Concurrency

I'm just switching to another process while waiting for IO. In other words, "concurrent processing" ʻasync / await and ThreadPoolExecutor` do not increase the OS process itself, so there is no effect of "accelerating the calculation processing using the CPU". Also, please be aware of the story of C10K problem.

The difference in the Python area from this document is ʻasyncio](https://docs.python.org/ja/) called [trio`. 3 / library / asyncio.html) I think that a library that is easier to handle has come out. I wonder if FastAPI will support it ...

How can I write a good program?

I would like to know, but I will give you some advice as much as I can. Joel Spolsky's "[Java School Dangers](https://web.archive.org/web/20190514152427/http://local.joelonsoftware.com/mediawiki/index.php/Java%E3%82%B9%E3" % 82% AF% E3% 83% BC% E3% 83% AB% E3% 81% AE% E5% 8D% B1% E9% 99% BA) "" Java "is modern Python.

There is a reason why recruiters using grep are fooled. Anyone who can use Scheme, Haskell, and C pointers I know will write better code than a Java programmer with 5 years of experience in 2 days of using Java. But that's not understandable to the average dull HR guy.

I also read this article and tried to touch C language and Haskell in addition to Python. In particular, the feeling of design method by type of static functional programming (I can't say it well, but pure functions (Creating and assembling a program as a synthesis) is also useful for Python implementation, and I think that it is useful to have written some C language for speeding up numpy processing. One of the answers to how to write a good program may be "write a program with the idea of a paradigm suitable for the problem".

For design, good information can be found in "Introduction to iOS App Design Patterns". "'Introduction to iOS App Design Patterns' was a good book that helped'non'iOS engineers who were worried about designing" As you can see, after understanding the advantages and disadvantages of each pattern, I think it will be possible to discuss which one to adopt.

The pattern is not in the first place, but in the process of modifying the code, the pattern is found in its final form. ... (Omitted) ... The start is from a simple design. If a known pattern could be applied there, you can imagine what would happen as a result. Since the advantages and disadvantages of the pattern have been analyzed, it is possible to suddenly arrive at a highly complete design. Or you could write the test little by little and proceed with the refactoring.

Without such an awareness of "deciding on appropriate options with consideration for trade-offs", it becomes dogmatic [this article](http://rirakkumya.hatenablog.com/entry/2013/04/ As shown in 20/093044), it will be in the state of "I rewrote the code implemented simply to a complicated one." I think it's a bit of an extreme theory.

You also need to learn about object orientation. However, it is confusing because it is often discussed with some roles mixed in the first place. For details, please read "Technology that supports coding". The author's blog post "Three Roles of Classes" also mentions a little.

Whatever the method, I think it is a repetition of "increasing the options and considering the trade-offs from them." It may be similar to studying a foreign language and increasing your vocabulary. I think it will be a long-term battle, so please look for a learning method that you can enjoy yourself.

Summary

If I summarize my 5 years of learning in 3 lines, it looks like this. It was really refreshing.

If you say something wrong in the article, please let us know in the comments.

Recommended Posts

How can I write a good program?
If you don't understand mathematical symbols, you can write a program.
After all, how much should I write a Qiita article?
How to write a Python class
[python] I made a class that can write a file tree quickly
Qiita (1) How to write a code name
I made a payroll program in Python!
Write a Caesar cipher program in Python
A memorandum of how to write pandas that I tend to forget personally
[Beginner] What happens if I write a program that runs in php in Python?
How do I start writing out my program?
How to write a ShellScript Bash for statement
I read "How to make a hacking lab"
Let's write a Python program and run it
How to write a named tuple document in 2020
[Go] How to write or call a function
I'm a beginner, can I borrow my wisdom?
I want to write to a file with Python
How to write a ShellScript bash case statement
I made a Caesar cryptographic program in Python.
I made a prime number generation program in Python
How to write a GUI using the maya command
Write a super simple molecular dynamics program in python
I want to write in Python! (2) Let's write a test
How to write a list / dictionary type of Python3
[Python] How to write a docstring that conforms to PEP8
A program to write Lattice Hinge with Rhinoceros with Python
I made a prime number generation program in Python 2
〇✕ I made a game
4. Creating a structured program
I got a sqlite3.OperationalError
How to run a Python program from within a shell script
I wrote a program quickly to study DI with Python ①
I wrote a demo program for linear transformation of a matrix
Write a python program to find the editing distance [python] [Levenshtein distance]
[Python] Chapter 01-03 About Python (Write and execute a program using PyCharm)
Write a program to solve the 4x4x4 Rubik's Cube! 1. Overview
I tried "a program that removes duplicate statements in Python"
I tried "How to get a method decorated in Python"
How to write a test for processing that uses BigQuery
I'll never forget how to write a shell script, don't forget! !!
How to write a metaclass that supports both python2 and python3
I made a plug-in that can "Daruma-san fell" with Minecraft
I didn't want to write the AWS key in the program