[PYTHON] Try refactoring step by step

I wanted to try something like this once

Premise

From the point that I made a module called csv.py that makes csv-like list good
This module is new and still has time and resources to redesign
File input / output is troublesome, so a module called f.py will do the trick.
Actually, you need path, but that is also appropriate
Any language was fine, but I thought I hadn't written it recently, so I chose python.
The subject and language are just images, and I'm not interested in the subject itself or whether it's cool python.

First edition

code

`f.py`


def write(lines):
    print 'write:', lines

`csv.py`


# -*- coding: utf-8 -*-

import f


def as_kvs(rows):
    keys = rows[0]
    value_rows = rows[1:]

    return [dict(zip(keys, value_row)) for value_row in value_rows]


def write_first_one_or_empty(rows, sort_key, filter_key, filter_value, write_key):
    #Make multiple kv
    kvs = as_kvs(rows)

    #Sort by specified key
    _sorted = sorted(kvs, key=lambda row: row[sort_key])

    #Filters only where the specified key has the specified value
    _filtered = filter(lambda row: row[filter_key] == filter_value, _sorted)

    if 0 < len(_filtered):
        #Extract the first line with the specified key
        _mapped_one = map(lambda row: row[write_key], _filtered)[0]
        #Write
        f.write([_mapped_one])
    else:
        #Create empty file
        f.write([])


def write_all_or_empty(rows, filter_key, filter_value, write_key):
    #Make multiple kv
    kvs = as_kvs(rows)

    #Filters only where the specified key has the specified value
    _filtered = filter(lambda row: row[filter_key] == filter_value, kvs)

    if _filtered:
        #Extract all lines with the specified key
        _mapped = map(lambda row: row[write_key], _filtered)
        #Write
        f.write(_mapped)
    else:
        #Create empty file
        f.write([])


def write_all_or_error(rows, filter_key, filter_value, write_key):
    #Make multiple kv
    kvs = as_kvs(rows)

    #Filters only where the specified key has the specified value
    _filtered = filter(lambda row: row[filter_key] == filter_value, kvs)

    if _filtered:
        #Extract all lines with the specified key
        _mapped = map(lambda row: row[write_key], _filtered)
        #Write
        f.write(_mapped)
    else:
        #error
        raise Exception("no result")

Caller

`main.py`


# status,Give csv consisting of code
#Sort by code
#The first case where status is active
#Write out code
#If not, just create a file

csv.write_first_one_or_empty(
    [['status', 'code'], ['dead', '001'], ['active', '003'], ['active', '002']],
    'code', 'status', 'active', 'code'
)

#result
# write: ['002']

`main.py`


# name,Give a csv consisting of gender
#All cases where gender is male
#Write out name
#If not, just create a file

csv.write_all_or_empty(
    [['name', 'gender'], ['Eva', 'female'], ['Sunny', 'female']],
    'gender', 'male', 'name'
)

#result
# write: []

`main.py`


# status,Give csv consisting of tel
#All cases with dead status
#Export tel
#Otherwise an error

csv.write_all_or_error(
    [['status', 'tel'], ['dead', '090-1111-1111'], ['active', '090-9999-9999']],
    'status', 'dead', 'tel'
)

#result
# write: ['090-1111-1111']

problem

Difficult to read
Not standardized
When adding other methods in the future, it will be all copies
I don't know how many places I have to fix if the same code is modified
No information available from comments
Only what you can understand by reading the code

step-1 I will write a comment in English for the time being

Difference

 def write_first_one_or_empty(rows, sort_key, filter_key, filter_value, write_key):
-    #Make multiple kv
+    # as kvs
     kvs = as_kvs(rows)

-    #Sort by specified key
+    # sort by specified key
     _sorted = sorted(kvs, key=lambda row: row[sort_key])

-    #Filters only where the specified key has the specified value
+    # filter by specified key and value
     _filtered = filter(lambda row: row[filter_key] == filter_value, _sorted)

     if 0 < len(_filtered):
-        #Extract the first line with the specified key
+        # extract by specified key and first one
         _mapped_one = map(lambda row: row[write_key], _filtered)[0]
-        #Write
+        # write
         f.write([_mapped_one])
     else:
-        #Create empty file
+        # write empty
         f.write([])

 def write_all_or_empty(rows, filter_key, filter_value, write_key):
-    #Make multiple kv
+    # as kvs
     kvs = as_kvs(rows)

-    #Filters only where the specified key has the specified value
+    # filter by specified key and value
     _filtered = filter(lambda row: row[filter_key] == filter_value, kvs)

     if _filtered:
-        #Extract all lines with the specified key
+        # extract by specified key
         _mapped = map(lambda row: row[write_key], _filtered)
-        #Write
+        # write
         f.write(_mapped)
     else:
-        #Create empty file
+        # write empty
         f.write([])

 def write_all_or_error(rows, filter_key, filter_value, write_key):
-    #Make multiple kv
+    # as kvs
     kvs = as_kvs(rows)

-    #Filters only where the specified key has the specified value
+    # filter by specified key and value
     _filtered = filter(lambda row: row[filter_key] == filter_value, kvs)

     if _filtered:
-        #Extract all lines with the specified key
+        # extract by specified key
         _mapped = map(lambda row: row[write_key], _filtered)
-        #Write
+        # write
         f.write(_mapped)
     else:
-        #error
+        # error
         raise Exception("no result")

Next problem

There are comments such as # as kvs and # write.
Even though the amount of information has not changed, there are many descriptions because it is double management and it is not cool

step-2 Organize comments and cut out methods based on English comments

policy

Delete the part that almost matches the code by making it an English comment because it is a worthless comment
For comments that are not, cut out the method and use it as the method name to delete it.

Difference

The method that was cut out is like this

+def sort_by_specified_key(kvs, key):
+    return sorted(kvs, key=lambda row: row[key])

+def filter_by_specified_key_and_value(kvs, key, value):
+    return filter(lambda row: row[key] == value, kvs)

+def extract_by_specified_key_and_first_one(kvs, key):
+    return kvs[0][key]

+def extract_by_specified_key(kvs, key):
+    return map(lambda row: row[key], kvs)

+def error():
+    raise Exception("no result")

The main body is like this

-def write_first_one_or_empty(rows, sort_key, filter_key, filter_value, write_key):
+def write_first_one_or_empty(rows, sort_key, filter_key, filter_value, extraction_key):
-    # as kvs
     kvs = as_kvs(rows)

-    # sort by specified key
-    _sorted = sorted(kvs, key=lambda row: row[sort_key])
+    _sorted = sort_by_specified_key(kvs, sort_key)

-    # filter by specified key and value
-    _filtered = filter(lambda row: row[filter_key] == filter_value, _sorted)
+    _filtered = filter_by_specified_key_and_value(_sorted, filter_key, filter_value)

     if 0 < len(_filtered):
-        # extract by specified key and first one
-        _mapped_one = map(lambda row: row[write_key], _filtered)[0]
-        # write
-        f.write([_mapped_one])
+        extracted_one = extract_by_specified_key_and_first_one(_filtered, extraction_key)
+        f.write([extracted_one])
     else:
-        # write empty
         f.write([])

-def write_all_or_empty(rows, filter_key, filter_value, write_key):
+def write_all_or_empty(rows, filter_key, filter_value, extraction_key):
-    # as kvs
     kvs = as_kvs(rows)

-    # filter by specified key and value
-    _filtered = filter(lambda row: row[filter_key] == filter_value, kvs)
+    _filtered = filter_by_specified_key_and_value(kvs, filter_key, filter_value)

     if _filtered:
-        # extract by specified key
-        _mapped = map(lambda row: row[write_key], _filtered)
-        # write
-        f.write(_mapped)
+        extracted = extract_by_specified_key(_filtered, extraction_key)
+        f.write(extracted)
     else:
-        # write empty
         f.write([])

-def write_all_or_error(rows, filter_key, filter_value, write_key):
+def write_all_or_error(rows, filter_key, filter_value, extraction_key):
-    # as kvs
     kvs = as_kvs(rows)

-    # filter by specified key and value
-    _filtered = filter(lambda row: row[filter_key] == filter_value, kvs)
+    _filtered = filter_by_specified_key_and_value(kvs, filter_key, filter_value)

     if _filtered:
-        # extract by specified key
-        _mapped = map(lambda row: row[write_key], _filtered)
-        # write
-        f.write(_mapped)
+        extracted = extract_by_specified_key(_filtered, extraction_key)
+        f.write(extracted)
     else:
-        # error
-        raise Exception("no result")
+        error()

Next problem

You can only == filter_key and filter_value, so you can't filter like ʻage <20`
Frequent index access to lists

step-3 Do a little local refactoring

Difference

I prepared head and tail because index access is a little unfriendly.

+def head(xs):
+    return xs[0]

+def tail(xs):
+    return xs[1:]

Where to use

 def as_kvs(rows):
-    keys = rows[0]
-    value_rows = rows[1:]
+    keys = head(rows)
+    value_rows = tail(rows)

 def extract_by_specified_key_and_first_one(kvs, key):
-    return kvs[0][key]
+    return head(kvs)[key]

Changed filter_by_specified_key_and_value to receive predicate instead of key, value to be a little more flexible

-def filter_by_specified_key_and_value(kvs, key, value):
-    return filter(lambda row: row[key] == value, kvs)
+def filter_by_predicate(kvs, predicate):
+    return filter(predicate, kvs)

Then filter_by_predicate does nothing more than filter and discards it.

-def filter_by_predicate(kvs, predicate):
-    return filter(predicate, kvs)

Then I made head, so I may not have to prepare ʻextract_by_specified_key_and_first_one`. If the names of the methods divided into small pieces are correct, it is difficult for the process to become unclear even if they are used in combination.

-def extract_by_specified_key_and_first_one(kvs, key):
-    return head(kvs)[key]

Here is the main body that reflects the above

-def write_first_one_or_empty(rows, sort_key, filter_key, filter_value, extraction_key):
+def write_first_one_or_empty(rows, sort_key, predicate, extraction_key):
     kvs = as_kvs(rows)

     _sorted = sort_by_specified_key(kvs, sort_key)

-    _filtered = filter_by_specified_key_and_value(_sorted, filter_key, filter_value)
+    _filtered = filter(predicate, _sorted)

     if 0 < len(_filtered):
-        extracted_one = extract_by_specified_key_and_first_one(_filtered, extraction_key)
+        extracted_one = head(_filtered)[extraction_key]
         f.write([extracted_one])
     else:
         f.write([])

-def write_all_or_empty(rows, filter_key, filter_value, extraction_key):
+def write_all_or_empty(rows, predicate, extraction_key):
     kvs = as_kvs(rows)

-    _filtered = filter_by_specified_key_and_value(kvs, filter_key, filter_value)
+    _filtered = filter(predicate, kvs)

     if _filtered:
         extracted = extract_by_specified_key(_filtered, extraction_key)
         f.write(extracted)
     else:
         f.write([])

-def write_all_or_error(rows, filter_key, filter_value, extraction_key):
+def write_all_or_error(rows, predicate, extraction_key):
     kvs = as_kvs(rows)

-    _filtered = filter_by_specified_key_and_value(kvs, filter_key, filter_value)
+    _filtered = filter(predicate, kvs)

     if _filtered:
         extracted = extract_by_specified_key(_filtered, extraction_key)
         f.write(extracted)
     else:
         error()

Next problem

There is a little duplicate code in the part of writing or abnormal handle
Since the processing is similar, it seems that it can be shared

step-4 Make common parts of the code that look similar (bad policy)

policy

The code from filter to f.write in the bottom two of the three body methods is exactly the same, so cut it out.

Difference

+def filter_and_extract_and_write_if_not_empty(kvs, predicate, extraction_key):
+    _filtered = filter(predicate, kvs)
+
+    if _filtered:
+        extracted = extract_by_specified_key(_filtered, extraction_key)
+        f.write(extracted)

Since it was cut out, the number of lines in the main body decreased

 def write_all_or_empty(rows, predicate, extraction_key):
     kvs = as_kvs(rows)

-    _filtered = filter(predicate, kvs)
+    filter_and_extract_and_write_if_not_empty(kvs, predicate, extraction_key)

-    if _filtered:
-        extracted = extract_by_specified_key(_filtered, extraction_key)
-        f.write(extracted)
-    else:
+    if not kvs:
         f.write([])

 def write_all_or_error(rows, predicate, extraction_key):
     kvs = as_kvs(rows)

-    _filtered = filter(predicate, kvs)
+    filter_and_extract_and_write_if_not_empty(kvs, predicate, extraction_key)

-    if _filtered:
-        extracted = extract_by_specified_key(_filtered, extraction_key)
-        f.write(extracted)
-    else:
+    if not kvs:
         error()

Problems with this policy

kvs is required for abnormal handles and cannot be shared
It is difficult to divert the extracted filter_and_extract_and_write_if_not_empty
If a fourth body method is created in the future, is it necessary to perform processing that exactly matches this?
Long method name
Many words, ʻand is used If you have + ʻand, they will often do it for you
If you do a lot, it will be difficult to understand
If you do a lot of things, it will be difficult to match the purpose when reusing
Only the ʻif part was cut out, so the ʻelse part remained on the main body side.
I don't know how to use the abnormal handle unless I read all the cut out processes.

What was wrong

The original processing flow was like this
Data structure conversion
Extraction
Processing
Export or abnormal handle
I put it together like this
Data structure conversion
Extraction / Processing / Export
or abnormal handle
Do not cut out because the code is similar, and it will not work unless you cut out in units of processing flow

step-4 Organize the processing flow (good policy)

The above-mentioned bad policy is cut back and redone

policy

Focus on the flow of writing the list, not the similar code

Difference

Pay attention to the second body method

This time, there is no need to explicitly separate the processing for the case with the contents of the list and the case with the empty list. Basically, if the processing to which the list is passed is the same, there is no need to be aware of the length of the list (it should not be).

It's better to pass this to f.write without knowing if it's a content or an empty list

 def write_all_or_empty(rows, predicate, extraction_key):
     kvs = as_kvs(rows)

     _filtered = filter(predicate, kvs)

-    if _filtered:
-        extracted = extract_by_specified_key(_filtered, extraction_key)
-        f.write(extracted)
-    else:
-        f.write([])
+    extracted = extract_by_specified_key(_filtered, extraction_key)
+    f.write(extracted)

Next, pay attention to the first body method.

If this is not an empty list, the first element is taken out, processed, and listed again.

this

Extract the first element
Single element machining
Relisting

The flow

Keep in a list with a maximum length of 1
Process all (although one) with map

I will change the idea to the flow

First, prepare a method to get the matching first element from the list with a maximum length of 1.

+def find_first(kvs, predicate):
+    for kv in kvs:
+        if predicate(kv):
+            return [kv]
+    else:
+        return []

Then you don't have to worry about the contents like the second method of the main body method. Moreover, since it is a list conversion, you can use ʻextract_by_specified_key instead of headandkey` access.

 def write_first_one_or_empty(rows, sort_key, predicate, extraction_key):
     kvs = as_kvs(rows)

     _sorted = sort_by_specified_key(kvs, sort_key)
+    first = find_first(_sorted, predicate)

-    _filtered = filter(predicate, _sorted)
-
-    if 0 < len(_filtered):
-        extracted_one = head(extract_by_specified_key(_filtered, extraction_key))
-        f.write([extracted_one])
-    else:
-        f.write([])
+    extracted = extract_by_specified_key(first, extraction_key)
+    f.write(extracted)

Both the ʻif clause and the ʻelse clause of the main body methods 1 and 2 fit into the particle size of" writing process without being aware of the list length " In a bad example, ʻif and ʻelse were a group of export processes, but it was bad that only ʻif` was cut out.

Next problem

I don't care about access modifiers
A module called csv, but there is just a list operation process

step-5 Organize the cut out methods

policy

Organize whether each cut out is public or private

Difference

Head and tail are likely to be used in other modules in the future, so public
Thinking more, these guys only do list operations, so create l.py and kick it out of csv.py
(list.py was fine, but the name conflicts with the standardlist (), so I shifted it appropriately)

`l.py`


+# -*- coding: utf-8 -*-
+
+
+def head(xs):
+    return xs[0]
+
+
+def tail(xs):
+    return xs[1:]
+
+
+def find_first(xs, predicate):
+    for x in xs:
+        if predicate(x):
+            return [x]
+    else:
+        return []

If you look closely, you can operate kvs to sort and extract, which is just a dictionary operation, so create d.py and expel it from csv.py.
At that time, change the variable name from kvs to an abstract appropriate name.

`d.py`


+# -*- coding: utf-8 -*-
+
+
+def extract_by_specified_key(xs, key):
+    return map(lambda x: x[key], xs)
+
+
+def sort_by_specified_key(xs, key):
+    return sorted(xs, key=lambda x: x[key])

The remaining processing is a method only for csv.py, so make it private

-def as_kvs(rows):
+def __as_kvs(rows):

-def error():
+def __error():

Why not cut out the seemingly convenient ʻas_kvs`

kvs is just a temporary data structure prepared by converting the data structure expected by csv.py only to make it easier to process internally.

kvs is actually dict, but the dict type does not appear in the public arguments or return value of csv.py. In other words, there is no dict type as a function provided by the module, so it should not be exposed.

In addition, the argument type of ʻas_kvs is list, but there are restrictions such as header + [body]configuration and that all columns match, sol.py and I thought that it was preferable to make csv.py private processing rather than d.py`

Next problem

It's hard to tell how many places the private method is used
(It was supposed to be, but I unexpectedly went out to another module ...)

step-6 Refactor the `private` method that is used only once (maybe pros and cons here)

When it comes to slightly larger modules, it can be extremely difficult to know where and how many privates are used.

If you do so, it may be difficult to refactor the private method or understand the range of influence, which can be quite difficult.

There are many `private`s, but each one is used only once.

public 1 -> private 1 -> private 2 -> private 3
public a -> private a -> private b
public x -> private x

It is much easier to grasp the whole picture if you think that there are actually three publics. You can refactor private with peace of mind.

An example of a mixture of `private`, which is used only once, and `private`, which depends on various parts.

public 1 -> private 1
            |
            V
public a -> private a -> private b
                         ^
                         |
public x -> private x -> private y

private 1 and private x can be changed to some extent easily, but if you fix private b lightly, it will affect all public.

policy

Try to define private, which is used only from one place, in the method

This is the method I tried to do without permission, but there is no proper exhibition or discussion, but I think it is a good move and I use it a lot in disposable cords and solo projects

Difference

Since __error is used only in one place, I prepared it with def in def

By the way, I deleted the annoying __ because it became invisible from the outside anyway.

 def write_all_or_error(rows, predicate, extraction_key):
+    def write_or_error(kvs):
+        if kvs:
+            f.write(kvs)
+        else:
+            raise Exception("no result")
+
     kvs = __as_kvs(rows)

     _filtered = filter(predicate, kvs)

     extracted = d.extract_by_specified_key(_filtered, extraction_key)
-    if extracted:
-        f.write(extracted)
-    else:
-        __error()
+    write_or_error(extracted)

Of course, __as_kvs is used from many places, so don't change it.

The final problem

I haven't written a test
At this rate, file-io appears frequently, making it difficult to write tests.
It is not impossible to test with file-io, but it may not be possible to test with processing such as standard output and email transmission.

step-7 Write the test separately for conversion and export

policy

If you take a big picture of the processing flow, it can be divided into "conversion" and "output".
Of these, the one who wants to check the operation firmly is the "conversion"
In most cases the "output" is hard to test

So the responsibility of the module is limited to "conversion" and the test is implemented.

Difference

It is decided to end with return without" output " Along with that, the method name was changed from write_ to ʻextract_`.

-def write_first_one_or_empty(rows, sort_key, predicate, extraction_key):
+def extract_first_one_or_empty(rows, sort_key, predicate, extraction_key):
     kvs = __as_kvs(rows)

     _sorted = d.sort_by_specified_key(kvs, sort_key)
     first = l.find_first(_sorted, predicate)

-    extracted = d.extract_by_specified_key(first, extraction_key)
-    f.write(extracted)
+    return d.extract_by_specified_key(first, extraction_key)

-def write_all_or_empty(rows, predicate, extraction_key):
+def extract_all_or_empty(rows, predicate, extraction_key):
     kvs = __as_kvs(rows)

     _filtered = filter(predicate, kvs)

-    extracted = d.extract_by_specified_key(_filtered, extraction_key)
-    f.write(extracted)
+    return d.extract_by_specified_key(_filtered, extraction_key)

-def write_all_or_error(rows, predicate, extraction_key):
-    def write_or_error(kvs):
-        if kvs:
-            f.write(kvs)
+def extract_all_or_error(rows, predicate, extraction_key):
+    def it_or_error(xs):
+        if xs:
+            return xs
         else:
             raise Exception("no result")

     kvs = __as_kvs(rows)

     _filtered = filter(predicate, kvs)

     extracted = d.extract_by_specified_key(_filtered, extraction_key)
-    write_or_error(extracted)
+    return it_or_error(extracted)

As a result, ʻimport disappears, so it can be confirmed that csv.py` no longer processes file-io.

-from private import f

The test looks like this

`csv_test.py`


# -*- coding: utf-8 -*-

import csv

assert csv.extract_first_one_or_empty(
    [['status', 'code'], ['dead', '001'], ['active', '003'], ['active', '002']],
    'code', lambda kvs: kvs['status'] == 'active', 'code'
) == ['002']

assert csv.extract_first_one_or_empty(
    [['status', 'code'], ['dead', '001']],
    'code', lambda kvs: kvs['status'] == 'active', 'code'
) == []

The end

Completed form

`csv.py`


# -*- coding: utf-8 -*-

import l
import d


def __as_kvs(rows):
    keys = l.head(rows)
    value_rows = l.tail(rows)

    return [dict(zip(keys, value_row)) for value_row in value_rows]


def extract_first_one_or_empty(rows, sort_key, predicate, extraction_key):
    kvs = __as_kvs(rows)

    _sorted = d.sort_by_specified_key(kvs, sort_key)
    first = l.find_first(_sorted, predicate)

    return d.extract_by_specified_key(first, extraction_key)


def extract_all_or_empty(rows, predicate, extraction_key):
    kvs = __as_kvs(rows)

    _filtered = filter(predicate, kvs)

    return d.extract_by_specified_key(_filtered, extraction_key)


def extract_all_or_error(rows, predicate, extraction_key):
    def it_or_error(xs):
        if xs:
            return xs
        else:
            raise Exception("no result")

    kvs = __as_kvs(rows)

    _filtered = filter(predicate, kvs)

    extracted = d.extract_by_specified_key(_filtered, extraction_key)
    return it_or_error(extracted)

Compared to the first edition

There are no comments at all, but the amount of information has not decreased because the method name is properly added.
There is no so-called system processing like ʻif or for`, and all you have to do is call the method.
This makes it possible to grasp the flow as if it were in English.
(If there is an internal def, skip it and read from the beginning of the text)
l.py and d.py that seem to be useful in the future were born
Since it is just a "conversion" and the return is a simple list type, the caller can perform further processing after this.
The list obtained from the return can be freely converted and incorporated into the report, as well as being continued for another conversion.
Can write test / Yes
It took a lot of work, but the number of lines is actually decreasing
Even if you delete the blank lines from both sides and delete the comments from the first edition, the number of lines is smaller in the finished product without comments originally.

Tightening

The point

I can't say it in a nutshell ...

Don't write worthless comments

Comments that just translate the code into Japanese
Someone like return result #return result, which is unexpectedly common
You don't need it once you get used to it, and you don't need it in the first place
Read the code quietly
Writing the same thing in code and comments doubles the corrections
And no matter how you write the comment, it doesn't affect the execution, so there is no guarantee that it will be fixed correctly.
In most cases, "means" is written

Then what is the comment to write on the contrary?

Why
# This process is for ~ ~ specifications, etc.
# ~ ~ was taken into consideration and implemented like this, etc.
`# ~ ~ is an exception because it is impossible due to ~ ~, etc.
Specification paths, language and framework bug issue urls, etc.
Others Basically later comment on additional information that someone cannot understand just by reading the code
In most cases, the "purpose" is written

Even so, if you think it's painful without comments

If the code is bad and you think so, make it private instead of a comment and name it properly as in this example.
The comment works even if it is wrong, but the code cannot be mistaken
Can be reused
If you simply can't read the code due to lack of skills or are worried that you don't have Japanese, think of decommenting as training and do your best to improve your skills.
It seems tough, but you can't lower the level of team activity to that level
Apart from the extraordinarily difficult and magical things, coding most business systems isn't that difficult.

Bad-looking `private` method

It's pretty suspicious that the method name contains ʻand`
It's hard to understand the methods that you often do
The more you do, the more specialized the situations you can use, so it's hard to reuse.
I don't know how to use it without reading the implementation
If you pass this and this as an argument, it will do that, right? Yeah, isn't it surprising to read all the way inside?
If you can't use it without reading it, the number of lines hasn't actually decreased.

Good `private` method

The private method at the end should have a simple name and one thing to do
len () or head () or .split ()
Of course, the private of the facade that aggregates the private is not so.
Can be used without reading inside
You don't try to read the contents of head () every time

Differences in handling `private` and `public` methods

Think about publishing as a module, not just as an access modifier
Don't publish the process that is useless by the method alone
It will be easier and easier to prepare a lot of small publics later.
On the contrary, the more modules that are not maintained, the more painful it will be later.

What to test

Aim and carefully test "conversion" rather than "input" or "output"
Test the public public methods properly
It can be combined with confidence

Do not standardize because the code looks similar

Organize the processing flow first and abstractly organize what you want to do
Make common points where you want to do something similar
There are times when the processing (means) is the same for a part of what you want to do, but you don't cut it out easily.
Where filter is set to find_first and where it was filter until the end
ʻIf write else empty and ʻif write else error ʻif` part

def in def

Technique when the private method created by converting a comment into a method is called from only one place
The narrower the effective scope, the less difficult it is to understand and the narrower the range of influence.
Maybe you can do it in most languages

scala can be obedient

def upperJoin(xs: List[String], sep: String): String = {
    def upper(xs: List[String]): List[String] = xs.map(_.toUpperCase)
    def join(xs: List[String]): String = xs.mkString(sep)

    join(upper(xs))
}

upperJoin(List("hello", "world"), " ") // HELLO WORLD

Even in java, if you use Function <T, R> etc.

public static String upperJoin(List<String> xs, String sep) {
    Function<List<String>, List<String>> upper = _xs -> _xs.stream().map(String::toUpperCase).collect(toList());
    Function<List<String>, String> join = _xs -> _xs.stream().collect(joining(sep));

    return join.apply(upper.apply(xs));
}

upperJoin(asList("hello", "world"), " "); // HELLO WORLD

can also havekell Or rather, the idea comes from the sense of defining haskell functions by value.

upperJoin xs sep = (join . upper) xs
  where
    upper = map (map toUpper)
    join = intercalate sep

upperJoin ["hello", "world"] " " -- HELLO WORLD

So the java example is close to haskell, and you can write it in python like this: (In PEP 8, naming and binding lambda is a violation of coding norms, but I hate nesting, so I'm more likely to be alone)

def upper_join(xs, sep):
    upper = lambda xs: [x.upper() for x in xs]
    join = lambda xs: sep.join(xs)

    return join(upper(xs))

upper_join(['hello', 'world'], ' ') # HELLO WORLD

Reflection

In the textbook, I refactor from where there is a test, but I skipped it
I wanted to put "output" in the first edition, so I started with no tests
Well, I think you'll notice if you make a mistake on this scale ...
In the first place, I prefer to write a test after completion
kvs was dict's list, so it was a little different to put it on d.py easily ...
Well, the example exists only for the atmosphere, so hey

The end

I just wanted to try it, so I was satisfied

It doesn't matter, but when I see the black letters on the red and green spaces, I always think that it looks like a watermelon.

[PYTHON] Try refactoring step by step

Premise

First edition

code

f.py

csv.py

Caller

main.py

main.py

main.py

problem

step-1 I will write a comment in English for the time being

Difference

Next problem

step-2 Organize comments and cut out methods based on English comments

policy

Difference

Next problem

step-3 Do a little local refactoring

Difference

Next problem

step-4 Make common parts of the code that look similar (bad policy)

policy

Difference

Problems with this policy

What was wrong

step-4 Organize the processing flow (good policy)

policy

Difference

Next problem

step-5 Organize the cut out methods

policy

Difference

l.py

d.py

Why not cut out the seemingly convenient ʻas_kvs`

Next problem

step-6 Refactor the private method that is used only once (maybe pros and cons here)

There are many privates, but each one is used only once.

An example of a mixture of private, which is used only once, and private, which depends on various parts.

policy

Difference

The final problem

step-7 Write the test separately for conversion and export

policy

Difference

csv_test.py

The end

Completed form

csv.py

Compared to the first edition

Tightening

The point

Don't write worthless comments

Then what is the comment to write on the contrary?

Even so, if you think it's painful without comments

Bad-looking private method

Good private method

Differences in handling private and public methods

What to test

Do not standardize because the code looks similar

Reflection

The end

`f.py`

`csv.py`

`main.py`

`main.py`

`main.py`

`l.py`

`d.py`

step-6 Refactor the `private` method that is used only once (maybe pros and cons here)

There are many `private`s, but each one is used only once.

An example of a mixture of `private`, which is used only once, and `private`, which depends on various parts.

`csv_test.py`

`csv.py`

Bad-looking `private` method

Good `private` method

Differences in handling `private` and `public` methods