[PYTHON] Science Council of Japan Appointment Refusal AI-Toward the Creation of an Elegant Answer-

Omission, bureaucrats

The refusal to appoint the Science Council of Japan is a problem, How are you doing with the bureaucrats?

Regarding the reasons for refusing the appointment of Prime Minister Suga

―― "Comprehensive, bird's-eye view" —— “Judging with diversity in mind” -"45% of members belong to the former Imperial University" ―― “There are few private sector people and young people, and there is a bias”

The answer is that I gradually added a sense of concreteness.

-"Private university affiliation is also rejected" -"Women (who are minorities in the first place) are also rejected."

As soon as the contradiction such as is pointed out in seconds ...

However, I have already refused ~~ after making a decision ~~ I can't ask the prime minister why. An excellent subordinate is the one who understands the intention of the senior. [^ 1]

Machine learning to unravel the prime minister's wisdom

On the other hand, trying to find the intention (logic) from the recommendation list It is difficult to find significant features even if Excel is aggregated on various axes ...

In such a case, let's leave it to "AI". Through learning for "appointment refusal prediction" by AI, we will find out the rules. [^ 2] Picture_kaigi.png

Data / features / models

--The list data is available on the official website of the Science Council of Japan. "The 25th Science Council of Japan Cooperation Member List (Overall Version) As of October 1, 2nd year of Reiwa" [^ 3] is used. --Refer to Wikipedia for information (name, affiliation, specialty) of recommenders who have been refused appointment [^ 4] --The specialized field was created by the author based on Wikipedia information and compared with the above list. --Since "Philosophy" was given to the Faculty of Theology / Graduate School of Religion, "Philosophy" was also given to Christianity. --"Political Thought History / Political Philosophy" grants "Political Science" and "Philosophy"

Name (honorific title omitted) Specialized field
Ashina Sadamichi philosophy
Shigeki Uno Political science / philosophy
Masanori Okada Law
Ryuichi Ozawa Law
Yoko Kato History
Takaaki Matsumiya Law

―― ~~ I haven't even seen the list ~~ I should have read the recommended list, but assuming that not all current members have seen it, I targeted the 105 people recommended this time. --Since building a predictive model is not the true purpose, the entire amount is used as training data.

--As the feature quantity, in addition to "gender" and "age", the character string included in "affiliation / job title" is converted into a feature quantity and used. -Whether or not to include the name of the former 7 imperial universities -Whether or not "Ritsumeikan University" and "Waseda University" are included -Whether or not "national" and "stock company" are included


def add_kw_col(df, target, kw):
    """
    @param {DataFrame}df Original dataframe
    @param {string}target kw Column name to determine if it contains a string
    @param {string}kw The contained line is 1,0 if not included
    """

    col = "{}in{}".format(kw, target)
    df[col] = 0
    df.loc[df[target].str.contains(kw), col] = 1

    return(df)

#For example, the former imperial university judgment of affiliation and job title
#Added a column called "○○ University in Affiliation / Job Title", 1/Flag 0
teidai = ["University of Tokyo", "Kyoto University", "Osaka University", "Tohoku University", "Nagoya University", "Hokkaido University", "Kyushu University"]
for kw in teidai:
    df_all = add_kw_col(df_all, target="Affiliation / Job title", kw=kw)

--The model also uses a highly interpretable decision tree because it also aims to create an elegant answer with a clear reason for refusal. ――The point is that if you divide the recommenders into two groups according to this value of this item, you can divide the appointed person and the rejected person relatively cleanly.

from sklearn import tree

max_d = 6
clf = tree.DecisionTreeClassifier(criterion='gini',
                                  splitter='best', 
                                  max_depth=max_d, 
                                  min_samples_split=2,
                                  min_samples_leaf=1, 
                                  min_weight_fraction_leaf=0.0,
                                  max_features=None, random_state=None,
                                  max_leaf_nodes=None, class_weight=None, presort=False)
clf = clf.fit(X, y)

Results / Discussion

The constructed decision tree is as follows. dt_kaigi.png

Aside from small numbers, when transcribed into letters ... [^ 6]

  1. If you specialize in law and belong to Ritsumeikan University, ** refuse! ** **
  2. If you are 61 years old or older and belong to Waseda University, who specializes in law and not in environmental studies, ** refuse! ** **
  3. If you specialize in law and are 61 or younger, ** reject! ** **
  4. If you specialize in philosophy and are under 56 years old, ** reject! ** **
  5. If you specialize in history and belong to the University of Tokyo, ** refuse! ** **
  6. If you specialize in philosophy and are over 63 years old, you may somehow refuse

It seems that "old emperor", "younger", and "private sector" have nothing to do with each other.

** I didn't understand the reason for refusal even with "AI" ** in 6. Professor Ashina, who is said to have been refused appointment, and Professor Yoshioka, who is a current member, Belonging to the same Kyoto University, age 64, specializing in philosophy, This analysis did not reveal the reason for the refusal.

It means that the features are insufficient ... When making the decision, it seems that the Prime Minister got information other than the list from somewhere.

in conclusion

[^ 1]: I don't know if the bureaucrats wrote the answer. [^ 2]: It's not really good to do this in an actual PRJ. Let's consider prediction and causal reasoning separately. The client wants to infer causality from the middle, but let's reject it elegantly.

Recommended Posts

Science Council of Japan Appointment Refusal AI-Toward the Creation of an Elegant Answer-
Organize useful blogs in the field of data science (overseas & Japan)