A word diary: These days, the days of fighting God Excel continue.
There is nothing wrong with handling the above table as data, When outputting, there is a request that "please do as shown in the table below".
Surprisingly, I couldn't find a page that does this processing itself, so I will post the implementation result.
dataframe_raw = pd.read_excel("Roster.xlsx") df=dataframe_raw["name"].str.split(' ',expand=True) #In this case, the name was divided by double-byte space, so split it into two columns.
If you divide by split, dataframe_process will be as shown in the table below, so check for duplicate surnames from here. If there are duplicates, special processing is used. If not, the surname is used as it is.
df["name"]="" #Add the storage column to the third column for i in range(0,len(df)): #i to 0~Loop to the length of the roster NAME = df.at[i,0] check_count=(df==NAME).sum() #Count how many character strings are stored in the 0th column of the i-th row (if it is 2 or more, there is a person with the same surname). if check_count>=2: df.at[i,"name"]=df.at[i,0]+"("+df.at[i,1]+")" #If there are duplicates, the character string in the 0th column +("0th character of the first character string")Substitute else : df.at[i,"name"]=df.at[i,0] #If there is no duplication, the character string in the 0th column of the i-th row +(0th character of the character string in the 1st column of the i-line)Substitute
With this process, the data frame will be as shown in the table below, so you can use it by substituting the name column or combining it.
I feel that it is relatively clean, but I am afraid of what to do if a list with the same surname and the same name appears.