I made a function that returns a vowel when a katakana character string is input, such as "Konichiha-> Onia".
Leave "n" and "tsu" as they are (Example: Ludicolo-> Unaa) Make "-" (long sound) the same as the previous vowel (example: cola-> oa) If the immediately preceding kana is "to" or "do", "u" is regarded as one character and treated in the same way as uppercase letters (example: Ducchi-> Uui, Uaa-> Uaa). "Ya", "e", and "yo" are treated as one character when the immediately preceding kana is in the I stage, and in other cases, they are treated in the same way as uppercase letters. (Example: Catatsu-> Aau, Kee-> E) "U" is regarded as one kana if the immediately preceding kana is "i", "te" and "de", and is treated as a capital letter in other cases (example: tulle-> u, dewar-> dewar). ) If the immediately preceding kana is U-dan, "ヮ", "a", "e", and "o" are treated as one kana together with it, and in other cases, they are treated as uppercase letters. (Example: Way-> Aei) If the previous kana is U-dan and "te" or "de", "i" is regarded as one kana, and in other cases, it is treated as an uppercase letter (eg Lemon Tea-> Eonii, IKEA-> IKEA) Leave the characters other than katakana as they are.
It has been confirmed to run on Google Colaboratory (as of March 27, 2020) and macOS Catalina, Python 3.8.0.
def kana2vowel(text):
#Uppercase and u conversion list
large_tone = {
'A' :'A', 'I' :'I', 'C' :'C', 'D' :'D', 'Oh' :'Oh',
'U': 'C', 'Vu': 'C',
'Mosquito' :'A', 'Ki' :'I', 'Ku' :'C', 'Ke' :'D', 'Ko' :'Oh',
'Service' :'A', 'Shi' :'I', 'Su' :'C', 'Se' :'D', 'So' :'Oh',
'Ta' :'A', 'Ji' :'I', 'Tsu' :'C', 'Te' :'D', 'To' :'Oh',
'Na' :'A', 'D' :'I', 'Nu' :'C', 'Ne' :'D', 'No' :'Oh',
'C' :'A', 'Hi' :'I', 'Fu' :'C', 'F' :'D', 'E' :'Oh',
'Ma' :'A', 'Mi' :'I', 'Mu' :'C', 'Me' :'D', 'Mo' :'Oh',
'Ya' :'A', 'Yu' :'C', 'Yo' :'Oh',
'La' :'A', 'Li' :'I', 'Le' :'C', 'Re' :'D', 'B' :'Oh',
'Wow' :'A', 'Wo' :'Oh', 'Down' :'Down', 'Vu' :'C',
'Moth' :'A', 'Gi' :'I', 'Gu' :'C', 'Ge' :'D', 'Go' :'Oh',
'The' :'A', 'The' :'I', 'Zu' :'C', 'Ze' :'D', 'Zo' :'Oh',
'Da' :'A', 'Di' :'I', 'Zu' :'C', 'De' :'D', 'Do' :'Oh',
'Ba' :'A', 'Bi' :'I', 'Bu' :'C', 'Be' :'D', 'Bo' :'Oh',
'Pacific League' :'A', 'Pi' :'I', 'Pu' :'C', 'Pe' :'D', 'Po' :'Oh'
}
#To/Do+'U'To c
for k in 'Steller sea lion':
while k+'U' in text:
text = text.replace(k+'U','C')
#Te/De+I/I/Convert to c
for k in 'Tedde':
for k2,v in zip('Ju','Iu'):
while k+k2 in text:
text = text.replace(k+k2,v)
#Convert uppercase letters and u to vowels
text = list(text)
for i, v in enumerate(text):
if v in large_tone:
text[i] = large_tone[v]
text = ''.join(text)
#Convert Wu to Wu
while 'Wu' in text:
text = text.replace('Wu','U')
#C+ヮ/A/I/E/Convert o to vowel
for k,v in zip('ヮ yeo','Aieo'):
text = text.replace('C'+k,v)
#E/Good/Convert to good
for k in 'Yeah':
while k+'-' in text:
text = text.replace(k+'-',k+'I')
#I/I+Turbocharger/Yu/E/Convert yo to vowel
for k,v in zip('Nyayo','Aueo'):
text = text.replace('I'+k, v).replace('I'+k, v)
#Convert the remaining lowercase letters to vowels
for k,v in zip('Eyayo','Aiaioauo'):
text = text.replace(k,v)
#-Convert (long sound) to vowel
for k in 'a-I-U-E-O':
while k+'-' in text:
text = text.replace(k+'-',k+k)
return text
Recommended Posts