[python] Create a list of various character types

Inspired by ~~ Make a simple list of alphabets with Python ~~ (broken link), make a list of various strings saw.

I think it can be used in password dictionaries and wordplay games.

Postscript (2018/03/30)

It seems that some people are watching it once in a while, so I will explain the module in the comment section as well. I think it is best practice to use the string module for half-width character strings.

>>> import string
>>> help(string)
(Omission)
DATA
    ascii_letters = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
    ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
    ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    digits = '0123456789'
    hexdigits = '0123456789abcdefABCDEF'
    letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuv...\xaf\xb0...
    lowercase = 'abcdefghijklmnopqrstuvwxyz'
    octdigits = '01234567'
    printable = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTU...
    punctuation = '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~'
    uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
    whitespace = '\t\n\x0b\x0c\r '
>>> string.digits
'0123456789'

In Python, the handling of lists and strings is not so different, so I will omit the explanation of that part.

Lowercase alphabet

[chr(i) for i in range(97, 97+26)]
# [chr(i) for i in range(ord('a'), ord('z')+1)]

Uppercase alphabet

[chr(i) for i in range(65, 65+26)]
# [chr(i) for i in range(ord('A'), ord('Z')+1)]

Half-width numbers

[chr(i) for i in range(48, 48+10)]
# [chr(i) for i in range(ord('0'), ord('9')+1)]

Hiragana

[chr(i) for i in range(12353, 12436)]
# [chr(i) for i in range(ord('Ah'), ord('Hmm')+1)]

Katakana

[chr(i) for i in range(12449, 12532+1)]
# [chr(i) for i in range(ord('A'), ord('Down')+2)]
#If you don't need "Vu",-1 "please

Full-width numbers

[chr(i) for i in range(65296, 65296+10)]
# [chr(i) for i in range(ord('０'), ord('９')+1)]

Common kanji

that is impossible. Continuously. However, there is nothing you can't do.

[UTF-8 version of common kanji code table] I think there is no choice but to extract the kanji from csv and list them ...

After deleting the comment field, execute the following.

import csv
kanji = []
with open('/path/to/joyo-kanji-code-u.csv', 'r') as f:
    data = csv.reader(f)
    
    for row in data:
        kanji.append(row[0])

Yup. It looks like a list of 2136 pieces ... Please do your best to make things that include the outside.

Half-width symbol

This is also impossible continuously. Rare gas that the person who set ASCII is bad ... I didn't come up with a good method, so I'll give an example.

eisu = [chr(i) for i in range(97, 97+26)]
eisu.extend([chr(i) for i in range(65, 65+26)])
eisu.extend([chr(i) for i in range(48, 48+10)])

[chr(i) for i in range(33, 127) if chr(i) not in eisu]
#By the way, if you change "33" to "32", a half-width space will be inserted.

As you may have noticed, for all half-width characters, [chr (i) for i in range (32, 127)]

bonus

I think it's a hassle to copy and paste one by one, so I'll put a function that returns the list I want. Of course, please add common kanji and half-width symbols afterwards.

#Lowercase alphabet →(97, 123)
#Uppercase alphabet →(65, 91)
#Half-width numbers →(48, 58)
#Hiragana →(12353, 12436)
#Katakana →(12449, 12532+1)
#Full-width numbers →(65296, 65306)

def moji_list(*args):
    moji = []
    for i in range(len(args)):
        moji.extend([chr(j) for j in range(args[i][0], args[i][1])])
    return moji

moji_list((97, 123), (65, 91), (48, 58))
# ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

I think it is also possible to specify it with ord ().