Während ich die Verarbeitung natürlicher Sprache mit Python schrieb, gab es eine Szene, in der ich einen Zähler mit nur Großbuchstaben schrieb. Vergessen Sie es also nicht ・ Wörter zählen, die mit Großbuchstaben beginnen ・ Nur wenn nur der Wortanfang groß geschrieben wird ・ Anwendbar auf das erste Kapital + Zahlen usw.
import re
from collections import Counter
org_text = """
Of course a writer can write from the viewpoint of Southern slave owners who of course would be racist and
view black people with, at best, benevolent paternalism, without her characters'
attitudes necessarily being her own. But it's very apparent that Mitchell was wholly and uncritically sympathetic to her antebellum ancestors.
The Old South was a graceful, chivalrous land where slavery was not a horrible and oppressive institution creating generations of misery and oppression,
but a divinely-ordained means of preserving racial harmony. And for all that Mitchell,
like her characters, probably considered herself to be kind and affectionate to all the African-Americans she knew personally,
there isn't a single black character in the book who isn't an ignorant, semi-human ape -- which is literally how they are described.
Even beloved Mammy is repeatedly compared to a monkey.
"""
#Trennen Sie Zeichenfolgen in ein Array
words= re.split(r'\s|\,|\.|\(|\)',org_text)
#Extrahieren Sie nur Wörter, die mit Großbuchstaben beginnen
r = re.compile("^[A-Z]$|^[A-Z][a-z0-9]+$")
dict_word=[x for x in words if r.match(x)]
counter=Counter(dict_word)
for word,count in counter.most_common():
print("%s,%d" % (word,count))