English word spell check tool (made by python)

It has always been a problem that there are many English typos in the comments in the patch (C language) I wrote, so I made a tool that checks the spelling of the English words in the comment by inputting patch / diff. So a note of it. C or C ++ / * ・ ・ ・ * / format comments can be supported.

The source is published on github and the repository is here.

How to use

$ git clone [email protected]:MasahikoSawada/Patch-Spell-Checker.git
$ export PATH=$PATH:/path/to/Patch-Spell-Checker/PatchSpellChecker.py
$ export WLIST_DIR=/path/to/Patch-Spell-Checker/wlist.d/

If you don't want to put it in the WLIST_DIR environment variable, specify it with -d every time you run it.

 $ git diff | PatctSpellChecker.py
"xl_heap_lock" might be wrong at line 13.
        "+               * needed before releasing buffer. we can reuse xl_heap_lock "
"pupose" might be wrong at line 14.
        "+               * for this pupose. it should be fine even if we crash midway "
"combocids" might be wrong at line 45.
        "+                       * for logical decoding we need combocids to properly decode the "

-s(--source-file)Add options


$ PatctSpellChecker.py -f src/backend/postmaster/postmaster.c -s
"subprocess" might be wrong at line 11.
        " *       operations, mind you --- it just forks off a subprocess to do them "
"lock-manager" might be wrong at line 18.
        " *       and so it cannot participate in lock-manager operations.  keeping "

Dictionary File

It is possible to register a new word by writing the word in the * .dict file in the environment variable WLIST_DIR. The accuracy of spell checking can be improved by registering technical terms. (There is a free word list on the net, so be careful about the license and register yourself.) The formats accepted by the dictionary file are as follows.

--Document format (free English documents can be posted as they are)

Document format


$ cat sentence.txt
PostgreSQL is a powerful open source object-relational database system.
It is fully ACID compliant, has full support for foreign keys, joins, views, triggers, and stored procedures.

--Word format (useful when defining your own jargon)

Word format


$ cat words.txt
PostgreSQL
is
a
ACID
database
system

If you want to extract words from an existing source, use the -s -w option. If you combine sort and ʻuniq`, you can get word data as it is.

$ PatchSpellChecker.py -f src/backend/postmaster/postmaster.c -s -w | sort | uniq
activity_buffer
addr
am_syslogger
antivirus
archiver
archive_recovery

Recommended Posts

English word spell check tool (made by python)
Impressions of touching Dash, a data visualization tool made by python
Create an English word app with python
A textbook for beginners made by Python beginners
[Python] Python and security-② Port scanning tool made with Python
GUI image cropping tool made with Python + Tkinter
[Automatic translation] English input support tool Translate-chan [Python]
# 1 Python beginners make simple English word learning tools
Split camel case string word by word in Python
Python nan check
python grammar check
Procedure from environment construction to operation test of testinfra, a server environment test tool made by Python