[Python] [Word] [python-docx] Try to create a template of a word sentence in Python using python-docx

I was wondering how the diff information was visualized. I want to finally make the colored difference information in pdf format.

When I looked it up in various ways, I wondered if python-docx could be used, and evaluated it. I will summarize it in. I thought about MarkDown, but the disposition of a bad engineer that the detailed layout and appearance can be corrected with the GUI of Word if it is a method of generating docx made me choose this method (bitter smile). If you can create Word, you can create PDF.

I hope it will be helpful for those who want to create a Word template based on the results of various analyzes with Python.

Postscript

What kind of docs to make

This time, we confirmed the following functions.

Operating environment, etc.

Originally, it would be smooth if it was done with Python on Windows, but this time, due to various circumstances, we confirmed the operation in the following environment. No, like diff, I wanted to process the data that was messed up on cygwin, so it became this environment.

When using Cygwin, check the character code of all python files in UTF-8 format and line breaks in \ n only format.

Also, for ** MS-DOS prompt **, the line break is \ r \ n and the character code is SJIS. Also in this chapter

# -*- coding: utf-8 -*-

A certain description

# -*- coding: shift-jis -*-

If so, it should be cool.

Install python-docx

Python is already included.

Installation is described here [https://python-docx.readthedocs.io/en/latest/user/install.html#install). The conditions are as follows.

You can usually do it with pip install python-docx, maybe. The MS-DOS prompt version went smoothly in my environment as well.

Precautions on Cygwin

First of all, although it is the above condition, the following libraries must be installed on Cygwin. If these are not included, an error will appear as if there is no header, so enter them. ** May not be included as standard. ** **

Furthermore, if you have python in both Windows and Cygwin like me, you need to be careful, and if you pip install, it will be in Windows depending on the path setting and so on. So, I installed it by the following method.

Click here to set up Cygwin (http://qiita.com/GDaigo/items/a80003684fc6ab7505fd#%E3%82%BB%E3%83%83%E3%83%88%E3%82%A2%E3%83] % 83% E3% 83% 97) may also be helpful

easy_install-2.7 python-docx

How to use python-docx

Below is a description of the head family.

https://python-docx.readthedocs.io/en/latest/index.html

The tutorial should be easy to understand. However, it is quite difficult to find what you want to do. In my case, I had a lot of trouble with the character modification system, but the relationship was summarized below.

http://python-docx.readthedocs.io/en/latest/user/text.html

It is troublesome to explain the specifications of the library in detail, so I tried to express it in the code below.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

#
# SimpleDocService
#  python-We provide a simple service related to dox.
#Well, it's the code to understand the docx library.
#

from docx import Document
from docx.shared import RGBColor
from docx.shared import Inches
from docx.shared import Pt

class SimpleDocxService:

    def __init__(self):
        self.document = Document()
        self.latest_run = None

    def set_normal_font(self, name, size):
        #Font settings
        font = self.document.styles['Normal'].font
        font.name = name
        font.size = Pt(size)

    def add_head(self, text, lv):
        #Heading settings
        self.document.add_heading(text, level=lv)

    def open_text(self):
        #Start adding text
        self.paragraph = self.document.add_paragraph()

    def close_text(self):
        #Text addition finished
        return #Currently no processing

    def get_unicode_text(self, text, src_code):
        # python-Convert to unicode for docx
        return unicode(text, src_code)

    def adjust_return_code(self, text):
        #If you add the data of the text file as it is, a line break will occur.
        #Remove it as it will be a hassle
        text = text.replace("\n", "")
        text = text.replace("\r", "")
        return text

    def add_text(self, text):
        #Add text
        self.latest_run = self.paragraph.add_run(text)

    def add_text_italic(self, text):
        #Add text (italically)
        self.paragraph.add_run(text).italic = True

    def add_text_bold(self, text):
        #Add text (emphasize)
        self.paragraph.add_run(text).bold = True

    def add_text_color(self, text, r, g, b):
        #Color the letters
        self.paragraph.add_run(text).font.color.rgb = RGBColor(r, g, b)

    def add_picture(self, filename, inch):
        #Insert figure
        self.document.add_picture(filename, width=Inches(inch))

    def save(self, name):
        #Output as a docx file.
        self.document.save(name)

SimpleDocxService is a class that collects APIs of various functions evaluated this time. It provides the following functions.

API motion
set_normal_font(name, size) Set the standard text font. name is the name and size to size
add_head(text, lv) Creating headlines. text is the heading name. lv is level(0=Title, 1=Heading 1,...)
open_text() Open text area (*)
close_text() Close text area (*)
get_unicode_text(text, src_code) src_Generates and returns a unicode character string from the character code specified by code
adjust_return_code(text) Generates and returns text with line breaks erased
add_text(text) Write text data to a word document
add_text_italic(text) Write text data to a word document, make the typeface italic
add_text_bold(text) Write text data to a word document, make the typeface bold
add_text_color(text, r, g, b) text Write data to a word document. Specify the color with rgb. Example: r=255, g=0, b=Red at 0
add_picture(filename, inch) Insert the image data specified by filename. inch is the horizontal inch size
save(name) Save as a word file with the file name specified by name

Some supplements.

This is related to the behavior of python-docx, so I will supplement it with code. The code that actually writes the text is as follows. This is the code taken from Honke.

p = document.add_paragraph('A plain paragraph having some ')
p.add_run('bold').bold = True
p.add_run(' and some ')
p.add_run('italic.').italic = True

How it will be displayed is also in Home. In this way, you can get a paragraph and add text to it. It seems that text modification can also be done at the time of this add_run. Perhaps this also involves the structure of the docx file.

So, open_text is used to take a new paragraph. It is not necessary in python-docx, but the idea is to use close_text to complete the series of descriptions. for that reason, The text using the SimpleDoxService class is described as follows.

docx  = SimpleDoxService()
docx.open_text()
docx.add_text("This is a my best book.\n")
docx.add_text("Do you know this?")
docx.close_text()

The reason why I do this is considering the relationship with the figure. If you want to include a picture, use add_picture as you can see in the code above. At this time, suppose you write the following (code that directly uses python-docx without using the SimpleDocxService class).

p = document.add_paragraph('A plain paragraph having some ')
p.add_run("text1\n")
document.add_picture("sample.png ", width=Inches(1.25))
p.add_run("text2\n")

In this case, of course, in a sense,

text1
<<sample.png diagram>>
text2

not

text1
text2
<<sample.png diagram>>

It becomes. So, I wanted to clarify that in the app code, so I added the concept of open and close. This will be shown later in the code of the sample application, so I hope you can refer to that as well.

In the case of Python, the character code is rather troublesome. It is necessary to pay attention to which character code the library handles. In the case of python-docx, it seems that it is processed by unicode, so in the case of Japanese, it is necessary to convert to unicode. There are various conversion methods here, but it seems that it is necessary to use this get_unicode_text function method to make it unicode (it seems that it is not SJIS because it is Word ...).

This is the code I put in by cut and try. Sorry. It seems that if you use the text with line breaks as it is, unnecessary line breaks will be inserted. The way I did it was to use this adjust_return_code function to prevent it.

In the next section, we will actually create a word file using the code of this SimpleDocxService class.

I will actually make it.

This time, create a WORD file with the following configuration as a sample.

  1. Title
  2. Insert a picture
  3. Heading 1 (1st)
  4. Text file string
  5. Insert another picture
  6. Python-qualified string
  7. Heading 1 (2nd)
  8. String specified in Python

Material

Below is the material used in the sample. I'm a professional student again. ..

First of all, the image under the title is a file called ** report_top.png **, which looks like this. report_top.png

Next, the text file is ** sample.txt **, which looks like this. Well, it's an excerpt from My blog ...

I think that the basic role of a manager is to move multiple people and achieve results.
Therefore, it is difficult to ignore people's emotional and mental problems. I think this is a little different from accepting the other party. After considering such a problem to some extent, I dared to ignore it.

Another picture is a file called ** sample_pic.png **, which looks like this. sample_pic.png

The sample posted below was created using this. Of course, the image and text do not have to be this. However, please note that the text is assumed to be SJIS in Japanese and the line break is \ r \ n in Windows.

By the way, the material of professional student is obtained from the following, and the size and character insertion are processed. http://pronama.azurewebsites.net/pronama/

Sample code

Below is sample code that uses the SimpleDocxService class to generate a word.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
from docx_simple_service import SimpleDocxService

if __name__ == "__main__":

    docx = SimpleDocxService()

    #Font settings
    docx.set_normal_font("Courier New", 9)

    #Title display
    docx.add_head(u"Main title", 0)

    #Illustration insertion
    docx.add_picture("report_top.png ", 3.0)

    #Phrase title display
    docx.add_head(u"First topic", 1)

    # shift-Put the jis text file in the docx text
    f = open("sample.txt")
    text = f.read()
    f.close()
    docx.open_text()
    docx.add_text("\n")
    text = docx.get_unicode_text(text, 'shift-jis')
    text = docx.adjust_return_code(text)
    docx.add_text(text)
    docx.close_text()

    #Illustration insertion
    docx.add_picture("sample_pic.png ", 5.0)

    #Generate text in code and put it in docx.
    #An example of qualification is also here.
    docx.open_text()
    docx.add_text("\nThis is a my best book.")
    docx.add_text("\nThis is ")
    docx.add_text_bold("a my best")
    docx.add_text(" book.")
    docx.add_text("\nThis is ")
    docx.add_text_italic("a my best")
    docx.add_text(" book.")
    docx.add_text_color("\nThis is a my best book.", 0xff, 0x00, 0x00)
    docx.close_text()

    #Next phrase
    docx.add_head(u"Second topic", 1)

    #Generate text in code and put it in docx.
    docx.open_text()
    docx.add_text(u"\n Yes, that's it.")
    docx.close_text()

    #It's a save.
    docx.save("test.docx")

    print "complete."

Execution result

You will have a docx like this. python_docx_sample.jpg

I think it has the structure described above.

For python-docx, the code itself isn't too difficult once you know how to write it. You can understand it by comparing the above sample code with the code of SimpleDocxService class. So, if it is within this range, I think that you can do various things by changing the code posted here.

license

I used it below. Thank you for providing the wonderful software.

that's all.

Recommended Posts

[Python] [Word] [python-docx] Try to create a template of a word sentence in Python using python-docx
Try to get a list of breaking news threads in Python.
How to create an instance of a particular class from dict using __new__ () in python
Create a GIF file using Pillow in Python
Try to calculate a statistical problem in Python
I want to create a window in Python
How to create a JSON file in Python
Create a MIDI file in Python using pretty_midi
Create a data collection bot in Python using Selenium
Try to make a Python module in C language
[Python] [Word] [python-docx] Simple analysis of diff data using python
Create a plugin to run Python Doctest in Vim (2)
Create a plugin to run Python Doctest in Vim (1)
How to execute a command using subprocess in Python
Create a function in Python
Create a dictionary in Python
How to develop in a virtual environment of Python [Memo]
To return char * in a callback function using ctypes in Python
Try building a neural network in Python without using a library
Create a tool to check scraping rules (robots.txt) in Python
I tried to make a stopwatch using tkinter in python
Try running a function written in Python using Fn Project
Just try to receive a webhook in ngrok and python
Create a python GUI using tkinter
Try using LevelDB in Python (plyvel)
Create a binary file in Python
Try to calculate Trace in Python
I tried to create a Python script to get the value of a cell in Microsoft Excel
5 Ways to Create a Python Chatbot
Create a random string in Python
Try using Leap Motion in Python
How to determine the existence of a selenium element in Python
[Cloudian # 9] Try to display the metadata of the object in Python (boto3)
I tried to make a regular expression of "amount" using Python
How to check the memory size of a variable in Python
Try to create a python environment with Visual Studio Code & WSL
I tried to create a list of prime numbers with python
I tried to make a regular expression of "date" using Python
How to create a heatmap with an arbitrary domain in Python
How to check the memory size of a dictionary in Python
Implement a deterministic finite automaton in Python to determine multiples of 3
Try to log in to Netflix automatically using python on your PC
How to create a large amount of test data in MySQL? ??
Various ways to create an array of numbers from 1 to 10 in Python.
Create a function to get the contents of the database in Go
Create a command line tool to convert dollars to yen using Python
Try logging in to qiita with Python
Try using the Wunderlist API in Python
(Bad) practice of using this in Python
Try using the Kraken API in Python
Try to make a kernel of Jupyter
Display a list of alphabets in Python 3
Try sending a SYN packet in Python
Try drawing a simple animation in Python
Create a simple GUI app in Python
Create a JSON object mapper in Python
[Python] Create a Batch environment using AWS-CDK
Log in to Slack using requests in Python
How to get a stacktrace in python
Try to operate Excel using Python (Xlwings)
Scraping a website using JavaScript in Python