A Python program that checks if any content has been updated and git commits & pushes it to GitLab if it has been updated.

What to do in this article

--Chase some textual content in chronological order to see if it has been updated compared to the last time. --If it has been updated, use the GitLab API to git commit & push the content to GitLab. --Implemented in Python. --This article uses EtherCalc as the content to keep track of updates.

Reference page (Thank you)

GitLab Docs > API Docs > API resources

environment

Ubuntu 20.04


$ python3 --version
Python 3.8.2

Introducing GitLab and creating a project

Introduced GitLab

Now let's get started. Use the on-premise GitLab as the git server. This time, I will introduce it locally with Docker quickly. After moving to a suitable directory on the terminal, execute the following.


git clone https://github.com/sameersbn/docker-gitlab

cd docker-gitlab

docker-compose up -d

When I waited for a while after launching the container and executed "docker-compose ps", the following was displayed.


           Name                         Command                  State                               Ports                        
----------------------------------------------------------------------------------------------------------------------------------
docker-gitlab_gitlab_1       /sbin/entrypoint.sh app:start    Up (healthy)   0.0.0.0:10022->22/tcp, 443/tcp, 0.0.0.0:10080->80/tcp
docker-gitlab_postgresql_1   /sbin/entrypoint.sh              Up             5432/tcp                                             
docker-gitlab_redis_1        docker-entrypoint.sh --log ...   Up             6379/tcp            

Since the http port of the GitLab container is 10080, access http: // localhost: 10080 / with a browser. When the screen below appears, change the root password.

スクリーンショット 2019-09-03 15.46.59.png

Go to the screen below and create a development user (username can be anything).

スクリーンショット 2019-09-03 15.56.31.png

Creating a GitLab project

After logging in as a development user, select Create a project from the screen below.

2020-08-17-01.png

The project name is "ethercalc_backup" as shown below, but the name can be anything. Press the Create project button to create the project.

2020-08-17-02.png

The project was created as follows. The Project ID is displayed below the project name (3 in this article). You will use this ID later in your Python code.

2020-08-17-03.png

Get access token for GitLab API

Later you will use the GitLab API from your Python code. At that time, you will need an access token, so get it. As shown below, bring up the pull-down menu from the upper right of the browser and select "Settings".

2020-08-17-10.png

On the screen below, select "Access Tokens" from the menu on the left.

2020-08-17-11.png

On the screen below, enter an arbitrary name in Name, Scopes will check api, and press the button "Create Personal access token".

2020-08-17-12.png

The access token was created as follows. Copy it to the clipboard and save it. In this article, we will use "6f8YXyrZ1SCSADHTJ2L9" as an access token.

2020-08-17-13.png

Introduction of EtherCalc

As a textual content provider, we will use EtherCalc for this article. Like GitLab, it will be deployed locally with Docker. After moving to a suitable directory different from GitLab on the terminal, create docker-compose.yml with the same contents as https://github.com/audreyt/ethercalc/blob/master/docker-compose.yml and create a container. Start up.


wget https://raw.githubusercontent.com/audreyt/ethercalc/master/docker-compose.yml

docker-compose up -d

When I waited for a while after launching the container and executed "docker-compose ps", the following was displayed.


            Name                          Command               State          Ports        
--------------------------------------------------------------------------------------------
docker-ethercalc_ethercalc_1   sh -c REDIS_HOST=$REDIS_PO ...   Up      0.0.0.0:80->8000/tcp
docker-ethercalc_redis_1       docker-entrypoint.sh redis ...   Up      6379/tcp         

Since the http port of the EtherCalc container is 80, try accessing http: // localhost / with a browser.

2020-08-17-04.png

Make two EtherCalc sheets for testing. Launch a text editor, create new foo.sc and bar.sc and save.


editor foo.sc

foo.sc



socialcalc:version:1.0
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=SocialCalcSpreadsheetControlSave
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

# SocialCalc Spreadsheet Control Save
version:1.0
part:sheet
part:edit
part:audit
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

version:1.5
cell:A1:t:foo1
cell:A2:t:foo2
sheet:c:1:r:2:tvf:1
valueformat:1:text-wiki
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

version:1.0
rowpane:0:1:1
colpane:0:1:1
ecell:A1
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

--SocialCalcSpreadsheetControlSave--

editor bar.sc

bar.sc



socialcalc:version:1.0
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=SocialCalcSpreadsheetControlSave
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

# SocialCalc Spreadsheet Control Save
version:1.0
part:sheet
part:edit
part:audit
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

version:1.5
cell:A1:t:bar1
cell:A2:t:bar2
sheet:c:1:r:2:tvf:1
valueformat:1:text-wiki
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

version:1.0
rowpane:0:1:1
colpane:0:1:1
ecell:A1
--SocialCalcSpreadsheetControlSave
Content-type: text/plain; charset=UTF-8

--SocialCalcSpreadsheetControlSave--

The foo.sc and bar.sc above are text files in SocialCalc format that can be imported into EtherCalc. Exporting / importing in SocialCalc format has the advantage that the sheet format (appearance) can also be recovered. You can also import CSV format files, but you cannot recover the sheet format.

Do the following to import.


curl -X PUT -H 'Content-Type: text/x-socialcalc' --data-binary @foo.sc http://localhost/_/foo

curl -X PUT -H 'Content-Type: text/x-socialcalc' --data-binary @bar.sc http://localhost/_/bar

Go to http: // localhost / foo and http: // localhost / bar in your browser. If the cell of the sheet contains the following data, the import is successful.

2020-08-17-05.png

2020-08-17-06.png

We will download the files in SocialCalc and CSV formats from these URLs and manage them in GitLab.

Python code

Please forgive me for the code being miscellaneous and poorly behaved.

The logical processing procedure is as follows. For each content of foo and bar

--Download files from EtherCalc and GitLab (both SocialCalc and csv formats) --If the file does not exist in GitLab, add a new one to the git repository --If there is a file in GitLab, compare both EtherCalc and GitLab files, and if there is a difference, update the git repository. --Git commit & push using GitLab API --After that, using GitLab API, git diff and output the result to the log --Create a directory called ethercalc under the git repository and back it up under it.

Below is the Python code. The code uses a variable called logger, but please note that the code around logging has been omitted.

ethercalc_backup.py



import time
import datetime
import urllib.request
import urllib.parse
import json
import pprint
import re
import base64

#URL of ethercalc content managed by git
ethercalc_uris = [ "http://localhost/foo", "http://localhost/bar" ]

#GitLab related
gitlab_base_uri = "http://localhost:10080/"

#Backup destination in the git repository
gitlab_backup_directory = "ethercalc"

gitlab_private_token = "6f8YXyrZ1SCSADHTJ2L9"
gitlab_project_id = 3

#now
str_now = datetime.datetime.today().strftime("%Y%m%d_%H%M%S")

#new line
LF = '\n'


def get_gitlab_file(private_token, file_path):
    """
Get 1 file from GitLab repository

    Parameters
    ----------
    private_token : str
Access token for GitLab API

    file_path : str
File path from the top of the git repository

    Returns
    -------
    anonymous : json
Response from GitLab
    """

    # https://docs.gitlab.com/ee/api/repository_files.html
    gitlab_uri = f"{gitlab_base_uri}api/v4/projects/{gitlab_project_id}/repository/files/{urllib.parse.quote(file_path, safe='')}?ref=master"
    logger.info(f"gitlab_uri={gitlab_uri}")
    headers = {
        "PRIVATE-TOKEN": private_token
    }
    request = urllib.request.Request(gitlab_uri, headers=headers)
    try:
        with urllib.request.urlopen(request) as res:
            res_files = json.loads(res.read())
    except urllib.error.HTTPError as ee:
        if ee.code == 404:
            return {}
        else:
            raise
    except:
        raise
    else:
        # logger.debug(f"gitlab res_commit={LF}{pprint.pformat(res_files)}")
        return res_files


def compare_ethercalc_and_gitlab(actions, ethercalc_uri, git_filename):
    """
Get files from EtherCalc and GitLab repositories, compare and add actions to actions variable if there are differences

    Parameters
    ----------
    actions : list
Actions variable later passed to GitLab's commits API

    ethercalc_uri : str
EtherCalc URI

    git_filename : str
Filename in git repository

    Returns
    -------
None
    """

    logger.info(f"ethercalc URL={ethercalc_uri}")

    #Download from EtherCalc
    request = urllib.request.Request(ethercalc_uri)
    with urllib.request.urlopen(request) as res:
        content_ethercalc = res.read().decode("utf-8")
    # logger.debug(f"content_ethercalc={LF}{content_ethercalc}")

    #Download from GitLab
    action_str = ""
    file_path = f"{gitlab_backup_directory}/{git_filename}"
    res_gitlab_file = get_gitlab_file(gitlab_private_token, file_path)
    try:
        content_gitlab = base64.b64decode(res_gitlab_file["content"]).decode("utf-8")
    except KeyError:
        #If there is no file in GitLab, create a new one later and git commit & push
        action_str = "create"
    except:
        raise
    else:
        # logger.debug(f"content_gitlab={LF}{content_gitlab}")

        #Compare files downloaded from EtherCalc and GitLab
        if content_ethercalc == content_gitlab:
            logger.info("content_ethercalc == content_gitlab")
        else:
            logger.info("content_ethercalc != content_gitlab")
            #When there is a difference in the file contents, git commit & push later
            action_str = "update"

    #Registered in actions variable when action is create or update
    if 0 < len(action_str):
        action = {
            "action": action_str,
            "file_path": file_path,
            "content": content_ethercalc
        }
        actions.append(action)


def main():
    # ethercalc_Process each uris URL
    actions = list()
    count_commit = 0
    re_compile = re.compile(r".*/(.*?)$")
    for index, ethercalc_uri in enumerate(ethercalc_uris):
        basename, = re_compile.match(ethercalc_uri).groups()    #String"foo"、"bar"Take out
        socialcalc_uri = ethercalc_uri[::-1].replace(basename[::-1], basename[::-1] + "/_", 1)[::-1]
        csv_uri = ethercalc_uri + ".csv"
        logger.info(f"[{index}] {basename}")

        #Download from EtherCalc and GitLab in SocialCalc format and compare file contents
        time.sleep(0.5)     #Sleep properly so as not to be a DoS attack
        compare_ethercalc_and_gitlab(actions, socialcalc_uri, f"{basename}.sc")

        #Download from EtherCalc and GitLab in csv format and compare file contents
        time.sleep(0.5)     #Sleep properly so as not to be a DoS attack
        compare_ethercalc_and_gitlab(actions, csv_uri, f"{basename}.csv")

        if len(actions) == 0:
            #Do not git commit if there is no difference in the file contents of EtherCalc and GitLab
            continue

        # git commit & push
        # https://docs.gitlab.com/ee/api/commits.html
        gitlab_uri = f"{gitlab_base_uri}api/v4/projects/{gitlab_project_id}/repository/commits"
        commit_message = datetime.datetime.today().strftime(f"backup {str_now} {basename}")
        logger.info(f'git commit -m "{commit_message}"')
        headers = {
            "method": "POST",
            "PRIVATE-TOKEN": gitlab_private_token,
            "Content-Type": "application/json"
        }
        payload = {
            "branch": "master",
            "commit_message": commit_message,
            "actions": actions
        }
        logger.debug(f"payload={LF}{pprint.pformat(payload)}")

        request = urllib.request.Request(gitlab_uri, json.dumps(payload).encode("utf-8"), headers=headers)
        with urllib.request.urlopen(request) as res:
            res_commit = json.loads(res.read())
        logger.debug(f"gitlab res_commit={LF}{pprint.pformat(res_commit)}")

        #git diff and output to log
        # https://docs.gitlab.com/ee/api/commits.html
        gitlab_uri = f"{gitlab_base_uri}api/v4/projects/{gitlab_project_id}/repository/commits/{res_commit['id']}/diff"
        logger.info(f"git diff ( {res_commit['id']} )")
        headers = {
            "PRIVATE-TOKEN": gitlab_private_token,
        }
        request = urllib.request.Request(gitlab_uri, headers=headers)
        with urllib.request.urlopen(request) as res:
            res_diff = json.loads(res.read())
        logger.info(f"gitlab res_diff={LF}{pprint.pformat(res_diff)}")

        count_commit += 1
        actions = list()

    logger.info(f"{count_commit}Git commit")


if __name__ == '__main__':
    try:
        main()
    except Exception as ee:
        logger.exception(ee)

First trial

The first time, run it with the GitLab repository empty. Do the following on your terminal:


python3 ethercalc_backup.py

The following is displayed at the end of the execution message.


2 git commits

Check the project on the GitLab screen. It has 2 Commits as shown below, and a new ethercalc directory has been created.

2020-08-17-15.png

When I went under ethercalc, there were two types of commits, foo and bar, as shown below, and two new files, SocialCalc format and csv format, were created for each commit.

2020-08-17-16.png

You can check the contents by clicking the file name.

Second trial

The second time, I will change only the contents of foo with EtherCalc and run the Python code. I added Hello etc. as follows.

2020-08-17-17.png

Do the following on your terminal:


python3 ethercalc_backup.py

The following is displayed at the end of the execution message.


1 git commit

Check the project on the GitLab screen. It is 3 Commits as shown below.

2020-08-17-18.png

3 When I click Commits, the recent foo commits have been added as below, while the bar hasn't added anything.

2020-08-17-20.png

When I clicked on the commit of the added foo, the difference from the previous commit was displayed as shown below.

2020-08-17-21.png

Third trial

The third time, I'll try running the Python code without changing EtherCalc.


python3 ethercalc_backup.py

The following is displayed at the end of the execution message.


0 git commits

that's all.

Recommended Posts

A Python program that checks if any content has been updated and git commits & pushes it to GitLab if it has been updated.
A python program that resizes a video and turns it into an image
I want to exe and distribute a program that resizes images Python3 + pyinstaller
Let's write a Python program and run it
A Python script that crawls RSS in Azure Status and posts it to Hipchat
A program that asks for a few kilograms to reach BMI and standard weight [Python]
A story that makes it easy to estimate the living area using Elasticsearch and Python
[Python] A program to find the number of apples and oranges that can be harvested
[Free] A Python certification course has been added to freeCodeCamp!
How to write a metaclass that supports both python2 and python3
・ <Slack> Write a function to notify Slack so that it can be quoted at any time (Python)
Try to write a program that abuses the program and sends 100 emails
A quick guide to PyFlink that combines Apache Flink and Python
Code Python to check and graph if it follows Benford's law
[Python] A program that rotates the contents of the list to the left
[Python / Jupyter] Translate the comment of the program copied to the clipboard and insert it in a new cell