[Python] Rewrite past mistakes (XML)

If you do a program in a hurry, you may make a mistake with a light glue called "Well," and it will be ridiculous later. The XML of the game configuration file I made in the past was terrible, so I tried to fix it using Python. (The meaning of each tag is not explained)

<?xml version="1.0" ?>
<charadata>
	<item>
		<id>0</id>
		<textureId>0</textureId>
		<finishIndex>2</finishIndex>
		<soundEffect>0</soundEffect>
		<motion>
			<frameCount>8</frameCount>
			<imageIndex>0</imageIndex>
		</motion>
		<motion>
			<frameCount>8</frameCount>
			<imageIndex>1</imageIndex>
		</motion>
		<motion>
			<frameCount>18</frameCount>
			<imageIndex>2</imageIndex>
		</motion>
		<motion>
			<frameCount/>
			<imageIndex/>
		</motion>
	</item>
	<item>
		<id>1</id>
		<textureId>2</textureId>
		<finishIndex>0</finishIndex>
		<soundEffect>0</soundEffect>
		<motion>
			<frameCount>4</frameCount>
			<imageIndex>0</imageIndex>
		</motion>
		<motion>
			<frameCount/>
			<imageIndex/>
		</motion>
		<motion>
			<frameCount/>
			<imageIndex/>
		</motion>
		<motion>
			<frameCount/>
			<imageIndex/>
		</motion>
	</item>
	<item>
		<id>2</id>
		<textureId>2</textureId>
		<finishIndex>1</finishIndex>
		<soundEffect>0</soundEffect>
		<motion>
			<frameCount>4</frameCount>
			<imageIndex>1</imageIndex>
		</motion>
		<motion>
			<frameCount>4</frameCount>
			<imageIndex>0</imageIndex>
		</motion>
		<motion>
			<frameCount/>
			<imageIndex/>
		</motion>
		<motion>
			<frameCount/>
			<imageIndex/>
		</motion>
	</item>
</charadata>

Should have been

<?xml version="1.0" ?>
<charadata>
	<item>
	<id>0</id>
	<finishIndex>2</finishIndex> 
	<soundEffect>0</soundEffect>
	<textureId>0</textureId>
	<frameCount_00>8</frameCount_00>
	<frameCount_01>8</frameCount_01>
	<frameCount_02>18</frameCount_02>
	<frameCount_03></frameCount_03>
	<imageIndex_00>0</imageIndex_00>
	<imageIndex_01>1</imageIndex_01>
	<imageIndex_02>2</imageIndex_02>
	<imageIndex_03></imageIndex_03>
	</item>
	
	<item>
	<id>1</id>
	<textureId>2</textureId>
	<finishIndex>0</finishIndex>
	<soundEffect>0</soundEffect>
	<frameCount_00>4</frameCount_00>
	<frameCount_01></frameCount_01>
	<frameCount_02></frameCount_02>
	<frameCount_03></frameCount_03>
	<imageIndex_00>0</imageIndex_00>
	<imageIndex_01></imageIndex_01>
	<imageIndex_02></imageIndex_02>
	<imageIndex_03></imageIndex_03>
	</item>
	
	<item>
	<id>2</id>
	<textureId>2</textureId>
	<finishIndex>1</finishIndex>
	<soundEffect>0</soundEffect>
	<frameCount_00>4</frameCount_00>
	<frameCount_01>4</frameCount_01>
	<frameCount_02></frameCount_02>
	<frameCount_03></frameCount_03>
	<imageIndex_00>1</imageIndex_00>
	<imageIndex_01>0</imageIndex_01>
	<imageIndex_02></imageIndex_02>
	<imageIndex_03></imageIndex_03>
	</item>
</charadata>

I did it. The parts that should be listed are managed as individual tags. It's a terrible omission.

I had the opportunity to reuse this (soul) wrong XML,

・ A difficult amount to fix by hand

・ It is more accurate to fix it using a program

So, I tried to program lightly as follows. (Well, the use of textbook programs)

exchange_xml.py


# -*- coding: utf-8 -*-
import xml.etree.ElementTree as ET
import xml.dom.minidom as minidom

#Input source
src_path = "./aaaa.xml"
tree = ET.parse(src_path)
src_root = tree.getroot()
print("src_root:"+src_root.tag)

#Output destination
dest_path = "./aaaa02.xml"
dest_root = ET.Element(src_root.tag)
print("dest_root:"+dest_root.tag)

#Get & rewrite elements in XML
for item in src_root.findall("item"):
	#Element acquisition, None(=If it's like NULL), put an empty string
	id=""
	textureId=""
	finishIndex=""
	soundEffect=""

	frameCount_00=""
	frameCount_01=""
	frameCount_02=""
	frameCount_03=""
	imageIndex_00=""
	imageIndex_01=""
	imageIndex_02=""
	imageIndex_03=""

	if item.find("id").text is not None:
		id = item.find("id").text
	
	if item.find("textureId").text is not None:
		textureId = item.find("textureId").text

	if item.find("finishIndex").text is not None:
		finishIndex = item.find("finishIndex").text

	if item.find("soundEffect").text is not None:
		soundEffect = item.find("soundEffect").text

	if item.find("frameCount_00").text is not None:
		frameCount_00 = item.find("frameCount_00").text

	if item.find("frameCount_01").text is not None:
		frameCount_01 = item.find("frameCount_01").text

	if item.find("frameCount_02").text is not None:
		frameCount_02 = item.find("frameCount_02").text

	if item.find("frameCount_03").text is not None:
		frameCount_03 = item.find("frameCount_03").text

	if item.find("imageIndex_00").text is not None:
		imageIndex_00 = item.find("imageIndex_00").text

	if item.find("imageIndex_01").text is not None:
		imageIndex_01 = item.find("imageIndex_01").text

	if item.find("imageIndex_02").text is not None:
		imageIndex_02 = item.find("imageIndex_02").text

	if item.find("imageIndex_03").text is not None:
		imageIndex_03 = item.find("imageIndex_03").text
	
	#Output information set
	dest_item = ET.SubElement(dest_root, "item")
	
	dest_id = ET.SubElement(dest_item, "id")
	dest_id.text = id

	dest_textureId = ET.SubElement(dest_item, "textureId")
	dest_textureId.text = textureId

	dest_finishIndex = ET.SubElement(dest_item, "finishIndex")
	dest_finishIndex.text = finishIndex

	dest_soundEffect = ET.SubElement(dest_item, "soundEffect")
	dest_soundEffect.text = soundEffect
	
	#Set list information
	#0th
	motion00 = ET.SubElement(dest_item, "motion")
	motion00_frameCount = ET.SubElement(motion00, "frameCount")
	motion00_frameCount.text = frameCount_00
	
	motion00_imageIndex = ET.SubElement(motion00, "imageIndex")
	motion00_imageIndex.text = imageIndex_00

	#First
	motion01 = ET.SubElement(dest_item, "motion")
	motion01_frameCount = ET.SubElement(motion01, "frameCount")
	motion01_frameCount.text = frameCount_01
	
	motion01_imageIndex = ET.SubElement(motion01, "imageIndex")
	motion01_imageIndex.text = imageIndex_01

	#The second
	motion02 = ET.SubElement(dest_item, "motion")
	motion02_frameCount = ET.SubElement(motion02, "frameCount")
	motion02_frameCount.text = frameCount_02
	
	motion02_imageIndex = ET.SubElement(motion02, "imageIndex")
	motion02_imageIndex.text = imageIndex_02

	#The third
	motion03 = ET.SubElement(dest_item, "motion")
	motion03_frameCount = ET.SubElement(motion03, "frameCount")
	motion03_frameCount.text = frameCount_03
	
	motion03_imageIndex = ET.SubElement(motion03, "imageIndex")
	motion03_imageIndex.text = imageIndex_03

#Newly output reassembled XML
string = ET.tostring(dest_root, 'utf-8')
pretty_string = minidom.parseString(string).toprettyxml(indent='	')

with open(dest_path, 'w') as f:
    f.write(pretty_string)

This is OK. The Python version could be 2.7 or 3.5. It would have been better to use constants or regular expressions for each element name, but it's not a very large program, and I wonder if it's okay to introduce it to people. If you search for basic stories about Element Tree and minidom, you will find bang bang, so please refer to that. This time, I just "read XML and analyze" and "rewrite XML information and output to a file".

Actually, Python is still a beginner, and I happened to be able to do something with the program this time, so "Let's do it with Python! It seems to be popular! If the purpose of programming is the same, anyone can make the same code. I started with "I'm worried about it!" ... No, it's good, Python. The best thing is that you can try the program interactively. This is kind to beginners (no, really). You can also check the grammar and error messages by typing the program line by line. You can also think about the flow of the program while actually typing the code (though it's not really good). If a little processing is needed in the future, I think I'll build it in Python.

Recommended Posts

[Python] Rewrite past mistakes (XML)
Rewrite Python2 code to Python3 (2to3)
Parse XML in Python
Generate XML (RSS) with Python
Read Namespace-specified XML in Python
Process feedly xml with Python.
Speed comparison of Python XML parsing
[Python] Parsing randomly generated XML [ElementTree]
Process Pubmed .xml data with python