[PYTHON] Recursive function that displays the XML tree structure [Note]

What you want to do

I made tips while writing the continuation of this article. I want to make a memorandum about how to express the hierarchical structure in python when understanding the tree structure of XML.

environment

Background

    1. The XML structure obtained from OpenWeatherMap has a maximum 4th floor concept, and I would like to clarify the instructions for ElementTree when extracting temperature / humidity data.
  1. I want to know how to display XML in a tree structure for that.
    1. It seems that it worked well if I applied recursion, so I made a memorandum.

The code I wrote at the beginning

#Library import
import urllib.request
import urllib.parse
import xml.etree.ElementTree as et
import xml.dom.minidom as md
url='http://api.openweathermap.org/data/2.5/forecast?' #Base URL settings
query = {
        'id'    : '1850144' ,
        'APPID' : 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', #The appid you got*1
        'units'  : 'metric',
        'mode'  : 'xml'}#Value group to be set in the query
url = url + urllib.parse.urlencode(query) #Generate request URL
response = urllib.request.urlopen(url) #http request
root = et.fromstring(response.read()) #Store the retrieved content in XML Element
for sub1 in root.iter('weatherdata'):
    print("->",sub1.tag, sub1.attrib)
    for sub2 in sub1:
        print("  ->",sub2.tag, sub2.attrib)
        for sub3 in sub2:
            print("    ->",sub3.tag, sub3.attrib)
            for sub4 in sub3:
                print("      ->",sub4.tag, sub4.attrib)

The result of the code I was writing first

-> weatherdata {}
  -> location {}
    -> name {}
    -> type {}
    -> country {}
    -> timezone {}
    -> location {'altitude': '0', 'latitude': '35.6895', 'longitude': '139.6917', 'geobase': 'geonames', 'geobaseid': '1850144'}
  -> credit {}
  -> meta {}
    -> lastupdate {}
    -> calctime {}
    -> nextupdate {}
  -> sun {'rise': '2017-07-08T19:33:18', 'set': '2017-07-09T09:59:30'}
  -> forecast {}
    -> time {'from': '2017-07-09T12:00:00', 'to': '2017-07-09T15:00:00'}
      -> symbol {'number': '800', 'name': 'clear sky', 'var': '02n'}
      -> precipitation {}
      -> windDirection {'deg': '197.01', 'code': 'SSW', 'name': 'South-southwest'}
      -> windSpeed {'mps': '4.32', 'name': 'Gentle Breeze'}
      -> temperature {'unit': 'celsius', 'value': '23.07', 'min': '23.07', 'max': '23.44'}
      -> pressure {'unit': 'hPa', 'value': '1019.84'}
      -> humidity {'value': '97', 'unit': '%'}
      -> clouds {'value': 'clear sky', 'all': '8', 'unit': '%'}
    -> time {'from': '2017-07-09T15:00:00', 'to': '2017-07-09T18:00:00'}
    (abridgement)
    -> time {'from': '2017-07-14T09:00:00', 'to': '2017-07-14T12:00:00'}
      -> symbol {'number': '500', 'name': 'light rain', 'var': '10n'}
      -> precipitation {'unit': '3h', 'value': '0.67', 'type': 'rain'}
      -> windDirection {'deg': '347.003', 'code': 'NNW', 'name': 'North-northeast'}
      -> windSpeed {'mps': '6.57', 'name': 'Moderate breeze'}
      -> temperature {'unit': 'celsius', 'value': '24.1', 'min': '24.1', 'max': '24.1'}
      -> pressure {'unit': 'hPa', 'value': '1012.9'}
      -> humidity {'value': '98', 'unit': '%'}
      -> clouds {'value': 'broken clouds', 'all': '76', 'unit': '%'}

code

But when I thought it wasn't elegant, I came up with the idea of recursion and created a recursive function. I thought about changing the display method to the directory method. This way you can track the depth and full path. Furthermore, the name of the root element can be abstractly expressed by root.tag, so it should be usable in various XML documents.

#Setting functions for recursion
def print_elements( element , depth , fullpath ):
    line = '[' + str(depth) + '] ' + fullpath + '/' + element.tag + str(element.attrib)
    print( line )
    for sub in element:
        print_elements( sub , depth + 1 , fullpath + '/' + element.tag )

#Library import
import urllib.request
import urllib.parse
import xml.etree.ElementTree as et
import xml.dom.minidom as md
url='http://api.openweathermap.org/data/2.5/forecast?' #Base URL settings
query = {
        'id'    : '1850144' ,
        'APPID' : 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX', #The appid you got*1
        'units'  : 'metric',
        'mode'  : 'xml'}#Value group to be set in the query
url = url + urllib.parse.urlencode(query) #Generate request URL
response = urllib.request.urlopen(url) #http request
root = et.fromstring(response.read()) #Store the retrieved content in XML Element
for sub in root.iter(root.tag):
    print_elements( sub , 1 , '')

result

[1] /weatherdata{}
[2] /weatherdata/location{}
[3] /weatherdata/location/name{}
[3] /weatherdata/location/type{}
[3] /weatherdata/location/country{}
[3] /weatherdata/location/timezone{}
[3] /weatherdata/location/location{'altitude': '0', 'latitude': '35.6895', 'longitude': '139.6917', 'geobase': 'geonames', 'geobaseid': '1850144'}
[2] /weatherdata/credit{}
[2] /weatherdata/meta{}
[3] /weatherdata/meta/lastupdate{}
[3] /weatherdata/meta/calctime{}
[3] /weatherdata/meta/nextupdate{}
[2] /weatherdata/sun{'rise': '2017-07-08T19:33:19', 'set': '2017-07-09T09:59:29'}
[2] /weatherdata/forecast{}
[3] /weatherdata/forecast/time{'from': '2017-07-09T15:00:00', 'to': '2017-07-09T18:00:00'}
[4] /weatherdata/forecast/time/symbol{'number': '500', 'name': 'light rain', 'var': '10n'}
[4] /weatherdata/forecast/time/precipitation{'unit': '3h', 'value': '0.0075', 'type': 'rain'}
[4] /weatherdata/forecast/time/windDirection{'deg': '204.001', 'code': 'SSW', 'name': 'South-southwest'}
[4] /weatherdata/forecast/time/windSpeed{'mps': '4.31', 'name': 'Gentle Breeze'}
[4] /weatherdata/forecast/time/temperature{'unit': 'celsius', 'value': '23.34', 'min': '22.9', 'max': '23.34'}
[4] /weatherdata/forecast/time/pressure{'unit': 'hPa', 'value': '1019.56'}
[4] /weatherdata/forecast/time/humidity{'value': '100', 'unit': '%'}
[4] /weatherdata/forecast/time/clouds{'value': 'scattered clouds', 'all': '32', 'unit': '%'}
[3] /weatherdata/forecast/time{'from': '2017-07-09T18:00:00', 'to': '2017-07-09T21:00:00'}
(abridgement)
[3] /weatherdata/forecast/time{'from': '2017-07-14T12:00:00', 'to': '2017-07-14T15:00:00'}
[4] /weatherdata/forecast/time/symbol{'number': '500', 'name': 'light rain', 'var': '10n'}
[4] /weatherdata/forecast/time/precipitation{'unit': '3h', 'value': '0.27', 'type': 'rain'}
[4] /weatherdata/forecast/time/windDirection{'deg': '320.001', 'code': 'NW', 'name': 'Northwest'}
[4] /weatherdata/forecast/time/windSpeed{'mps': '5.11', 'name': 'Gentle Breeze'}
[4] /weatherdata/forecast/time/temperature{'unit': 'celsius', 'value': '24.35', 'min': '24.35', 'max': '24.35'}
[4] /weatherdata/forecast/time/pressure{'unit': 'hPa', 'value': '1013.6'}
[4] /weatherdata/forecast/time/humidity{'value': '96', 'unit': '%'}
[4] /weatherdata/forecast/time/clouds{'value': 'few clouds', 'all': '20', 'unit': '%'}

Well, it seems that it went well like this. It feels good to be able to grasp the whole feeling without hesitation.

Summary

The tree structure was exceptionally deep. This notation may be good for grasping the whole feeling. An intermediate representation for grasping the structure. Useful for organizing which elements and which attributes you want. This recursive function can (should) be used in other XML documents.

The site that I used as a reference

result screen of jupyter notebook

木構造をディレクトリの様に表示.PNG

Recommended Posts

Recursive function that displays the XML tree structure [Note]
How to solve the recursive function that solved abc115-D
[Python] Note: A self-made function that finds the area of the normal distribution
Review the tree structure and challenge BFS
Recursive function
Note that the Pandas loc specifications have changed
The one that displays the progress bar in Python
A shell program that displays the Fibonacci sequence