The route map and the graph network go well together, and when you try to think about something on the route map, you will touch on the shortest path problem that anyone can think of.
Turn the railroad map into a simple graph problem and use Dijkstra's algorithm to find the shortest path. Dijkstra's algorithm is built into the Python library NetworkX, which is convenient, so we will use it this time.
The program is uploaded to Google Colaboratory so you can run it in your browser. Here https://nbviewer.jupyter.org/github/galileo15640215/train/blob/master/train_dijkstra.ipynb
Even though it is a route map, it is too large to handle national data suddenly, and before considering routes nationwide, first extract and create Tokyo Metro routes as a small problem.
Order ・ Data processing ・ Route map within Tokyo Metro ・ The shortest route within the Tokyo Metro area ・ Nationwide route map ・ The shortest route nationwide
From station data.jp https://ekidata.jp/ Business data company20200309.csv Route data line20200306free.csv Station data station20200316free.csv Connection station data join20200306.csv Download. (CSV file name as of May 1, 2020)
We will create the necessary data based on this.
#Required modules
import pandas as pd
import matplotlib.pyplot as plt
import japanize_matplotlib
import numpy as np
import networkx as nx
#Create a pandsa format table from a csv file
station = pd.read_csv("station20200316free.csv")
join = pd.read_csv("join20200306.csv")
#pref = pd.read_csv("pref.csv")
line = pd.read_csv("line20200306free.csv")
company = ("company20200309.csv")
So far, we've put the necessary data into the Pandas table.
Next, extract the data for Tokyo Metro only.
#Tokyo Metro extracts only Tokyo Metro stations from stations nationwide...company_cd == 18
metro = station[["station_cd", "station_name", "line_cd", "lon", "lat"]]
metro = pd.merge(metro, line, on = 'line_cd')
metro = metro[metro["company_cd"] == 18]
metro = metro[["station_cd", "station_name", "line_cd", "lon_x", "lat_x", "line_name", "line_color_c", "line_color_t"]]
lon = metro["lon_x"]
lat = metro["lat_x"]
metro["lon"] = lon
metro["lat"] = lat
metro = metro[["station_cd", "station_name", "line_cd", "lon", "lat", "line_name"]]
#Route to extract the connection side of Tokyo Metro...line_cd == 28001---28010
metro_join = join[(join["line_cd"]==28001)|(join["line_cd"]==28002)|(join["line_cd"]==28003)|(join["line_cd"]==28004)|(join["line_cd"]==28005)|(join["line_cd"]==28006)|(join["line_cd"]==28007)|(join["line_cd"]==28008)|(join["line_cd"]==28009)|(join["line_cd"]==28010)]
metro_join = metro_join[["station_cd1", "station_cd2"]]
From here, create a graph of the Tokyo Metro route map.
#Graph declaration
G = nx.Graph()
#Make the apex the station name
G.add_nodes_from(metro["station_name"])
#Set the coordinates of plot
pos={}
for i, j, k in zip(metro["station_name"], metro["lon"], metro["lat"]):
pos[i] = (j, k)
#List e to station_name and station_Store cd and link
e = []
for i, j in zip(metro["station_name"], metro["station_cd"]):
e.append([i, j])
#Add edge information to the graph
for i, j in zip(metro_join["station_cd1"], metro_join["station_cd2"]):
for k in e:
if k[1] == i:
for l in e:
if l[1] == j:
G.add_edge(k[0], l[0])
#Graph output settings
plt.figure(figsize=(10,10),dpi=200)
plt.title('Tokyo metro', fontsize=20)
plt.axes().set_aspect('equal', 'datalim')
nx.draw_networkx(G, pos, node_color='b', alpha=0.8, node_size=10, font_size=5, font_family='IPAexGothic')
plt.show()
I was able to visualize the route map of Tokyo Metro. From here, give the graph the information needed to solve the shortest path problem.
#Data creation to give the distance between stations as a weight on the side
dist = []
cd1_lat = []
cd1_lon = []
cd2_lat = []
cd2_lon = []
not_exist = [] #station that exists in the join table but not in the station table_Store cd
for i, j in zip(metro_join["station_cd1"], metro_join["station_cd2"]):
flag = True
for k, l, m in zip(metro["station_cd"], metro["lat"], metro["lon"]):
if i == k:
cd1_x = l
cd1_y = m
cd1_lat.append(l)
cd1_lon.append(m)
flag = False
if j == k:
cd2_x = l
cd2_y = m
cd2_lat.append(l)
cd2_lon.append(m)
if flag:
not_exist.append([i, j])
#print(i, j)
else:
dist.append(((cd1_x-cd2_x)**2 + (cd1_y-cd2_y)**2)**0.5)
#If you execute it as it is, an error ValueError: Length of values does not match length of index
#Apparently"station_cd" ==Removed 2800701 and 2800702 because they do not exist in the station table and are unnecessary from the join table
#The following two lines...metro_join = metro_join[metro_join["station_cd1"] != 2800701]...Equivalent to
for i in range(len(not_exist)):
metro_join = metro_join[metro_join["station_cd1"] != not_exist[i][0]]
#Add column to join table
metro_join["cd1_lat"] = cd1_lat
metro_join["cd1_lon"] = cd1_lon
metro_join["cd2_lat"] = cd2_lat
metro_join["cd2_lon"] = cd2_lon
metro_join["distance"] = dist
Add edge weights to the graph
#nodes is station_name
#Give the graph edge weights
for i, j, m in zip(metro_join["station_cd1"], metro_join["station_cd2"], metro_join["distance"]):
for k in e:
if k[1] == i:
for l in e:
if l[1] == j:
G.add_edge(k[0], l[0], weight=m)
The NetworkX library, Networkx.dijkstra_path (), which is needed to solve the shortest path problem, does not seem to support the Japanese apex. Change the vertices from station_name to station_cd and declare a new graph.
#nodes is station_cd
G = nx.Graph()
G.add_nodes_from(metro["station_cd"])
pos={}
for i, j, k in zip(metro["station_cd"], metro["lon"], metro["lat"]):
pos[i] = (j, k)
for i, j, m in zip(metro_join["station_cd1"], metro_join["station_cd2"], metro_join["distance"]):
G.add_edge(i, j, weight=m)
#station_The station with the same name when cd is the apex_Even with name, station for each line_Since cd is set, it does not have a connection side with other lines as it is
#Therefore, connect stations with the same name with a weight of 0.
for i, j in zip(metro["station_name"], metro["station_cd"]):
for k, l in zip(metro["station_name"], metro["station_cd"]):
if i == k and j != l:
G.add_edge(j, l, weight=0)
#Set start and finish stations
st_name = "Ogikubo"
go_name = "Nishi-Funabashi"
#station_from name to station_Search cd
for i, j in zip(metro["station_name"], metro["station_cd"]):
if i == st_name:
st = j
if i == go_name:
go = j
#Search for the shortest path
dij = nx.dijkstra_path(G, st, go)
out = []
for k in range(len(dij)):
for i, j in zip(metro["station_name"], metro["station_cd"]):
if j == dij[k]:
out.append(i)
print(out)
#Declare a graph for the shortest path
G_root = nx.Graph()
G_root.add_nodes_from(dij)
pos_root = {}
for l in dij:
for i, j, k in zip(metro["station_cd"], metro["lon"], metro["lat"]):
if l == i:
pos_root[l] = (j, k)
for i in range(len(dij)-1):
G_root.add_edge(dij[i], dij[i+1])
plt.figure(figsize=(10,10),dpi=200)
plt.title('Tokyo metro', fontsize=20)
plt.axes().set_aspect('equal', 'datalim')
nx.draw_networkx(G, pos, node_color='b', alpha=0.3, node_size=10, with_labels= False)
c = ['green' if n==st else 'red' if n!=go else'yellow' for n in G_root.nodes()]
n_size = [30 if n==st else 10 if n!=go else 30 for n in G_root.nodes()]
nx.draw_networkx(G_root, pos_root, node_color=c, alpha=0.9, node_size=n_size, with_labels= False)
plt.show()
Execution result ['Ogikubo','Minami Asagaya',' Shinkoenji','Higashikoenji','Shin Nakano','Nakano Sakagami','Nishi Shinjuku','Shinjuku','Shinjuku 3-chome','Shinjuku Gyoenmae', "Yotsuya 3-chome", "Yotsuya", "Akasaka Mitsuke", "In front of the Parliament Building", "In front of the Parliament Building", "Kasumigaseki", "Hibiya", "Hibiya", "Ginza", "Ginza", "Kyobashi" ,'Nipponbashi','Nihonbashi','Kayabacho','Monzennakacho','Kiba','Toyocho','Minamisagocho','Nishi Kasai','Kasai','Urayasu','Minami Gyotoku', 'Gyotoku','Myonori',' Log Nakayama',' Nishifunabashi']
I was able to draw the shortest route on the Tokyo Metro. How does it compare to the actual excellent transfer guidance app?
The program has more transfers. However, the route is almost the same for a small one like Tokyo Metro.
The same thing is done nationwide.
zen_join = join
zen = station[["station_cd", "station_name", "line_cd", "lon", "lat"]]
zen = pd.merge(zen, line, on='line_cd')
zen = zen[["station_cd", "station_name", "line_cd", "lon_x", "lat_x", "line_name"]]
lon = zen["lon_x"]
lat = zen["lat_x"]
zen["lon"] = lon
zen["lat"] = lat
zen = zen[["station_cd", "station_name", "line_cd", "lon", "lat", "line_name"]]
cd1_lat = []
cd1_lon = []
cd2_lat = []
cd2_lon = []
dist = []
not_exist = []
for i, j in zip(zen_join["station_cd1"], zen_join["station_cd2"]):
flag = True
for k, l, m in zip(zen["station_cd"], zen["lat"], zen["lon"]):
if i == k:
cd1_x = l
cd1_y = m
cd1_lat.append(l)
cd1_lon.append(m)
flag = False
if j == k:
cd2_x = l
cd2_y = m
cd2_lat.append(l)
cd2_lon.append(m)
if flag:
not_exist.append([i, j])
#print(i, j)
else:
dist.append(((cd1_x-cd2_x)**2 + (cd1_y-cd2_y)**2)**0.5)
#Deleted because it does not exist in the station table and is unnecessary from the join table
for i in range(len(not_exist)):
zen_join = zen_join[zen_join["station_cd1"] != not_exist[i][0]]# & zen_join["station_cd2"] != not_exist[i][1]]
zen_join["cd1_lat"] = cd1_lat
zen_join["cd1_lon"] = cd1_lon
zen_join["cd2_lat"] = cd2_lat
zen_join["cd2_lon"] = cd2_lon
zen_join["distance"] = dist
#nodes is station_cd
G = nx.Graph()
G.add_nodes_from(zen["station_cd"])
pos={}
for i, j, k in zip(zen["station_cd"], zen["lon"], zen["lat"]):
pos[i] = (j, k)
for i, j, m in zip(zen_join["station_cd1"], zen_join["station_cd2"], zen_join["distance"]):
G.add_edge(i, j, weight=m)
for i, j, m in zip(zen["station_name"], zen["station_cd"], zen["lat"]):
for k, l, n in zip(zen["station_name"], zen["station_cd"], zen["lat"]):
if i == k and j != l and m == n:
#print(i, j, k, l)
G.add_edge(j, l, weight=0)
plt.figure(figsize=(10,10),dpi=200)
plt.title('All Japan', fontsize=20)
plt.axes().set_aspect('equal', 'datalim')
nx.draw_networkx(G, pos, alpha=0.0, with_labels=False)
nx.draw_networkx_edges(G, pos)
plt.show()
st_name = "Nemuro"
go_name = "Kagoshima"
for i, j in zip(zen["station_name"], zen["station_cd"]):
if i == st_name:
st = j
if i == go_name:
go = j
dij = nx.dijkstra_path(G, st, go)
out = []
for k in range(len(dij)):
for i, j in zip(zen["station_name"], zen["station_cd"]):
if j == dij[k]:
out.append(i)
print(out)
#nodes is station_cd
plt.figure(figsize=(50,50),dpi=200)
G_root = nx.Graph()
G_root.add_nodes_from(dij)
pos_root = {}
for l in dij:
for i, j, k in zip(zen["station_cd"], zen["lon"], zen["lat"]):
if l == i:
pos_root[l] = (j, k)
for i in range(len(dij)-1):
G_root.add_edge(dij[i], dij[i+1])
plt.figure(figsize=(10,10),dpi=200)
plt.title('All Japan', fontsize=20)
plt.axes().set_aspect('equal', 'datalim')
nx.draw_networkx(G, pos, alpha=0.0, with_labels=False)
nx.draw_networkx_edges(G, pos, alpha=0.3)
nx.draw_networkx(G_root, pos_root, alpha=0.0, with_labels=False)
nx.draw_networkx_edges(G_root, pos_root, alpha=1.0, edge_color='r')
plt.show()
Complete. It's unrealistic to actually make a connection, and unless you're a big train lover, it's hell. It's done. Thanks to the Hokkaido Shinkansen, you can reach Honshu through the Seikan Tunnel. Speaking of the shortest time, it is natural to transfer to the Tohoku Shinkansen and Tokaido Shinkansen, but if the route is the shortest, it seems that you will go to the Sea of Japan side by conventional line without using the Tohoku Shinkansen.
The side information this time was the distance, but if you use it as the time, you can make something like the shortest time for transfer guidance. I would like to try it if I can create time data. To be continued...
Visualize railway line data as a graph with Cytoscape 1 https://qiita.com/keiono/items/29286f49b15a5b13c987 Please refer to here for the processing of station data.
Recommended Posts