[Python] (Line) Extract values from graph images

Introduction

Is there a graph image but no data ...? If you have an image, extract it.

image.png pandas 0.7.3 documentation -Plotting with matplotlib             \downarrow array([-0.4028436 , -0.09518499, 0.21247362, ..., 39.12322275, 39.12322275, 39.12322275])             + image.png

↑ ~~ If it is so fine, you cannot expect much accuracy. .. .. ~~

Process flow

Get the desired graph by selecting the color gamut   ↓ Average in the vertical direction   ↓ Interpolate for the number of samples you want   ↓ Scale adjustment   ↓ output

Implementation

** You can run it in Colab here **

import numpy as np
import matplotlib.pyplot as plt
from ipywidgets import interact
import requests
from PIL import Image
import io

Get image data, ignore alpha for the time being

path = "Image path"
im = plt.imread(path)
if im.shape[2] == 4:im = im[:,:,:-1]
if im.max() > 1:im /= 255
h, w, _ = im.shape
plt.imshow(im[::-1])

Trim only the range of the graph for later scale adjustment

@interact(x_min=(0, w), x_max=(0, w), y_min=(0,h), y_max=(0,h))
def Plot(x_min=0, x_max=w, y_min=0, y_max=h):
    global imag
    plt.figure(figsize=(7, 7))
    imag = im[min(y_min,y_max-1):max(y_min+1, y_max), min(x_min,x_max-1):max(x_min+1, x_max)]
    plt.imshow(imag[::-1])

Select the graph you want to extract by color gamut selection, and adjust the Threshold to prevent unnecessary parts from entering.

@interact(x=(0, imag.shape[1]), y=(0,imag.shape[0]), thresh=(1,10))
def Plot(x, y, thresh):
    global p
    p = ((imag - imag[y, x]) ** 2).sum(axis=2) < (1 / (1<<thresh))
    print(p.sum())
    plt.imshow(p[::-1])
    plt.plot([x, x], [0, imag.shape[0]], color="r")
    plt.plot([0, imag.shape[1]], [imag.shape[0]-y, imag.shape[0]-y], color="r")

Take the average in the vertical direction

p = np.pad(p, 1, "constant")
sx = np.arange(len(p[0]))[p.argmax(axis=0)!=0]
sy = []

for i in p.T:
    j = np.where(i!=0)[0]
    if j.tolist():
        sy.append(j.mean())

Noise removal by selecting the number of samples and moving average (convolution)

@interact(sample=(5, 1250), conv_size=(1, 21, 2))
def fit(sample, conv_size):
    global y
    x = np.linspace(sx.min(), sx.max(), sample)
    y = np.convolve(np.pad(np.interp(x, sx, sy), (conv_size-1)//2, "edge"), np.ones(conv_size) / conv_size, "valid")
    plt.plot(x, y)
    plt.xlim(0,len(p[0]))
    plt.ylim(0, len(p))

Enter the range of the graph that was cut first

yl = list(map(int,input("Y-range of trimmed graph?         ").split(",")))

Scale adjustment, output

y_out = y * (yl[1] - yl[0]) / p.shape[0] + yl[0]
y_out

Graph output

plt.plot(y_out)
plt.ylim(*yl)

↑ is a program for Jupyter, so it cannot be executed unless cells are separated by separate parts.

finally

If you make trimming and color gamut selection more interactive using HTML, it will be easier to use.

Recommended Posts

[Python] (Line) Extract values from graph images
Post images from Python to Tumblr
Extract strings from files in Python
[Python] [3D line graph] Multiple data in one graph, axis values in characters
Extract images from cifar and CUCUMBER-9 datasets
# 5 [python3] Extract characters from a character string
Download images from URL list in Python
[Python] Download original images from Google Image Search
Extract text from PowerPoint with Python! (Compatible with tables)
Python scraping Extract racing environment from horse racing site
Python: Tips-Swap values
From file to graph drawing in Python. Elementary elementary
[Python] Takes representative values ​​of multiple images [Numpy]
sql from python
LINE heroku python
MeCab from Python
Load images from URLs using Pillow in Python 3
Bulk download images from specific URLs with python
Read line by line from a file with Python
Extract data from a web page with Python
Extract images and tables from pdf with python to reduce the burden of reporting
[Python] How to remove duplicate values from the list
Bulk download images from specific site URLs with python
[Python] How to draw a line graph with Matplotlib
[Python] Extract the video ID from the YouTube video URL [Note]
Extract characters from images using docomo's character recognition API
[Python beginner] Extract prefectures and cities from addresses (3 lines).
[Python] Extract only numbers from lists and character strings
Use thingsspeak from python
Touch MySQL from Python 3
Use fluentd from python
Extract data from S3
Access bitcoind from python
Changes from Python 3.0 to Python 3.5
Extract features (features) from sentences.
Changes from Python 2 to Python 3.0
Python from or import
Graph drawing in python
Use MySQL from Python
Run python from excel
Install python from source
Execute command from Python
Operate neutron from Python!
Use MySQL from Python
Swapping values in Python
Operate LXC from Python
Manipulate riak from python
Force Python from Fortran
Use BigQuery from python.
Execute command from python
Extract table from wikipedia
Draw graph in python
[Python] Read From Stdin
Use mecab-ipadic-neologd from python
[Python] Extract text data from XML data of 10GB or more.
I want to send a message from Python to LINE Bot
[Python] Read command line arguments from file name or stdin
Extract the value closest to a value from a Python list element
Python: Extract file information from shared drive with Google Drive API