[PYTHON] Automatically generate collage from image list

This is a continuation of the last time and the last time.

- Collage template automatic generation - Optimal placement of multiple images

However, I found a better method than "optimal placement of multiple images" to solve the optimization problem, so I adopted that method this time.

With the automatic generation of collage templates, you can create almost unlimited templates. Therefore, you should make a lot of templates and select the one that can display the given image list most beautifully. Actually, it is good to express the template with parameters and solve the optimization problem, but that is a future task.

Now, first of all, we have to think about what it means to be able to display the most beautifully. This time, I decided to emphasize the following two points.

How can the aspect ratio be maintained?
Image size variance is small

The first criterion is easy to understand. The more you break the aspect ratio of an image, the less beautiful it will be, so try to keep it. However, with only the first criterion, only one image is displayed large, and the other images are allowed to be considerably small (ultimately not displayed). Therefore, we adopted the second standard.

As for how to realize these, first, the following values are given.

・ $ H $: Vertical width of the collage to be created ・ $ W $: Width of collage to be created ・ $ \ {x_i \} _ {i = 1} ^ n : n images to be embedded ( x \ _ i \ in \ mathbb {R} \ _ {++} ^ 2 $ for each vertical width, Represents the width)

Then position the templateL = [\ell\_1 | \cdots | \ell\_n]^T \in \mathbb{R}\_{+}^{n \times 2}And sizeS = [s\_1 | \cdots | s\_n]^T \in \mathbb{R}\_{+}^{n \times 2}(In reality, the definition should be written so that it does not protrude, but it is omitted for simplicity). The template is created by the following function.

def generate_template(n, width, height, random_state=1, max_random_state=1000000, offset=0):
    L = [np.array([offset, offset, width-offset, height-offset])]
    random_state_lists = stats.randint.rvs(0, max_random_state, size=(n-1, 4), random_state=random_state)

    for idx, random_state_list in enumerate(random_state_lists):
        n_areas = len(L)
        if n_areas == 1:
            i = 0
        else:
            p = np.repeat(1 / (n_areas + i), n_areas)
            x = stats.multinomial.rvs(1, p, size=1, random_state=random_state_list[0])[0]
            i = x.argmax()

        y = stats.bernoulli.rvs(0.5, size=1, random_state=random_state_list[1])[0]
        if y == 0:
            b = stats.uniform.rvs(L[i][0], L[i][2] - L[i][0], size=1, random_state=random_state_list[2])[0]
        else:
            b = stats.uniform.rvs(L[i][1], L[i][3] - L[i][1], size=1, random_state=random_state_list[3])[0]
        if y == 0:
            area1 = np.array([L[i][0], L[i][1], b-offset/2, L[i][3]])
            area2 = np.array([b+offset/2, L[i][1], L[i][2], L[i][3]])
        else:
            area1 = np.array([L[i][0], L[i][1], L[i][2], b-offset/2])
            area2 = np.array([L[i][0], b+offset/2, L[i][2], L[i][3]])
        L.pop(i)
        L.append(area1)
        L.append(area2)
    return np.array(L)

Various templates can be generated by changing random_state for n. By the way, offset is the distance between images (thickness of the border).

Next, decide where to put each image in the template. This is achieved by choosing the location with the most similar aspect ratio. Find the location $ j \ _i $ in the template where the image $ i $ will be placed below.

\begin{align*}
j_i &= \text{argmin}_{j} \ \|r_i - r'_j \|^2 \\
r_i &= \frac{[x_i]_1}{[x_i]_0}, \ r'_j = \frac{[s_j]_1}{[s_j]_0}
\end{align*}

However, $ [x] _i $ is the $ i $ component of $ x $. In addition, once you have selected the location, you cannot select it. That is, $ \ {j \ _1, \ ldots, j \ _n \} = \ {1, \ ldots, n \} $.

Using the above, the score function $ f $ is defined as follows for the template size $ S $.

\begin{align*}
f(S) = \sum_{i=1}^n \|r_i - r_{j_i}\|^2 + \frac{\alpha}{n}
 \sum_{i=1}^n \left( [s_i]_1 [s_i]_2 - \frac{1}{n} \sum_{i'=1}^n [s_{i'}]_1 [s_{i'}]_2 \right)^2
\end{align*}

Create $ m $ templates and call them $ S ^ {(1)}, \ ldots, S ^ {(m)} $. Finally, choose a template that minimizes $ f $.

Based on the above, I experimented. In addition, I used the following image.

https://hapila.jp/wp-content/uploads/2017/04/coffee-stock-photo-0e8b300f42157b6f.jpg
http://www.menshealth.com/sites/menshealth.com/files/coffee-mug.jpg
https://upload.wikimedia.org/wikipedia/commons/thumb/4/45/A_small_cup_of_coffee.JPG/1280px-A_small_cup_of_coffee.JPG

The experimental results are as follows.

Isn't it embedded well?

This time, I tried to automatically generate a collage. As a result, I think we have made something of its own. As a future issue,

--Parametrize and optimize templates --Fast matching between template and image list

Is it? The code used for the experiment is listed below. Thank you for reading.

import itertools
import glob
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from skimage import io, transform


def generate_template(n, width, height, random_state=1, max_random_state=1000000, offset=0):
    L = [np.array([offset, offset, width-offset, height-offset])]
    random_state_lists = stats.randint.rvs(0, max_random_state, size=(n-1, 4), random_state=random_state)

    for idx, random_state_list in enumerate(random_state_lists):
        n_areas = len(L)
        if n_areas == 1:
            i = 0
        else:
            p = np.repeat(1 / (n_areas + i), n_areas)
            x = stats.multinomial.rvs(1, p, size=1, random_state=random_state_list[0])[0]
            i = x.argmax()

        y = stats.bernoulli.rvs(0.5, size=1, random_state=random_state_list[1])[0]
        if y == 0:
            b = stats.uniform.rvs(L[i][0], L[i][2] - L[i][0], size=1, random_state=random_state_list[2])[0]
        else:
            b = stats.uniform.rvs(L[i][1], L[i][3] - L[i][1], size=1, random_state=random_state_list[3])[0]
        if y == 0:
            area1 = np.array([L[i][0], L[i][1], b-offset/2, L[i][3]])
            area2 = np.array([b+offset/2, L[i][1], L[i][2], L[i][3]])
        else:
            area1 = np.array([L[i][0], L[i][1], L[i][2], b-offset/2])
            area2 = np.array([L[i][0], b+offset/2, L[i][2], L[i][3]])
        L.pop(i)
        L.append(area1)
        L.append(area2)
    return np.array(L)


def estimate(X, w, h, random_states, offset=0):
    r = np.c_[X[:, 0] / X[:, 1]]
    r2 = r**2
    best_L = None
    best_s = None
    lowest_linkage = np.inf
    for random_state in random_states:
        s = stats.uniform.rvs(0, 1, random_state=random_state)
        L = generate_template(X.shape[0], w*s, h*s, random_state=random_state, offset=offset*s)
        r_temp = np.c_[(L[:, 2] - L[:, 0]) / (L[:, 3] - L[:, 1])]
        r_temp2 = r_temp**2
        dist2 = r2 + r_temp2.T - r.dot(r_temp.T)

        assignment = []
        linkage = 0
        selected = set()
        for i, idx in enumerate(dist2.argsort(axis=1)):
            for j in idx:
                if j not in selected:
                    linkage += dist2[i, j]
                    assignment.append(j)
                    selected.add(j)
                    break
        assignment = np.array(assignment)
        L = L[assignment]
        A = L.copy()
        size = np.c_[L[:, 2] - L[:, 0], L[:, 3] - L[:, 1]]
        L = np.c_[L[:, 0], L[:, 1], np.min(size / X, axis=1)]
        mu = L[:, 2].mean()
        var = np.sqrt(np.mean((L[:, 2] - mu)**2))
        linkage = linkage + var
        if linkage < lowest_linkage:
            lowest_linkage = linkage
            best_L = A
            best_s = s
    return best_L, best_s, lowest_linkage


if __name__ == '__main__':
    images = []
    X = []
    for filename in glob.glob("img/*")[:3]:
        image = io.imread(filename)
        images.append(image)
        X.append((image.shape[0], image.shape[1]))
    X = np.array(X)

    Z = np.c_[X[:, 1], X[:, 0]]

    width, height = 375, 667
    L, s, linkage_value = estimate(Z, width, height, range(100), offset=0)
    L = L / s
    print(linkage_value)

    position = np.int32(np.c_[L[:, 1], L[:, 0]])
    size = np.int32(np.c_[L[:, 3] - L[:, 1], L[:, 2] - L[:, 0]])
    
    canvas = np.zeros((height, width, 3))
    for i in range(len(images)):
        image = images[i]
        print(image.shape)
        canvas[position[i][0]:position[i][0]+size[i][0], position[i][1]:position[i][1]+size[i][1]] = transform.resize(image, size[i])    

    plt.imshow(canvas)
    plt.show()