What is faker

Overview

A library that generates dummy data (test data). PHP and Ruby also have the same name, and it has a de facto atmosphere. https://github.com/joke2k/faker

This time, I will introduce it so that I can generate address data in Japanese.

What kind of data can be generated

What kind of data can faker generate? Let's write a simple example first.

`sample.py`


from faker import Factory
f = Factory.create()
print f.name()
print f.address()
print f.phone_number()
print f.date()

`Execution result`


Jennie Homenick
Petramouth, WI 21918-9349
177.513.9541
1998-12-21

It will generate the data nicely, but the default is English-speaking notation. Data in other languages can also be generated by specifying location in the argument of Factory.create.

About Japanese support

I'm curious about Japanese support, but with the commit of @ ta2xeo about a month ago, names and phone numbers can now be generated in Japanese.

And this time, I made it possible for me to generate an address as well. Let's see it together.

`sample_ja_JP.py`


from faker import Factory
f = Factory.create('ja_JP')
print f.name()
print f.phone_number()
print f.date()
print f.address()
print f.address()
print f.zipcode()
print f.prefecture()
print f.city()
print f.town()
print f.chome()
print f.ban()
print f.gou()
print f.building_name()

`Execution result`


Akiko Matsumoto
070-1472-1794
2011-03-04
11-4-20 Hanakawado, Tsurumi-ku, Yokohama-shi, Fukushima Corp Minowa 553
31-24-20 Ujiie Shinden, Sammu City, Toyama Prefecture
121-0122
Akita
Koganei City
Taitung
11th Street
No. 8
No. 13
Palace

As you can see, there are almost no real addresses, good or bad. It may not be possible to generate consistent data, or it may not support various address display formats in Japan, but for the time being, it is better than English notation.

In using

~~ It seems that the Japanese version has not been released to PyPI yet. ~~ ~~ If you want to use it, please install it from the GitHub repository. ~~

Since it was released in v0.5.1, the steps in this section are unnecessary.

Creating data mask tool

You can generate test data with a library such as faker, but there are cases where dummy data alone does not work. In such cases, I usually want to mask some of the data in the production environment and use it, so I created a tool for that. Of course I use faker.

A tool called Hermes that masks only specific columns in CSV. It is still poor, but I plan to make steady improvements. https://github.com/ohbarye/Hermes

Recommended Posts

Generate Japanese test data with Python faker

Download Japanese stock price data with python

Primality test with Python

Data analysis with python 2

Primality test with python

Data analysis with Python

Create test data like that with Python (Part 1)

Sample data created with python

Send Japanese email with Python3

Get Youtube data with python

Japanese morphological analysis with Python