[PYTHON] Create a dummy data file

You can create a lot of dummy data using the faker module.

Installation

pip install fake-factory

Create a dummy data file

This time id, 10 digits, 10 words Create a csv file in the format like.

dummy.py


from faker import Factory
import csv

with open("dummy_data.csv", "w+") as f:

    csv_writer = csv.writer(f)
    fake = Factory.create()

    for i in range(10000):
        l = [fake.md5(), fake.random_number(10)]
        l.extend(fake.words(10))
        csv_writer.writerow(l)

When you do this

2109993cebbf9e68b5a74344798c19a3,0,sit,corrupti,eaque,perspiciatis,voluptatum,nihil,quaerat,corporis,asperiores,aut
3728284aa04584cafaaab4118fd77e58,1470,non,qui,vitae,aperiam,ut,est,facilis,perspiciatis,dolores,adipisci
ed599579acda23e99243372106f1f2f8,0,provident,sint,quidem,unde,omnis,perferendis,sint,dolorum,rerum,qui
a117e010335d11c8e88bcd8d359d9429,434500369,enim,atque,earum,nihil,voluptatem,omnis,enim,reiciendis,qui,facilis
b2524affecebe4f67f2dccfca6b6ddf2,6590,commodi,et,maxime,laudantium,eaque,nihil,omnis,perferendis,nesciunt,beatae
aefecf6b23019fbab30947f948b26a18,477210330,doloremque,fugit,est,ut,nobis,sed,aliquam,rem,asperiores,ducimus
834df95fc9e1dff879e3f1d63c870390,36,dolores,at,et,est,id,earum,nulla,ut,autem,ut
fd9a959e399b57749fcaf1b52e0388e0,13,minus,quaerat,tenetur,cumque,rerum,molestiae,repellat,autem,voluptas,repudiandae
f08d779d34eb463d9ee2653fe7f58e59,1746570,perspiciatis,maiores,saepe,porro,quia,iusto,facilis,inventore,repellat,provident
af31877a37fff42e8f624cbe5aa2ae57,5236,odit,neque,voluptatem,facere,corrupti,incidunt,est,et,id,quo

You can get a csv like that Convenient

The dummy data that can be created looks like this

fake.add_provider               fake.name
fake.address                    fake.null_boolean
fake.am_pm                      fake.numerify
fake.boolean                    fake.opera
fake.bothify                    fake.paragraph
fake.bs                         fake.paragraphs
fake.building_number            fake.parse
fake.catch_phrase               fake.phone_number
fake.century                    fake.postcode
fake.chrome                     fake.prefix
fake.city                       fake.provider
fake.city_prefix                fake.providers
fake.city_suffix                fake.pybool
fake.company                    fake.pydecimal
fake.company_email              fake.pydict
fake.company_suffix             fake.pyfloat
fake.country                    fake.pyint
fake.country_code               fake.pyiterable
fake.credit_card_expire         fake.pylist
fake.credit_card_full           fake.pyset
fake.credit_card_number         fake.pystr
fake.credit_card_provider       fake.pystruct
fake.credit_card_security_code  fake.pytuple
fake.date                       fake.random_digit
fake.date_time                  fake.random_digit_not_null
fake.date_time_ad               fake.random_element
fake.date_time_between          fake.random_int
fake.date_time_this_century     fake.random_letter
fake.date_time_this_decade      fake.random_number
fake.date_time_this_month       fake.randomize_nb_elements
fake.date_time_this_year        fake.safari
fake.day_of_month               fake.safe_email
fake.day_of_week                fake.secondary_address
fake.domain_name                fake.seed
fake.domain_word                fake.sentence
fake.email                      fake.sentences
fake.firefox                    fake.sha1
fake.first_name                 fake.sha256
fake.format                     fake.slug
fake.free_email                 fake.state
fake.free_email_domain          fake.state_abbr
fake.geo_coordinate             fake.street_address
fake.get_formatter              fake.street_name
fake.get_providers              fake.street_suffix
fake.internet_explorer          fake.suffix
fake.ipv4                       fake.text
fake.ipv6                       fake.time
fake.iso8601                    fake.timezone
fake.language_code              fake.tld
fake.last_name                  fake.unix_time
fake.latitude                   fake.uri
fake.lexify                     fake.uri_extension
fake.linux_platform_token       fake.uri_page
fake.linux_processor            fake.uri_path
fake.locale                     fake.url
fake.longitude                  fake.user_agent
fake.mac_platform_token         fake.user_name
fake.mac_processor              fake.windows_platform_token
fake.md5                        fake.word
fake.mime_type                  fake.words
fake.month                      fake.year
fake.month_name   

It covers most of the addresses, names, credit card numbers, dates, etc.

Recommended Posts

Create a dummy data file
Create a 1MByte random number file
How to create a config file
Create a file uploader with Django
Create a dummy image with Python + PIL.
Create a large text file with shellscript
Create a deb file from a python package
[GPS] Create a kml file in Python
How to create a CSV dummy file containing Japanese using Faker
Script to create a Mac dictionary file
Read a character data file with numpy
Create a GIF file using Pillow in Python
Create an executable file in a scripting language
Randomly sample MNIST data to create a dataset
How to create a JSON file in Python
Create a PDF file with a random page size
Create 3D printer data (STL file) using CadQuery
Create a binary data parser using Kaitai Struct
Create a Photoshop format file (.psd) with python
Create a MIDI file in Python using pretty_midi
Create a cylinder with open3d + STL file output
Create a Python image in Django without a dummy image file and test the image upload
Create a Django schedule
Create a Python module
[WIP] Create 1-file Chainer
Aggregate steps by day from iPhone healthcare data to create a CSV file
Create a Bootable LV
Create a Python environment
Create regular polyhedron data
Create a slack bot
Create a data collection bot in Python using Selenium
Create dummy data using Python's NumPy and Faker packages
Python-Read data from a numeric data file and calculate covariance
I tried reading data from a file using Node.js.
Python script to create a JSON file from a CSV file
[Python] Create a Tkinter program distribution file with cx_Freeze
Create a 2d CAD file ".dxf" with python [ezdxf]
Instantly create a diagram of 2D data using python's matplotlib
Create a Wox plugin (Python)
Create a function in Python
[Python] Create a file & folder path specification screen with tkinter
Create a data frame from the acquired boat race text data
Create a (simple) REST server
Create a homepage with django
Create applications, register data, and share with a single email
Upload a file to Dropbox
Read and write a file
Create a python numpy array
Create a Django login screen
Create a heatmap with pyqtgraph
Write and read a file
Create a classroom on Jupyterhub
Export a gzip-compressed text file
Create a simple textlint server
Create a directory with python
Create xlsx file with XlsxWriter
Create a rudimentary ELF packer
Create an API that returns data from a model using turicreate
Create a temporary file with django as a zip file and return it
[numpy] Create a moving window matrix from multidimensional time series data
Create a shell script to run the python file multiple times