[PYTHON] It may be a problem to use Japanese for folder names and notebook names in Databricks

Conclusion first

If you use ** Japanese for the folder name or Notebook name **, use dbutils.notebook.run to use another There is a case that an error occurs when calling Notebook

What do you mean

I have a notebook with the following folder structure

/Users/xxx@yyy.jp
 |-MyNotebook
 |-My notebook
 |-MyNotebookCaller
 |-MyNotebook Caller
 |-test
   |-MyNotebook
   |-MyNotebookCaller

Of these, if the following cases apply, calling another Notebook using dbutils.notebook.run failed.

-** Caller ** If you use Japanese in the name of your Notebook -/Users/[email protected]/MyNotebook Caller -** Caller ** When Japanese is used for the name of the storage folder of Notebook -/Users/[email protected]/test/MyNotebookCaller

In the following cases, the call was successful without any problem.

--When Japanese is used for the name of the called notebook -/Users/[email protected]/My Notebook --When Japanese is used for the name of the storage folder of the called Notebook -/Users/[email protected]/test/MyNotebook

Verification

Description of each notebook

It is a simple process of calling by passing parameters from MyNotebookCaller or MyNotebook Caller to MyNotebook, and printing the received parameters in MyNotebook.

/Users/xxx@yyy.jp/MyNotebook

dbutils.widgets.text("param1", "111")
dbutils.widgets.text("param2", "222")

print("param1:{},param2:{}".format(dbutils.widgets.get("param1"), dbutils.widgets.get("param2")))

/Users/[email protected]/My Notebook

#/Users/xxx@yyy.jp/Same as My Notebook

/Users/xxx@yyy.jp/MyNotebookCaller

#Cmd1 Call MyNotebook in the same folder
dbutils.notebook.run(
  "./MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

#Cmd2 Call My Notebook in the same folder
dbutils.notebook.run(
  "./My notebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

#Call My Notebook in the Cmd3 test folder
dbutils.notebook.run(
  "./test/MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

/Users/[email protected]/MyNotebook Caller

#/Users/xxx@yyy.jp/Same as MyNotebookCaller

/Users/[email protected]/Test/MyNotebook

#/Users/xxx@yyy.jp/Same as My Notebook

/Users/[email protected]/Test/MyNotebookCaller

#Cmd1 Call MyNotebook in the same folder
dbutils.notebook.run(
  "./MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

#Cmd2 Call MyNotebook in the folder one level above
dbutils.notebook.run(
  "../MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

Verification (1) When Japanese is used for the name of the calling Notebook

Try calling MyNotebook in the same folder from /Users/[email protected]/MyNotebook Caller

Cmd1 Call MyNotebook in the same folder


dbutils.notebook.run(
  "./MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

As a result, I got a WorkflowException. It seems that an error was returned by using Japanese, which is a character other than Latin characters (ASCII character set).

com.databricks.WorkflowException: com.databricks.common.client.DatabricksServiceHttpClientException: INVALID_PARAMETER_VALUE: Only Latin1 (ASCII) characters are currently supported. Any international characters must be removed or replaced in workflow_context

Even if I call MyNotebook in the test folder,

Cmd2 Call MyNotebook in the test folder


dbutils.notebook.run(
  "./test/MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

This also caused a WorkflowException

com.databricks.WorkflowException: com.databricks.common.client.DatabricksServiceHttpClientException: INVALID_PARAMETER_VALUE: Only Latin1 (ASCII) characters are currently supported. Any international characters must be removed or replaced in workflow_context

Verification (2) When Japanese is used for the name of the storage folder of the calling Notebook

Try calling MyNotebook in the same folder from /Users/[email protected]/test/MyNotebookCaller

Cmd1 Call MyNotebook in the same folder


dbutils.notebook.run(
  "./MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

This also resulted in a WorkflowException

com.databricks.WorkflowException: com.databricks.common.client.DatabricksServiceHttpClientException: INVALID_PARAMETER_VALUE: Only Latin1 (ASCII) characters are currently supported. Any international characters must be removed or replaced in workflow_context

Verification ③ When Japanese is used for the name of the called notebook

Try calling /Users/[email protected]/MyNotebook from /Users/[email protected]/MyNotebookCaller

Cmd2 Call My Notebook in the same folder


dbutils.notebook.run(
  "./My notebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

The process ended normally even though Japanese was used for the called notebook name. It seems that the passed parameters are also output properly.

param1:val1,param2:val2

Verification ④ When Japanese is used for the name of the storage folder of the called notebook

Try calling /Users/[email protected]/test/MyNotebook from /Users/[email protected]/MyNotebookCaller

Call My Notebook in the Cmd3 test folder


dbutils.notebook.run(
  "./test/MyNotebook",
  60,
  {
    "param1": "val1",
    "param2": "val2"
  }
)

The process ended normally even though Japanese was used for the folder name of the called notebook. It seems that the passed parameters are also output properly.

param1:val1,param2:val2

Summary

Be careful when using Japanese for folder names and notebook names

Recommended Posts

It may be a problem to use Japanese for folder names and notebook names in Databricks
How to save the feature point information of an image in a file and use it for matching
I thought it would be slow to use a for statement in NumPy, but that wasn't the case.
Convenient to use matplotlib subplots in a for statement
How to use Decorator in Django and how to make it
Is it a problem to eliminate the need for analog human resources in the AI era?
Recursively get the Excel list in a specific folder with python and write it to Excel.
[AWS] A story that may be helpful for those who are new to Lambda-Python and DynamoDB
How to trick and use a terrible library that is supposed to be kept globally in flask
If you write go table driven test in python, it may be better to use subTest
[Introduction to Python] How to use the in operator in a for statement?
I want to create a pipfile and reflect it in docker
Connect to postgreSQL from Python and use stored procedures in a loop.
How to make a container name a subdomain and make it accessible in Docker
[Python] It was very convenient to use a Python class for a ROS program.
How to use is and == in Python
A solution to the problem that the Python version in Conda cannot be changed
Tips for those who are wondering how to use is and == in Python
Flutter in Docker-How to build and use a Flutter development environment inside a Docker container
How to read a serial number file in a loop, process it, and graph it
Build a PYNQ environment on Ultra96 V2 and log in to Jupyter Notebook
A collection of resources that may be useful for creating and expanding dotfiles
Until you get daily data for multiple years of Japanese stocks and save it in a single CSV (Python)
Try to calculate a statistical problem in Python
Make a chatbot and practice to be popular.
A Python script that crawls RSS in Azure Status and posts it to Hipchat
GradCAM with 22 lines of code. tf_explain may be easy to use, I recommend it!
It would be wise to write like boolean and "A" or "B" [Python] [But]
Use ipywidgets in jupyter notebook to interactively manipulate parameters and also try image processing
How to use any or all to check if it is in a dictionary (Hash)
Use slackbot as a relay and return from bottle to slack in json format.
How to use Docker to containerize your application and how to use Docker Compose to run your application in a development environment
A solution to the problem that files containing [and] are not listed in glob.glob ()