[PYTHON] Power Query Technique Collection Part 1

I've summarized the Power Query use cases that I often use. I haven't learned it yet, so there are some points that I can't understand, but if you notice any points, please point them out.

Features of M

For the time being, I will briefly summarize the features of M.

The last point is a bit confusing, but the point is that it isn't executed in order from the top, but is processed according to dependencies like spreadsheet software.

Data type

This is also roughly. For more information, it's best to look at the MS reference.

list

A list is a set of values with a zero-based index. To make a list, just arrange each element with {} as shown below. It may be faster to write a predetermined list directly.

{1, 2, true, "abc", 3.1}

Since the list is a set of ordered values, it can be retrieved with a trailing index starting at 0.

{1, 2, true, "abc", 3.1}{4} #3.1 returns

If you want to retrieve more than one, use List.Range. {0..10} is a list of 11 items from 0 to 10. This will retrieve the index 0 to one before 5.

List.Range({0..10}, 0, 5)

result image.png Note that the index displayed in the result starts at 1 and is different from the index in the function. For example, List.Range ({0..10}, 0, 5) {1} returns the second element 1.

table

Tables can be created with the #table () function. For other examples, you should look at Microsoft's site.

#table(
    {"Name", "Sex"},
    {
        {"Mike","M"},
        {"Anne","F"}
    }
)

function

The function definition can be written as follows.

(x as number) as number =>
    let
        Ret = if x < 2 
            then x
            else Fib(x - 1) + Fib(x - 2)
    in
        Ret

This is a common function to calculate the Fibonacci number, but as you can see, recursion is also possible. If you save this query as Fib, the Fib in the query will call this function itself.

The following is a function that converts an 8-digit character string such as 20200401 to a date type, and if it cannot be converted successfully, it returns the character string as it is.

(x as text) =>
    let
        usrYear = Number.From(Text.Middle(x, 0, 4)),
        usrMonth = Number.From(Text.Middle(x, 4, 2)),
        usrDay = Text.Middle(x, 6, 2),
        isDD = Value.Is(Value.FromText(usrDay), type number),
        usrDate = if isDD 
            then Date.From(#date(usrYear, usrMonth, Number.From(usrDay)))
            else x
    in
        usrDate

If you close the editor and select this query, you can also enter the parameters yourself and see the results. image.png This function can be used in other queries, so it is convenient to keep frequently used functions in the query as user functions.

Use externally stored queries

This is the method I found by google, but it is also possible to save these codes in a text file and call it. If you write the query part with a suitable text editor and save it, you can execute it in the same way with the following query. This is useful if you want to use common functions in different files.

let
    Source = Text.FromBinary(File.Contents("C:\temp\func.m")),
    EvaluatedExpression = Expression.Evaluate(Source, #shared)
in
    EvaluatedExpression

It will not work without the \ # shared parameter. Mystery is. Recursive functions cannot be executed well as they are, so some ingenuity is required.

Fib.m


let 
    Fib = (x as number) as number =>
        let
            Ret = if x < 2 
                then x
                else @Fib(x - 1) + @Fib(x - 2)
        in
            Ret
in
    Fib

If you nest let and embed the Fib function definition itself in this way, you can use it even if you import it as a file. It may be better to write it like this because you can copy it without being affected by the query name. It seems that there is a rule to add @ to the head when calling recursively, but it worked even if it was not added.

Combine multiple functions into one file

Arrange multiple user functions separated by commas and return each as a record. SumProduct is a function I found by google.

myfuncs.m


let 
    Fib = (x as number) as number =>
        if x < 2 then x else @Fib(x - 1) + @Fib(x - 2),
    Multiply = (x, y) => (x * y),
    SumProduct = (L1 as list, L2 as list) as number =>
        let
            Result = List.Accumulate(List.Positions(L1),
            0,
            (state, current) => state + L1{current} * L2{current})
        in
            Result,
    Ret = [
        Fib = Fib,
        Multiply = Multiply,
        SumProduct = SumProduct
    ]
in 
    Ret

This returns a record with each function name as a key and a function as a value, so if you read this text file with a query such as myfuncs,

myfuncs[SumProduct]({1..10},{2..11})

You can call it with () as an argument.

Use Python's DataFrame as a table.

If you set it to use Python, you can load Pandas DataFrame. If you select Python as the query source, the editor will open, so I will write the code here, but here I will create the code with a suitable text editor or IDE and save it as a separate file. For example

bostondf.py


import pandas as pd
from sklearn.datasets import load_boston
boston_dataset = load_boston()
boston = pd.DataFrame(boston_dataset.data, columns=boston_dataset.feature_names)

Create a file called, and write the following in the query editor.

let
    Source = Python.Execute(Text.FromBinary(File.Contents("c:\temp\bostondf.py"))),
    Ret = Source{[Name="boston"]}[Value]
in
    Ret

Then the data frame becomes a table. If you want to define multiple data frames in one code, just enumerate the DFs.

bostondf.py


import pandas as pd
from sklearn.datasets import load_boston
boston_dataset = load_boston()
boston = pd.DataFrame(boston_dataset.data, columns=boston_dataset.feature_names)
boston_01 = boston[:100]
boston_02 = boston[100:200]
boston_03 = boston[200:300]
boston_04 = boston[300:400]
boston_05 = boston[400:500]
boston_06 = boston[500:600]

If you write the query as follows, a table consisting of each DF and table will be returned.

LoadPythonDFs.m


Python.Execute(Text.FromBinary(File.Contents("c:\temp\bostondf.py")))

image.png If you save the query that loaded this code with an appropriate name such as LoadPythonDFs and create a new query as follows, you can create a function that takes a DF with the Python DF name as an argument.

(tName) => LoadPythonDFs{[Name=tName]}[Value]

Calling python is very time consuming, but once you've populated it, Power BI works lightly (probably). It's a bit annoying to be asked for permission every time I edit python code.

When the filename field is not created

When you specify a folder as the data source and import multiple files at once, the Source.Name field may not be created and you may not know which file the data is. Sometimes it works, but for some reason I often make mistakes in my environment. I think this is a bug. In such a case, look at the query and find the part that looks like this. image.png Add one line in between and modify the following lines as follows: image.png This will add a congratulatory Source.Name item.

Delete fields at once

When deleting unnecessary fields, it is often decided what to delete. I will solidify the fields that do not make such a table.

let
    UnecessaryFields = #table(
    {"DataName", "ListofFields"},
    {
        {"Report1",
            {"Unnecessary field 1", "Unnecessary field 2", "Unnecessary field 3"}
        },
        {"Report2",
            {"Unnecessary field 1", "Unnecessary field 2", "Unnecessary field 3"}
        }
    }
)
in
    UnecessaryFields

And if you write it like this in the query, you can delete the specified fields at once.

Remove_Fields = Table.RemoveColumns(
    "Previous table", Table.SelectRows(UnecessaryFields, each [DataName] = "Report1"
        ){0}[ListofFields])

random number

You can get random numbers from a to b with Number.RandomBetween (a, b), but if you add them to the column as they are, they may all be the same number. (Sometimes it doesn't happen. Mystery) In this case, if you add or subtract a number column to the first argument, a different random number will be calculated for each record.

let
    ListN = #table({"i"}, List.Transform({1..10}, each {_})),
    #"Added Custom" = Table.Buffer(Table.AddColumn(ListN, "rnd", 
        each Number.RandomBetween(0 + [i] - [i], 100)))
in
    #"Added Custom"

two-way Lookup For example, there are two tables: image.png image.png Temporarily save the first table as People. To bring the items from the second table from the first table (for example, you want to know the Name with ID = 001), add a custom column to the second table and write: I will. image.png In this way, you can do the same thing as pasting a relation. However, if there are multiple keys in the master table, an error will occur, so it is necessary to set the master ID to unique in advance. The same can be achieved more easily by merging queries. You can use this to associate an HTML color code with a particular value in your data and change the background color etc. depending on the value in the conditional formatting of the report. You can do it without using it.

Link Collection

Finally, the links that I referred to. Microsoft Official M Formula Language Reference Power Query M Primer Introduction M or DAX Basics of M

Recommended Posts

Power Query Technique Collection Part 1
FastAPI Tutorial Memo Part 3 Query Parameters