Addition that is sly to write once

Maybe I like it after all. Koyu's. Nostalgic C. "[Mahjong is fun](https://www.google.co.jp/search?q=%E9%BA%BB%E9%9B%80%E3%81%A3%E3%81%A6% E6% A5% BD% E3% 81% 97% E3% 81% 84% E3% 82% 88% E3% 81% AD & source = lnms & tbm = isch & sa = X) "(Saki) and Yuka, Mahjong game implementation issues It's interesting because there are so many things that can be said in the waiting search, which is equivalent to the medium difficulty level. The previous Story of Ritsu and the story of the state machine are low difficulty, and the AI of NPC is high difficulty (well, so that the number of listenings will improve) It's a lot of tiles, but it's relatively strong, but it's not human-like, so it's difficult to calculate what for a human-like and strong AI). Shi. Moreover, since there is no in-depth commentary on computer mahjong yet (it seems to be a pattern in which a book was not published because it was not sold, although it was released), it is a treasure trove of material. If you do it carefully, it will be one book up to the number of listenings, and it will be a rather rugged book. (The following is not polite, so I don't know: I feel like I'm just saying the answer)

data form

There are various ways to express mahjong tehai.

/*(1)How to put each type of tile in an array*/
   t_pai pai[14];
   int painum;
/*(2)How to put the number of tiles you have in an array for each type of tile*/
   int painum[9+9+9+3+4+1];/*Contains one dummy element*/
   int pairednum[4];/*Stores only the number of red tiles, overlapping with painum (1 dummy is included)*/
/*(3)Is there a way to handle it like a tree structure? I'm not sure, but I'm sure it's inefficient*/
   t_pai_node top;
   top.same(), top.more(), top.less(), //And? In short, I wanted to say topless.

(1) is suitable for handling in the display system and imitation flow, but as you can see if you read this article to the end, (2) is faster for analysis such as mahjong tile search. Even if (1) is left as it is, I think that it can be made faster as it is if you do it in a smart way, but since you have to bite the if step by step and scan it, the code becomes complicated. Efficient listing of winning candidate tiles is difficult, and face decomposition requires repeated swaps and sorts. However, since the number of elements is small, that is the only merit, but it is inevitable that the TRY will be deep, and the algorithm will be difficult to get into the pipeline, so (2) will win. I will consider the objection. And I feel like a lot of mahjong software is doing that.

Since it is easy to generate (1) to (2), it is appropriate to manage it in (1) except when analyzing tiles, and to roughly generate (2) when analyzing tiles. think. The reason why the dummy is prepared in (2) is that (1) → (2) can be performed without branch jump even when the number of sounds is small. In other words, the last element is the number of null tiles.

It may also be painum [12 + 12 + 12 + 12 + 1] for another optimization. We'll talk about that later.

The squeal tiles are not mentioned here, but should be managed separately. Reflect only in the primary array (engraving / ordering data) for winning combination in advance.

Below, I will write on the premise of the data format of (2).

Reprint of comments attached to a certain place

There are two main policies for searching for mahjong tiles.

A [There is potential, but it tends to be complicated to identify the tiles]
try single horse (1 to 3 owned tiles)
Remove the engraving and Junko from both ends with priority on engraving, and if you can remove it, you can wait for a single horse
try Sparrow head confirmed (2-4 owned tiles)
Remove the engraving and Junko from both ends with priority on engraving, and if there is a surplus, it is a waiting form
Find the derivative form by accommodating the last remaining waiting form with the mens who removed it above.
For example, if the last is 45, interact with the removed Junko that contains 3 or 6.
For example, if there is 678, read as (45/678) → (456/78).
Further from 78 the same sentence below
For example, if there is 234, read (45 ・ 234) → (345 ・ 24)
For example, if 66 is a sparrow head, it should be read as (45.66) → (456/6) waiting for a single horse.
For example, if you have 666, change (45/666) → (456/66) to read as waiting for a shabo with a sparrow head.
As you can imagine, search for it so that it doesn't loop.
Remove the 4-card tile from the tile

B [It takes a lot of work to identify the waiting tile, but it is fast if you just find the waiting tile]
try Add the tiles you own and the tiles before and after
try Sparrow head confirmed (2-4 additional tiles owned)
Remove the engraving and Junko from both ends with priority on engraving, and it is OK if it can be removed

B'[Specific effort of the above waiting form]
The added tiles are included in the sparrow head → Single horse
The added tiles are included in Junko → Depending on how they are included, side tiles, fittings, and both sides
The added tiles are included in the engraving → Shabo
If it is a triple engraving type, a waiting type when it is read as Junko is also added
The above can be interpreted more than once → The highest score should be interpreted by calculating the mark and judging the combination.
→ The Tsumo mark changes depending on the waiting form, so at least the Tsumo flag must be visible.
→ If you decide the score, you can see the self-wind, field wind, reach state, dora, etc.
→ This is the same for A. It's hard to modularize it neatly
Most of the game data needs to be visible here

B is smooth for the purpose of identifying the waiting tile and judging the role. The calculation of the distance including the number of listeners and the probability to the winning form is an extension of A, so in many cases, both will be implemented. In addition, if you do not specify the waiting form in B and then pass the flag for each waiting form, whether it is Tsumo or Iwate, you will not be able to judge Tsumori 3 memorization or pinf I make a mistake. As I wrote in the comments, this kind of processing is a very annoying thing if you have to complete the whole game data (honestly, you have to pass almost everything except the river).

And it's optimization!

So, the point is how to write this with the minimum jump and using only the memory that fits in the L1 cache, but if you write a mahjong game server, most of the processing is in the form of tiles. Since you are doing analysis, the faster this guy is, the more people you can accommodate. State machines are only troublesome to deal with such small rules as compound squeal and agari, doubloons, dark cans, and chankans (although it is very troublesome to make the rules variable). It's not like consuming processing time (certainly). Now, I think the lobby and game flow can be managed with Node.js + react.js + socket.io. However, only tile analysis cannot support C10K without low-level coding using C ++ binding.

The number of listenings is reduced by at most 1 per Tsumo, so if you can calculate the number of listenings correctly, you can analyze the tiles on the server once every few rounds. (Number of listening-1) The tour can skip the tile analysis. And this is the most root pruning and the most effective. Based on that, we will further optimize the winning judgment, tenpai judgment, reachability judgment, etc., including the calculation of the number of listenings. In addition, the number of listenings needs to be calculated in three ways: seven pairs, Kokushi, and face-to-face, but the number of hearings to face-to-face requires pruning that goes beyond what is explained here. This is because the number of listenings can be calculated from the combination of Toko, Chiitoitsu, and Face, but the division is unexpectedly ambiguous, and if you do it poorly, you may fall into a deep search. It's annoying to get the wrong value. If I feel like it, I'll write it soon, but I'll start with the basics.

  a[0]>=If 3
　[ a[0], a[1], a[2] ] = [ a[0]-3, a[1], a[2] ] /*Try to take the engraving from the edge*/
Otherwise
　st = min(a[0],a[1],a[2])When
　[ a[0], a[1], a[2] ] = [ a[0]-st, a[1]-st, a[2]-st ] /*Try to take Junko from the edge*/

/*Of course after this
a[1], a[2], a[3]
a[2], a[3], a[4]
a[3], a[4], a[5]
a[4], a[5], a[6]
a[5], a[6], a[7]
a[6], a[7], a[8]
The same process is applied to each of them.
The engraving is
a[7]
a[8]
The same process is applied to.
If you want to do character tile processing with the memory allocation described later
a[9]
Also requires engraving.
*/

It is only necessary to perform the calculation by combining bit operations and table references without branching, and to keep a record of what was closed. Well, the answer in most cases is "subtract 0 for what you want without any impact." Then, write the result record for the time being, and make the increment to the pointer address 1 only when you want to keep the record, and add 0 to the pointer address in other cases, and it will be overwritten eventually. Eventually it will be ignored.

The following is a simple correction of the above beginning without jumping.

`sample`


if(a[0] >= 3)
{
    a[0] -= 3;
    *result++ = pai_1pin_ko;
}
//  ↓
int kotsu[] = {0,0,0,3,3};
*result = pai_1pin_ko;
result += kotsu[a[0]]>>1;
a[0] -= kotsu[a[0]];

It feels like. When I first wrote it in logical operation, I made a mistake, so I got angry and rewrote it to a table reference. It's fast enough because an array of this size can never go out of cache. Maybe faster than logical operations.

Incidentally added

The one above wins the table reference. Small table references are as fast as stupid. However, min has a famous arithmetic method, and logical operations win. That is this. This 31-bit right shift is a standard for doing things like ternary operators using positive and negative operations. It comes out very often to eliminate jumps.

`sample`


st = min(a[0], a[1], a[2]);
    /*↓*/
/*This is bit operation. Famous guy*/
int ab = a[0] + (((a[1]-a[0])>>31) & (a[1]-a[0]));
int st = ab + (((a[2]-ab)>>31) & (a[2]-ab));
assert((0x80000000>>31)==(-1));

For example, when doing a positive number check, do something like ((-a) >> 31). Since it depends on the environment, I also posted an assert to judge whether it works or not. In terms of application in mahjong, when you say the tiles you own when listing the winning candidate tiles and the tiles before and after that,

p[0]=a[1]*a[2],p[1]=a[2]*a[3]..., 
q[1]=a[2]*a[0],q[2]=a[3]*a[1]..., 
r[2]=a[0]*a[1],r[3]=a[1]*a[2]..., 
p[0]+=q[0]+r[0]+a[0],p[1]+=q[1]+r[1]+a[1]...
p[0]=((-p[0])>>31),p[1]=((-p[1])>>31)...

After doing so, you can select the tiles that have become non-zero.

In the above code, the buffers for a, p, q, and r are only written for clarity and beauty, and in reality, only two buffers, the original data a and the result holding p, are required.

By the way, even if you say the front and back tiles, when considering whether 2 tiles can be a winning candidate tile if 2 cylinders are not in the hand tile, if there are only 3 tiles and there are no 4 or 1 tiles You can't be a winning candidate tile, right? That's why I look at the two next to each other.

Here, multiplication and addition come out, but the magnitude of the value has no meaning. 0 × 0 = 0,0 × positive number = 0, positive number × 0 = 0, positive number × positive number = positive number, then 0 + 0 = 0,0 + positive number = positive number, positive number +0 = It is used as a kind of logical product / sum of positive numbers, positive numbers + positive numbers = positive numbers. Also, in this case, 7 pairs of the same formula appear in p and r, so assemble them so that they can be used well. You may leave it to the compiler.

By the way, if you optimize like a junior high school student. This is a simple formula transformation.

p[2] = (a[0]+a[3])*(a[1]+a[4])-a[0]*a[4]+a[2];

The amount of calculation itself is the original p [2] = a [0] * a [1] + a [1] * a [3] + a [3] * a [4] + a [2]; It's the same as, but there is also an evil way to calculate here using the progress of the calculation of "muscle mod 3" that will come out later. It's more effective to keep memory access in the L1 cache, reduce jumps, and run the pipeline, rather than doing it that far. Readability is so delicious. There is a scene where this process is close to "Muscle mod 3".

It may be a good idea to remove the tiles that have already been used with 4 tiles from consideration because the use of 4 tiles is very infrequent, so it may be a good idea to identify the winning tiles and then eliminate them later. There was a story that made me laugh a little when using 4 cards, and I remember a bug report that "1111234444" was treated as a no-ten after the stream. No, it ’s not in the world. The story goes awry, but there is a similar story in the Novetan form, with 3 Furo, "3456", "3" coming out of the upper family, "Ron" choices, but "Chi" not. Or something. You can't sing because the replacement is confirmed.

And back to the story, if you want to judge whether you can make it a sparrow head, it is ((2-a) >> 31). It's not always the case that you can judge so honestly ... next!

Great technique "color-coded mod 3"

By the way, there are at most 34 types of pre-verification of the above-mentioned winning candidate tiles, including character tiles. There are at most 25 types that should be tried in the preliminary verification because there are restrictions on the number of handicrafts. If you decide not to scan when you are not listening, if you find more than 20 candidates, you will be able to prun because you will never hear. However, the narrowing down that makes such narrowing down ridiculous is the narrowing down by the following "color-coded mod 3".

In fact, you can narrow down the number of tiles for each color, such as which color the winning candidate tiles are in. Looking at the mod 3 of the number of tiles held for each color, first of all, in the winning form, there is a sparrow head somewhere, and all the others are facets of 3 each, so (2, 0 for others) It is a combination of. That's definitely the case with face-to-face hands.

Then, when it comes to tenpai, one is drawn from the face, so there are two patterns depending on whether you draw one from 2 or 0. In other words, it is only a tenpai when (1, other is 0) and (2,2, others are 0). And since the place where the winning tile is drawn is non-zero, it means that there is a possibility that there is a winning tile at the non-zero place. Arranging by pattern, when (1, other is 0), the sparrow head is the same color as the winning tile, and when (2, 2, other is 0), the sparrow head and the winning tile are different colors. At this point, the winning tile candidates and sparrow head candidates will be narrowed down considerably.

It doesn't make much sense to suppress jumps in the process of narrowing down (although it is better to use the table well), so use if properly to divide the pattern. When it is 1, it is necessary to perform "try winning tile" x "try sparrow head candidate", but when it is 2, 2, one will perform "try sparrow head candidate" + the other will perform "try winning tile". Therefore, if you write a judgment code specialized for it, you can further optimize it. (Personally, the former is called product search and the latter is called sum search.)

Based on the result of this judgment, you can replace the start pointer and make it run, but let's make the best use of the pipeline by interleaving the verification process for different colors. Then, ** the compiler does not know that the range of pointers pointing to each color does not overlap **. So ** you have to manually interleave **. However, it is in a state like a mixed-color winning judgment with well-conditioned conditions, so if you have the guts, you can write super fast. This mod 3 method is one of the basic ideas that often appears in pruning in other tile analysis (such as listening count calculation).

By the way, as an aside, it is okay to judge the reachability by discarding one by one (especially if the policy is to calculate on the client side), but for safety, it is judged on the server side so far. If you want to notify the client, the loop of "try discard try try winning candidate tile" is unexpectedly deep, so you want to optimize it. Also, if you run NPCs on a reasonable number of servers, it's a must.

In the reachability judgment, that is, the tenpai judgment in the state of 14 sheets, "mod 3 by color" is (2, others 0), (1,1, others 0), (1,2,2, others There are only 3 ways of 0). And you have already identified the colors of the candidates for disposal for each pattern, haven't you? If there is 1, the color of 1 is the candidate for discarding, otherwise the color of 0,2 is a candidate for discarding tiles. If you think about it further, the color of the discarded tiles can be further limited. If we call "complete face decomposition" "Tris", If 1,1, The condition is that "one 1 succeeds in product search, another 1 is Tris by cutting one, and other colors are also Tris". Similarly If 1,2,2 "2, 2 succeeded in sum search, only 1 is Tris by cutting one piece, other colors are also Tris", If 2, a "2 cuts one sheet and succeeds in product search, 0 is all Tris" b "2 is Tris with waiting tiles, 0 is Tris with sparrow head, other colors are Tris" c "2 is Tris with sparrow head, one 0 is Tris with waiting tiles, other colors are Tris" It will be. It seems to be difficult, but when I write the code, it feels good to narrow down more and more. Since it is "Sparrow Tris" ⊃ "Machi-Tile Tris", you can check which category it belongs to in the Machi-Tile Tris judgment at once. (That is, there are three types of output: waiting tile Tris and Jakuto Tris, non-waiting Tris and Jakuto Tris, both failing: see "Muscle mod 3" below) And Furiten from the end If the possibility of reach is judged first and treated differently, the case of 2 is narrowed down a little more, and if all 0s are Tris, it always belongs to a when it is not a winning form. So, in other cases, there is always one 0 whose Tris judgment is NG (if there are two or more, it is not ten), and b and c can be judged at once by fixing the color. is. Therefore, the cases where you have to make a "waiting Tris judgment", which is equivalent to a "try winning candidate tile", have been extremely narrowed down. At most twice.

Then, if you search by mixing "complete face decomposition of color A and sum search of colors B and C" for each combination of patterns, you can write a code that is easy to get into the pipeline. Here, it is necessary to devise so as not to judge the order of the tiles, but with 12 elements for each of the Mantsutsu rope characters, 1 2 3 4 5 6 7 8 9 ◆◆◆ 　①②③④⑤⑥⑦⑧⑨◆◆◆ 　１２３４５６７８９◆◆◆ Southeast ◇ Northwest ◇ From white ◇ Middle ◆◆ It is also a good idea to place it in memory like this. It is OK if you perform the engraving search up to a [9] and overwrite ◇ with 0 after calculating the winning tile candidates. The merit of being able to use high-speed algorithms such as Tris judgment and product search without exception should outweigh the loss that causes some wasteful processing. Up to this point, 10 elements are fine, but if you want to use "Muscle mod 3" in the next section with branch suppression, you have to use 12 elements. Well, it doesn't increase the amount of calculation anyway, and I think it's okay to secure it just by touching it when calculating the sum of "muscle mod 3".

You can even use "Muscle mod 3"

I rewrote it here. Don't look at the edit history. I'm afraid (after that). The situation is slightly different from that of "color-coded", so it's complicated. So let's take a second look and go politely.

What is "Muscle mod 3"?

int suji[3];
suji[0] = (a[0]+a[3]+a[6]) mod 3;
suji[1] = (a[1]+a[4]+a[7]) mod 3;
suji[2] = (a[2]+a[5]+a[6]) mod 3;

And if you abbreviate it as s (suji [0], suji [1], suji [2]) etc., wherever the engraving is, s (0,0,0), where is Junko? Is also s (1,1,1), so if Tris passes, it will always be s (a, a, a). Is it OK so far?

So, if it is a 13-sheet system (n * 3 + 1), it should add up to paisum mod 3 == 1, so it will always be in the form of s (a, a, a + 1) in no particular order. Removing the sparrow head means removing the same two tiles, so s (-2,0,0) = s (+ 1,0,0), assuming a sparrow tile is also s (1,0) , 0). (Which source is still undecided)

Since s (a, a, a + 1) may be duplicated, in order to add 1 to two places to make s (a', a', a'),

Either s (a ** + 1 **, a ** + 1 **, a + 1) or s (a, a, a + 1 ** + 1 + 1 **).

OK. Now you can use it for product search. Assuming a sparrow head at a + 1 of s (a, a, a + 1), the waiting tiles are in the same line (so, even if there is only one tile in the hand tile, the sparrow head TRY is required) .. And if you assume a sparrow head in either a, the waiting tiles are in different a lines (so you can only try the sparrow head TRY in this case when you already have two or more tiles in your hand).

Therefore, if you keep track of which muscles you should try waiting for when picking up a sparrow head, you will always get better with 3 Tris for each sparrow head. I have to make a false judgment of Tris with a broken sparrow head of -1, but at most 9 times * Tris 3 times, on average 3-4 times x 3 times loop, so enroll Then you can judge the product by jumping the table. Branch prediction with a table jump will also be correct to some extent. Is it about 30%?

Ah, it was refreshing. I didn't say it as usual, but I'm really sorry to write the first thing.

Although the story goes awry

When I was young, I dreamed that if I gave a domain of input with a hint to the compiler, I could do crazy optimization, but I did various crazy optimizations using the domain myself. When I saw it, I came to think that it was not very promising to let the machine do it. However, it seems that there is still more CPU rate-determining processing, and considering that the clock speed is near the theoretical limit, it may be a way of life. Since the usage will be limited, it will be an extension of basic procedural and functional languages such as C and Haskell, respectively, but for language implementers, the theme of "optimization by using the domain", How is it?

[GO] Basic policy for searching for mahjong tiles

Addition that is sly to write once

data form

Reprint of comments attached to a certain place

And it's optimization!

sample

Incidentally added

sample

Great technique "color-coded mod 3"

You can even use "Muscle mod 3"

Although the story goes awry

`sample`

`sample`