Range story

It is the 23rd day of Ruby Advent Calendar 2020.

I couldn't talk about difficult things inside, so I thought I should write something on a document basis, but I just encountered a situation where the behavior of processing using Range changed with the version upgrade of Ruby, and that Since I investigated the Endless and Endless Ranges as a trigger, I will write about that.

Overview

Starting with Ruby 2.6, a range without an end is available, and from Ruby 2.7, a range without a start is available. I will introduce the range without start and end generated by each version because it is interesting.

Range without end and Range without start

1..  #No termination
..10 #No beginning

Both the start and end cannot be omitted.

..  # syntax error

However, you can write a Range with neither the start nor the end.

nil..nil

Differences between versions when the end or start is nil

The execution results for each of these versions are as follows.

1..nil
#  < 2.6: ArgumentError
# >= 2.6: 1..

nil..10
#  < 2.7: ArgumentError
# >= 2.7: ..10

nil..nil
#  < 2.6: nil..nil
# ~> 2.6: nil..
# >= 2.7: nil..nil

The difference between the first two is obvious, but the last one is hard to tell.

< 2.6: nil..nil

This is exactly the range from nil to nil. size is nil and contains only nil.

range = nil..nil
range.size        #=> nil
range.cover?(nil) #=> true

~> 2.6: nil..

This is also the endless Range of nil start. Again, size is nil and contains only nil.

range = nil..nil
range.size        #=> nil
range.cover?(nil) #=> true

>= 2.7: nil..nil

Lastly, here is the Range with neither the start nor the end as mentioned at the beginning. The size is Infinity and ** includes everything **.

range = nil..nil
range.size         #=> Infinity
range.cover?(nil)  #=> true
range.cover?(1)    #=> true
range.cover?('s')  #=> true
range.cover?(true) #=> true

By the way, since the last nil .. nil is size Infinity, size in the case of nil .. and .. nil is also introduced, but ** Range # size is Array. Unlike #size, it does not simply represent the number of elements **.

('a'..'z').size      #=> nil
('a'..'z').to_a.size #=> 26

Although it is a method that represents the number of elements, it returns nil unless it is a subclass object of Numeric or nil at both the end and the beginning.

Range cannot be destructively changed

Ruby's Range class is immutable. That is, the object itself cannot be modified destructively. Therefore, the range pointed to by the once created Range object can never be changed.

That makes me want to make some destructive changes. ... but after a lot of research, there was no way to make a destructive change.

Rurima's story

Range # begin and Range # first, Range # end and Range # last are written as the same description, but the result will be different if there is no start or end.

Range#begin

(..10).begin #=> nil
(..10).first #=> RangeError

Range#end

(1..).end #=> nil
(1..).last #=> RangeError

When I read the source code, it seems that the implementation is slightly different.

# Range#begin
static VALUE
range_begin(VALUE range)
{
    return RANGE_BEG(range);
}
# Range#first
static VALUE
range_first(int argc, VALUE *argv, VALUE range)
{
    VALUE n, ary[2];

    if (NIL_P(RANGE_BEG(range))) {
        rb_raise(rb_eRangeError, "cannot get the first element of beginless range");
    }
    if (argc == 0) return RANGE_BEG(range);

    rb_scan_args(argc, argv, "1", &n);
    ary[0] = n;
    ary[1] = rb_ary_new2(NUM2LONG(n));
    rb_block_call(range, idEach, 0, 0, first_i, (VALUE)ary);

    return ary[1];
}

I can't read C language at all, but at first glance Range # first and Range # last raise an exception when the value obtained by begin or end is nil. It seems that. After that, Range # first and Range # last can get multiple values ​​starting from the beginning or end by passing an argument, so it seems to be the processing of that part.

If you come up with a sentence that can explain the difference in operation around here well, create a modified PR.

By the way, you can modify Rurima by creating a PR from the "edit" link here. スクリーンショット 2020-12-21 23.07.22.png

It is said that Range # size also returns nil if it is not a subclass object whose end or start is Numeric, but the end nil of 1..nil is not a subclass object of Numeric. It returns Infinity instead of nil. I don't know what the interpretation is here [^ 1], so it doesn't seem to be a simple mistake, but there seems to be room to change the expression in an easy-to-understand manner.

Range#size

Summary

It's fun to discover new things by moving objects of classes that you usually use while reading the document again.

Regarding the "situation where the behavior of processing using Range changes due to the version upgrade of Ruby" mentioned at the beginning, the code was as follows.

(time_or_nil..).cover?(time)

It was a process that compares the value of the Time object with the above-mentioned endless Range and ===, and returns true if it is included. Range may take the form of nil .., in which case it always returned false, but when it was raised to 2.7, it always returned true. It was. Please note that some people may encounter the same situation.

It's still okay if it's private, but I don't know all the code written by others when it comes to work, so it's especially easy to overlook the part where the behavior changes due to such a version upgrade. I would like to write the test properly on a daily basis so that I can pick it up when the behavior changes properly.

Reference material

Rurima

[^ 1]: Is it okay to consider the value nil of (1..nil) .end as the end? Since nil here expresses" no end ", it can be said that the end is" none "instead of nil.

Recommended Posts

Range story
JUnit story
MyBatis story
Java static story
BigDecimal.valueOf (double) story
Downcast story (memories)
Java initializer story
Java generic story