Create an easy-to-extend Stream API alternative

Introduction

Stream API is an API that processes columns of data added from Java SE 8. It is possible to write in easy-to-understand code the complicated processing that has been performed for collections so far. However, there are some points to be worried about when actually using it.

I would like to create a new API to improve these. The policy is as follows.

All the code I wrote in this article can be found at Try to make an easy-to-extend Stream API alternative.

An alternative interface to Stream

When it comes to creating an alternative to the Stream API, you need to define the equivalent of the Stream interface. You can define a new interface, but here we use the ʻIterableinterface. TheCollection interface implements the ʻIterable interface, so you can retrieve the ʻIterable directly without having to do something like list.stream () `.

Intermediate processing

map

First, let's implement map. Receives ʻIterable and returns ʻIterable <U> as a result. List <U> implements ʻIterable , so if you think that you can return List `, you can implement it like this.

public static <T, U> Iterable<U> map(Function<T, U> mapper, Iterable<T> source) {
    List<U> result = new ArrayList<>();
    for (T element : source)
        result.add(mapper.apply(element));
    return result;
}

However, with this, if source has 1 million elements, 1 million ʻArrayListwill be created. In the intermediate processing, it is necessary to sequentially return the result of applyingmapper when the element is needed. To do this, you must implement ʻIterator <U>, which applies mappaer sequentially. If you implement ʻIterator , it's easy to implement ʻIterable <U> that returns it. This is because ʻIterable is a functional interface in which only ʻIterator <U> iterator () is defined.

public static <T, U> Iterable<U> map(Function<T, U> mapper, Iterable<T> source) {
    return () -> new Iterator<U>() {

        final Iterator<T> iterator = source.iterator();

        @Override
        public boolean hasNext() {
            return iterator.hasNext();
        }

        @Override
        public U next() {
            return mapper.apply(iterator.next());
        }

    };
}

Define the termination process toList before testing. ʻReceives Iterable and returnsList `. It's easy because you can use the extended for statement.

public static <T> List<T> toList(Iterable<T> source) {
    List<T> result = new ArrayList<>();
    for (T element : source)
        result.add(element);
    return result;
}

The test looks like this. I will also write the code using the Stream API for comparison.

@Test
void testMap() {
    List<String> actual = toList(
        map(String::toUpperCase,
            List.of("a", "b", "c")));
    List<String> expected = List.of("A", "B", "C");
    assertEquals(actual, expected);

    List<String> stream = List.of("a", "b", "c").stream()
        .map(String::toUpperCase)
        .collect(Collectors.toList());
    assertEquals(stream, expected);
}

Since the Stream API processes in a chain of instance methods, it feels like processing sequentially from top to bottom, but since static methods are used, the order of description is reversed. It may feel a little strange.

filter

Next, let's implement filter. Since we need to process it sequentially as in the case of map, we define an anonymous inner class that implements ʻIterator <T>. map is simple because there is a one-to-one correspondence between input and output, but filter is not so a little annoying.

public static <T> Iterable<T> filter(Predicate<T> selector, Iterable<T> source) {
    return () -> new Iterator<T>() {

        final Iterator<T> iterator = source.iterator();
        boolean hasNext = advance();
        T next;

        boolean advance() {
            while (iterator.hasNext())
                if (selector.test(next = iterator.next()))
                    return true;
            return false;
        }

        @Override
        public boolean hasNext() {
            return hasNext;
        }

        @Override
        public T next() {
            T result = next;
            hasNext = advance();
            return result;
        }
    };
}

It is necessary to look ahead to see if there is anything that meets the conditions of selector when filter is called. If there is something to satisfy, save it in the instance variable next and return it whennext ()is called. At the same time, find the element that satisfies the following selector. I will test it. Find the sequence of even numbers extracted from the sequence of integers and multiplied by 10.

@Test
void testFilter() {
    List<Integer> actual = toList(
        map(i -> i * 10,
            filter(i -> i % 2 == 0,
                List.of(0, 1, 2, 3, 4, 5))));
    List<Integer> expected = List.of(0, 20, 40);
    assertEquals(expected, actual);

    List<Integer> stream = List.of(0, 1, 2, 3, 4, 5).stream()
        .filter(i -> i % 2 == 0)
        .map(i -> i * 10)
        .collect(Collectors.toList());
    assertEquals(stream, actual);
}

In the case of Stream, once the termination process is performed, the Stream cannot be reused. In the case of Iterable, you can save the intermediate result and reuse it later.

@Test
void testSaveFilter() {
    Iterable<Integer> saved;
    List<Integer> actual = toList(
        map(i -> i * 10,
            saved = filter(i -> i % 2 == 0,
                List.of(0, 1, 2, 3, 4, 5))));
    List<Integer> expected = List.of(0, 20, 40);
    assertEquals(expected, actual);

    assertEquals(List.of(0, 2, 4), toList(saved));
}

Termination

Termination is easy because you just save the result using the extended for statement, like the toList () above.

toMap

toMap () just creates a Map using Function which extracts the key and value from the element.

public static <T, K, V> Map<K, V> toMap(Function<T, K> keyExtractor,
    Function<T, V> valueExtractor, Iterable<T> source) {
    Map<K, V> result = new LinkedHashMap<>();
    for (T element : source)
        result.put(keyExtractor.apply(element), valueExtractor.apply(element));
    return result;
}

groupingBy

Next, let's implement a simple groupingBy. Just take the key from the element and make it a Map. Elements with duplicate keys are packed into the list.

public static <T, K> Map<K, List<T>> groupingBy(Function<T, K> keyExtractor,
    Iterable<T> source) {
    Map<K, List<T>> result = new LinkedHashMap<>();
    for (T e : source)
        result.computeIfAbsent(keyExtractor.apply(e), k -> new ArrayList<>()).add(e);
    return result;
}

I will test it. This is an example of grouping by the length of a character string.

@Test
public void testGroupingBy() {
    Map<Integer, List<String>> actual = groupingBy(String::length,
        List.of("one", "two", "three", "four", "five"));
    Map<Integer, List<String>> expected = Map.of(
        3, List.of("one", "two"),
        5, List.of("three"),
        4, List.of("four", "five"));
    assertEquals(expected, actual);

    Map<Integer, List<String>> stream =
        List.of("one", "two", "three", "four", "five").stream()
        .collect(Collectors.groupingBy(String::length));
    assertEquals(stream, actual);
}

Next is groupingBy, which groups by key and then aggregates duplicate elements.

static <T, K, V> Map<K, V> groupingBy(Function<T, K> keyExtractor,
    Function<Iterable<T>, V> valueAggregator, Iterable<T> source) {
    return toMap(Entry::getKey, e -> valueAggregator.apply(e.getValue()),
        groupingBy(keyExtractor, source).entrySet());
}

Aggregate elements with the same key with value Aggregator. Let's define another termination process for testing.

public static <T> long count(Iterable<T> source) {
    long count = 0;
    for (@SuppressWarnings("unused")
    T e : source)
        ++count;
    return count;
}

The following groups by string length and counts the number of strings with the same string length.

@Test
public void testGroupingByCount() {
    Map<Integer, Long> actual = groupingBy(String::length, s -> count(s),
        List.of("one", "two", "three", "four", "five"));
    Map<Integer, Long> expected = Map.of(3, 2L, 5, 1L, 4, 2L);
    assertEquals(expected, actual);

    Map<Integer, Long> stream = List.of("one", "two", "three", "four", "five").stream()
        .collect(Collectors.groupingBy(String::length, Collectors.counting()));
    assertEquals(stream, actual);
}

Things that are difficult to implement with the Stream API

Finally, let's implement something that is difficult to implement with the Stream API. Both are intermediate processes. All intermediate operations in the Stream API are instance methods and cannot be easily extended.

zip

zip is a process to match each data string from the beginning of two data strings into one data string. If the two input columns have different lengths, the shorter one is matched and the longer one is ignored.

static <T, U, V> Iterable<V> zip(BiFunction<T, U, V> zipper, Iterable<T> source1,
    Iterable<U> source2) {
    return () -> new Iterator<V>() {

        final Iterator<T> iterator1 = source1.iterator();
        final Iterator<U> iterator2 = source2.iterator();

        @Override
        public boolean hasNext() {
            return iterator1.hasNext() && iterator2.hasNext();
        }

        @Override
        public V next() {
            return zipper.apply(iterator1.next(), iterator2.next());
        }

    };
}

This is a test to arrange a sequence of integers and strings.

@Test
void testZip() {
    List<String> actual = toList(
        zip((x, y) -> x + "-" + y,
            List.of(0, 1, 2),
            List.of("zero", "one", "two")));
    List<String> expected = List.of("0-zero", "1-one", "2-two");
    assertEquals(expected, actual);
}

cumulative

This is an intermediate process that accumulates elements. Similar to reduce (), but reduce () returns a single value for termination, while cumulative returns a column.

public static <T, U> Iterable<U> cumulative(U unit, BiFunction<U, T, U> function,
    Iterable<T> source) {
    return () -> new Iterator<U>() {

        Iterator<T> iterator = source.iterator();
        U accumlator = unit;

        @Override
        public boolean hasNext() {
            return iterator.hasNext();
        }

        @Override
        public U next() {
            return accumlator = function.apply(accumlator, iterator.next());
        }

    };
}

This is a test to find the partial sum from the beginning.

@Test
public void testCumalative() {
    List<Integer> actual = toList(
        cumulative(0, (x, y) -> x + y,
            List.of(0, 1, 2, 3, 4, 5)));
    List<Integer> expected = List.of(0, 1, 3, 6, 10, 15);
    assertEquals(expected, actual);
}

flatMap

flatMap () is also in the Stream API, but it's hard to understand how to implement it, so I'll put it here.

public static <T, U> Iterable<U> flatMap(Function<T, Iterable<U>> flatter, Iterable<T> source) {
    return () -> new Iterator<U>() {

        final Iterator<T> parent = source.iterator();
        Iterator<U> child = null;
        boolean hasNext = advance();
        U next;

        boolean advance() {
            while (true) {
                if (child == null) {
                    if (!parent.hasNext())
                        return false;
                    child = flatter.apply(parent.next()).iterator();
                }
                if (child.hasNext()) {
                    next = child.next();
                    return true;
                }
                child = null;
            }
        }

        @Override
        public boolean hasNext() {
            return hasNext;
        }

        @Override
        public U next() {
            U result = next;
            hasNext = advance();
            return result;
        }

    };
}

Below is the test. Inflate each element into two rows.

@Test
public void testFlatMap() {
    List<Integer> actual = toList(
        flatMap(i -> List.of(i, i),
            List.of(0, 1, 2, 3)));
    List<Integer> expected = List.of(0, 0, 1, 1, 2, 2, 3, 3);
    assertEquals(expected, actual);

    List<Integer> stream = List.of(0, 1, 2, 3).stream()
        .flatMap(i -> Stream.of(i, i))
        .collect(Collectors.toList());
    assertEquals(stream, actual);
}

Finally

I haven't implemented all the APIs that support the Stream API, but I've found that it's surprisingly easy to create an equivalent.