Item 47: Prefer Collection to Stream as a return type

47. Collection should be selected from the stream as the return type

What type should a contiguous element be returned in?

Possible return types for contiguous elements are collection interfaces, Iterables, arrays, and streams.

Return as a stream

You may hear that it's good to return in a stream, but as mentioned in Item 45, it's important to separate the stream from the iteration.

Since Stream does not inherit from Iterable, the only way to turn the value returned as a stream with a for-each statement is to use Stream's iterator method. At first glance It seems like the code below works fine with the iterator method.

// Won't compile, due to limitations on Java's type inference
for (ProcessHandle ph : ProcessHandle.allProcesses()::iterator) {
    // Process the process
}

However, this code cannot be compiled and I need to cast it as follows.

// Hideous workaround to iterate over a stream
for  (ProcessHandle ph : (Iterable<ProcessHandle>)
                        ProcessHandle.allProcesses()::iterator)

This code works, but it's messy and confusing. An alternative is to use the adapter method. The JDK doesn't provide such a method, but you can easily write it like this:

// Adapter from  Stream<E> to Iterable<E>
public static <E> Iterable<E> iterableOf(Stream<E> stream) {
    return stream::iterator;
}

By using this adapter method, it is possible to turn for-each for the stream as follows.

for (ProcessHandle p : iterableOf(ProcessHandle.allProcesses())) {
    // Process the process
}

Return with Iterable

On the contrary, even if the client is trying to process it as a stream, but the return value corresponds only to Iterable, it needs to be handled. This correspondence is not prepared in JDK, but you can easily write the corresponding method as follows.

// Adapter from Iterable<E> to Stream<E>
public static <E> Stream<E> streamOf(Iterable<E> iterable) {
    return StreamSupport.stream(iterable.spliterator(), false);
}

Return in Collection

The Collection interface is a subtype of Iterable and also has stream methods, so it can handle both iteration processing and stream processing. So the best return type for a method that returns ** contiguous elements is usually a Collection or a suitable Subtype of Collection **. If the contiguous elements to be returned are small enough, you can return an implementation of Collection such as ArrayList or HashSet, but ** you should not store large contiguous elements in memory to return as a Collection. ** **

If the continuous element to be returned is large but can be expressed concisely, consider implementing a special collection. For example, consider an implementation that returns the power set of a given set. The power set is, for example, the power set of {a, b, c} is {{}, {a}, {b}, {c}, {a, b}, {a, c}, {b, It looks like c}, {a, b, c}}, and if there is a set of n elements, there is a power set of 2 to the nth power. Don't think about putting the power set in a standard collection, as it will be a very large set. A custom collection that achieves this can be easily implemented using the AbstractList. The mechanism is to index each element of the set and determine whether they exist or not by bit. The code looks like this:

package tryAny.effectiveJava;

import java.util.AbstractList;
import java.util.ArrayList;
import java.util.Collection;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

//Returns the power set of an input set as custom collection
public class PowerSet {
    public static final <E> Collection<Set<E>> of(Set<E> s) {
        List<E> src = new ArrayList<>(s);
        if (src.size() > 30)
            throw new IllegalArgumentException("Set too big " + s);
        return new AbstractList<Set<E>>() {
            @Override
            public int size() {
                return 1 << src.size(); // 2 to the power srcSize
            }

            @Override
            public boolean contains(Object o) {
                return o instanceof Set && src.containsAll((Set) o);
            }

            @Override
            public Set<E> get(int index) {
                Set<E> result = new HashSet<>();
                for (int i = 0; index != 0; i++, index >>= 1)
                    if ((index & 1) == 1)
                        result.add(src.get(i));
                return result;
            }
        };
    }
}

In the above code, an exception is thrown when the set has 30 or more elements. This is because the maximum value that can be returned by the size method of Collection is 2 to the 31st power-1.

In some cases, the type to be returned is determined only by the difficulty of implementation. For example, consider writing a method that returns all sublists of the input list. You can write three lines of code to create sublists and put them into a standard collection, but the memory must hold a collection with a two-dimensional structure. This is not bad compared to a set that should grow exponentially, but it is unacceptable. Implementing a custom collection, as in the case of power sets, is tedious.

However, returning all sublists of a list as a stream can be implemented directly with a little ingenuity. The sublists that hold the first character of the list are called prefixes. That is, the prefixes of (a, b, c) are (a), (a, b), (a, b, c). The sublists that hold the last character of the list are called suffixes. That is, the suffixes of (a, b, c) are (a, b, c), (b, c), (c). At this time, all sublists of the list are suffixes of the prefixes of the list and an empty list. The implementation is as follows.

package tryAny.effectiveJava;

import java.util.Collections;
import java.util.List;
import java.util.stream.IntStream;
import java.util.stream.Stream;

//Returns a stream of all the sublists of its input list
public class SubLists {
    public static <E> Stream<List<E>> of(List<E> list) {
        return Stream.concat(Stream.of(Collections.emptyList()), prefixes(list).flatMap(SubLists::suffixes));
    }

    private static <E> Stream<List<E>> prefixes(List<E> list) {
        return IntStream.rangeClosed(1, list.size()).mapToObj(end -> list.subList(0, end));
    }

    private static <E> Stream<List<E>> suffixes(List<E> list) {
        return IntStream.range(0, list.size()).mapToObj(start -> list.subList(start, list.size()));
    }
}

This code has the same idea as a nested for loop like the one below.

for (int start = 0; start < src.size(); start++)
    for (int end = start + 1; end <= src.size(); end++)
        System.out.println(src.subList(start, end));

A literal translation of this for loop into stream processing would be simpler, but less readable. Specifically, it is as follows.

// Returns a stream of all the sublists of its input list
public static <E> Stream<List<E>> of(List<E> list) {
   return IntStream.range(0, list.size())
      .mapToObj(start ->
         IntStream.rangeClosed(start + 1, list.size())
            .mapToObj(end -> list.subList(start, end)))
      .flatMap(x -> x);
}

Neither implementation is bad, but some users may need code to convert the stream so that it can be iterated, or they may have to stream it where iterating is natural. The code that converts from the stream so that it can be iterated not only clutters the client code, but also has performance problems compared to the implementation in Collection.

Recommended Posts

Item 47: Prefer Collection to Stream as a return type
Item 28: Prefer lists to arrays
Item 65: Prefer interfaces to reflection
Item 43: Prefer method references to lambdas
Item 42: Prefer lambdas to anonymous classes
Item 39: Prefer annotations to naming patterns
Item 85: Prefer alternatives to Java serialization
Item 58: Prefer for-each loops to traditional for loops
Item 23: Prefer class hierarchies to tagged classes
Item 61: Prefer primitive types to boxed primitives
I'm looking for a way to return Yes / No of Dialog as boolean ...
Item 81: Prefer concurrency utilities to wait and notify
To manually deploy Struts2 as a war file
Item 80: Prefer executors, tasks, and streams to threads
Pass arguments to the method and receive the result of the operation as a return value