Any thoughts on Ceylon-style union and intersection types?

Of course Kotlin is by-far my favorite new JVM-based language, but I do find myself getting a little jealous of Ceylon's union and intersection types as described here: http://ceylon-lang.org/documentation/1.0/tour/types/

Any thoughts on whether we could have something similar in Kotlin?

Ian.

We could have many things, if there were good reasons. Tell me what the good reasons for this complication of the type system are?

Wait, you mean we actually need reasons for new features?!Ā Ā That's crazy talk ;-)

Like many features, this is more about affording greater flexibility and less verbosity, than enabling something fundamentally new.

As a specific example, Iā€™ve been working on creating a class that represents rational numbers, allowing them to be added, subtracted, etc.Ā Ā I want to be able to construct a rational number from a String or a Number.

Right now, Iā€™d either need to supply factory methods for the alternate means of construction since constructors canā€™t be overloaded (right?), or have the constructor accept an Any, and then use a case statement to figure out how to handle it, but thatā€™s ugly.Ā Ā 

With a union type I could declare the constructor as follows:

class Rational(val nominator : Number|String, val denominator : Long = 1) : Number() {
Ā Ā ā€¦
}

Note that the nominator can be either a String or a Number, and this will be enforced by the compiler.Ā Ā It seems like a fairly natural companion to Kotlinā€™s pattern matching (indeed, the first time I saw this was with the ML programming language back in the 90s - so itā€™s not a new idea by any means).

Constructors can be overloaded byfactory methods:

class Foo(val x: Int) {...} fun Foo(s: String) = Foo(s.parseInt())

Ah!Ā Ā Ok, I had overlooked that in the documentation, good to know.

I guess in that case union types provide a more concise syntax for defining functions which can accept multiple types that donā€™t necessarily have a common supertype:

fun concatinate(toConcat : List<Int>|String) {

}

I donā€™t know, I suppose itā€™s really a judgement call as to whether you feel this functionality is worthy of the effort.Ā Ā Certainly the creators of Ceylon thought it was as did the creators of ML and Haskell, and it seems like a natural evolution of the type system.

Strategically, if Kotlin seeks to be the definitive post-Java JVM language, that you consider stealing any good ideas implemented by competitive languages.

Anyway, I trust your jugement.

Neither Ceylon, ML nor Haskell has overloading.

I'm currently facing a type heirarchy challenge in Java, and am wondering if union types might be a good solution, or if there's another Kotlin feature that would work well.

In my model, I have classes A, B, C, etc., where A is the base class and the others form a heirarchy. I also have an interface X that some objects may posses. Letā€™s say I have classes A, B extends A, C extends A, BX extends B implements X, CX extends C implements X (plus others, such as DX extends C implements X). I have a special container MyList&lt;T extends A&gt; that I use to hold my objects. I would like to be able to declare a list MyList&lt;A &amp; X&gt; to hold objects from this heirarchy that have the interface X. AFAIK, this is not possible in Java even though Java looks like it almost has union types. Eg, I can have a function &lt;U extends A &amp; X&gt; void foo(MyList&lt;U&gt; list) { A a = list.get(0); X x = list.get(0); } but it can accept either a MyList<BX> or a MyList<CX> but not a list containing objects of types BX and CX simultaneously.

Itā€™d be very nice if I could write functions that take MyList<A&X>, but it seems I am forced to give up on trying to have explicit strong static typing. In java, Iā€™ve resorted to declaring a MyXList&lt;T extends A&gt; extends MyList&lt;T&gt; that serves at least as a marker in the functionā€™s signature.

I understand such situations arenā€™t common, but Iā€™m just throwing it out as an example.

I'm writing a multidimensional array library in Kotlin with support for advanced indexing ala Numpy/Matlab/Octave. I've run into a case where Union types would be valuable. As far as I can see, function overloading does not provide an alternative.

My usecase is as follows: A view of an Nd array (for the sake of discussion, 3d) X is given by X[a, b, c] where each of the ā€œcoordinatesā€ a, b and c can be of type SliceBuilder, Array<Int> or Int. I currently get around the absence of Union types in Kotlinā€™s type system by using Any (see lines 95 and 109 of https://github.com/burjorjee/Nd/blob/master/src/Nd.kt). But Any is too broad. The correct type of of the variable coords in this case is best specified by Array<SliceBuilder | Array<Int> | Int>.

You can always use multiple inheritance of interfaces (traits in Kotlin) to model this. It may double the number of .class files, btu you'll have the static typing

To preserve type safety, you could either have conversion functions on arrays (why not lists, btw?) and ints to SliceBuilder or present some overloads (not all 2^N) alongside a, say, getUnsafe() method that takes Any

One could perhaps get away with storing an Int as a Slice or Array<Int> of size one. However, one cannot store an Array<Int> (which may be an arbitrary sequence of integers) as a Slice since there is no way of representing, say, array(1,3,4, 70) as a slice because the increment varies from element to element. One could store a Slice, say Slice(0, 10000000, 2), as an Array<Int>, but this would involve an unneccessary increase in the amount of memory used. The type of a "coordinate" in advanced indexing schemes such as the one used by Numpy really is most naturally represented by Int | Array<Int> | Slice.

Why Array<Int> instead of List<Int>? No good reason. List<Int> may indeed offer more flexibility to users. Better still would be Iterable<Int> where Array<T> and List<T> are subclasses of Iterable<out T>

My understanding is that union and intersection types allow Ceylon to denote all types that the compiler needs to reason about in the language. That's not only cool but also immensely useful, because it allows the compiler to produce an actually readable error message if you do a mistake, especially with generics. I don't know if adding unions and intersections to Kotlin would give us the same benefit, given the differences in the type systems (mixed-site variance?), but I guess it's at least worth exploring. And if I remember correctly, the compiler already has intersection types, so...

It kind of follows from what you say that it will be enough if the compiler uses intersection and union types in error messages. Good idea, we'll think about it.

Yes, kind of. I still think that it's better for the compiler to use actual language syntax, rather than make up something extralinguistic, but every improvement of error messages is good. I surely don't want to see the dreaded capture#178 that we know from Java :-)

Capture-types arise from use-site variance, not from intersection types, but I see your point :)

Here's one case for intersection types: multiple upper bounds in generic functions (but I guess it applies to generic classes too). The documentation gives this example:

fun cloneWhenGreater<T : Comparable<T>>(list : List<T>, threshold : T) : List<T>
  Ā Ā where T : Cloneable {
Ā Ā return list when {it > threshold} map {it.clone()}
}

Here, T has to be a subtype of Comparable<T> and it also has to be a subtype of Cloneable. Which in fact means that T has to be a subtype of an intersection type Comparable<T> & Cloneable, right? I think I'd like that more than the where syntax, which feels more like a hack. But, if I remember correctly, there are other usecases for the where clause, so this itself wouldn't be enough to get rid of it :-(

I don't think it's really a case for intersection types. It's a case for a syntax, not for a new notion in type system

I have a couple of possible uses for union types in mind, but Iā€™m happy to hear of ways to do these things with existing (1.0.1) language features.

Firstly, from long-ago experience with Dylan, I recall that it can be useful to have explicit union types in order to put them in your API. If your API implements a union of, say, Int and String by having two overloads for every relevant method, then itā€™s possible to add a new method but forget to add one of the overloads. If you have an explicit union type, youā€™ll have to handle all contained types within your one, non-overloaded method, or the compiler will complain. This is even more useful if you can have a ā€œtypedefā€ of your union type.

Come to think of it, as I write this, I suppose another reason for union types is to be able to use them as a return type in an API. You canā€™t do that with overloads, because it may not be the case that, say, the overload which accepts Int always returns Int while the one which accepts String always returns String.

Secondly, it seems really useful for JavaScript interop. Iā€™ve been playing with calling PouchDB from Kotlin (as a complete Kotlin and PouchDB newbie :wink: ) and had problems because the function passed to a JS Promiseā€™s then method is allowed to return either a Promise or a value. Take the following definitions ā€¦

@native("Object")
class JSMap<T> {
    @nativeGetter
    operator fun get(key: String): T = noImpl

    @nativeSetter
    operator fun set(key: String, value: T): Unit = noImpl
}

fun jsobject(init: dynamic.() -> Unit): dynamic {
    return (JSMap<dynamic>()).apply(init)
}

@native
interface BulkQueryResult<T> {
    val total_rows: Int
    val offset: Int
    val rows: Array<BulkQueryRow<T>>
}

@native
interface BulkQueryRow<T> {
    val id: String
    val key: String
    val value: BulkQueryRowValue
    val doc: T?
}

@native
interface BulkQueryRowValue {
    val rev: String
    val deleted: Boolean?
}

@native
class PouchDB(var name: String, var options: JSMap<dynamic> = JSMap())
{
    fun put(doc: dynamic): Promise<dynamic> = noImpl
    fun get(id: String): Promise<dynamic> = noImpl
    fun <T> allDocs(options: JSMap<dynamic> = JSMap()): Promise<BulkQueryResult<T>> = noImpl
}

I want to be able to write the following (to find the set of edges not present in a DAG).

@native
abstract class Promise<T> {
    fun <U> then(result: (T) -> U): Promise<U>
    fun <U> then(result: (T) -> Promise<U>): Promise<U>
    fun catch(error: (T) -> Unit): Unit
}

fun proposeEdges(db: PouchDB): Promise<dynamic> {
    val allPossibleEdges: MutableCollection<Edge> = mutableListOf<Edge>()

    // Read all nodes and edges
    return db.allDocs<Node>(jsobject {
        startkey = "Node_"
        endkey = "Node_\uffff"
        include_docs = true
    }).then {
        val nodes = it.rows.mapNotNull { it.doc }
        // Find set of all possible edges
        for (from in nodes) {
            for (to in nodes) {
                allPossibleEdges.add(edge("from_${from._id}_to_${to._id}").apply {
                    this.from = from._id
                    this.to = to._id
                })
            }
        }
        it
    }.then {
        // Remove existing edges
        db.allDocs<Edge>(jsobject {
            startkey = "Edge_"
            endkey = "Edge_\uffff"
            include_docs = true
        })
    }.then {
        val edges = it.rows.mapNotNull { it.doc }
        allPossibleEdges.removeAll { possibleE ->
            edges.any { actualE ->
                actualE.from == possibleE.from &&
                        actualE.to == possibleE.to
            }
        }
    }

    // Return result
    return allPossibleEdges
}

but I have to write the following, explicitly disambiguating by name the ā€œoverloadsā€ which take functions returning U or Promise<U> (with no other changes).

@native
abstract class Promise<T> {
    @native("then") fun <U> thenV(result: (T) -> U): Promise<U>
    @native("then") fun <U> thenP(result: (T) -> Promise<U>): Promise<U>
    fun catch(error: (T) -> Unit): Unit
}

fun proposeEdges(db: PouchDB): Promise<dynamic> {
    val allPossibleEdges: MutableCollection<Edge> = mutableListOf<Edge>()

    // Read all nodes and edges
    return db.allDocs<Node>(jsobject {
        startkey = "Node_"
        endkey = "Node_\uffff"
        include_docs = true
    }).thenV {
        val nodes = it.rows.mapNotNull { it.doc }
        // Find set of all possible edges
       for (from in nodes) {
            for (to in nodes) {
                allPossibleEdges.add(edge("from_${from._id}_to_${to._id}").apply {
                    this.from = from._id
                    this.to = to._id
                })
            }
        }
        it
    }.thenP {
        // Remove existing edges
        db.allDocs<Edge>(jsobject {
            startkey = "Edge_"
            endkey = "Edge_\uffff"
            include_docs = true
        })
    }.thenV {
        val edges = it.rows.mapNotNull { it.doc }
        allPossibleEdges.removeAll { possibleE ->
            edges.any { actualE ->
                actualE.from == possibleE.from &&
                        actualE.to == possibleE.to
            }
        }
    }

    // Return result
    return allPossibleEdges
}

If I donā€™t do this, IDEA gives me errors because itā€™s inferring the ā€œwrongā€ return type for the lambda. Maybe this is just IDEA not being smart enough?

I think union types can be implemented in Kotlin not by changing the type-system, but by transforming at compile-time union declarations into sealed classes.

Iā€™ve been using sealed classes hierarchies to emulate union types (because I found them to be extremely useful in practice) but itā€™s very inconvenient to create type hierarchies all the time. Union types can be seen as simple ad-hoc sealed class hierarchies.

Suppose we want to emulate Either.

Currently, we need something like this:

sealed class Either<out S, out F>
{
    data class Success<out S>(val value: S) : Either<S, Nothing>()
    data class Failure<out F>(val failure: F) : Either<Nothing, F>()
}

Which can be used as:

val e: Either<String, Exception> = Either.Success("hi")
val a: Either<String, Exception> = Either.Failure(Exception("error"))

val s: String = when (e)
{
    is Either.Success<String>    -> e.value
    is Either.Failure<Exception> -> e.failure.toString()
}

If Kotlin supported union types, we could just do:

val e: String|Exception = "hi"
val a: String|Exception = Exception("error")

This is kind of the same as Optional VS null, null is better in Kotlin as you donā€™t need to wrap it into something and is just as safe because the compiler checks nullable types.

The when block would work the same. The Either type could be generated automatically as Union2<A, b> or something.

This gives the same power as sealed classes but avoid us having to create strict hierarchies every time we need something like this (which in my experience is a lot).

Intersection types are also really helpful when you try to model some complex domains. For example, I am now trying to model items which may or may not appear in a List. But some of them may or may not appear also in an enumeration (but the set of types that can go in a List is not exactly the same as the types can be part of an enumeration). So currently I need to model this using sealed classes + interfaces, but the interfaces cannot express that implementations must be one of the sealed class members, so modelling this reasonably in Kotlin is impossible. If Kotlin had intersection types, that would be trivial:

interface CanBeListItem
interface CanBeEnumeration

sealed class Item {
    data class A(..): Item(), CanBeListItem
    data class B(..): Item(), CanBeEnumeration // some types can be both
    data class ListItem(itemsType: Item & CanBeListItem): Item()
    data class EnumItem(enumsType: Item & CanBeEnumeration): Item()
}

The above is a oversimplification, but I hope my point comes across. I donā€™t think these use-cases are edge cases, I come across this kind of thing quite often! Hope others agree with me and you consider this option.

2 Likes

The last example somehow resembles a feature request I posted recently:
https://youtrack.jetbrains.com/issue/KT-19645.

The only difference is that I wanted to use generics instead to limit set of possible branches for ā€˜whenā€™ construct while passing sealed type as a subject.