文档结构  
翻译进度:35%     翻译赏金:0 元 (?)    ¥ 我要打赏

什么是泛型? 为什么它们被认为有用? 为什么Go没有泛型? Gophers应该用什么替代?

本文研究泛型的本质,并探索可用于解决这种编程范式缺失的各种技术。

更新:标题中的“...”是一个简单的英语省略号! 非常感谢读者指出,在这种情况下,“...”可以(并且将)被理解为Go的省略号。 在这种情况下,标题当然完全是垃圾。

首先,一个重要的说明

Go中有关泛型的问题是多年以前的问题,已经在Go论坛,新闻组和电子邮件列表中进行了讨论。 重新讨论这个问题,不是本文的目标。 我认为所有的都可以说,而且也已经说过。

第 1 段(可获 1.8 积分)

这里有一个关于讨论情况的很好总结。

本文只专注于实现其他语言尝试用泛型解决的一些目标的替代方法。

让我们从简要地看待泛型的动机以及可能的缺点开始。

问题

许多数据结构和算法适用于一系列数据类型。 例如,可以定义排序树来保存int类型,或 map [string] string或某种结构类型的元素。 排序算法能够排序其元素彼此可比的任何类型。 所以,如果你需要,比如说,一个对字符串排序的树,你可以坐下来写一个。 足够简单!

第 2 段(可获 1.45 积分)

但是如果你需要一个排序树用于许多不同类型呢? 树数据类型带有几个方法,如插入,查找,删除等。如果有一个有N个方法,M种元素类型的树要实现,你需要实现N×M个方法! 听起来像乏味,重复的工作。

我们不能让编译器这样做吗?

输入泛型

为了解决这个问题,许多编程语言有一个称为“泛型”的概念。 泛型的关键是以一种抽象的形式,用不透明的类型占位符而不是真实的类型来编写代码一次,就像这个Java例子:

第 3 段(可获 1.3 积分)
// Java pseudo code
public class SortedTree<E implements Comparable<E>> {
	void insert(E comparableSortTreeElement) {
		//...
	}
}

E是一个类型占位符; 它可以出现在类或接口定义以及函数参数列表中。 SortedTree使用了通过占位符替换的实际类型:

// Java pseudo code (again)

SortedTree<String> sortedTreeOfStrings = new SortedTree<String>();
sortedTreeOfStrings.insert("abcde");

上面这些行使用字符串元素实例化排序树。 insert 方法只接受了字符串,排序算法便使用字符串比较方法。 如果我们使用Integer替换,比如inSortedTree <Integer>,树将是一个整数的排序树。

第 4 段(可获 0.85 积分)

总而言之,泛型允许用类型参数定义类和函数。 类型参数可以限于某个子集(例如,参数T必须是可比较的),否则它们是非特定的类型。 结果是一个代码模板,可以根据需要应用于不同的类型。

缺点

虽然泛型可能派上用场,他们也有一些字符串附加。

  1. 性能:将通用代码模板转换为实际代码需要时间,无论是在编译时还是在运行时。 正像Russ Cox在2009年说:通常的困境是:你想要慢程序员,慢编译器和膨胀的二进制文件,还是缓慢的执行时间?(“慢程序员”部分是指根本没有泛型,也没有任何合适的替代)。
  2. 复杂性:泛型本身并不复杂,但是当与其他语言特性(例如继承)集成时,或者当开发人员开始创建嵌套泛型,
// C#
List<Dictionary<string<IEnumerable<HttpRequest>>>>

(感谢Jonathan Oliver的这个漂亮的例子),或者看似递归继承,

// Java
public abstract class Enum<E extends Enum<E>>

(取自 Angelika Langer所著的Java泛型常见问题)。

 

第 5 段(可获 2.2 积分)

Go没有泛型

我们都知道,Go是故意设计的很简单,并且泛型被认为增加了一种语言的复杂性(见上一节)。 因此,除了继承,多态性和当时的“最先进的”语言的一些其他功能,泛型被从Go的特性列表中删除。

事实上,Go确实有一些泛型 ……

Go中确实有一些“通用”语言结构。

首先,有三种通用的数据类型,你可以使用(可能已经这样做,但没有注意到):

  • 数组
  • 切片
  • 映射
第 6 段(可获 1.24 积分)

所有这些都可以在任意元素类型上实例化。 甚至map类型的键和值都是如此。 这使得map相当通用。 例如,map [string] struct {}可以用作一个Set类型,其中每个元素都是唯一的。

第二,一些内置函数可以对一系列数据类型进行操作,这使得它们几乎像泛型函数一样运行。 例如,len()适用于字符串,数组,切片和映射。

虽然这绝对不是“真实的东西”,但是有可能是那些“内部泛型”已经涵盖了你的需求。

第 7 段(可获 1.11 积分)

但如果项目确实对泛型有迫切需求时,我们该如何应对?

如果不是完全无法使用泛型,你可能会认为一些问题用泛型解决起来会非常方便。那你还能做些什么呢?

幸运的是,有一些可供选择的替代方案。

1. 重新审视需求

让我们回头去看看最初的需求,再次审视你的技术或功能说明文档(你应该有一份)。文档真的需要用到泛型吗?与那些在设计上就支持泛型的基于类型系统的语言相比,Go有不同的设计哲学:

如果说C++和Java是专注于类型的层次结构和类型分类的语言,那Go则专注于组合。(Rob Pike)

第 8 段(可获 1.39 积分)

So think of the paradigms that come with Go–most notably composition and embedding–and verify if any of these would help approaching the problem in a way more natural to Go.

2. Consider copy & paste

This may sound like a foolish advice (and if applied improperly, it surely is), but do not hastily reject it.

Here is the point:

Every time you think that you need to create a generic object, do a quick Litmus test: Ask yourself, “How many times would I ever have to instantiate this generic object in my application or library?” In other words, is it worth to construct a generic object when there may only be one or two actual implementations of it? In this case, creating a generic object would just be an over-abstraction.

第 9 段(可获 1.64 积分)

(At this point, I cannot help but thinking of Joel Spolsky’s witty article onArchitecture Astronauts who build abstractions on top of other abstractions until they run out of oxygen. (Small caveat: the article is from 2001. Expect a couple of references to outdated software concepts of which you never may have heard if you are young enough.)) <– Yes, I nested two parens here. So?

An excellent real-life example can be found right in the standard library. The two packages strings and bytes have almost identical API’s, yet no generics were used in the making of these packages (or so I strongly guess).

Remember that this article tries to remain neutral and takes no stand on generics. I don’t want to indicate that copy&paste is better than generics. I only want to encourage you to seek out other options even if they look strange at a first glance.

第 10 段(可获 1.86 积分)

3. Use interfaces

Interfaces define behavior without requiring any implementation details. This is ideal for defining ‘generic’ behavior:

  • Find a set of basic operations that your generic algorithm or data container can use to process the data.
  • Define an interface containing these operations.
  • To ‘instantiate’ your generic entity on a given data type, implement the interface methods for that type.

The sort package is an example for this technique. It contains a sort interface (appropriately called sort.Interface) that declares three methods: Len()Less(), and Swap().

type Interface interface {
	Len() int
	Less(i, j int) bool
	Swap(i, j int)
}

By just implementing these three methods for a data container (say, a slice of structs), sort.Sort() can be applied to any kind of data, as long as the data has a reasonable definition of ‘less than’.

A nice aspect here is that the code of sort.Sort() does not know anything about the data it sorts, and actually it does not have to. It simply relies on the three interface methods LenLess, and Swap.

第 11 段(可获 1.85 积分)

There are already a couple of examples in the sort package doc, and the following technique makes heavy use of an empty interface, so I’ll skip the code part and move right on to type assertions.

4. Use type assertions

Generic container types usually do not need to care much about the actual type of their contents. (Except maybe for basic operations like sorting, but we already know a solution for that.) Hence the value can be stored in the container as a ‘type without properties’. Such a type is already built into Go: the empty interface, declared as interface{}. This interface has no particular behavior, hence objects with any behavior satisfy this interface.

第 12 段(可获 1.43 积分)

It is quite easy to build a container type based on interface{}. We only need a way to recover the actual data type when reading elements from the container. For that purpose, Go has type assertions. Here is an example that implements a generic container object.

To keep the code short and concise, the container object is kept super-simple. It features only two methods, Put and Get.

Container is a generic container, accepting anything.

type Container []interface{}

Put adds an element to the container.

func (c *Container) Put(elem interface{}) {
	*c = append(*c, elem)
}
第 13 段(可获 0.99 积分)

Get gets an element from the container.

func (c *Container) Get() interface{} {
	elem := (*c)[0]
	*c = (*c)[1:]
	return elem
}

The calling code does the type assertion when retrieving an element.

func assertExample() {
	intContainer := &Container{}
	intContainer.Put(7)
	intContainer.Put(42)
	elem, ok := intContainer.Get().(int) // assert that the actual type is int
	if !ok {
		fmt.Println("Unable to read an int from intContainer")
	}
	fmt.Printf("assertExample: %d (%T)\n", elem, elem)
}

This looks so straightforward that we might easily forget the downsides of this technique. We give up compile-time type checking here, exposing the application to increased risk of type-related failure at runtime. Also, conversions to and from interfaces come with a cost. And finally, all ‘magic’ happens outside our Container type, at the caller’s level. Usually you would rather want a technique that hides the type conversion mechanisms from the caller.

第 14 段(可获 1.13 积分)

5. Use reflection

Remember the generic dilemma from the ‘downsides’ section? One of the three options to pick from was “slow execution times”. The technique discussed below is a manifestation of this option. Yes, I am talking about reflection.

Reflection allows a program to work with objects whose types are not known at compile time. Putting performance concerns aside, this sounds like a good fit for solving the generics problem, so let’s give it a try.

The code below implements a simple generic container. You can see a lot of type checking going on via the reflect package. Worth noting: A couple of standard operations do not work on variables of type reflect.Value even if the actual type would allow this operation. You need to substitute these operations with methods from reflect. I added comments to the parts of the code where this occurs.

第 15 段(可获 1.78 积分)

Cabinet has one field, s, that holds a slice of a given type. (As the name Container is already taken, I had to choose another name.)

type Cabinet struct {
	s reflect.Value
}

NewCabinet creates a new Cabinet struct where s holds a slice of type []t. Formally, sremains a reflect.Value as defined in the struct. The code needs to deal with that fact in some places below.

func NewCabinet(t reflect.Type) *Cabinet {
	return &Cabinet{
		s: reflect.MakeSlice(reflect.SliceOf(t), 0, 10), // cap is arbitrary, we need to pass one here
	}
}

Set sets the element at index i to the passed-in value.

第 16 段(可获 0.84 积分)
func (c *Cabinet) Put(val interface{}) {

The passed-in val must match the type of the elements of slice s. Otherwise, Put panics, as this is a programmer error.

	if reflect.ValueOf(val).Type() != c.s.Type().Elem() {
		panic(fmt.Sprintf("Put: cannot put a %T into a slice of %s", val, c.s.Type().Elem()))
	}

AppendSlice is a replacement for the builtinappend function, which fails on a reflect.Valueeven if the actual value’s type is a slice.

	c.s = reflect.Append(c.s, reflect.ValueOf(val))
}

Get gets the element at index i. There is no way (or so it seems) to have a function return areflect.Value type that turns into the actual type of the returned data. Hence the Get function has a second parameter that must be a reference of the receiving variable. SeereflectExample().

func (c *Cabinet) Get(retref interface{}) {

Index(i) replaces the index operator [i] as s is only a reflect.Value (even though it effectively contains a slice).

	retref = c.s.Index(0)
	c.s = c.s.Slice(1, c.s.Len())
}
func reflectExample() {
	f := 3.14152
	g := 0.0
	c := NewCabinet(reflect.TypeOf(f))

Try c.Put(0, “blabla”) to see the type check panicking

	c.Put(f)

The syntax g = c.Get(0) is not possible, see the comment on Get().

	c.Get(&g)
	fmt.Printf("reflectExample: %f (%T)\n", g, g)
}

Frankly, I really had a hard time wrapping my head around the semantics of the reflect package until I got this code running. 

第 17 段(可获 1.86 积分)

And I realized that when moving from real types to the level ofreflect.Value and reflect.Type, a couple of things are not possible anymore. For example, you need replacement functions for append(), the[i] operator, etc, and there is no way to return a reflect.Value and have it magically turn into the actual type that it contains. (That’s why Get() needs to receive a reference to fill with the return value.)

I feel that I have put most of the time of writing this article into the Reflection code. A Go proverb says, “Clear is better than clever”, but this code is not only anything but clear, it is also far from being clever.

My $0.02 on generics through reflection?

Avoid.

第 18 段(可获 1.46 积分)

6. Use a code generator

Far better than reflection IMHO is code generation. The concept is rather simple:

  • Write a template file that uses generic ‘mock’ types. These types are just placeholders for the real types.
  • Run the file through a converter. The converter identifies the mock types and replaces them by real types.

To illustrate this, here is a quick walkthrough using genny. (There are also other converters with similar features out there, like genericgengengen, and more.)

So let’s write another generic container. Whenever we need a generic type, we can declare one like so:

第 19 段(可获 1.24 积分)
type APlaceholder generic.Type

APlaceholder then can be used in the template code like any other type. The template code compiles without problems, and it is even possible to write tests against this code. This is how our container template looks like:

package capsule

import "github.com/cheekybits/genny/generic"

type Item generic.Type

type ItemCapsule struct {
	s []Item
}

func NewItemCapsule() *ItemCapsule {
	return &ItemCapsule{s: []Item{}}
}

func (c *ItemCapsule) Put(val Item) {
	c.s = append(c.s, val)
}

func (c *ItemCapsule) Get() Item {
	r := c.s[0]
	c.s = c.s[1:]
	return r
}

(When you go get the code of this article, this template code is incapsule/capsule.go.)

In the main file, generics.go, we want to have a Capsule containing uint32 elements. So we place a go:generate directive at the top of the file:

//go:generate genny -in=template.go -out=uint32capsule.go gen "Item=uint32"

Before compiling the main binary, calling go generate executes genny, which produces an output file where Item is replaced by uint32. Note that this also applies to names that contain the string ‘Item’ - like ‘ItemCapsule’ in our example. This way, it is possible to create multiple replacements from the same template without name conflicts. (For example, a Uint32Capsule and a StringCapsule.)

第 20 段(可获 1.64 积分)

After running go generate generics.go, we can find the fileuint32capsule.go in the project dir with the following code:

Generated code

type Uint32Capsule struct {
	s []uint32
}

func NewUint32Capsule() *Uint32Capsule {
	return &Uint32Capsule{s: []uint32{}}
}

func (c *Uint32Capsule) Put(val uint32) {
	c.s = append(c.s, val)
}

func (c *Uint32Capsule) Get() uint32 {
	r := c.s[0]
	c.s = c.s[1:]
	return r
}

Now we can write the calling function as if the Uint32Capsule was handcoded. Everything is just plain, clear, idiomatic Go.

func generateExample() {
	var u uint32 = 42
	c := NewUint32Capsule()
	c.Put(u)
	v := c.Get()
	fmt.Printf("generateExample: %d (%T)\n", v, v)
}

For completness of our test code, here is the main function.

main simply runs all examples.

func main() {
	assertExample()
	reflectExample()
	generateExample()
}

Get the complete test code from GitHub:

    go get -d github.com/appliedgo/generics
    cd $GOPATH/src/github.com/appliedgo/generics
    go build
    ./generics

You don’t need to get genny nor run go generate if you just want to run the examples in generics.go. The code generated by genny is included in generics.go.

第 21 段(可获 1.05 积分)

Summary

Let me do a quick roundup of the examined techniques.

Requirements review 

  • Pro: Makes you think about your design.
  • Con: Likely to apply to only a small set of problem types.

Copy&Paste

  • Pros:
    • Quick
    • Needs no external libraries or tools
  • Cons:
    • Code bloat
    • Breaks the DRY principle

Interfaces (with methods)

  • Pros:
    • Clean and idiomatic
    • Needs no external libraries or tools
  • Cons:
    • Additional coding required - each interface method needs to be implemented for each target type.
    • Runtime overhead (however maybe not that dramatic)

Type assertions (with empty interfaces)

  • Pros:
    • Code stays quite clear
    • Needs no external libraries or tools
  • Cons:
    • No compile-time type checking
    • Runtime overhead from converting to and from interfaces
    • Caller is required to do the type assertion
第 22 段(可获 1.53 积分)

Reflection

  • Pros:
    • Versatile
    • Needs no external libraries or tools
  • Cons:
    • Anything but clear
    • No compile-time type checking
    • Considerable runtime overhead

Code generation

  • Pros:
    • Very clear code possible (depending on the tool), both in the templates and in the generated code
    • Compile-time type checking
    • Some tools even allow writing tests against the generic code template
    • No runtime overhead
  • Cons:
    • Possible binary bloat, if a template gets instantiated many times for different target types (but how often does this happen in your apps?)
    • The build process requires an extra step and a third party tool (but as far as I have seen, these tools usually are go gettable)

That’s it for this time. I hope you enjoyed reading, even though this article ended up being quite long and without any images (sorry for that). I try to keep it shorter in the future. Until next time!

第 23 段(可获 1.8 积分)

文章评论