Encode/Gob

寒江蓑笠翁大约 13 分钟技术日志goGolangEncoding

Encode/Gob

Go标准库自带的编码库


简介

Package gob manages streams of gobs - binary values exchanged between an Encoderopen in new window (transmitter) and a Decoderopen in new window (receiver). A typical use is transporting arguments and results of remote procedure calls (RPCs) such as those provided by net/rpcopen in new window.

正如包介绍所说,gob提供了一个标准库内置的Go类型二进制流式编码方式,其主要被用在了rpc库通信上,当然我们自己也可以拿来用在项目里。

使用

package main

import (
    "bytes"
    "encoding/gob"
    "fmt"
)

type Peron struct {
    Name   string
    Age    int
    Salary float64
    Home   []string
}

func main() {
    mike := Peron{
       Name:   "mike",
       Age:    18,
       Salary: 5000,
       Home:   []string{"USA"},
    }

    buffer := bytes.NewBuffer(nil)
    err := gob.NewEncoder(buffer).Encode(mike)
    if err != nil {
       panic(err)
    }

    fmt.Printf("mike is %T, value: %+v\n", mike, mike)
    var mikeClone Peron
    err = gob.NewDecoder(buffer).Decode(&mikeClone)
    if err != nil {
       panic(err)
    }
    fmt.Printf("mike clone is %T, value: %+v\n", mikeClone, mikeClone)
}

gob 对外暴露的接口是非常通用的Encoderdecoder流式接口,使用起来非常简单,纯天然兼容Go语言类型,不过也仅支持Go类型。它的使用场景有很多,可以是:

  1. 编码成二进制格式进行数据存储
  2. rpc网络通信的信息载体
  3. 本地缓存等等

注意点

不要以为gob是标准库是可以放心大胆的用了,就直接想着用这个来替代json,它的使用有非常多的注意点,建议看完这些再考虑是否大面积使用。

本地类型无法反序列化到接口

如果你序列化的是一个结构体,希望反序列化成一个接口类型,那么不行,因为通信两端必须两方都是接口类型

package main

import (
    "encoding/gob"
    "fmt"
    "os"
)

type User struct {
    Name string
    Age  int
}

func main() {

    data := User{Name: "Alice", Age: 30}
    file, _ := os.Create("data.gob")
    encoder := gob.NewEncoder(file)
    encoder.Encode(data)
    file.Close()

    file, _ = os.Open("data.gob")
    decoder := gob.NewDecoder(file)

    var decodedData interface{}
    err := decoder.Decode(&decodedData)
    if err != nil {
       fmt.Println("解码失败:", err)
       return
    }

    // 类型断言还原数据
    if user, ok := decodedData.(User); ok {
       fmt.Printf("解码成功: Name=%s, Age=%d\n", user.Name, user.Age)
    }
}

会有如下错误

解码失败: gob: local interface type *interface {} can only be decoded from remote interface type; received concrete type User = struct { Name string; Age int; }

它识别到了你的结构体是本地类型,不允许你反序列化成any

通信类型必须双端都注册

RPC通信双方交换的类型,必须在两端都注册,gob会编码类型信息,如果发现本地类型表里面没有,就无法序列化

func init() {
    gob.Register(remote.Uer{})
}

你的本地类型,只要你代码引用到了都会自动注册。

只能序列化Exported字段

gob只会序列化导出的字段,如果你嵌套的结构体里面,包含了一个导出的结构体类型,但是里面的字段全是私有的

type User struct {
    Name string
    Age  int
    M    Man
}

type Man struct {
    n int
}

那么会直接panic

panic: gob: type main.Man has no exported fields

对你没看错,不是返回error,而是直接panic,如果你的代码没有做好recover,并且嵌入了某个三方库的类型,而且它恰好包含了某个你不知道的私有结构体,整个程序都会因此直接崩掉

指针零值会被视为nil

package main

import (
    "bytes"
    "encoding/gob"
    "fmt"

    "github.com/samber/lo"
)

type Peron struct {
    Name   string
    Age    int
    Salary float64
    Home   []string
    Wife   *string // 字符串指针
}

func main() {
    mike := Peron{
       Name:   "mike",
       Age:    18,
       Salary: 5000,
       Home:   []string{"USA"},
       Wife:   lo.ToPtr(""), // 空字符串
    }

    buffer := bytes.NewBuffer(nil)
    err := gob.NewEncoder(buffer).Encode(mike)
    if err != nil {
       panic(err)
    }

    fmt.Printf("mike is %T, value: %+v\n", mike, mike)
    var mikeClone Peron
    err = gob.NewDecoder(buffer).Decode(&mikeClone)
    if err != nil {
       panic(err)
    }
    fmt.Printf("mike clone is %T, value: %+v\n", mikeClone, mikeClone)
}

在序列化的时候它就已经被当作空值来处理了

mike is main.Peron, value: {Name:mike Age:18 Salary:5000 Home:[USA] Wife:0xc000024650}
mike clone is main.Peron, value: {Name:mike Age:18 Salary:5000 Home:[USA] Wife:<nil>}

这样你可能觉得还有点道理,那如果是浮点数呢?比如说经纬度浮点数

package main

import (
    "bytes"
    "encoding/gob"
    "fmt"

    "github.com/samber/lo"
)

type Peron struct {
    Name      string
    Age       int
    Salary    float64
    Home      []string
    Wife      *string
    Latitude  *float64
    Longitude *float64
}

func main() {
    mike := Peron{
       Name:      "mike",
       Age:       18,
       Salary:    5000,
       Home:      []string{"USA"},
       Wife:      lo.ToPtr(""),
       Latitude:  lo.ToPtr(0.0),
       Longitude: lo.ToPtr(0.0),
    }

    buffer := bytes.NewBuffer(nil)
    err := gob.NewEncoder(buffer).Encode(mike)
    if err != nil {
       panic(err)
    }

    fmt.Printf("mike is %T, value: %+v\n", mike, mike)
    var mikeClone Peron
    err = gob.NewDecoder(buffer).Decode(&mikeClone)
    if err != nil {
       panic(err)
    }
    fmt.Printf("mike clone is %T, value: %+v\n", mikeClone, mikeClone)
}

经纬度(0.0, 0.0)是一个绝对有效的值,如果你恰好想用指针来表示,那么你会发现你的值“丢了”

mike is main.Peron, value: {Name:mike Age:18 Salary:5000 Home:[USA] Wife:0xc000024650 Latitude:0xc00000a128 Longitude:0xc00000a170}
mike clone is main.Peron, value: {Name:mike Age:18 Salary:5000 Home:[USA] Wife:<nil> Latitude:<nil> Longitude:<nil>}

性能好吗?

性能这种东西光说无用,我们直接上基准测试,看代码和数据。

package main

import (
	"bytes"
	"encoding/gob"
	"encoding/json"
	"fmt"
	"math/rand"
	"testing"
	"time"

	"github.com/bytedance/sonic"
	"github.com/vmihailenco/msgpack/v5"
)

// 定义测试数据结构
type User struct {
	ID        int       `json:"id" msgpack:"id"`
	Name      string    `json:"name" msgpack:"name"`
	Email     string    `json:"email" msgpack:"email"`
	CreatedAt time.Time `json:"created_at" msgpack:"created_at"`
	Active    bool      `json:"active" msgpack:"active"`
	Balance   float64   `json:"balance" msgpack:"balance"`
}

// 生成随机用户数据
func generateUsers(count int) []User {
	users := make([]User, count)
	for i := 0; i < count; i++ {
		users[i] = User{
			ID:        i + 1,
			Name:      fmt.Sprintf("User %d", i+1),
			Email:     fmt.Sprintf("user%d@example.com", i+1),
			CreatedAt: time.Now().Add(-time.Duration(rand.Intn(365)) * 24 * time.Hour),
			Active:    rand.Intn(2) == 0,
			Balance:   rand.Float64() * 10000,
		}
	}
	return users
}

// 基准测试函数
func BenchmarkSerialization(b *testing.B) {
	// 测试不同数据量
	dataSizes := []int{1000, 10000, 100000}

	for _, size := range dataSizes {
		users := generateUsers(size)

		b.Run(fmt.Sprintf("JSON_Encode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				_, err := json.Marshal(users)
				if err != nil {
					b.Fatal(err)
				}
			}
		})

		b.Run(fmt.Sprintf("Sonic_Encode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				_, err := sonic.Marshal(users)
				if err != nil {
					b.Fatal(err)
				}
			}
		})

		b.Run(fmt.Sprintf("Gob_Encode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				var buf bytes.Buffer
				enc := gob.NewEncoder(&buf)
				err := enc.Encode(users)
				if err != nil {
					b.Fatal(err)
				}
			}
		})

		b.Run(fmt.Sprintf("MsgPack_Encode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				_, err := msgpack.Marshal(users)
				if err != nil {
					b.Fatal(err)
				}
			}
		})
	}
}

func BenchmarkDeserialization(b *testing.B) {
	dataSizes := []int{1000, 10000, 100000}

	for _, size := range dataSizes {
		users := generateUsers(size)

		// 准备序列化数据
		jsonData, _ := json.Marshal(users)
		sonicData, _ := sonic.Marshal(users)

		var gobBuf bytes.Buffer
		gobEnc := gob.NewEncoder(&gobBuf)
		gobEnc.Encode(users)
		gobData := gobBuf.Bytes()

		msgpackData, _ := msgpack.Marshal(users)

		b.Run(fmt.Sprintf("JSON_Decode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				var result []User
				err := json.Unmarshal(jsonData, &result)
				if err != nil {
					b.Fatal(err)
				}
			}
		})

		b.Run(fmt.Sprintf("Sonic_Decode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				var result []User
				err := sonic.Unmarshal(sonicData, &result)
				if err != nil {
					b.Fatal(err)
				}
			}
		})

		b.Run(fmt.Sprintf("Gob_Decode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				var result []User
				buf := bytes.NewBuffer(gobData)
				dec := gob.NewDecoder(buf)
				err := dec.Decode(&result)
				if err != nil {
					b.Fatal(err)
				}
			}
		})

		b.Run(fmt.Sprintf("MsgPack_Decode=%d", size), func(b *testing.B) {
			for i := 0; i < b.N; i++ {
				var result []User

				err := msgpack.Unmarshal(msgpackData, &result)
				if err != nil {
					b.Fatal(err)
				}
			}
		})
	}
}

结果

goos: windows
goarch: amd64
pkg: golearn
cpu: 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz
BenchmarkSerialization
BenchmarkSerialization/JSON_Encode=1000
BenchmarkSerialization/JSON_Encode=1000-16         	    1886	    612071 ns/op
BenchmarkSerialization/Sonic_Encode=1000
BenchmarkSerialization/Sonic_Encode=1000-16        	    4869	    244013 ns/op
BenchmarkSerialization/Gob_Encode=1000
BenchmarkSerialization/Gob_Encode=1000-16          	    5115	    229241 ns/op
BenchmarkSerialization/MsgPack_Encode=1000
BenchmarkSerialization/MsgPack_Encode=1000-16      	    3966	    297808 ns/op
BenchmarkSerialization/JSON_Encode=10000
BenchmarkSerialization/JSON_Encode=10000-16        	     198	   6096757 ns/op
BenchmarkSerialization/Sonic_Encode=10000
BenchmarkSerialization/Sonic_Encode=10000-16       	     278	   4261545 ns/op
BenchmarkSerialization/Gob_Encode=10000
BenchmarkSerialization/Gob_Encode=10000-16         	     355	   3190153 ns/op
BenchmarkSerialization/MsgPack_Encode=10000
BenchmarkSerialization/MsgPack_Encode=10000-16     	     327	   4316431 ns/op
BenchmarkSerialization/JSON_Encode=100000
BenchmarkSerialization/JSON_Encode=100000-16       	      16	  64004156 ns/op
BenchmarkSerialization/Sonic_Encode=100000
BenchmarkSerialization/Sonic_Encode=100000-16      	      30	  36431373 ns/op
BenchmarkSerialization/Gob_Encode=100000
BenchmarkSerialization/Gob_Encode=100000-16        	      45	  25052718 ns/op
BenchmarkSerialization/MsgPack_Encode=100000
BenchmarkSerialization/MsgPack_Encode=100000-16    	      38	  31971308 ns/op
BenchmarkDeserialization
BenchmarkDeserialization/JSON_Decode=1000
BenchmarkDeserialization/JSON_Decode=1000-16       	     862	   1400651 ns/op
BenchmarkDeserialization/Sonic_Decode=1000
BenchmarkDeserialization/Sonic_Decode=1000-16      	    3397	    344001 ns/op
BenchmarkDeserialization/Gob_Decode=1000
BenchmarkDeserialization/Gob_Decode=1000-16        	    6163	    212270 ns/op
BenchmarkDeserialization/MsgPack_Decode=1000
BenchmarkDeserialization/MsgPack_Decode=1000-16    	    2448	    501270 ns/op
BenchmarkDeserialization/JSON_Decode=10000
BenchmarkDeserialization/JSON_Decode=10000-16      	      82	  14697723 ns/op
BenchmarkDeserialization/Sonic_Decode=10000
BenchmarkDeserialization/Sonic_Decode=10000-16     	     372	   3311510 ns/op
BenchmarkDeserialization/Gob_Decode=10000
BenchmarkDeserialization/Gob_Decode=10000-16       	     673	   1859699 ns/op
BenchmarkDeserialization/MsgPack_Decode=10000
BenchmarkDeserialization/MsgPack_Decode=10000-16   	     243	   4851535 ns/op
BenchmarkDeserialization/JSON_Decode=100000
BenchmarkDeserialization/JSON_Decode=100000-16     	       7	 144560900 ns/op
BenchmarkDeserialization/Sonic_Decode=100000
BenchmarkDeserialization/Sonic_Decode=100000-16    	      42	  28476936 ns/op
BenchmarkDeserialization/Gob_Decode=100000
BenchmarkDeserialization/Gob_Decode=100000-16      	      70	  17558089 ns/op
BenchmarkDeserialization/MsgPack_Decode=100000
BenchmarkDeserialization/MsgPack_Decode=100000-16  	      24	  48077071 ns/op
PASS

我们能看到的是确实不错,至少整体性能比sonic好。

结语

但如果你真的要用RPC通信,我建议使用Protobuf,性能更好,也更通用,如果你追求极致性能,建议你使用flattbuffer,如果你只是想要一个开箱即用,性能还不错go类型二进制编码库,并且以后不会涉及到其它语言的交互,并且能接受上述提到的问题,那么可以使用gob。

上次编辑于:
贡献者: 246859