不同写法的性能差异

简介: 不同写法的性能差异

达到相同目的,可以有多种写法,每种写法有性能、可读性方面的区别,本文旨在探讨不同写法之间的性能差异




len(str)  )v s  str==""


本部分参考自:

[问个 Go 问题,字符串 len == 0 和 字符串== "" ,有啥区别?](segmentfault.com/a/119000003… "问个 Go 问题,字符串 len == 0 和 字符串== "" ,有啥区别?")

微信截图_20230925205014.png

package gotest
func Test1() bool {
  var v string
  if v == "" {
    return true
  }
  return false
}
func Test2() bool {
  var v string
  if len(v) == 0 {
    return true
  }
  return false
}
package gotest
import (
  "testing"
)
func BenchmarkTest1(b *testing.B) {
  for i := 0; i < b.N; i++ {
    Test1()
  }
}
func BenchmarkTest2(b *testing.B) {
  for i := 0; i < b.N; i++ {
    Test2()
  }
}

执行 go test -test.bench=".*"

goos: darwin
goarch: amd64
pkg: note/performance
BenchmarkTest1-8        1000000000               0.467 ns/op
BenchmarkTest2-8        1000000000               0.464 ns/op
PASS
ok      note/performance        1.290s

第4行显示了BenchmarkTest1执行了1000000000次,每次的执行平均时间是0.467纳秒,

第5行显示了BenchmarkTest2也执行了1000000000次,每次的平均执行时间是0.464 纳秒。

最后一行显示总共的执行时间为 1.290s


可使用-count来指定执行多少次 go test -test.bench=".*" -count=5:

goos: darwin
goarch: amd64
pkg: note/performance
BenchmarkTest1-8        1000000000               0.485 ns/op
BenchmarkTest1-8        1000000000               0.484 ns/op
BenchmarkTest1-8        1000000000               0.464 ns/op
BenchmarkTest1-8        1000000000               0.497 ns/op
BenchmarkTest1-8        1000000000               0.479 ns/op
BenchmarkTest2-8        1000000000               0.490 ns/op
BenchmarkTest2-8        1000000000               0.476 ns/op
BenchmarkTest2-8        1000000000               0.482 ns/op
BenchmarkTest2-8        1000000000               0.469 ns/op
BenchmarkTest2-8        1000000000               0.474 ns/op
PASS
ok      note/performance        5.791s

go test --bench=. -benchmem

(添加 -benchmem 参数,可以提供每次操作分配内存的次数,以及每次操作分配的字节数。参考 go benchmark 性能测试)

goos: darwin
goarch: amd64
pkg: note/performance
BenchmarkTest1-8        1000000000               0.471 ns/op           0 B/op          0 allocs/op
BenchmarkTest2-8        1000000000               0.462 ns/op           0 B/op          0 allocs/op
PASS
ok      note/performance        1.457s

经过多次测试,可知:

<1>. 性能几乎没有差别

<2>. 均不涉及内存申请和操作,均为 0 allocs/op。(也说明变量并不是声明了,就有初始化动作. Go 编译器有做优化)


进一步看两者的汇编代码,以细究具体区别在哪里:

go tool compile -S gotest.go:

"".Test1 STEXT nosplit size=6 args=0x8 locals=0x0
        0x0000 00000 (gotest.go:3)      TEXT    "".Test1(SB), NOSPLIT|ABIInternal, $0-8
        0x0000 00000 (gotest.go:3)      PCDATA  $0, $-2
        0x0000 00000 (gotest.go:3)      PCDATA  $1, $-2
        0x0000 00000 (gotest.go:3)      FUNCDATA        $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (gotest.go:3)      FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (gotest.go:3)      FUNCDATA        $2, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (gotest.go:6)      PCDATA  $0, $0
        0x0000 00000 (gotest.go:6)      PCDATA  $1, $0
        0x0000 00000 (gotest.go:6)      MOVB    $1, "".~r0+8(SP)
        0x0005 00005 (gotest.go:6)      RET
        0x0000 c6 44 24 08 01 c3                                .D$...
"".Test2 STEXT nosplit size=6 args=0x8 locals=0x0
        0x0000 00000 (gotest.go:11)     TEXT    "".Test2(SB), NOSPLIT|ABIInternal, $0-8
        0x0000 00000 (gotest.go:11)     PCDATA  $0, $-2
        0x0000 00000 (gotest.go:11)     PCDATA  $1, $-2
        0x0000 00000 (gotest.go:11)     FUNCDATA        $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (gotest.go:11)     FUNCDATA        $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (gotest.go:11)     FUNCDATA        $2, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
        0x0000 00000 (gotest.go:14)     PCDATA  $0, $0
        0x0000 00000 (gotest.go:14)     PCDATA  $1, $0
        0x0000 00000 (gotest.go:14)     MOVB    $1, "".~r0+8(SP)
        0x0005 00005 (gotest.go:14)     RET
        0x0000 c6 44 24 08 01 c3                                .D$...
go.cuinfo.packagename. SDWARFINFO dupok size=0
        0x0000 67 6f 74 65 73 74                                gotest
go.loc."".Test1 SDWARFLOC size=0
go.info."".Test1 SDWARFINFO size=46
        0x0000 03 22 22 2e 54 65 73 74 31 00 00 00 00 00 00 00  ."".Test1.......
        0x0010 00 00 00 00 00 00 00 00 00 00 01 9c 00 00 00 00  ................
        0x0020 01 0f 7e 72 30 00 01 03 00 00 00 00 00 00        ..~r0.........
        rel 0+0 t=24 type.bool+0
        rel 10+8 t=1 "".Test1+0
        rel 18+8 t=1 "".Test1+6
        rel 28+4 t=30 gofile../Users/dashen/go/src/note/performance/gotest.go+0
        rel 40+4 t=29 go.info.bool+0
go.range."".Test1 SDWARFRANGE size=0
go.debuglines."".Test1 SDWARFMISC size=11
        0x0000 04 02 14 06 41 04 01 03 7b 06 01                 ....A...{..
go.loc."".Test2 SDWARFLOC size=0
go.info."".Test2 SDWARFINFO size=46
        0x0000 03 22 22 2e 54 65 73 74 32 00 00 00 00 00 00 00  ."".Test2.......
        0x0010 00 00 00 00 00 00 00 00 00 00 01 9c 00 00 00 00  ................
        0x0020 01 0f 7e 72 30 00 01 0b 00 00 00 00 00 00        ..~r0.........
        rel 0+0 t=24 type.bool+0
        rel 10+8 t=1 "".Test2+0
        rel 18+8 t=1 "".Test2+6
        rel 28+4 t=30 gofile../Users/dashen/go/src/note/performance/gotest.go+0
        rel 40+4 t=29 go.info.bool+0
go.range."".Test2 SDWARFRANGE size=0
go.debuglines."".Test2 SDWARFMISC size=13
        0x0000 04 02 03 08 14 06 41 04 01 03 73 06 01           ......A...s..
gclocals·33cdeccccebe80329f1fdbee7f5874cb SRODATA dupok size=8
        0x0000 01 00 00 00 00 00 00 00                          ........

编译出来的汇编代码是完全一致的,可以明确 Go 编译器对此做了优化(应该是直接比对了)


生成pprof:

go test -bench=".*" -cpuprofile=cpu.profile ../xxx文件夹

此时会在文件夹下,生成一个 xxx.test

go tool pprof xxx.test cpu.profile :


File: performance.test
Type: cpu
Time: Apr 12, 2021 at 5:20pm (CST)
Duration: 1.23s, Total samples = 970ms (78.99%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) 
(pprof) o
  call_tree                 = false                
  compact_labels            = true                 
  cumulative                = flat                 //: [cum | flat]
  divide_by                 = 1                    
  drop_negative             = false                
  edgefraction              = 0.001                
  focus                     = ""                   
  granularity               = filefunctions        //: [addresses | filefunctions | files | functions | lines]
  hide                      = ""                   
  ignore                    = ""                   
  mean                      = false                
  nodecount                 = -1                   //: default
  nodefraction              = 0.005                
  noinlines                 = false                
  normalize                 = false                
  output                    = ""                   
  prune_from                = ""                   
  relative_percentages      = false                
  sample_index              = cpu                  //: [samples | cpu]
  show                      = ""                   
  show_from                 = ""                   
  tagfocus                  = ""                   
  taghide                   = ""                   
  tagignore                 = ""                   
  tagshow                   = ""                   
  trim                      = true                 
  trim_path                 = ""                   
  unit                      = minimum 

执行 go tool pprof -web xxx.test cpu.profile

微信截图_20230925205236.png

几种 int转string 方法的性能差异

package shuang
import (
  "fmt"
  "strconv"
  "testing"
)
func BenchmarkSprintf(b *testing.B) {
  n := 10
  b.ResetTimer()
  for i := 0; i < b.N; i++ {
    fmt.Sprintf("%d", n)
  }
}
func BenchmarkItoa(b *testing.B) {
  n := 10
  b.ResetTimer()
  for i := 0; i < b.N; i++ {
    strconv.Itoa(n)
  }
}
func BenchmarkFormatInt(b *testing.B) {
  n := int64(10)
  b.ResetTimer()
  for i := 0; i < b.N; i++ {
    strconv.FormatInt(n, 10)
  }
}

执行 go test -test.bench=".*" -benchmem


goos: darwin
goarch: amd64
pkg: dashen
BenchmarkSprintf-8      14417409                75.9 ns/op            16 B/op          2 allocs/op
BenchmarkItoa-8         452276205                2.64 ns/op            0 B/op          0 allocs/op
BenchmarkFormatInt-8    492620018                2.42 ns/op            0 B/op          0 allocs/op
PASS
ok      dashen  4.518s

第4行显示了BenchmarkSprintf-8 执行了14417409次,每次的执行平均时间是75.9纳秒, 每次操作有两次内存分配,每次分配了16Byte大小的内存空间

第5行显示了BenchmarkItoa-8 执行了452276205次,每次的平均执行时间是2.64 纳秒, 无内存分配

第6行显示了BenchmarkFormatInt-8 执行了492620018次,每次的平均执行时间是2.42 纳秒, 无内存分配

最后一行显示总共的执行时间为 4.518s


可见, strconv.FormatInt(n, 10)strconv.Itoa(n) 性能差不多, fmt.Sprintf() 性能最差

Golang 中整数转字符串




几种 字符串拼接 写法的性能差异


将两个字符串 "hello"和"world",拼接为"hello,world"

package shuang
import (
  "bytes"
  "fmt"
  "strings"
  "testing"
)
func BenchmarkAddStringWithOperator(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = hello + "," + world
  }
}
func BenchmarkAddStringWithSprintf(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = fmt.Sprintf("%s,%s", hello, world)
  }
}
func BenchmarkAddStringWithJoin(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = strings.Join([]string{hello, world}, ",")
  }
}
func BenchmarkAddStringWithBuffer(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < 1000; i++ {
    var buffer bytes.Buffer
    buffer.WriteString(hello)
    buffer.WriteString(",")
    buffer.WriteString(world)
    _ = buffer.String()
  }
}

执行 go test -test.bench=".*" -benchmem

goos: darwin
goarch: amd64
pkg: dashen
BenchmarkAddStringWithOperator-8        52448029                21.4 ns/op             0 B/op          0 allocs/op
BenchmarkAddStringWithSprintf-8          8755447               136 ns/op              48 B/op          3 allocs/op
BenchmarkAddStringWithJoin-8            31878931                37.0 ns/op            16 B/op          1 allocs/op
BenchmarkAddStringWithBuffer-8          1000000000               0.000104 ns/op        0 B/op          0 allocs/op
PASS
ok      dashen  4.420s

第4行显示了BenchmarkAddStringWithOperator-8  执行了52448029次,每次的执行平均时间是 21.4纳秒, 无内存分配

第5行显示了BenchmarkAddStringWithSprintf-8  执行了8755447次,每次的平均执行时间是 136纳秒, 每次操作有3次内存分配,每次分配了48Byte大小的内存空间

第6行显示了BenchmarkAddStringWithJoin-8  执行了31878931次,每次的平均执行时间是 37.0纳秒, 每次操作有1次内存分配,每次分配了16Byte大小的内存空间

第7行显示了BenchmarkAddStringWithBuffer-8  执行了1000000000次,每次的平均执行时间是 0.000104纳秒, 无内存分配

最后一行显示总共的执行时间为 4.420s


可见, fmt.Sprintf()strings.Join()均有内存分配,buffer.WriteString()性能最好


通过 `go tool compile -S gotest.go:` 看四种方法的汇编代码:


更多参考:

golang 几种字符串的拼接方式


后面给Go提交了几次代码,go/ast: use strings.Builder, go/parser: use strings.Builder,了解到bytes.Buffer是有内存分配的,推荐使用strings.Builder, 仔细审视了之前的代码,发现benchmark里for i := 0; i < 1000; i++ {这一行有问题,修改并新增strings.Builder后的压测文件如下:

package shuang
import (
  "bytes"
  "fmt"
  "strings"
  "testing"
)
func BenchmarkAddStringWithOperator(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = hello + "," + world
  }
}
func BenchmarkAddStringWithSprintf(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = fmt.Sprintf("%s,%s", hello, world)
  }
}
func BenchmarkAddStringWithJoin(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = strings.Join([]string{hello, world}, ",")
  }
}
func BenchmarkAddStringWithBuffer(b *testing.B) {
  hello := "hello"
  world := "world"
  //for i := 0; i < 1000; i++ { // 这样则没有内存分配
  for i := 0; i < b.N; i++ {
    var buffer bytes.Buffer
    buffer.WriteString(hello)
    buffer.WriteString(",")
    buffer.WriteString(world)
    _ = buffer.String()
  }
}
func BenchmarkAddStringWithBuilder(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    var sbuilder strings.Builder
    sbuilder.WriteString(hello)
    sbuilder.WriteString(",")
    sbuilder.WriteString(world)
    _ = sbuilder.String()
  }
}
goos: darwin
goarch: arm64
pkg: shuang/bc2
BenchmarkAddStringWithOperator-8        73952530                15.59 ns/op            0 B/op          0 allocs/op
BenchmarkAddStringWithSprintf-8         12973456                91.93 ns/op           48 B/op          3 allocs/op
BenchmarkAddStringWithJoin-8            44449520                26.82 ns/op           16 B/op          1 allocs/op
BenchmarkAddStringWithBuffer-8          36167272                32.81 ns/op           64 B/op          1 allocs/op
BenchmarkAddStringWithBuilder-8         36218533                32.60 ns/op           24 B/op          2 allocs/op
PASS
ok      shuang/bc2      6.839s

Go 1.9 版本后,strings包新增加strings.Builder

看起来strings.Builder相比于bytes.Buffer会多分配一次内存,但单次内存分配的大小小于bytes.Buffer

strings.Join底层其实也是调用strings.Builder,为什么前者在每次执行耗时及每次内存分配大小上,优于后者? 有空时仔细研究下~


以上都只是多次拼接hello,world。还有一种一直追加(append而非add)的拼接,即循环里不断将新字符串追加到原字符串之后:

如将字符串*cuishuang, *,append形式拼接10000次

package shuang
import (
  "bytes"
  "fmt"
  "strings"
  "testing"
)
func BenchmarkAddStringWithOperator(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = hello + "," + world
  }
}
func BenchmarkAddStringWithSprintf(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = fmt.Sprintf("%s,%s", hello, world)
  }
}
func BenchmarkAddStringWithJoin(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    _ = strings.Join([]string{hello, world}, ",")
  }
}
func BenchmarkAddStringWithBuffer(b *testing.B) {
  hello := "hello"
  world := "world"
  //for i := 0; i < 1000; i++ { // 这样则没有内存分配
  for i := 0; i < b.N; i++ {
    var buffer bytes.Buffer
    buffer.WriteString(hello)
    buffer.WriteString(",")
    buffer.WriteString(world)
    _ = buffer.String()
  }
}
func BenchmarkAddStringWithBuilder(b *testing.B) {
  hello := "hello"
  world := "world"
  for i := 0; i < b.N; i++ {
    var sbuilder strings.Builder
    sbuilder.WriteString(hello)
    sbuilder.WriteString(",")
    sbuilder.WriteString(world)
    _ = sbuilder.String()
  }
}
// 循环中拼接
func BenchmarkAppendWithAdd(b *testing.B) {
  var s string
  for i := 0; i < b.N; i++ {
    s = s + "cuishuang,"
    // 用来校验结果是否一致
    //if i == 10 {
    //  fmt.Println("s is:", s)
    //}
  }
}
func BenchmarkAppendWithSprintf(b *testing.B) {
  var s string
  for i := 0; i < b.N; i++ {
    s = fmt.Sprintf("%s%s", s, "cuishuang,")
    // 用来校验结果是否一致
    //if i == 10 {
    //  fmt.Println("s is:", s)
    //}
  }
}
func BenchmarkAppendWithJoin(b *testing.B) {
  var s string
  for i := 0; i < b.N; i++ {
    s = strings.Join([]string{s, "cuishuang,"}, "")
    // 用来校验结果是否一致
    //if i == 10 {
    //  fmt.Println("s is:", s)
    //}
  }
  _ = s
}
func BenchmarkAppendWithBytesBuffer(b *testing.B) {
  var byt bytes.Buffer
  for i := 0; i < b.N; i++ {
    byt.WriteString("cuishuang,")
    // 用来校验结果是否一致
    //if i == 10 {
    //  fmt.Println("s is:", s)
    //}
  }
  byt.String()
}
func BenchmarkAppendWithStringBuilder(b *testing.B) {
  var sbuilder strings.Builder
  for i := 0; i < b.N; i++ {
    sbuilder.WriteString("cuishuang,")
    // 用来校验结果是否一致
    //if i == 10 {
    //  fmt.Println("s is:", s)
    //}
  }
  sbuilder.String()
}
goos: darwin
goarch: arm64
pkg: shuang/bc2
BenchmarkAddStringWithOperator-8        76988056                15.62 ns/op            0 B/op          0 allocs/op
BenchmarkAddStringWithSprintf-8         12866815                97.66 ns/op           48 B/op          3 allocs/op
BenchmarkAddStringWithJoin-8            43484365                27.59 ns/op           16 B/op          1 allocs/op
BenchmarkAddStringWithBuffer-8          35710210                32.96 ns/op           64 B/op          1 allocs/op
BenchmarkAddStringWithBuilder-8         34601161                33.51 ns/op           24 B/op          2 allocs/op
BenchmarkAppendWithAdd-8                  317286             95134 ns/op         1590508 B/op          1 allocs/op
BenchmarkAppendWithSprintf-8               95326             84243 ns/op          959952 B/op          3 allocs/op
BenchmarkAppendWithJoin-8                 272533            100585 ns/op         1366737 B/op          1 allocs/op
BenchmarkAppendWithBytesBuffer-8        100000000               12.74 ns/op           31 B/op          0 allocs/op
BenchmarkAppendWithStringBuilder-8      100000000               10.83 ns/op           57 B/op          0 allocs/op
PASS
ok      shuang/bc2      75.146s

由于string是不可修改的,所以在使用“+”进行拼接字符串,每次都会产生申请空间,拼接,复制等操作,数据量大的情况下非常消耗资源和性能。而采用Buffer等方式,都是预先计算拼接字符串数组的总长度(如果可以知道长度),申请空间,底层是slice数组,可以以append的形式向后进行追加。最后在转换为字符串。这申请了不断申请空间的操作,也减少了空间的使用和拷贝的次数,自然性能也高不少

go语言string之Buffer与Builder

一般情况下strings.Builder性能略好于bytes.Buffer

其中一个原因是bytes.Buffer最后将byte切片转为string的String()方法,就是将字节切片强转为string(强转的时候是需要进行申请空间,并拷贝的)

// To build strings more efficiently, see the strings.Builder type.
func (b *Buffer) String() string {
    if b == nil {
        // Special case, useful in debugging.
        return "<nil>"
    }
    return string(b.buf[b.off:])
}

strings.Builder 使用*(*string)(unsafe.Pointer(&b.buf)),以unsafe.Pointer媒介,程序绕过类型系统,进行地址转换而不是拷贝

// String returns the accumulated string.
func (b *Builder) String() string {
    return *(*string)(unsafe.Pointer(&b.buf))
}

即 在最后由字节切片转为string时, bytes.Buffer 重新申请了一块空间,存放生成的string变量, 而strings.Builder 通过*(*string)(unsafe.Pointer(&byteSli))直接将底层的[]byte转换成了string类型返回了回来,省掉了申请空间的操作

参考下面 byte切片转string


更多参考:

strings.Builder 转换字符串的时候为什么比 bytes.Buffer 要快

go strings.Builder和bytes.Buffer

Go bytes.Buffer 和 strings.Builder 性能比较




byte切片转string

package main
import "fmt"
func main() {
  str := `{"default":{"common":{"pet":{"five":"斑斑","four":"皮瓜瓜","one":"弥弥懵","three":"呆呆","two":"黄橙橙"},"relation":{"father":"cuixxxxxxx","mother":"yinxxxxx","wife":"pengxx"}}}}`
  fmt.Println([]byte(str))
}

先得到 byte类型的切片

输出:

[123 34 100 101 102 97 117 108 116 34 58 123 34 99 111 109 109 111 110 34 58 123 34 112 101 116 34 58 123 34 102 105 118 101 34 58 34 230 150 145 230 150 145 34 44 34 102 111 117 114 34 58 34 231 154 174 231 147 156 231 147 156 34 44 34 111 110 101 34 58 34 229 188 165 229 188 165 230 135 181 34 44 34 116 104 114 101 101 34 58 34 229 145 134 229 145 134 34 44 34 116 119 111 34 58 34 233 187 132 230 169 153 230 169 153 34 125 44 34 114 101 108 97 116 105 111 110 34 58 123 34 102 97 116 104 101 114 34 58 34 99 117 105 120 120 120 120 120 120 120 34 44 34 109 111 116 104 101 114 34 58 34 121 105 110 120 120 120 120 120 34 44 34 119 105 102 101 34 58 34 112 101 110 103 120 120 34 125 125 125 125]

byteSliToStr.go:

package main
import (
  "testing"
  "unsafe"
)
/**
  原始字符串
`{"default":{"common":{"pet":{"five":"斑斑","four":"皮瓜瓜","one":"弥弥懵","three":"呆呆","two":"黄橙橙"},"relation":{"father":"cuixxxxxxx","mother":"yinxxxxx","wife":"pengxx"}}}}`
*/
func BenchmarkString(b *testing.B) {
  byteSli := []byte{123, 34, 100, 101, 102, 97, 117, 108, 116, 34, 58, 123, 34, 99, 111, 109, 109, 111, 110, 34, 58, 123, 34, 112, 101, 116, 34, 58, 123, 34, 102, 105, 118, 101, 34, 58, 34, 230, 150, 145, 230, 150, 145, 34, 44, 34, 102, 111, 117, 114, 34, 58, 34, 231, 154, 174, 231, 147, 156, 231, 147, 156, 34, 44, 34, 111, 110, 101, 34, 58, 34, 229, 188, 165, 229, 188, 165, 230, 135, 181, 34, 44, 34, 116, 104, 114, 101, 101, 34, 58, 34, 229, 145, 134, 229, 145, 134, 34, 44, 34, 116, 119, 111, 34, 58, 34, 233, 187, 132, 230, 169, 153, 230, 169, 153, 34, 125, 44, 34, 114, 101, 108, 97, 116, 105, 111, 110, 34, 58, 123, 34, 102, 97, 116, 104, 101, 114, 34, 58, 34, 99, 117, 105, 120, 120, 120, 120, 120, 120, 120, 34, 44, 34, 109, 111, 116, 104, 101, 114, 34, 58, 34, 121, 105, 110, 120, 120, 120, 120, 120, 34, 44, 34, 119, 105, 102, 101, 34, 58, 34, 112, 101, 110, 103, 120, 120, 34, 125, 125, 125, 125}
  _ = string(byteSli)
}
func BenchmarkUnsafe(b *testing.B) {
  byteSli := []byte{123, 34, 100, 101, 102, 97, 117, 108, 116, 34, 58, 123, 34, 99, 111, 109, 109, 111, 110, 34, 58, 123, 34, 112, 101, 116, 34, 58, 123, 34, 102, 105, 118, 101, 34, 58, 34, 230, 150, 145, 230, 150, 145, 34, 44, 34, 102, 111, 117, 114, 34, 58, 34, 231, 154, 174, 231, 147, 156, 231, 147, 156, 34, 44, 34, 111, 110, 101, 34, 58, 34, 229, 188, 165, 229, 188, 165, 230, 135, 181, 34, 44, 34, 116, 104, 114, 101, 101, 34, 58, 34, 229, 145, 134, 229, 145, 134, 34, 44, 34, 116, 119, 111, 34, 58, 34, 233, 187, 132, 230, 169, 153, 230, 169, 153, 34, 125, 44, 34, 114, 101, 108, 97, 116, 105, 111, 110, 34, 58, 123, 34, 102, 97, 116, 104, 101, 114, 34, 58, 34, 99, 117, 105, 120, 120, 120, 120, 120, 120, 120, 34, 44, 34, 109, 111, 116, 104, 101, 114, 34, 58, 34, 121, 105, 110, 120, 120, 120, 120, 120, 34, 44, 34, 119, 105, 102, 101, 34, 58, 34, 112, 101, 110, 103, 120, 120, 34, 125, 125, 125, 125}
  _ = *(*string)(unsafe.Pointer(&byteSli))
}

bench_test.go:

package main
import (
  "testing"
)
func BenchmarkTest1(b *testing.B) {
  for i := 0; i < b.N; i++ {
    BenchmarkString(b)
  }
}
func BenchmarkTest2(b *testing.B) {
  for i := 0; i < b.N; i++ {
    BenchmarkUnsafe(b)
  }
}

执行 go test -test.bench=".*" -benchmem:

goos: darwin
goarch: arm64
pkg: xxxx
BenchmarkTest1-8        16376076                61.51 ns/op          192 B/op          1 allocs/op
BenchmarkTest2-8        34398655                33.49 ns/op            0 B/op          0 allocs/op
PASS
ok      xxxx      2.363s

第4行显示了BenchmarkString 执行了16376076次,每次的执行平均时间是61.51纳秒, 每次操作有1次内存分配,每次分配了192Byte大小的内存空间

第5行显示了BenchmarkUnsafe 执行了34398655次,每次的平均执行时间是33.49 纳秒, 无内存分配


可见使用unsafe这种"黑科技",确实可以少分配一次内存

也可看出,string(byteSli)的方式是深拷贝,为新生成的新字符串新分配了一块内存




string转byte切片


再看一下上面的逆操作

strToByteSli.go:

package main
import (
  "reflect"
  "testing"
  "unsafe"
)
func BenchmarkByteStyle(b *testing.B) {
  str := `{"default":{"common":{"pet":{"five":"斑斑","four":"皮瓜瓜","one":"弥弥懵","three":"呆呆","two":"黄橙橙"},"relation":{"father":"cuixxxxxxx","mother":"yinxxxxx","wife":"pengxx"}}}}`
  _ = []byte(str)
}
func BenchmarkWithUnsafe(b *testing.B) {
  str := `{"default":{"common":{"pet":{"five":"斑斑","four":"皮瓜瓜","one":"弥弥懵","three":"呆呆","two":"黄橙橙"},"relation":{"father":"cuixxxxxxx","mother":"yinxxxxx","wife":"pengxx"}}}}`
  sh := (*reflect.StringHeader)(unsafe.Pointer(&str))
  bh := reflect.SliceHeader{
    Data: sh.Data,
    Len:  sh.Len,
    Cap:  sh.Len,
  }
  _ = *(*[]byte)(unsafe.Pointer(&bh))
}

bench_test.go:

package main
import (
  "testing"
)
func BenchmarkTest3(b *testing.B) {
  for i := 0; i < b.N; i++ {
    BenchmarkByteStyle(b)
  }
}
func BenchmarkTest4(b *testing.B) {
  for i := 0; i < b.N; i++ {
    BenchmarkWithUnsafe(b)
  }
}

执行 go test -test.bench=".*" -benchmem:

goos: darwin
goarch: arm64
pkg: xxxx
BenchmarkTest3-8        34892566                34.03 ns/op          192 B/op          1 allocs/op
BenchmarkTest4-8        1000000000               0.3148 ns/op          0 B/op          0 allocs/op
PASS
ok      xxxx      2.873s

第4行显示了BenchmarkByteStyle 执行了34892566次,每次的执行平均时间是34.03纳秒, 每次操作有1次内存分配,每次分配了192Byte大小的内存空间

第5行显示了BenchmarkWithUnsafe 执行了1000000000次,每次的平均执行时间是0.3148纳秒, 无内存分配


使用unsafe不仅可以少分配一次内存,每次的平均执行时间也差了100倍...(而用unsafe 从[]byte到string,和使用string(byteSli)方式,执行时间只快了一倍)

string和[]byte转换会发生内存拷贝吗


目录
相关文章
|
7月前
|
存储 编译器
深入解析i++和++i的区别及性能影响
在我们编写代码时,经常需要对变量进行自增操作。这种情况下,我们通常会用到两种常见的操作符:i++和++i。最近在阅读博客时,我偶然看到了有关i++和++i性能的讨论。之前我一直在使用它们,但从未从性能的角度考虑过,这让我突然产生了兴趣。尽管它们看起来相似,但它们之间存在微妙而重要的区别。在本文中,我们将详细解释i++和++i之间的区别,以及它们对代码性能的影响。
237 1
深入解析i++和++i的区别及性能影响
|
2月前
|
JavaScript 前端开发
v-if 和 v-show 的差异及最优使用场景
v-if和v-show都是Vue.js中的条件渲染指令,它们都可以根据表达式的值来决定是否渲染一个元素。但是它们的工作方式不同,因此在使用上也有一些区别。
|
4月前
|
分布式计算 并行计算 算法
图计算中的性能优化有哪些方法?请举例说明。
图计算中的性能优化有哪些方法?请举例说明。
20 0
|
5月前
|
缓存 JavaScript
巧用 computed 计算属性,实现代码简洁高效
巧用 computed 计算属性,实现代码简洁高效
94 1
|
9月前
|
编译器 测试技术 Go
不同写法的性能差异(1)
不同写法的性能差异(1)
42 0
|
9月前
|
测试技术 Go
不同写法的性能差异(2)
不同写法的性能差异(2)
44 0
|
10月前
|
存储 机器人 应用服务中间件
|
11月前
|
测试技术 编译器 Go
不同写法的性能差异
不同写法的性能差异
308 0
|
11月前
HAVING和WHERE的差别
HAVING和WHERE的差别
32 0
一个微小的调优去掉嵌套的if,else
一个微小的调优去掉嵌套的if,else