A Go library implementing an FST (finite state transducer)——mark下

简介:

https://github.com/couchbaselabs/vellum

Building an FST

To build an FST, create a new builder using the New() method. This method takes an io.Writer as an argument. As the FST is being built, data will be streamed to the writer as soon as possible. With this builder you MUST insert keys in lexicographic order. Inserting keys out of order will result in an error. After inserting the last key into the builder, you MUST call Close() on the builder. This will flush all remaining data to the underlying writer.

In memory:

  var buf bytes.Buffer
  builder, err := vellum.New(&buf, nil) if err != nil { log.Fatal(err) }

To disk:

  f, err := os.Create("/tmp/vellum.fst") if err != nil { log.Fatal(err) } builder, err := vellum.New(f, nil) if err != nil { log.Fatal(err) }

MUST insert keys in lexicographic order:

err = builder.Insert([]byte("cat"), 1) if err != nil { log.Fatal(err) } err = builder.Insert([]byte("dog"), 2) if err != nil { log.Fatal(err) } err = builder.Insert([]byte("fish"), 3) if err != nil { log.Fatal(err) } err = builder.Close() if err != nil { log.Fatal(err) }

Using an FST

After closing the builder, the data can be used to instantiate an FST. If the data was written to disk, you can use the Open()method to mmap the file. If the data is already in memory, or you wish to load/mmap the data yourself, you can instantiate the FST with the Load() method.

Load in memory:

  fst, err := vellum.Load(buf.Bytes())
  if err != nil { log.Fatal(err) }

Open from disk:

  fst, err := vellum.Open("/tmp/vellum.fst") if err != nil { log.Fatal(err) }

Get key/value:

  val, exists, err = fst.Get([]byte("dog"))
  if err != nil { log.Fatal(err) } if exists { fmt.Printf("contains dog with val: %d\n", val) } else { fmt.Printf("does not contain dog") }

Iterate key/values:

  itr, err := fst.Iterator(startKeyInclusive, endKeyExclusive)
  for err == nil { key, val := itr.Current() fmt.Printf("contains key: %s val: %d", key, val) err = itr.Next() } if err != nil { log.Fatal(err) }

How does the FST get built?

A full example of the implementation is beyond the scope of this README, but let's consider a small example where we want to insert 3 key/value pairs.

First we insert "are" with the value 4.

step1

Next, we insert "ate" with the value 2.

step2

Notice how the values associated with the transitions were adjusted so that by summing them while traversing we still get the expected value.

At this point, we see that state 5 looks like state 3, and state 4 looks like state 2. But, we cannot yet combine them because future inserts could change this.

Now, we insert "see" with value 3. Once it has been added, we now know that states 5 and 4 can longer change. Since they are identical to 3 and 2, we replace them.

step3

Again, we see that states 7 and 8 appear to be identical to 2 and 3.

Having inserted our last key, we call Close() on the builder.

step4

Now, states 7 and 8 can safely be replaced with 2 and 3.

For additional information, see the references at the bottom of this document.















本文转自张昺华-sky博客园博客,原文链接:http://www.cnblogs.com/bonelee/p/6806129.html,如需转载请自行联系原作者


相关文章
Go REFLECT Library | 05 - reflect.Value 动态修变量值
Go REFLECT Library | 05 - reflect.Value 动态修变量值
Go REFLECT Library | 05 - reflect.Value 动态修变量值
|
存储 Go 索引
Go REFLECT Library | 03 - 反射的值 Value
Go REFLECT Library | 03 - 反射的值 Value
Go REFLECT Library | 03 - 反射的值 Value
|
自然语言处理 Java 编译器
Go REFLECT Library | 01 - 反射的类型 Type
Go REFLECT Library | 01 - 反射的类型 Type
Go REFLECT Library | 01 - 反射的类型 Type
Go REFLECT Library | 06 - reflect.Type 和 reflect.Value 应用
Go REFLECT Library | 06 - reflect.Type 和 reflect.Value 应用
|
JSON Go 数据格式
Go REFLECT Library | 04 - 反射的值 Value
Go REFLECT Library | 04 - 反射的值 Value
|
Go 索引
Go REFLECT Library | 02 - 反射的类型 Type
Go REFLECT Library | 02 - 反射的类型 Type
|
负载均衡 Go Unix
ALiyun LOG Go Consumer Library 快速入门及原理剖析
Aliyun LOG Go Consumer Library是用go语言编写的协同消费库,主要处理多个消费者同时消费logstore时自动分配shard问题。 其中消费组会自动处理shard的负载均衡,消费者failover等逻辑,这部分说明会在本篇下面的篇幅中进行详细的介绍。
11605 1
|
Go
Go Consumer Library发布
信息摘要: Go SDK提供Consumer Library模式,提供断点续传、多实例负载均衡、弹性伸缩等实时消费能力适用客户: Go语言开发者版本/规格功能: Aliyun LOG Go Consumer Library是用Go语言编写的协同消费库,支持多个消费者同时消费logstore。
1103 0
|
1天前
|
安全 大数据 Go
深入探索Go语言并发编程:Goroutines与Channels的实战应用
在当今高性能、高并发的应用需求下,Go语言以其独特的并发模型——Goroutines和Channels,成为了众多开发者眼中的璀璨明星。本文不仅阐述了Goroutines作为轻量级线程的优势,还深入剖析了Channels作为Goroutines间通信的桥梁,如何优雅地解决并发编程中的复杂问题。通过实战案例,我们将展示如何利用这些特性构建高效、可扩展的并发系统,同时探讨并发编程中常见的陷阱与最佳实践,为读者打开Go语言并发编程的广阔视野。
|
4天前
|
Go
golang语言之go常用命令
这篇文章列出了常用的Go语言命令,如`go run`、`go install`、`go build`、`go help`、`go get`、`go mod`、`go test`、`go tool`、`go vet`、`go fmt`、`go doc`、`go version`和`go env`,以及它们的基本用法和功能。
14 6
下一篇
DDNS