Kiali是一款服务网格拓扑可视化工具,能够利用服务网格收集的可观测性指标将网格中的流量链路绘制成直观易懂的拓扑图。Kiali提供了基于 Web 的交互式图形用户界面,能够以多个层级(应用程序,版本,工作负载)展示集群中微服务之间的交互和连接情况。
Kiali的拓扑图(Graph,以下简称图)底层是通过 Cytoscape.js
绘制的。
cytoscape.js是一个做网页可视化的常用工具 。cytoscape.js包含图论模型和可选的渲染器,用于显示交互式图形。该库旨在使程序员和科学家尽可能轻松地在他们的应用程序中使用图形理论,无论是用于Node.js应用程序中的服务器端分析还是用于丰富的用户界面。
在前端发起请求后,Kiali是如何获取数据并转化成最终的拓扑图的,本文基于Kiali源码对图的生成流程进行了探索。
用户访问后,前端首先向/api/namespaces/graph发起请求,路径/api/namespaces/graph对应router位于routing/routes.go
中的第1061行
收到请求后,由handlers/graph.go
中的handler GraphNamespaces
来处理。主要干了两件事:
- 从请求参数中构造option
- 通过认证信息获取business.Layer。Layer按照资源类型,封装了访问资源的客户端,包括k8s, prometheus, jaeger。
Options:
typeOptionsstruct { ConfigVendorstringTelemetryVendorstringConfigOptionsTelemetryOptions}
ConfigOptions:
typeConfigOptionsstruct { BoxBystringCommonOptions}
TelemetryOptions:
typeTelemetryOptionsstruct { AccessibleNamespacesmap[string]time.TimeAppendersRequestedAppenders// requested appenders, nil if param not suppliedIncludeIdleEdgesbool// include edges with request rates of 0InjectServiceNodesbool// inject destination service nodes between source and destination nodes.NamespacesNamespaceInfoMapRatesRequestedRatesCommonOptionsNodeOptions}
CommonOptions:
typeCommonOptionsstruct { Durationtime.DurationGraphTypestringParamsurl.Values// make available the raw query params for vendor-specific handlingQueryTimeint64// unix time in seconds}
之后转给api包中的GraphNamespaces
在网格环境下,TlemetryVendor的值为"Istio",故会调用graphNamespacesIstio
继续调用istio包下的BuildNamespacesTrafficMap
去生成最终的TrafficMap(就是一个string-*Node的map)。
Node定义:
typeNodestruct { IDstring// unique identifier for the nodeNodeTypestring// Node typeClusterstring// ClusterNamespacestring// NamespaceWorkloadstring// Workload (deployment) nameAppstring// Workload app label valueVersionstring// Workload version label valueServicestring// Service nameEdges []*Edge// child nodesMetadataMetadata// app-specific data} typeEdgestruct { Source*NodeDest*NodeMetadataMetadata// app-specific data}
BuildNamespacesTrafficMap的逻辑分为两步:
- 对于每个命名空间生成traffic-map
- 生成全局traffic-map并转换
每个命名空间的TrafficMap由buildNamespaceTrafficMap
构建。对于常见的HTTP/GRPC请求,对于不同的流量指标(分成了三类)构造query字符串并使用promQuery
函数查询所需指标。
- Incoming: query source telemetry to capture unserviced namespace services' incoming traffic
- Incoming: query destination telemetry to capture namespace services' incoming traffic
- Outgoing: query source telemetry to capture namespace workloads' outgoing traffic
promql查询得到的结果是Vector,即Sample指针的Slice。Vector中的Sample拥有相同的timestamp。
Sample定义:
typeSamplestruct { MetricMetric`json:"metric"`ValueSampleValue`json:"value"`TimestampTime`json:"timestamp"`}
拿到Vector后,由populateTrafficMap
函数提取指标中的各个参数
TS是timestamp的缩写
参数被穿给addTraffic
函数,由addTraffic去调用addNode
向map中添加node。addEdgeTraffic
向source node添加edge。
得到了namespaceTrafficMap后,每个开启的appender对其中的node进行增加或删除。
然后MergeTrafficMaps
合并到全局的trafficmap当中。
最后使用finalizers对全局的trafficmap进行操作。BuildNamespacesTrafficMap
工作完成。
- appender: 根据请求参数决定开启哪些appender,然后按顺序添加到appenders切片中。具体的appender定义在graph/telemetry/isto/appender/下。
- finalizers: 和appender同样order is important,不能随意修改顺序
trafficmap由generateGraph
转换成 CystoscapeJS 格式的 vendorConfig。
typeConfigstruct { Timestampint64`json:"timestamp"`Durationint64`json:"duration"`GraphTypestring`json:"graphType"`ElementsElements`json:"elements"`} typeElementsstruct { Nodes []*NodeWrapper`json:"nodes"`Edges []*EdgeWrapper`json:"edges"`} typeNodeWrapperstruct { Data*NodeData`json:"data"`} typeEdgeWrapperstruct { Data*EdgeData`json:"data"`} typeNodeDatastruct { // Cytoscape FieldsIDstring`json:"id"`// unique internal node ID (n0, n1...)Parentstring`json:"parent,omitempty"`// Compound Node parent ID// App Fields (not required by Cytoscape)NodeTypestring`json:"nodeType"`Clusterstring`json:"cluster"`Namespacestring`json:"namespace"`Workloadstring`json:"workload,omitempty"`Appstring`json:"app,omitempty"`Versionstring`json:"version,omitempty"`Servicestring`json:"service,omitempty"`// requested service for NodeTypeServiceAggregatestring`json:"aggregate,omitempty"`// set like "<aggregate>=<aggregateVal>"DestServices []graph.ServiceName`json:"destServices,omitempty"`// requested services for [dest] nodeLabelsmap[string]string`json:"labels,omitempty"`// k8s labels associated with the nodeTraffic []ProtocolTraffic`json:"traffic,omitempty"`// traffic rates for all detected protocolsHealthDatainterface{} `json:"healthData"`// data to calculate health status from configurationsHealthDataAppinterface{} `json:"-"`// for local use to generate appBox healthHasCBbool`json:"hasCB,omitempty"`// true (has circuit breaker) | falseHasFaultInjectionbool`json:"hasFaultInjection,omitempty"`// true (vs has fault injection) | falseHasHealthConfigHealthConfig`json:"hasHealthConfig,omitempty"`// set to the health config overrideHasMirroringbool`json:"hasMirroring,omitempty"`// true (has mirroring) | falseHasMissingSCbool`json:"hasMissingSC,omitempty"`// true (has missing sidecar) | falseHasRequestRoutingbool`json:"hasRequestRouting,omitempty"`// true (vs has request routing) | falseHasRequestTimeoutbool`json:"hasRequestTimeout,omitempty"`// true (vs has request timeout) | falseHasTCPTrafficShiftingbool`json:"hasTCPTrafficShifting,omitempty"`// true (vs has tcp traffic shifting) | falseHasTrafficShiftingbool`json:"hasTrafficShifting,omitempty"`// true (vs has traffic shifting) | falseHasVS*VSInfo`json:"hasVS,omitempty"`// it can be empty if there is a VS without hostnamesHasWorkloadEntry []graph.WEInfo`json:"hasWorkloadEntry,omitempty"`// static workload entry information | empty if there are no workload entriesIsBoxstring`json:"isBox,omitempty"`// set for NodeTypeBox, current values: [ 'app', 'cluster', 'namespace' ]IsDeadbool`json:"isDead,omitempty"`// true (has no pods) | falseIsGateway*GWInfo`json:"isGateway,omitempty"`// Istio ingress/egress gateway informationIsIdlebool`json:"isIdle,omitempty"`// true | falseIsInaccessiblebool`json:"isInaccessible,omitempty"`// true if the node exists in an inaccessible namespaceIsOutsidebool`json:"isOutside,omitempty"`// true | falseIsRootbool`json:"isRoot,omitempty"`// true | falseIsServiceEntry*graph.SEInfo`json:"isServiceEntry,omitempty"`// set static service entry information} typeEdgeDatastruct { // Cytoscape FieldsIDstring`json:"id"`// unique internal edge ID (e0, e1...)Sourcestring`json:"source"`// parent node IDTargetstring`json:"target"`// child node ID// App Fields (not required by Cytoscape)DestPrincipalstring`json:"destPrincipal,omitempty"`// principal used for the edge destinationIsMTLSstring`json:"isMTLS,omitempty"`// set to the percentage of traffic using a mutual TLS connectionResponseTimestring`json:"responseTime,omitempty"`// in millisSourcePrincipalstring`json:"sourcePrincipal,omitempty"`// principal used for the edge sourceThroughputstring`json:"throughput,omitempty"`// in bytes/sec (request or response, depends on client request)TrafficProtocolTraffic`json:"traffic,omitempty"`// traffic rates for the edge protocol}
实际转换主要是由cytoscape包下的buildConfig完成。
整个过程中,数据格式转换过程:
Prometheus的samples -> Kiali的traffimap -> CystoscapeJS的config json