The Ogg Skeleton Metadata Bitstream原文
Overview
Ogg Skeleton provides structuring information for multitrack Ogg files. It is compatible with Ogg Theora and provides extra clues for synchronization and content negotiation such as language selection. Ogg Skeleton提供多轨Ogg文件的结构信息。它与Ogg Theora兼容,并为同步和内容协商(例如语言选择)提供了额外的线索。 Ogg is a generic container format for time-continuous data streams, enabling interleaving of several tracks of frame-wise encoded content in a time-multiplexed manner. As an example, an Ogg physical bitstream could encapsulate several tracks of video encoded in Theora and multiple tracks of audio encoded in Speex or Vorbis or FLAC at the same time. A player that decodes such a bitstream could then, for example, play one video channel as the main video playback, alpha-blend another one on top of it (e.g. a caption track), play a main Vorbis audio together with several FLAC audio tracks simultaneously (e.g. as sound effects), and provide a choice of Speex channels (e.g. providing commentary in different languages). Such a file is generally possible to create with Ogg, it is however not possible to generically parse such a file, seek on it, understand what codecs are contained in such a file, and dynamically handle and play back such content. Ogg是用于时间连续数据流的通用容器格式,能够以时分多路复用方式对逐帧编码内容的多个磁道进行交织。作为示例,Ogg物理比特流可以同时封装以Theora编码的多个视频轨道和同时以Speex或Vorbis或FLAC编码的多个音频轨道。解码此类比特流的播放器随后可以例如播放一个视频频道作为主要视频播放,将其上的另一个Alpha混合(例如字幕轨道),将主要Vorbis音频与几个FLAC音频轨道一起播放同时(例如,作为声音效果),并提供Speex频道的选择(例如,提供不同语言的评论)。通常可以使用Ogg创建这样的文件,但是不可能一般性地解析,搜索,了解该文件中包含哪些编解码器以及动态地处理和播放此类内容。 Ogg does not know anything about the content it carries and leaves it to the media mapping of each codec to declare and describe itself. There is no meta information available at the Ogg level about the content tracks encapsulated within an Ogg physical bitstream. This is particularly a problem if you don't have all the decoder libraries available and just want to parse an Ogg file to find out what type of data it encapsulates (such as the "file" command under nix to determine what file it is through magic numbers), or want to seek to a temporal offset without having to decode the data (such as on a Web server that just serves out Ogg files and parts thereof). Ogg对它所携带的内容一无所知,而是将其留给每个编解码器的媒体映射来声明和描述自己。在Ogg级别上没有关于封装在Ogg物理比特流中的内容轨道的元信息。如果您没有所有可用的解码器库,而只是想解析一个Ogg文件以找出封装的数据类型(例如 nix下的“ file”命令来确定它是什么文件),则这尤其成问题。 (例如,通过魔术数字),或者想要寻求时间偏移而不必解码数据(例如在仅提供Ogg文件及其部分内容的Web服务器上)。
Ogg Skeleton is being designed to overcome these problems. Ogg Skeleton is a logical bitstream within an Ogg stream that contains information about the other encapsulated logical bitstreams. For each logical bitstream it provides information such as its media type, and explains the way the granulepos field in Ogg pages is mapped to time. Ogg Skeleton旨在克服这些问题。 Ogg Skeleton是Ogg流中的逻辑位流,其中包含有关其他封装的逻辑位流的信息。对于每个逻辑位流,它提供诸如其媒体类型之类的信息,并说明Ogg页面中的granulepos字段映射到时间的方式。 Ogg Skeleton is also designed to allow the creation of substreams from Ogg physical bitstreams that retain the original timing information. For example, when cutting out the segment between the 7th and the 59th second of an Ogg file, it would be nice to continue to start this cut out file with a playback time of 7 seconds and not of 0. This is of particular interest if you're streaming this file from a Web server after a query for a temporal subpart such as in example.com/video.ogv?t…Ogg Skeleton还设计为允许从Ogg物理位流创建保留原始时序信息的子流。例如,当切出Ogg文件的第7秒到第59秒之间的片段时,最好以7秒而不是0的播放时间继续启动此切出的文件。您查询了时间子部分(例如http://example.com/video.ogv?t=7-59)之后,正在从Web服务器流式传输此文件
Specification
How to describe the logical bitstreams within an Ogg container?如何描述Ogg容器中的逻辑位流?
The following information about a logical bitstream is of interest to contain as meta information in the Skeleton: 以下是有关逻辑位流的信息,希望将其包含在骨架中作为元信息:
- the serial number: it identifies a content track 序列号:它标识内容轨道
- the mime type: it identifies the content type MIME类型:它标识内容类型
- other generic name-value fields that can provide meta information such as the language of a track or the video height and width 其他可以提供元信息的通用名称/值字段,例如轨道的语言或视频的高度和宽度
- the number of header packets: this informs a parser about the number of actual header packets in an Ogg logical bitstream 标头数据包的数量:这将向解析器通知Ogg逻辑比特流中实际标头数据包的数量
- the granule rate: the granule rate represents the data rate in Hz at which content is sampled for the particular logical bitstream, allowing to map a granule position to time by calculating "granulepos / granulerate"颗粒率:颗粒率表示数据速率,以Hz为单位,针对特定逻辑比特流采样内容,从而通过计算“ granulepos /颗粒率”将颗粒位置映射到时间
- the preroll: the number of past content packets to take into account when decoding the current Ogg page, which is necessary for seeking (vorbis has generally 2, speex 3)preroll:在解码当前的Ogg页面时要考虑的过去内容数据包的数量,这是查找所必需的(vorbis通常为2,speex 3)
- the granuleshift: the number of lower bits from the granulepos field that are used to provide position information for sub-seekable units (like the keyframe shift in theora)granuleshift:granulepos字段的低位位数,用于为可细分的单位提供位置信息(例如theora中的关键帧移位)
- a basetime: it provides a mapping for granule position 0 (for all logical bitstreams) to a playback time; an example use: most content in professional analog video creation actually starts at a time of 1 hour and thus adding this additional field allows them retain this mapping on digitizing their content一个基准时间:它提供了颗粒位置0(对于所有逻辑比特流)到回放时间的映射;用法示例:专业模拟视频创作中的大多数内容实际上是在1小时的时间开始的,因此添加此附加字段使他们可以在数字化其内容时保留此映射
- a UTC time: it provides a mapping for granule position 0 (for all logical bitstreams) to a real-world clock time allowing to remember e.g. the recording or broadcast time of some content UTC时间:它提供了颗粒位置0(对于所有逻辑比特流)到现实世界时钟时间的映射,允许您记住例如某些内容的录制或播放时间
How to allow the creation of substreams from an Ogg physical bitstream?如何允许从Ogg物理比特流创建子流?
When cutting out a subpart of an Ogg physical bitstream, the aim is to keep all the content pages intact (including the framing and granule positions) and just change some information in the Skeleton that allows reconstruction of the accurate time mapping. When remultiplexing such a bitstream, it is necessary to take into account all the different contained logical bitstreams. A given cut-in time maps to several different byte positions in the Ogg physical bitstream because each logical bitstream has its relevant information for that time at a different location. In addition, the resolution of each logical bitstream may not be high enough to accommodate for the given cut-in time and thus there may be some surplus information necessary to be remuxed into the new bitstream. 切出Ogg物理位流的子部分时,其目的是保持所有内容页(包括成帧和颗粒位置)完整无缺,并且只需更改“骨架”中的某些信息即可重建准确的时间映射。当重新复用这样的比特流时,必须考虑所有不同的包含的逻辑比特流。给定的插入时间映射到Ogg物理比特流中的几个不同字节位置,因为每个逻辑比特流在不同的位置具有该时间的相关信息。另外,每个逻辑比特流的分辨率可能不够高以适应给定的切入时间,因此可能有一些多余的信息需要重新混合到新的比特流中。 The following information is necessary to be added to the Skeleton to allow a correct presentation of a subpart of an Ogg bitstream: 必须将以下信息添加到骨骼中,以正确呈现Ogg比特流的子部分: the presentation time: this is the actual cut-in time and all logical bitstreams are meant to start presenting from this time onwards, not from the time their data starts, which may be some time before that (because this time may have mapped right into the middle of a packet, or because the logical bitstream has a preroll or a keyframe shift) 表示时间:这是实际的插入时间,所有逻辑比特流均应从此时间开始显示,而不是从其数据开始的时间开始,这可能是在此之前的某个时间(因为此时间可能已经映射到数据包的中间,或者因为逻辑位流具有预滚动或关键帧移位) the basegranule: this represents the granule number with which this logical bitstream starts in the remuxed stream and provides for each logical bitstream the accurate start time of its data stream; this information is necessary to allow correct decoding and timing of the first data packets contained in a logcial bitstream of a remuxed Ogg stream 基本粒度:代表该逻辑比特流在重混合流中以此开始的粒度,并为每个逻辑比特流提供其数据流的准确开始时间;此信息对于允许正确解码和定时重新混合的Ogg流的逻辑比特流中包含的第一个数据包是必需的
Ogg Skeleton version 3.0 Format Specification
Adding the above information into an Ogg bitstream without breaking existing Ogg functionality and code requires the use of a logical bitstream for Ogg Skeleton. This logical bitstream may be ignored on decoding such that existing players can still continue to play back Ogg files that have a Skeleton bitstream. Skeleton enriches the Ogg bitstream to provide meta information about structure and content of the Ogg bitstream. 在不破坏现有Ogg功能和代码的情况下将上述信息添加到Ogg比特流中,需要为Ogg Skeleton使用逻辑比特流。解码时可以忽略此逻辑比特流,以便现有播放器仍可以继续播放具有骨架比特流的Ogg文件。骨架丰富了Ogg比特流,以提供有关Ogg比特流的结构和内容的元信息。 The Skeleton logical bitstream starts with an ident header that contains information about all of the logical bitstreams and is mapped into the Skeleton bos page. The first 8 bytes provide the magic identifier "fishead\0". After the fishead follows a set of secondary header packets, each of which contains information about one logical bitstream. These secondary header packets are identified by an 8 byte code of "fisbone\0". The Skeleton logical bitstream has no actual content packets. Its eos page is included into the stream before any data pages of the other logical bitstreams appear and contains a packet of length 0. 骨架逻辑比特流以ident头开头,该头包含有关所有逻辑比特流的信息,并映射到Skeleton Bos页面。前8个字节提供魔术标识符“ fishead \ 0”。在fishead之后,出现一组辅助标头数据包,每个辅助标头数据包都包含有关一个逻辑比特流的信息。这些辅助报头数据包由8字节代码“ fisbone \ 0”标识。骨架逻辑比特流没有实际的内容包。在其他逻辑比特流的任何数据页出现之前,其eos页已包含在流中,并且包含长度为0的数据包。 The fishead ident header looks as follows:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identifier 'fishead\0' | 0-3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 4-7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version major | Version minor | 8-11 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Presentationtime numerator | 12-15 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 16-19 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Presentationtime denominator | 20-23 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 24-27 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Basetime numerator | 28-31 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 32-35 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Basetime denominator | 36-39 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 40-43 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UTC | 44-47 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 48-51 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 52-55 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 56-59 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 60-63 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The version fields provide version information for the Skeleton track, currently being 3.0 (the number having evolved within the Annodex project). 版本字段提供了有关Skeleton轨迹的版本信息,当前为3.0(在Annodex项目中不断发展的数量)。 Presentation time and basetime are specified as a rational number, the denominator providing the temporal resolution at which the time is given (e.g. to specify time in milliseconds, provide a denominator of 1000). 表示时间和基准时间被指定为有理数,分母提供给出时间的时间分辨率(例如,以毫秒为单位指定时间,分母为1000)。 The fisbone secondary header packet looks as follows: fisbone辅助标头数据包如下所示:
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identifier 'fisbone\0' | 0-3 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 4-7 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Offset to message header fields | 8-11 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Serial number | 12-15 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Number of header packets | 16-19 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Granulerate numerator | 20-23 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 24-27 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Granulerate denominator | 28-31 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 32-35 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Basegranule | 36-39 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | 40-43 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Preroll | 44-47 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Granuleshift | Padding/future use | 48-51 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Message header fields ... | 52- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The mime type is provided as a message header field specified in the same way that HTTP header fields are given (e.g. "Content-Type: audio/vorbis"). Further meta information (such as language and screen size) are also included as message header fields. The offset to the message header fields at the beginning of a fisbone packet is included for forward compatibility - to allow further fields to be included into the packet without disrupting the message header field parsing. MIME类型是作为消息头字段提供的,该消息头字段的指定方式与HTTP头字段的指定方式相同(例如,“ Content-Type:audio / vorbis”)。其他元信息(例如语言和屏幕大小)也作为消息头字段包含在内。包含在fisbone数据包开头的消息头字段的偏移量是为了向前兼容-允许在不破坏消息头字段解析的情况下将其他字段包含在数据包中。 The granule rate is again given as a rational number in the same way that presentation time and basetime were provided above. 颗粒率再次以与上面提供展示时间和基准时间相同的方式作为有理数给出。 A further restriction on how to encapsulate Skeleton into Ogg is proposed to allow for easier parsing: 建议进一步限制如何将Skeleton封装到Ogg中,以简化解析过程: there can only be one Skeleton logical bitstream in a Ogg bitstream Ogg比特流中只能有一个Skeleton逻辑比特流 the Skeleton bos page is the very first bos page in the Ogg stream such that it can be identified straight away and decoders don't get confused about it being e.g. Ogg Vorbis without this meta information Skeleton bos页面是Ogg流中的第一个bos页面,因此可以立即对其进行识别,并且解码器不会对此感到困惑,例如没有此中继信息的Ogg Vorbis the bos pages of all the other logical bistreams come next (a requirement of Ogg) Skeleton bos页面是Ogg流中的第一个bos页面,因此可以立即对其进行识别,并且解码器不会对此感到困惑,例如没有此中继信息的Ogg Vorbis the secondary header pages of all logical bitstreams come next, including Skeleton's secondary header packets 接下来是所有逻辑位流的辅助标头页面,包括骨架的辅助标头数据包 the Skeleton eos page end the control section of the Ogg stream before any content pages of any of the other logical bitstreams appear 在任何其他逻辑比特流的任何内容页面出现之前,Skeleton eos页面结束Ogg流的控制部分