Protobuf是一种可以实现内存与外存交换的协议接口。这是由谷歌开发的开源工具,目前研究Caffe源码时用到。
一个软件项目 = 数据结构 + 算法 + 参数,对于数据结构和算法我们都已经有较多研究,但不同开发者对参数管理却各有千秋。有人喜欢TXT格式化的参数文件,有人喜欢BIN简单高效,也有人喜欢图形化界面的直观。不一致的参数管理带来很多问题,例如一个项目组内不同成员必须约定一套统一的参数方案,或者称为通信协议,这样便于模块集成。而Protobuf工具就完美解决了这个问题,关键部分代码自动生成,节省了大量的开发、调试时间。
首先下载protobuf,地址(打不开?……不解释)
这里用Linux版本2.5.0
解压:
tar zxvf protobuf-2.5.0.tar.gz
切到主目录:
cd protobuf-2.5.0
编译:
./configure
make
sudo make install
添加环境变量:
export PKG_CONFIG_PATH=$(pwd)
编译examples:
cd examples/
make cpp
这里我们只编译C++代码。
编译完成,生成了以下可执行文件:
add_person_cpp
list_people_cpp
这是个通讯录的例子。我们首先运行add_person_cpp:
./add_person_cpp zyk zyk: File not found. Creating a new file. Enter person ID number: 123 Enter name: zhaoyongke Enter email address (blank for none): zhaoyongke@yeah.net Enter a phone number (or leave blank to finish): 188188188 Is this a mobile, home, or work phone?(回车) Unknown phone type. Using default. Enter a phone number (or leave blank to finish):(回车)
然后运行list_people_cpp:
./list_people_cpp zyk Person ID: 123 Name: zhaoyongke E-mail address: zhaoyongke@yeah.net Home phone #: 188188188
可见我们生成了新的通讯录zyk,里面保存了相应的信息。
例子运行结束了,我们看下代码是如何生成的。
protobuf使用前,先编写proto文件,这是描述我们需要配置参数的数据结构。这个例子里面的proto如下:
// See README.txt for information and build instructions. package tutorial; option java_package = "com.example.tutorial"; option java_outer_classname = "AddressBookProtos"; message Person { required string name = 1; required int32 id = 2; // Unique ID number for this person. optional string email = 3; enum PhoneType { MOBILE = 0; HOME = 1; WORK = 2; } message PhoneNumber { required string number = 1; optional PhoneType type = 2 [default = HOME]; } repeated PhoneNumber phone = 4; } // Our address book file is just one of these. message AddressBook { repeated Person person = 1; }
前几行是定义包的,可以忽略。
message Person{...}定义了一个需要传输的参数结构体,可见包括这么几个单元:name(string类型)、id(int32类型)、email(string类型)、phone(PhoneNumber类型,嵌套在Person内的类)。前面标记为“required”是必须有值的,而“optional“则为可选项,”repeated“表示后面单元为相同类型的一组向量。
有了如上定义,我们可以用protobuf工具生成接口代码,命令如下:
protoc --cpp_out=. addressbook.proto
运行后生成了两个文件:addressbook.pb.cc 和addressbook.pb.h,代码比较长就不贴了。我们的应用程序可以通过自动生成的接口实现参数的序列化/反序列化,代码如下:
//add_person.c #include <iostream> #include <fstream> #include <string> #include "addressbook.pb.h" using namespace std; // This function fills in a Person message based on user input. void PromptForAddress(tutorial::Person* person) { cout << "Enter person ID number: "; int id; cin >> id; person->set_id(id); cin.ignore(256, '\n'); cout << "Enter name: "; getline(cin, *person->mutable_name()); cout << "Enter email address (blank for none): "; string email; getline(cin, email); if (!email.empty()) { person->set_email(email); } while (true) { cout << "Enter a phone number (or leave blank to finish): "; string number; getline(cin, number); if (number.empty()) { break; } tutorial::Person::PhoneNumber* phone_number = person->add_phone(); phone_number->set_number(number); cout << "Is this a mobile, home, or work phone? "; string type; getline(cin, type); if (type == "mobile") { phone_number->set_type(tutorial::Person::MOBILE); } else if (type == "home") { phone_number->set_type(tutorial::Person::HOME); } else if (type == "work") { phone_number->set_type(tutorial::Person::WORK); } else { cout << "Unknown phone type. Using default." << endl; } } } // Main function: Reads the entire address book from a file, // adds one person based on user input, then writes it back out to the same // file. int main(int argc, char* argv[]) { // Verify that the version of the library that we linked against is // compatible with the version of the headers we compiled against. GOOGLE_PROTOBUF_VERIFY_VERSION; if (argc != 2) { cerr << "Usage: " << argv[0] << " ADDRESS_BOOK_FILE" << endl; return -1; } tutorial::AddressBook address_book; { // Read the existing address book. fstream input(argv[1], ios::in | ios::binary); if (!input) { cout << argv[1] << ": File not found. Creating a new file." << endl; } else if (!address_book.ParseFromIstream(&input)) { cerr << "Failed to parse address book." << endl; return -1; } } // Add an address. PromptForAddress(address_book.add_person()); { // Write the new address book back to disk. fstream output(argv[1], ios::out | ios::trunc | ios::binary); if (!address_book.SerializeToOstream(&output)) { cerr << "Failed to write address book." << endl; return -1; } } // Optional: Delete all global objects allocated by libprotobuf. google::protobuf::ShutdownProtobufLibrary(); return 0; }
可见只需要调用addressbook.pb.h中声明的tutorial::AddressBook类、Person类中的接口(add_person(), add_phone(), set_number(), set_email()等)就能操作相应的参数,最后将内存中的参数序列化为文件只需要执行SerializeToOstream()。相应的读取参数文件的操作为ParseFromIstream()。这里贴出例子中的第二个程序如下:
// list_people.c #include <iostream> #include <fstream> #include <string> #include "addressbook.pb.h" using namespace std; // Iterates though all people in the AddressBook and prints info about them. void ListPeople(const tutorial::AddressBook& address_book) { for (int i = 0; i < address_book.person_size(); i++) { const tutorial::Person& person = address_book.person(i); cout << "Person ID: " << person.id() << endl; cout << " Name: " << person.name() << endl; if (person.has_email()) { cout << " E-mail address: " << person.email() << endl; } for (int j = 0; j < person.phone_size(); j++) { const tutorial::Person::PhoneNumber& phone_number = person.phone(j); switch (phone_number.type()) { case tutorial::Person::MOBILE: cout << " Mobile phone #: "; break; case tutorial::Person::HOME: cout << " Home phone #: "; break; case tutorial::Person::WORK: cout << " Work phone #: "; break; } cout << phone_number.number() << endl; } } } // Main function: Reads the entire address book from a file and prints all // the information inside. int main(int argc, char* argv[]) { // Verify that the version of the library that we linked against is // compatible with the version of the headers we compiled against. GOOGLE_PROTOBUF_VERIFY_VERSION; if (argc != 2) { cerr << "Usage: " << argv[0] << " ADDRESS_BOOK_FILE" << endl; return -1; } tutorial::AddressBook address_book; { // Read the existing address book. fstream input(argv[1], ios::in | ios::binary); if (!address_book.ParseFromIstream(&input)) { cerr << "Failed to parse address book." << endl; return -1; } } ListPeople(address_book); // Optional: Delete all global objects allocated by libprotobuf. google::protobuf::ShutdownProtobufLibrary(); return 0; }
相信做完这个实验,你将不再对Caffe代码中的参数初始化、参数保存操作感到陌生,一切都很自然。
除了上述简单功能,Protobuf还可以用来传递不同语言(C/C++与Java、Python)之间的参数,省去了自己手动维护数据结构的繁琐工作。也可以支持客户端/服务器模式,在主机/从机之间传递参数。