gcc 将C/C++ 热函数映射到大页的方法-阿里云开发者社区

概述

采用方法：

使用__attribute__((section))：

在C++中，可以使用__attribute__((section))将热函数放入特定的段。例如：

attribute((section(“.hot_functions”))) void hot_function1() { /…/ }

attribute((section(“.hot_functions”))) void hot_function2() { /…/ }

然后，在链接器脚本中，将这个.hot_functions段映射到内存的大页。

链接器脚本的编写

要在链接器脚本中将 “.hot_functions” 段映射到大页，您需要首先了解您的系统的大页大小。假设您的系统大页大小为 2MB，您可以按照以下方式创建一个名为 “linker_script.ld” 的链接器脚本：

/* linker_script.ld */
OUTPUT_FORMAT(elf64-x86-64)
OUTPUT_ARCH(i386:x86-64)
/* 获取已有的链接器脚本 */
INCLUDE /usr/lib/ldscripts/elf_x86_64.x
MEMORY
{
  /* 定义大页内存区域，假设从地址 0x40000000 开始 */
  BIGPAGES (rw) : ORIGIN = 0x40000000, LENGTH = 2M
}
SECTIONS
{
  /* 之前的节（section）内容 */
  /* ... */
  /* 在大页内存区域映射热函数段 */
  .hot_functions : ALIGN(2M) {
    *(.hot_functions)
  } > BIGPAGES
  /* 之后的节（section）内容 */
  /* ... */
}

然后，在编译和链接过程中，使用此链接器脚本：

g++ -o my_program my_program.cpp -Wl,-T,linker_script.ld

这样，所有带有 “attribute((section(”.hot_functions")))" 的热函数都将映射到内存的大页中。请注意，这里的例子假设了 x86-64 架构和 ELF 格式，您需要根据您的系统和目标文件格式进行相应的调整。

x86_64 体系结构下，用户空间程序的默认线性地址

文本（text）段：0x00400000
数据（data）段和 BSS 段：紧随文本段之后的地址（具体取决于文本段的大小）

这些值可能会因操作系统、编译器、链接器和程序结构的不同而有所不同。然而，在 x86_64 体系结构的用户空间中，线性地址 0x400000 是一个典型的文本段加载地址。

请注意，当您在用户空间程序中使用大页映射时，应确保选择的内存地址不与其他程序段冲突，并遵循操作系统的内存布局规范。如果需要将特定的函数（如热函数）映射到大页，可以根据上面的回答中提供的链接器脚本示例进行相应调整。

用户空间程序和内核空间线性地址

在 x86_64 体系结构下，用户空间程序和内核空间具有不同的线性地址布局。这两个地址空间是相互独立的，因此，它们具有不同的默认 text 段线性地址。

用户空间程序：

默认文本（text）段线性地址：0x00400000

用户空间程序在较低的虚拟地址范围运行，地址空间与其他用户空间程序相互隔离。用户空间程序的默认 text 段地址一般为 0x00400000（当然，也可能因操作系统、编译器和链接器的不同而有所不同）。

x86_64 Linux 内核空间：

默认文本（text）段线性地址：0xffffffff81000000

默认内核数据（data）段：0xffffffff81a00000

默认内核 BSS 段：0xffffffff81c00000

内核空间位于较高的虚拟地址范围，地址空间与用户空间分开。内核默认的 text 段地址通常位于 0xffffffff81000000，但这个值可能会因操作系统、内核配置和编译选项的不同而有所变化。

总之，用户空间程序和内核空间具有不同的线性地址布局，它们的默认 text 段线性地址分别位于不同的虚拟地址范围。

如果链接器脚本已经集成到二进制文件。

在这种情况，如果要在链接器脚本中包含默认链接器脚本（例如 elf_x86_64.x），您可以在编译过程中使用 -Wl,--verbose 选项来找到默认链接器脚本的内容。这会将详细的链接过程信息输出到标准输出。

g++ -o my_program my_program.cpp -Wl,--verbose

在输出中，查找包含以下内容的行：

==================================================
attempt to open /usr/lib/gcc/x86_64-linux-gnu/...

找到默认链接器脚本内容的部分，它应该是类似于：

/* Script for -z combreloc: combine and sort reloc sections */
OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)
SEARCH_DIR("/usr/x86_64-linux-gnu/lib64"); SEARCH_DIR("/usr/local/lib64"); ...

将找到的默认链接器脚本内容复制到一个新文件（例如，default_ld_script.ld），然后在您的自定义链接器脚本中，使用 INCLUDE 指令包含刚刚创建的文件：

/* linker_script.ld */
INCLUDE default_ld_script.ld
MEMORY
{
  /* 定义大页内存区域，假设从地址 0x40000000 开始 */
  BIGPAGES (rw) : ORIGIN = 0x40000000, LENGTH = 2M
}
SECTIONS
{
  /* 之前的节（section）内容 */
  /* ... */
  /* 在大页内存区域映射热函数段 */
  .hot_functions : ALIGN(2M) {
    *(.hot_functions)
  } > BIGPAGES
  /* 之后的节（section）内容 */
  /* ... */
}

然后再进行编译和链接：

g++ -o my_program my_program.cpp -Wl,-T,linker_script.ld

这样您就可以在自定义链接器脚本中包含默认链接器脚本了。

gcc的步骤

使用 GCC 编译器时，将 C++ 源文件编译为可执行文件的过程与之前的示例非常相似。仍然需要使用之前创建的 “linker_script.ld” 链接器脚本。这里有一个用 GCC 编译 C++ 代码的示例：

首先，确保您的 C++ 代码中已经使用了 __attribute__((section(".hot_functions"))) 将热函数放入特定的段：

// my_program.cpp
__attribute__((section(".hot_functions"))) void hot_function1() { /*...*/ }
__attribute__((section(".hot_functions"))) void hot_function2() { /*...*/ }
int main() {
  // Your main function code
  return 0;
}

然后，使用以下命令将您的 C++ 代码编译成可执行文件，并使用前面创建的 “linker_script.ld”：

gcc -o my_program my_program.cpp -Wl,-T,linker_script.ld -lstdc++

这条命令会告诉 GCC 使用 linker_script.ld 链接器脚本，并通过 -lstdc++ 选项链接 C++ 标准库。编译后的可执行文件 “my_program” 中的热函数将被映射到内存大页中。

共享库中的热函数

在这种情况下，您需要将共享对象（.so）文件中的热函数映射到大页。要实现这一点，您需要在创建共享对象时使用相同的链接器脚本。

首先，确保在 C++ 代码中将热函数放入特定的段（如之前所述）：

// my_shared_library.cpp
__attribute__((section(".hot_functions"))) void hot_function1() { /*...*/ }
__attribute__((section(".hot_functions"))) void hot_function2() { /*...*/ }

然后，使用以下命令将 C++ 代码编译为共享对象，并使用 “linker_script.ld”：

g++ -shared -fPIC -o libmy_shared_library.so my_shared_library.cpp -Wl,-T,linker_script.ld

接下来，您需要创建一个主程序，该程序将链接到刚刚创建的共享库。首先创建一个包含热函数声明的头文件：

// my_shared_library.h
#ifndef MY_SHARED_LIBRARY_H
#define MY_SHARED_LIBRARY_H
void hot_function1();
void hot_function2();
#endif // MY_SHARED_LIBRARY_H

然后，编写一个使用共享库中热函数的简单程序：

// main.cpp
#include "my_shared_library.h"
int main() {
  hot_function1();
  hot_function2();
  return 0;
}

最后，使用以下命令编译主程序并链接到共享库：

g++ -o my_program main.cpp -L. -lmy_shared_library

执行上述过程后，共享库中的热函数将被映射到大页。然而，在动态链接的情况下，操作系统负责加载共享对象到内存，因此无法确保热函数始终映射到大页。为了解决这个问题，您可以尝试使用操作系统提供的功能，如 Linux 中的 madvise 或 mmap 的 MAP_HUGETLB 选项，来确保热函数在加载时使用大页。这些方法需要对程序进行修改，以便在运行时控制共享库的加载和内存映射。

针对dlopen() 方式加载共享库的场景

对于采用 dlopen() 方式加载共享库的场景，您可以尝试在程序运行时，通过 mmap() 函数将共享库中的热函数映射到大页。以下是使用 mmap() 和 dlopen() 将共享库的热函数映射到大页的一种方法：

首先，需要创建一个简单的共享库，如前所述：

// my_shared_library.cpp
#include <iostream>
extern "C" {
__attribute__((section(".hot_functions"))) void hot_function1() {
  std::cout << "Hot function 1." << std::endl;
}
__attribute__((section(".hot_functions"))) void hot_function2() {
  std::cout << "Hot function 2." << std::endl;
}
}

编译共享库：

g++ -shared -fPIC -o libmy_shared_library.so my_shared_library.cpp -Wl,-T,linker_script.ld

然后，在主程序中，使用 mmap() 和 MAP_HUGETLB 选项将共享库映射到大页。注意，这里的例子仅适用于 Linux 平台：

// main.cpp
#include <dlfcn.h>
#include <fcntl.h>
#include <iostream>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>
int main() {
  const char* lib_path = "./libmy_shared_library.so";
  // 获取共享库文件大小
  struct stat st;
  if (stat(lib_path, &st) != 0) {
    perror("Error obtaining file size");
    return 1;
  }
  // 打开共享库文件
  int fd = open(lib_path, O_RDONLY);
  if (fd < 0) {
    perror("Error opening file");
    return 1;
  }
/*
在这里，PROT_READ | PROT_EXEC 表示内存区域应允许读访问和执行访问。这使得映射到大页的共享库中的热函数可以被成功执行。
但是，请注意，将共享库映射到大页并确保所有的热函数都使用大页可能会涉及到一些实现细节。在某些情况下，您可能需要更精细地控制共享库的加载过程，以确保热函数能够始终映射到大页。此外，在不同平台和操作系统上，如何映射大页和设置内存属性可能会有所不同，因此确保了解适用于您特定环境的最佳做法是很重要的。
  */
  // 使用 mmap() 和 MAP_HUGETLB 将共享库映射到大页
  void* addr = mmap(NULL, st.st_size, PROT_READ | PROT_EXEC, MAP_PRIVATE | MAP_HUGETLB, fd, 0);
  if (addr == MAP_FAILED) {
    perror("Error mapping shared library to huge pages");
    return 1;
  }
  // 使用 dlopen() 加载共享库
  void* handle = dlopen(lib_path, RTLD_NOW | RTLD_GLOBAL);
  if (!handle) {
    std::cerr << "Error loading shared library: " << dlerror() << std::endl;
    return 1;
  }
  // 获取热函数符号
  using hot_function_type = void (*)();
  hot_function_type hot_function1 = reinterpret_cast<hot_function_type>(dlsym(handle, "hot_function1"));
  hot_function_type hot_function2 = reinterpret_cast<hot_function_type>(dlsym(handle, "hot_function2"));
  // 调用热函数
  hot_function1();
  hot_function2();
  // 清理
  if (munmap(addr, st.st_size) != 0) {
    perror("Error unmapping shared library");
  }
  if (close(fd) != 0) {
    perror("Error closing file");
  }
  if (dlclose(handle) != 0) {
    std::cerr << "Error unloading shared library: " << dlerror() << std::endl;
  }
  return 0;
}

类成员函数热函数的导出

类成员函数可以通过与普通函数类似的方式映射到大页。为了演示这个过程，我们将创建一个名为 MyClass 的简单类，它包含两个带有 __attribute__((section(".hot_functions"))) 的热成员函数。这里是完整的共享库源代码：

// my_shared_library.cpp
#include <iostream>
class MyClass {
public:
  __attribute__((section(".hot_functions"))) void hot_member_function1() {
    std::cout << "Hot member function 1." << std::endl;
  }
  __attribute__((section(".hot_functions"))) void hot_member_function2() {
    std::cout << "Hot member function 2." << std::endl;
  }
};
// 导出用于创建和删除 MyClass 实例的工厂函数
extern "C" {
__attribute__((visibility("default"))) MyClass* create_my_class() {
  return new MyClass();
}
__attribute__((visibility("default"))) void delete_my_class(MyClass* instance) {
  delete instance;
}
}

编译共享库：

g++ -shared -fPIC -o libmy_shared_library.so my_shared_library.cpp -Wl,-T,linker_script.ld

接下来，创建一个头文件，其中包含 MyClass 类声明和工厂函数声明：

// my_shared_library.h
#ifndef MY_SHARED_LIBRARY_H
#define MY_SHARED_LIBRARY_H
class MyClass {
public:
  void hot_member_function1();
  void hot_member_function2();
};
extern "C" {
MyClass* create_my_class();
void delete_my_class(MyClass* instance);
}
#endif // MY_SHARED_LIBRARY_H

在主程序中，使用 mmap() 和 MAP_HUGETLB 选项将共享库映射到大页，如前面所示。然后，使用工厂函数创建和删除 MyClass 实例，并调用热成员函数：

// main.cpp
#include "my_shared_library.h"
#include <dlfcn.h>
#include <fcntl.h>
#include <iostream>
#include <sys/mman.h>
#include <sys/stat.h>
#include <unistd.h>
int main() {
  const char* lib_path = "./libmy_shared_library.so";
  // 获取共享库文件大小
  struct stat st;
  if (stat(lib_path, &st) != 0) {
    perror("Error obtaining file size");
    return 1;
  }
  // 打开共享库文件
  int fd = open(lib_path, O_RDONLY);
  if (fd < 0) {
    perror("Error opening file");
    return 1;
  }
  // 使用 mmap() 和 MAP_HUGETLB 将共享库映射到大页
  void* addr = mmap(NULL, st.st_size, PROT_READ | PROT_EXEC, MAP_PRIVATE | MAP_HUGETLB, fd, 0);
  if (addr == MAP_FAILED) {
    perror("Error mapping shared library to huge pages");
    return 1;
  }
  // 使用 dlopen() 加载共享库
  void* handle = dlopen(lib_path, RTLD_NOW | RTLD_GLOBAL);
  if (!handle) {
    std::cerr << "Error loading shared library: " << dlerror() << std::endl;
    return 1;
  }
  // 获取工厂函数符号
  using factory_function_type = MyClass*()();
using delete_function_type = void ()(MyClass);
factory_function_type create_my_class = reinterpret_cast<factory_function_type>(dlsym(handle, "create_my_class"));
delete_function_type delete_my_class = reinterpret_cast<delete_function_type>(dlsym(handle, "delete_my_class"));
// 使用工厂函数创建 MyClass 实例
MyClass* my_instance = create_my_class();
// 调用热成员函数
my_instance->hot_member_function1();
my_instance->hot_member_function2();
// 使用工厂函数删除 MyClass 实例
delete_my_class(my_instance);
// 清理
if (munmap(addr, st.st_size) != 0) {
perror("Error unmapping shared library");
}
if (close(fd) != 0) {
perror("Error closing file");
}
if (dlclose(handle) != 0) {
std::cerr << "Error unloading shared library: " << dlerror() << std::endl;
}
return 0;
}

编译主程序并链接到共享库：

g++ -o my_program main.cpp -ldl

在此示例中，MyClass 的热成员函数被映射到大页。请注意，我们需要使用工厂函数来创建和删除 MyClass 的实例，因为在动态加载共享库时，无法直接访问类构造函数和析构函数。

在外部程序中直接实例化共享库中的类

如果在外部程序中直接实例化共享库中的类，您需要在共享库中导出类的定义，并在外部程序中包含该类的声明。由于类构造函数和析构函数可能涉及名称修饰，因此在加载共享库时可能无法直接访问它们。为了解决这个问题，可以在共享库中创建工厂函数，以在外部程序中实例化和销毁类。接下来的示例展示了如何实现这种方法。

首先，在共享库中创建一个名为MyClass的类，并声明为导出符号：

// my_shared_library.cpp
#include <iostream>
class __attribute__((visibility("default"))) MyClass {
public:
  __attribute__((section(".hot_functions"))) void hot_member_function1() {
    std::cout << "Hot member function 1." << std::endl;
  }
  __attribute__((section(".hot_functions"))) void hot_member_function2() {
    std::cout << "Hot member function 2." << std::endl;
  }
};

编译共享库：

g++ -shared -fPIC -o libmy_shared_library.so my_shared_library.cpp -Wl,-T,linker_script.ld

接下来，在外部程序中包含共享库中类的声明，并使用dlopen()动态加载共享库。然后，创建一个MyClass的实例，并调用热成员函数。请注意，由于类定义已经导出为共享库中的符号，因此无需在此处使用工厂函数。这是外部程序的源代码：

// main.cpp
#include "my_shared_library.h"
#include <dlfcn.h>
#include <iostream>
int main() {
  // 使用 dlopen() 加载共享库
  void* handle = dlopen("./libmy_shared_library.so", RTLD_NOW | RTLD_GLOBAL);
  if (!handle) {
    std::cerr << "Error loading shared library: " << dlerror() << std::endl;
    return 1;
  }
  // 创建 MyClass 实例
  MyClass my_instance;
  // 调用热成员函数
  my_instance.hot_member_function1();
  my_instance.hot_member_function2();
  // 清理
  if (dlclose(handle) != 0) {
    std::cerr << "Error unloading shared library: " << dlerror() << std::endl;
  }
  return 0;
}

编译主程序并链接到共享库：

g++ -o my_program main.cpp -ldl

这个示例中，MyClass的热成员函数被映射到大页，且您可以在外部程序中直接实例化类。请注意，由于您仍在使用dlopen()动态加载共享库，因此将共享库中的类直接实例化在外部程序中可能仍会遇到某些问题，例如在类构造函数和析构函数的链接过程中可能出现名称修饰问题。

当然,也可以通过将类的接口（抽象基类）和实现分离来保证共享库的实现部分不被暴露。这种设计通常被称为“面向接口编程”或“PImpl模式”。

首先，我们需要在共享库中定义一个接口（抽象基类），如下所示：

// my_shared_library_interface.h
#ifndef MY_SHARED_LIBRARY_INTERFACE_H
#define MY_SHARED_LIBRARY_INTERFACE_H
class IMyClass {
public:
  virtual void hot_member_function1() = 0;
  virtual void hot_member_function2() = 0;
  virtual ~IMyClass() = default;
};
extern "C" {
IMyClass* create_my_class();
void delete_my_class(IMyClass* instance);
}
#endif // MY_SHARED_LIBRARY_INTERFACE_H

然后，在共享库中实现该接口：

// my_shared_library.cpp
#include "my_shared_library_interface.h"
#include <iostream>
class MyClass : public IMyClass {
public:
  __attribute__((section(".hot_functions"))) void hot_member_function1() override {
    std::cout << "Hot member function 1." << std::endl;
  }
  __attribute__((section(".hot_functions"))) void hot_member_function2() override {
    std::cout << "Hot member function 2." << std::endl;
  }
};
extern "C" {
IMyClass* create_my_class() {
  return new MyClass();
}
void delete_my_class(IMyClass* instance) {
  delete instance;
}
}

编译共享库：

g++ -shared -fPIC -o libmy_shared_library.so my_shared_library.cpp -Wl,-T,linker_script.ld

在主程序中，您需要包含共享库的接口头文件并使用工厂函数来创建和销毁共享库中的类实例。请注意，这样做可以保证共享库的实现细节不会暴露给外部程序，而仅暴露接口头文件。

// main.cpp
#include "my_shared_library_interface.h"
#include <dlfcn.h>
#include <iostream>
int main() {
  // 使用 dlopen() 加载共享库
  void* handle = dlopen("./libmy_shared_library.so", RTLD_NOW | RTLD_GLOBAL);
  if (!handle) {
    std::cerr << "Error loading shared library: " << dlerror() << std::endl;
    return 1;
  }
  // 获取工厂函数符号
  using factory_function_type = IMyClass* (*)();
  using delete_function_type = void (*)(IMyClass*);
  factory_function_type create_my_class = reinterpret_cast<factory_function_type>(dlsym(handle, "create_my_class"));
  delete_function_type delete_my_class = reinterpret_cast<delete_function_type>(dlsym(handle, "delete_my_class"));
  // 使用工厂函数创建 MyClass 实例
  IMyClass* my_instance = create_my_class();
  // 调用热成员函数
  my_instance->hot_member_function1();
  my_instance->hot_member_function2();
  // 使用工厂函数删除 MyClass 实例
  delete_my_class(my_instance);
  // 清理
  if (dlclose(handle) != 0) {
    std::cerr << "Error unloading shared library: " << dlerror() << std::endl;
  }
  return 0;
}

编译主程序并链接到共享库：

g++ -o my_program main.cpp -ldl

通过使用接口（抽象基类）和工厂函数，您可以在不暴露共享库内部实现细节的情况下创建和操作共享库中的类实例。这种方法可以有效地保护共享库的实现，而仅向外部程序暴露必要的接口。

优化结果调查

性能优化的程度取决于许多因素，如程序结构、访问模式、系统架构等。使用大页映射热函数可能会提高缓存局部性，从而减少缺页异常和TLB未命中，提高性能。然而，具体的性能提升需要通过实际测试来确定。

以下是一种可能的性能对比方法：

创建两个版本的共享库和程序，一个版本包含大页优化，另一个版本不包含大页优化。
分别运行这两个版本的程序，并记录 /proc/self/stat 文件中的用户态和内核态CPU时间。对比这两个版本的CPU时间差异，以了解大页优化是否有效地减少了CPU时间。
使用库中的 std::set_new_handler() 和 std::get_new_handler() 函数，记录两个版本程序在运行过程中的内存分配情况。比较内存分配情况，以了解大页优化是否减少了内存管理开销。
使用 std::chrono 库测量两个版本程序的执行时间。比较执行时间差异，以了解大页优化是否提高了程序的运行速度。

对比数据可能表明，对于没有进行优化的程序，大页优化版可能在 CPU 时间、内存管理和执行时间方面均有所改进。然而，具体提升程度因程序和系统差异而异，而且可能并不总是显著的。在某些情况下，大页优化可能对性能提升有限，或者甚至对某些负载造成负面影响。因此，进行实际测试和对比非常重要，以确保您的优化符合预期。

模拟程序进行对比

以下是一个简单的程序示例，分别使用默认页面大小和大页面大小。我们将比较这两个版本在执行时间方面的性能。

假设我们的共享库包含一个热函数，该函数对一大块内存进行扫描：

// my_shared_library.cpp
#include <cstdint>
extern "C" {
__attribute__((section(".hot_functions"))) void scan_memory(volatile uint8_t* data, std::size_t size) {
  for (std::size_t i = 0; i < size; ++i) {
    data[i]++;
  }
}
}

编译这个共享库，如前所述：

g++ -shared -fPIC -o libmy_shared_library.so my_shared_library.cpp -Wl,-T,linker_script.ld

然后我们创建一个简单的程序，用于加载这个共享库并调用热函数。我们将首先创建一个不使用大页的版本：

// main_default_page_size.cpp
#include <chrono>
#include <cstdint>
#include <dlfcn.h>
#include <iostream>
#include <vector>
int main() {
  constexpr std::size_t kDataSize = 128 * 1024 * 1024; // 128 MiB
  std::vector<uint8_t> data(kDataSize);
  void* handle = dlopen("./libmy_shared_library.so", RTLD_NOW);
  if (!handle) {
    std::cerr << "Error loading shared library: " << dlerror() << std::endl;
    return 1;
  }
  using scan_memory_type = void (*)(volatile uint8_t*, std::size_t);
  scan_memory_type scan_memory = reinterpret_cast<scan_memory_type>(dlsym(handle, "scan_memory"));
  auto start = std::chrono::steady_clock::now();
  scan_memory(data.data(), data.size());
  auto end = std::chrono::steady_clock::now();
  std::chrono::duration<double> elapsed_seconds = end - start;
  std::cout << "Elapsed time (default page size): " << elapsed_seconds.count() << "s" << std::endl;
  if (dlclose(handle) != 0) {
    std::cerr << "Error unloading shared library: " << dlerror() << std::endl;
  }
  return 0;
}

接下来创建一个使用大页的版本，如前所述：

// main_huge_page_size.cpp
// ...（与 main_default_page_size.cpp 相同，除了 mmap 部分）
  // 使用 mmap() 和 MAP_HUGETLB 将共享库映射到大页
  void* addr = mmap(NULL, st.st_size, PROT_READ | PROT_EXEC, MAP_PRIVATE | MAP_HUGETLB, fd, 0);
  if (addr == MAP_FAILED) {
    perror("Error mapping shared library to huge pages");
    return 1;
  }
// ...（与 main_default_page_size.cpp 相同）

编译这两个程序：

g++ -o main_default_page_size main_default_page_size.cpp -ldl
g++ -o main_huge_page_size main_huge_page_size.cpp -ldl

现在，分别运行这两个版本的程序，记录执行时间：

./main_default_page_size
./main_huge_page_size

比较这两个版本的执行时间，了解大页优化是否提高了程序的运行速度。

以下是运行这两个程序时可能获得的输出示例：

Elapsed time (default page size): 0.345s
Elapsed time (huge page size): 0.294s

上面的示例中，我们主要关注了程序的执行时间。然而，我们可以用相似的方法比较两个版本程序的链接时间。这需要在编译时记录时间戳，然后计算编译期间消耗的时间。以下是使用 std::chrono 库测量链接时间的方法：

使用 std::chrono::steady_clock::now() 获取当前时间戳。
执行链接操作。
再次使用 std::chrono::steady_clock::now() 获取当前时间戳。
计算两个时间戳之间的差异。

在这个示例中，我们将预测使用大页优化的程序与不使用大页优化的程序之间链接时间的差异。

由于我们的示例程序相对简单，并且主要关注的是内存布局优化，我们可以预期链接时间的差异相对较小。大页优化主要影响程序在运行时的性能，而不是链接阶段。

然而，需要注意的是，实际的链接时间差异可能取决于许多因素，如系统负载、编译器实现、硬件性能等。在实际环境中进行测量是获取精确数据的唯一方法。

这里是一个大致的预测，但请注意，这仅仅是一个估计值：

Link time (default page size): 0.450s
Link time (huge page size): 0.460s

在这个预测中，使用大页优化的程序的链接时间略高于未使用大页优化的程序。但请注意，在实际情况下，链接时间可能会有所不同，因此进行实际测量是很重要的。

在这个简单示例中，使用大页优化的程序版本可能会在执行时间上有所改进。然而，这仅仅是一个简单的例子，实际的性能提升程度取决于程序的具体工作负载、访问模式、硬件和系统环境等因素。

请注意，在实际系统中运行程序时，您可能会遇到与预期不同的结果。因此，在实际环境中对程序进行性能测试和对比非常重要，以确保您的优化实际上在提高程序性能。此外，在进行性能优化时，请确保关注内存访问模式、数据结构以及程序的其他方面，因为它们也可能对性能产生重要影响。

gcc 将C/C++ 热函数映射到大页的方法

概述

链接器脚本的编写

x86_64 体系结构下，用户空间程序的默认线性地址

用户空间程序和内核空间线性地址

gcc的步骤

共享库中的热函数

针对dlopen() 方式加载共享库的场景

类成员函数热函数的导出

优化结果调查

模拟程序进行对比

热门文章

最新文章

相关课程

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

gcc 将C/C++ 热函数映射到大页的方法

概述

链接器脚本的编写

x86_64 体系结构下，用户空间程序的默认线性地址

用户空间程序和内核空间线性地址

gcc的步骤

共享库中的热函数

针对dlopen() 方式加载共享库的场景

类成员函数热函数的导出

优化结果调查

模拟程序进行对比

热门文章

最新文章

相关课程

相关电子书