菜鸟学Linux 第111篇笔记 Memory

简介:

菜鸟学Linux 第111篇笔记 Memory


建议查看原文(因为复制版的格式可能有问题)

原文出自 Winthcloud 链接:



内容总览

内存子系统组件

Memory提升

Viewing system calls

Strategies for using memory

Tunning page allocation

Tuning overcommit

Slab cache

ARP cache

Page cache

调优策略(redhat 6 performance tuning guide官方文档)

进程间通信相关的调优




内存子系统组件

slab allocator

buddy system

kswapd

pdflush

mmu


虚拟化环境

PA --> HA --> MA

虚拟机转换:PA-->HA


GuestOS, OS

Shadow PT



Memory提升

lbs 


Hugetlbfs

查看是否启用Hugetlbfs

cat /proc/meminfo | grep Huge


启用大页面(永久有效)

/etc/sysctl.conf

添加 vm.nr_hugepages = n


即时启用

sysctl -w vm.nr_hugepages=n



Configure hugetlbfs if needed by application

创建hugepage并挂载

mkdir /hugepages

mount -t hugetlbfs none /hugepages



Viewing system calls

Trace every system call made by a program

strace -o /tmp/strace.out -p PID

grep mmap /tmp/strace.out


Summarize system calls

strace -c -p PID or

strace -c COMMAND




Strategies for using memory

Reduce overhead for tiny memory objects

Slab cache

cat /proc/slabinfo

Reduce or defer service time for slower subsystems

Filesystem metadata: buffer cache(slab cache)

Disk IO: page cache

Interprocess communications: shared memory

Network IO: buffer cache, arp cache, connection tracking


使用buffer cache 缓存文件元数据

使用page cache缓存Disk IO

使用shm完成进程间通信

使用buffer cache, arp cache和connection tracking提升网络IO性能


Considerations when tunning memory

How should pages be reclaimed to avoid pressure?

Larger writes are usually more efficient due to re-sorting



Tunning page allocation

Set using 


vm.min_free_kbytes

Tuning vm.min_free_kbytes only be necessary when an application regularly

needs to allocate a large block of memory, then frees that same memory


It may well be the case that the system has too little disk bandwith, too

little CPU power, or too little memory to handle its load.


Consequences

Reduces service time for demand paging

Memory is not available for other useage

Can cause pressure on ZONE_NORMAL


内存耗尽会使系统崩溃



Tuning overcommit

Set using


cat /proc/sys/vm/overcommit_memory

vm.overcommit_memory

0 = heuristic overcommit

1 = always overcommit

2 = commit all RAM plus a percentage of swap (may be > 100)


vm.overcommit_ratio

specified the percentage of physical memory allowed to be 

overcommited when the vm.overcommit_memory set to 2


View Committed_AS in /proc/meminfo

An estimate of how much RAM is required to avoid an out of memory (OOM)

condition for the current workload on a system


OOM

Overcommit Of Memory




Slab cache

Tiny kernel objects are stored in slab

Extra overhead of tracking is better than using 1 page/object

Example: filesystem metadata(dentry and inode caches)


Monitoring

/proc/slabinfo

slabtop

vmstat -m


Tuning a particular slab cache

echo "cache_name limit batchcount shared" > /proc/slabinfo

limit the maximum number of objects that will be cached for each CPU


batchcount the maximum number of global cache objects that will be 

   trasferred to the per-CPU cache when it becomes empty


shared the sharing behavior for Symmetric MultiProcessing(SMP) systems




ARP cache

ARP entries map hardware address to protocol address

cached in /proc/net/arp

By default, the cache is limited to 512 entries as a soft limit 

and 1024 entries as a hard limit


Garbage collection removes stale or older entries


Insufficient ARP cache leads to

Intermittent timeouts between hosts

ARP thrashing


Too much ARP cache puts pressure on ZONE_NORMAL

List entries

ip neighbor list

Flush cache

ip neighbor flush dev ethX


Tuning ARP cache

Adjust where the gc will leave arp table alone

net.ipv4.neigh.default.gc_thresh1

default 128


Soft upper limit

net.ipv4.neigh.default.gc_thresh2

default 512

Becomes hard limit after 5 seconds


Hard upper limit

net.ipv4.neigh.default.gc_thresh3


Garbage collection frequency in seconds

net.ipv4.neigh.default.gc_interval




Page cache

A large percentage of paging activity is due to I/O requests

File reads: each page of file read from disk into memory

These pages form the page cache


Page cache is always checked for IO requests

Drectory reads

Reading and writing regular files

Reading and writing via block device files, DISK IO

Accessing memory mapped files, mmap

Accessing swapped out pages


Page in the page cache are associated with file data



Tuning page cache

View page cache allocation in /proc/meminfo

Tune length/size of memory

vm.lowmen_reserve_ratio

vm.vfs_cache_pressure


Tune arrival/completion rate

vm.page-cluster

vm.zone_reclaim_mode



vm.lowmen_reserve_ratio

For some specialised workloads on highmem machines it is dangerous for the 

kernel to allow process memory to be allocated from the "lowmem" zone


Linux page allocator has a mechanism which prevents allocations which could

use highmem from using too much lowmem


The 'lowmem_reserve_ratio' tunable determines how aggressive the kernel is 

in defending these lower zones


If you have a machine which uses highmem or ISA DMA and Your applications

are using mlock(), or if you are running with no swap then you probably 

should change the lowmem_reserve_ratio setting



vfs_cache_pressure

Controls the tendency of the kernel to reclaim the memory which is used for

caching of directory and inode objects


At the default value of vfs_cache_pressure=100 the kernel will attempt to 

reclaim dentries and inodes at a "fair" rate with respect to pagecache and 

swapcache reclaim


Decreasing vfs_cache_pressure causes the kernel to prefer to retain dentry 

and inode caches


When vfs_cache_pressure=0, the kernel will never reclaim dentries and 

inodes due to memory pressure and this can easily lead to out-of-memory

conditions


Increasing vfs_cache_pressure beyond 100 causes the kernel to prefer to 

reclaim dentries and inodes.



page-cluster

page-cluster controls the number of pages which are written to swap in 

single attempt


It is a logarithmic value-setting it to zero means "1 page", setting it to

1 means "2 pages", setting it to 2 means "4 pages", etc


The default value is three (eight pages at a time)

There may be some small benefits in tuning this to a different value if 

your workload is swap-intensive



zone_reclaim_mode

Zone_reclaim_mode allows someone to set more or less aggressive approaches

to reclaim memory when a zone runs out of memory


If it is set to zero then no zone reclaim occurs


Allocations will be satisfied from other zones/nodes in the system


This is value ORed together of

1 = Zone reclaim on

2 = Zone reclaim writes dirty pages out

4 = Zone reclaim swaps pages



Anonymous pages

Anonymous pages can be another large  consumer of data


Are not associated with a file, but instead contain:

Program data - arrays, head allocations, etc

Anonymous memory regions

Dirty memory mapped process private pages

IPC shared memory regions pages


View summary usage

grep Anon /proc/meminfo

cat /proc/PID/statm

Anonymous pages = RSS - Shared


Anonymous pages are eligible for swap




调优策略

硬件调优: 硬件选型

软件调优: 内核调优 /proc, /sys

   应用调优



内核调优

1. 进程管理,CPU

2. 内存调优

3. I/O 调优

4. 文件系统

5. 网络子系统


调优思路

1. 查看各项性能指标,定位瓶颈

2. 调优




红帽官方提供一份文档  redhat 6 performance tuning guide 可以搜索到





进程间通信相关的调优

ipcs (interprocess communication facilities)

进程间通信管理命令

ipcs

ipcrm


shared memory

kernel.shmmni

Specifies the maximum number of shared memory segments 

system-wide, default = 4096


kernel.shmall

Specifies the total amount of shared memory, in pages, that

can be used at one time on the system, default=2097152


This should be at least kernel.shammax/PAGE_SIZE


kernel.shmmax

Specifies the maximum size of a shared memory segment that 

can be created



messages

kernel.msgmnb

Specifies the maximum number of bytes in single message 

queue, default = 16384


kernel.msgmni

Specifies the maximum number of message queue identifiers, 

default=16


kernel.msgmax

Specifies the maximum size of a message that can be passed 

between processes


This memory cannot be swapped, default=8192

本文转自Winthcloud博客51CTO博客,原文链接http://blog.51cto.com/winthcloud/1930795如需转载请自行联系原作者


Winthcloud


相关文章
|
1月前
|
运维 Oracle 容灾
Oracle dataguard 容灾技术实战(笔记),教你一种更清晰的Linux运维架构
Oracle dataguard 容灾技术实战(笔记),教你一种更清晰的Linux运维架构
|
1月前
|
运维 Linux Docker
Docker笔记(个人向) 简述,最新高频Linux运维面试题目分享
Docker笔记(个人向) 简述,最新高频Linux运维面试题目分享
|
2天前
|
存储 编解码 Ubuntu
【QT】linux下alsa库的移植和QT中音视频的处理&笔记
【QT】linux下alsa库的移植和QT中音视频的处理&笔记
|
2天前
|
SQL Linux 数据库
Linux上sqlite的安装和使用方法以及在QT中如何使用sqlite&笔记总结
Linux上sqlite的安装和使用方法以及在QT中如何使用sqlite&笔记总结
|
4天前
|
编解码 Linux
FFmpeg开发笔记(二十八)Linux环境给FFmpeg集成libxvid
XviD是开源的MPEG-4视频编解码器,曾与DivX一起用于早期MP4视频编码,但现在已被H.264取代。要集成XviD到Linux上的FFmpeg,首先下载源码,解压后配置并编译安装libxvid。接着,在FFmpeg源码目录中,重新配置FFmpeg以启用libxvid,然后编译并安装。成功后,通过`ffmpeg -version`检查是否启用libxvid。详细步骤包括下载、解压libxvid,使用`configure`和`make`命令安装,以及更新FFmpeg配置并安装。
14 2
FFmpeg开发笔记(二十八)Linux环境给FFmpeg集成libxvid
|
10天前
|
Web App开发 安全 Linux
FFmpeg开发笔记(二十六)Linux环境安装ZLMediaKit实现视频推流
《FFmpeg开发实战》书中介绍轻量级流媒体服务器MediaMTX,但其功能有限,不适合生产环境。推荐使用国产开源的ZLMediaKit,它支持多种流媒体协议和音视频编码标准。以下是华为欧拉系统下编译安装ZLMediaKit和FFmpeg的步骤,包括更新依赖、下载源码、配置、编译、安装以及启动MediaServer服务。此外,还提供了通过FFmpeg进行RTSP和RTMP推流,并使用VLC播放器拉流的示例。
22 3
FFmpeg开发笔记(二十六)Linux环境安装ZLMediaKit实现视频推流
|
11天前
|
编解码 Linux
FFmpeg开发笔记(二十五)Linux环境给FFmpeg集成libwebp
《FFmpeg开发实战》书中指导如何在Linux环境下为FFmpeg集成libwebp以支持WebP图片编解码。首先,从GitHub下载libwebp源码,解压后通过`libtoolize`,`autogen.sh`,`configure`,`make -j4`和`make install`步骤安装。接着,在FFmpeg源码目录中重新配置并添加`--enable-libwebp`选项,然后进行`make clean`,`make -j4`和`make install`以编译安装FFmpeg。最后,验证FFmpeg版本信息确认libwebp已启用。
19 1
FFmpeg开发笔记(二十五)Linux环境给FFmpeg集成libwebp
|
17天前
|
Linux 编解码 Python
FFmpeg开发笔记(二十四)Linux环境给FFmpeg集成AV1的编解码器
AV1是一种高效免费的视频编码标准,由AOM联盟制定,相比H.265压缩率提升约27%。各大流媒体平台倾向使用AV1。本文介绍了如何在Linux环境下为FFmpeg集成AV1编解码库libaom、libdav1d和libsvtav1。涉及下载源码、配置、编译和安装步骤,包括设置环境变量以启用这三个库。
37 3
FFmpeg开发笔记(二十四)Linux环境给FFmpeg集成AV1的编解码器
|
20天前
|
网络协议 Linux Shell
Linux重定向笔记
Linux重定向笔记
17 0
|
1月前
|
编解码 Linux 5G
FFmpeg开发笔记(二十)Linux环境给FFmpeg集成AVS3解码器
AVS3,中国制定的第三代音视频标准,是首个针对8K和5G的视频编码标准,相比AVS2和HEVC性能提升约30%。uavs3d是AVS3的解码器,支持8K/60P实时解码,且在各平台有优秀表现。要为FFmpeg集成AVS3解码器libuavs3d,需从GitHub下载最新源码,解压后配置、编译和安装。之后,重新配置FFmpeg,启用libuavs3d并编译安装,通过`ffmpeg -version`确认成功集成。
35 0
FFmpeg开发笔记(二十)Linux环境给FFmpeg集成AVS3解码器