今天拿到一块上海宝存 shannon 的1.2TB SSD一块, 以前一直在用的是OCZ RevoDrive3 X2, 所以对比一下这两款的性能如何.
感谢上海宝存提供SSD卡进行测试.
首先看一下宝存官方提供的性能指标.
Shannon Direct-IO SSD G2
IOPS
用户可用容量 | 800GB | 1200GB | 1600GB | 3200GB |
---|---|---|---|---|
读带宽 | 1.4GB/s | 2.0GB/s | 2.4GB/s | 2.0GB/s |
写带宽 | 1.2GB/s | 1.8GB/s | 1.8GB/s | 1.9GB/s |
随机读取 IOPS -4KB | 300,000 | 450,000 | 590,000 | 500,000 |
随机写入 IOPS -4KB | 310,000 | 460,000 | 480,000 | 480,000 |
延迟
用户可用容量 | 800GB | 1200GB | 1600GB | 3200GB |
---|---|---|---|---|
随机读延迟 -4KB | 67us | 67us | 67us | 70us |
随机写延迟 -4KB | 9us | 9us | 9us | 9us |
接口配置
用户可用 容量 | 800GB/1200GB | 1600GB/3200GB |
---|---|---|
接口配置 | PCIe 2.0x8 half-height half-length | PCIe 2.0x8 Full-height half-length |
功耗 | 6W~25W | 8W~25W |
环境参数
最小值 | 最大值 | |
---|---|---|
工作温度 | 0℃ | 50℃ |
非工作状态温度 | -40℃ | 70℃ |
工作湿度 | 5% | 95% |
海拔(m) | 3000 | |
气流(LFM) | 300 |
介绍一下两款进行对比的SSD的测试环境.
测试环境1
主机 DELL R720xd
内存 32G
CPU 8核 Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz
SSD shannon 1.2TB
系统 CentOS 6.5 x64
内核 2.6.32-431.el6.x86_64
SSD 驱动信息
# modinfo shannon
filename: /lib/modules/2.6.32-431.el6.x86_64/extra/shannon/shannon.ko
license: GPL
alias: pci:v00001CB0d00000275sv*sd*bc*sc*i*
alias: pci:v00001CB0d00000265sv*sd*bc*sc*i*
alias: pci:v000010EEd00006024sv*sd*bc*sc*i*
depends:
vermagic: 2.6.32-431.el6.x86_64 SMP mod_unload modversions
parm: shannon_sector_size:int
parm: shannon_debug_level:int
parm: shannon_force_rw:int
parm: shannon_major:int
parm: shannon_auto_attach:int
PCI接口信息
# lspci -vvvv -s 04:00.0
04:00.0 SCSI storage controller: OCZ Technology Group, Inc. RevoDrive 3 X2 PCI-Express SSD 240 GB (Marvell Controller) (rev 02)
Subsystem: OCZ Technology Group, Inc. RevoDrive 3 X2 PCI-Express SSD 240 GB (Marvell Controller)
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 122
Region 0: Memory at df1a0000 (64-bit, non-prefetchable) [size=128K]
Region 2: Memory at df1c0000 (64-bit, non-prefetchable) [size=256K]
Expansion ROM at df100000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 256 bytes, MaxReadReq 256 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Kernel driver in use: ocz10xx
Kernel modules: ocz10xx
测试环境2
主机 DELL R610
内存 96G
CPU 8核 Intel(R) Xeon(R) CPU E5504 @ 2.00GHz
SSD OCZ RevoDrive3 X2 960G
系统 CentOS 5.8 x64
内核 2.6.18-308.el5
SSD 驱动信息
# modinfo ocz10xx
filename: /lib/modules/2.6.18-308.el5/extra/ocz10xx.ko
version: 2.3.1.1977
license: Proprietary
description: OCZ Linux driver
author: OCZ Technology Group, Inc.
srcversion: 27F0A3AF2BD189FDFA8ED54
alias: pci:v00001B85d00001084sv*sd*bc*sc*i*
alias: pci:v00001B85d00001083sv*sd*bc*sc*i*
alias: pci:v00001B85d00001044sv*sd*bc*sc*i*
alias: pci:v00001B85d00001043sv*sd*bc*sc*i*
alias: pci:v00001B85d00001042sv*sd*bc*sc*i*
alias: pci:v00001B85d00001041sv*sd*bc*sc*i*
alias: pci:v00001B85d00001022sv*sd*bc*sc*i*
alias: pci:v00001B85d00001021sv*sd*bc*sc*i*
alias: pci:v00001B85d00001080sv*sd*bc*sc*i*
depends: scsi_mod
vermagic: 2.6.18-308.4.1.el5 SMP mod_unload gcc-4.1
parm: ocz_msi_enable: Enable MSI Support for OCZ VCA controllers (default=0) (int)
PCI接口信息
# lspci -vvvv -s 41:00.0
41:00.0 Mass storage controller: Device 1cb0:0275
Subsystem: Device 1cb0:0275
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 150
Region 0: Memory at d40fc000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee000d8 Data: 0000
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range B, TimeoutDis-, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [9c] MSI-X: Enable- Count=1 Masked-
Vector table: BAR=0 offset=00000180
PBA: BAR=0 offset=000001e0
Capabilities: [fc] #00 [0000]
Capabilities: [100 v1] Device Serial Number 00-00-00-01-01-00-0a-35
Kernel driver in use: shannon
Kernel modules: shannon
shannon驱动的安装方法 :
1. 使用已编译好的包
找到与内核版本一直的驱动rpm
# uname -r
2.6.32-431.el6.x86_64
安装
shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64.rpm
2. 或者使用源码build
如果安装包里没有对应的内核版本包, 那么可以使用源码包直接build
shannon-v2.6-9.src.rpm
# rpm -ivh shannon-v2.6-9.src.rpm
# cd /root/rpmbuild
# ll
total 8
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SOURCES
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SPECS
# cd SPECS/
# ll
total 8
-rw-rw-r--. 1 spike spike 7183 May 21 17:10 shannon-driver.spec
# rpmbuild -bb shannon-driver.spec
# cd ..
# ll
total 24
drwxr-xr-x 3 root root 4096 Jun 19 21:05 BUILD
drwxr-xr-x 2 root root 4096 Jun 19 21:05 BUILDROOT
drwxr-xr-x 3 root root 4096 Jun 19 21:05 RPMS
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SOURCES
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SPECS
drwxr-xr-x 2 root root 4096 Jun 19 21:05 SRPMS
# cd RPMS
# ll
total 4
drwxr-xr-x 2 root root 4096 Jun 19 21:05 x86_64
# cd x86_64/
# ll
total 392
-rw-r--r-- 1 root root 401268 Jun 19 21:05 shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64.rpm
# rpm -ivh shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64.rpm
安装完就可以把rpmbuild目录删掉了.
shannon linux下的命令行管理工具使用介绍
可以监控SSD的状态, 擦除SSD数据, 修改保留容量等.
(因为shannon的数据未被收录到smartmontools工具中, 所以不能使用smartctl查看该SSD的状态)
我下载了最新的smartctl还是无法查看shannon的信息.
# /opt/smartmontools-6.2/sbin/smartctl -A /dev/dfa
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-2.6.32-431.el6.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
/dev/dfa: Unable to detect device type
Please specify device type with the -d option.
Use smartctl -h to get a usage summary
我们看看shannon的驱动包带了哪些工具
# rpm -qa|grep shan
shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64
# rpm -ql shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64
/lib/modules/2.6.32-431.el6.x86_64/extra/shannon/Module.symvers
/lib/modules/2.6.32-431.el6.x86_64/extra/shannon/shannon.ko
/lib/udev/rules.d/60-persistent-storage-shannon.rules
/usr/bin/shannon-attach
/usr/bin/shannon-beacon
/usr/bin/shannon-bugreport
/usr/bin/shannon-detach
/usr/bin/shannon-format
/usr/bin/shannon-status
挨个介绍一下这些管理工具的使用.
1.
绑定SSD卡给操作系统使用. 这个一般不需要操作, 除非你detach了.
# /usr/bin/shannon-attach -h
Usage: shannon-attach [OPTIONS] [device]
Attaches Direct-IO PCIe SSD card and makes it available to the operating system.
OPTIONS:
-h, --help
Display this usage message.
-n, --readonly
Attach the Direct-IO drive and set access mode to readonly.
-r, --reduced-write
Attach the Direct-IO drive and set access mode to reduced-write.
If no -n or -r option is given, set access mode to normal readwrite.
[device]
Device node for the Direct-IO PCIe SSD card (/dev/sctx).
2.
LED标示, 一般用于在多个SSD中物理的区分.
# /usr/bin/shannon-beacon -h
Usage: shannon-beacon [OPTIONS] [device]
Lights the Direct-IO PCIe SSD card's yellow LED to locate the device.
This utility always turns the LED on, unless you specifically use the -o option.
OPTIONS:
-h, --help
Display this usage message.
-l, --on
Light on the yellow LED.
-o, --off
Turn off the yellow LED.
[device]
Device node for the Direct-IO PCIe SSD card (/dev/sctx).
3.
收集BUG报告. 例如
# /usr/bin/shannon-bugreport -h
usage: shannon-bugreport
# /usr/bin/shannon-bugreport
$hostname$
Linux-2.6.32-431.el6.x86_64
Copying files into system... done.
Copying files into system... done.
Copying files into system... done.
Copying files into proc... done.
Copying files into proc/self... done.
Copying files into log... done.
Copying files into log... done.
Copying files into log... done.
Copying files into log... done.
Dumping dmesg ... done.
Copying files into sys... done.
Copying files into sys... done.
Copying files into sys... done.
Copying files into sys... done.
Dumping lspci -vvvvv... done.
Dumping uname -a... done.
Dumping hostname ... done.
Dumping ps aux... done.
Dumping ps aux --sort start_time... done.
Dumping pstree ... done.
Dumping lsof ... done.
Dumping w ... done.
Dumping lsmod ... done.
Dumping dmidecode ... done.
Dumping sar -A... done.
Dumping sar -r... done.
Dumping sar ... done.
Dumping iostat -dmx 1 5... done.
Dumping vmstat 1 5... done.
Dumping top -bc -d1 -n5... done.
Copying files into disk... done.
Copying files into disk... done.
Dumping df -h... done.
Dumping pvs ... done.
Dumping vgs ... done.
Dumping lvs ... done.
Dumping dmsetup table... done.
Dumping dmsetup status... done.
Gathering information using shannon-status... done.
Dumping numactl --hardware... done.
Dumping numactl --show... done.
Dumping numastat ... done.
Copying files into debug... done.
Building tarball...
存放到/tmp下的一个TAR包, 便于给厂商提供分析数据.
Tarball: /tmp/shannon-bugreport-20140619-211326.tar.gz
Plz send it to our customer service, including steps to reproduce the problem.
All the information would help us address the issue tremendously.
4.
把一个SSD设备从操作系统中删除, 即解绑.
# /usr/bin/shannon-detach -h
Usage: shannon-detach [OPTIONS] [device]
Detaches and removes the corresponding /dev/dfx Direct-IO block device.
OPTIONS:
-h, --help
Display this usage message.
[device]
Device node for the Direct-IO PCIe SSD card (/dev/sctx).
5. SSD设备格式化, 查看当前设备的物理容量, 擦除, 设置最小逻辑访问单元, 设置用户可用空间, 等.
注意SSD一般物理容量都要大于实际的可用容量, 保留的容量用于纠错, 替换坏块. 因为SSD的存储单元有擦写次数限制, 超出限制就会造成物理损伤, 变成坏块, 坏块就需要额外的容量来替换和修复.
shannon保留的容量一般为27% 这个保留值可以使用状态命令看到.
当坏块比较多之后, SSD还可以继续使用, 但是如果保留容量也用的差不多的话, SSD随着坏块的增多, 最后就会报废.
所以一般坏块多了, 还可以通过缩小用户可用空间, 保持27%甚至更多的可用空间来维持SSD继续使用, 直到所有块都变成坏块. 就不能再写入了.
# /usr/bin/shannon-format -h
Usage: shannon-format [OPTIONS] [device]
Direct-IO PCIe SSD card is pre-formated before shipped to the customer.
This tool can perform a re-format as needed.
WARNING:
Re-format will erase all your data on the drive.
Please use shannon-detach to detach block device before using this tool,
OPTIONS:
-h, --help
Display this usage message.
-p, --probe
Probe current physical capacity and advertised user capacity.
-e, --erase
Erase all data and re-format the drive without changing setting.
-y, --always-yes
Auto-answer "yes" to all queries from this tool (i.e. bypass prompts).
-i DELAY, --interrupt=DELAY
Set interrupt delay (unit: us).
-b SIZE, --logical-block=SIZE
Set logical block size, i.e the minimum access unit from host.
-s CAPACITY, --capacity=CAPACITY
Set user capacity as a specific size(in TB, GB, or MB)
or as a percentage(such as 70%) of the advertised capacity.
-o CAPACITY, --overformat=CAPACITY
Over-format user capacity (to greater than the advertised capacity)
as a specific size(in TB, GB, or MB)
or as a percentage(e.g. 70%) of physical capacity.
-a, --advertised Set user capacity to advertised capacity directly.
Warning: -s, -o , -a options are mutually exclusive!
[device]
Device node for the Direct-IO PCIe SSD card (/dev/sctx).
6. 前面我们说shannon SSD没有录入smartmontools字典, 还好shannon厂商提供了这个工具来查看SSD的状态, 包括使用寿命, 坏块占比, 预留空间, 温度传感器值, 等.
# /usr/bin/shannon-status -h
Usage: shannon-status [OPTIONS] [device]
Shows information about Direct-IO PCIe card(s) installed on the system.
OPTIONS:
-h, --help
Display this usage message.
-r SECS, --refresh=SECS
Set refresh interval of monitoring (unit: second, default: 2 seconds).
-a, --all
Find all Direct-IO drive(s), and provide basic information of them.
-m, --monitor
If given, this tool will open a monitoring window,
which dynamically shows detailed information of the specified drive.
-p, --print
Generate key=value format output for easier parsing.
[device]
Device node for the Direct-IO PCIe SSD card (/dev/sctx).
例如 :
# /usr/bin/shannon-status -a
Found Shannon PCIE SSD card /dev/scta:
Direct-IO drive scta at PCI Address:41:00:0: PCI接口信息, 使用lspci可以查看对应的驱动信息等.
Model:sh-shannon-pcie-ssd, SN: 006819246149b014 序列号
Device state: attached as disk /dev/dfa, Access mode: readwrite
Firmware version: 0c321351, Driver version: 2.6.9 固件版本, 驱动版本
Vendor:1cb0, Device:0275, Sub vendor:1cb0, Sub device:0275
Flash manufacturer: 98, Flash id: 3a
Channels: 7, Lunsets in channel: 8, Luns in lunset: 2, Available luns: 112
Eblocks in lun: 2116, Pages in eblock: 256, Nand page size: 32768 闪存信息
Logical sector: 512, Physical sector: 4096 逻辑扇区和SSD的物理扇区(字节), 分区对齐时使用物理扇区大小, 也就是4K对齐
User capacity: 1200.00 GB/1117.59 GiB 用户可用容量
Physical capacity: 1632.37 GB/1520.26 GiB 物理容量, 可用看到预留了大概400GB作为修复坏块用
Overprovision: 27%, warn at 10%
Error correction: 35 bits per 880 bytes codeword 错误校验信息
Controller internal temperature: 71 degC, max 77 degC 温度传感器信息
Controller board temperature: 53 degC, max 59 degC
NAND Flash temperature: 53 degC, max 63 degC
Internal voltage: 1001 mV, max 1028 mV 电压
Auxiliary voltage: 1795 mV, max 1804 mV
Media status: 1.0760% bad block 存储介质状态, 已经有1%的坏块了.
Power on hours: 9 hours 15 minutes, Power cycles: 3 加电时间和加电次数
Lifetime data volumes: 生命信息
Host write data : 13774.36 GB / 12828.37 GiB
Host read data : 5172.11 GB / 4816.90 GiB
Total write data : 19974.61 GB / 18602.80 GiB
Write amplifier : 1.4501
Estimated life left: 99% left 评估的剩余生命. 这个可以作为是否需要换盘的参考.
Totally found 1 Direct-IO PCIe SSD card on this system.
KEY-VALUE输出格式, 信息和前面的差不多.
# /usr/bin/shannon-status -p /dev/scta
drive=/dev/scta
pci_address=41:00:0
model=sh-shannon-pcie-ssd
serial_number=006819246149b014
device_state=attached as disk /dev/dfa
access_mode=readwrite
firmware_version=0c321351
driver_version=2.6.9
vendor_id=1cb0
device_id=0275
subsystem_vendor_id=1cb0
subsystem_device_id=0275
flash_manufacturer=98
flash_id=3a
channels=7
lunsets_in_channel=8
luns_in_lunset=2
available_luns=112
eblocks_in_lun=2116
pages_in_eblock=256
nand_page_size=32768
logical_sector=512
physical_sector=4096
user_capacity=1200.00 GB
physical_capacity=1632.37 GB
overprovision=27%
error_correction=35 bits per 880 bytes codeword
controller_temp=71 degC
controller_temp_max=77 degC
board_temp=53 degC
board_temp_max=59 degC
flash_temp=53 degC
flash_temp_max=63 degC
internal_voltage=1004 mV
internal_voltage_max=1028 mV
auxiliary_voltage=1790 mV
auxiliary_voltage_max=1804 mV
bad_block_percentage=1.0760%
power_on_hours=9 hours 15 minutes
power_cycles=3
host_write_data=13801.32 GB
host_read_data=5172.11 GB
total_write_data=20001.57 GB
write_amplifier=1.4493
estimated_life_left=99%
实时监控举例
# /usr/bin/shannon-status -m -r 1 /dev/scta
Direct-IO PCIe SSD Card Monitor Program
Commands: q|Q exit; g|G general info; m|M main window
We are now monitoring disk 'scta' at PCI:41:00:0
Capacity: 1200.00 GB/1117.59 GiB, Block size: 4096, Overprovision: 27%
Power on hours : 9 hours 20 minutes
Power cycles : 3
Controller internal temp : 72 degC, max 77 degC
Controller board temp : 53 degC, max 59 degC
NAND Flash temperature : 53 degC, max 63 degC
Internal voltage : 990 mV, max 1028 mV
Auxiliary voltage : 1781 mV, max 1804 mV
Free block count : 32
Host write data : 13987.89 GB / 13027.24 GiB
Write Bandwith : 733.258 MB/s / 699.289 MiB/s
Write IOPS : 89.509 K
Avg write latency : 0.013 ms
Host read data : 5172.11 GB / 4816.90 GiB
Read Bandwith : 0.000 MB/s / 0.000 MiB/s
Read IOPS : 0.000 K
Avg read latency : 0.000 ms
Total write data : 20188.16 GB / 18801.69 GiB
Total write Bandwith : 733.258 MB/s / 699.289 MiB/s
Write amplifier : life 1.443, transient 1.000
Buffer write percentage : 99%
使用lspci查看对应的pci接口信息.
# lspci -vvvv -s 41:00.0
41:00.0 Mass storage controller: Device 1cb0:0275
Subsystem: Device 1cb0:0275
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 150
Region 0: Memory at d40fc000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [48] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee000d8 Data: 0000
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 256 bytes, MaxReadReq 4096 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range B, TimeoutDis-, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [9c] MSI-X: Enable- Count=1 Masked-
Vector table: BAR=0 offset=00000180
PBA: BAR=0 offset=000001e0
Capabilities: [fc] #00 [0000]
Capabilities: [100 v1] Device Serial Number 00-00-00-01-01-00-0a-35
Kernel driver in use: shannon
Kernel modules: shannon
重点来了, 接下来测2块东西, 一块是fsync接口的性能, 这个直接影响到数据库的checkpoint, xlog的flush等; 另一块是直接测试PostgreSQL数据库的读写性能, 测试模型后面给出.
1. 测试fsync接口的性能.
注意分区时物理块的对齐, 本文场景使用4K对齐; 据昨天宝存的工程师阐述宝存SSD盘不存在这个问题.
另外需要写超过50%后进行测试, 因为OCZ有这样的问题, 使用超过容量1半后, 性能会下降.
/dev/dfa1 788G 733G 15G 99% /mnt
最后要注意的是因为CPU的差异, 可能导致IOPS未能达到测试瓶颈, 这样的话需要开多个进程进行测试, 直到iostat 块设备的 %util = 100
本文将开启2个进程同时测试, 测试的单次fsync数据块大小为8KB. (如果要测4K的块, 可以修改一下--with-wal-blocksize=4重新编译PostgreSQL, 目前支持的块大小为1,2,4,8,16,32,64), 或者直接使用dd obs指定数据块大小, oflag=sync,nonblock,noatime指定同步调用.
OCZ测试结果
进程1
# /data_ssd0/pgsql9.3.4/bin/pg_test_fsync -f /ssd/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 10448.416 ops/sec 96 usecs/op
fdatasync 1002.611 ops/sec 997 usecs/op
fsync 532.699 ops/sec 1877 usecs/op
fsync_writethrough n/a
open_sync 6525.048 ops/sec 153 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 3097.379 ops/sec 323 usecs/op
fdatasync 939.144 ops/sec 1065 usecs/op
fsync 481.663 ops/sec 2076 usecs/op
fsync_writethrough n/a
open_sync 3378.838 ops/sec 296 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 4442.898 ops/sec 225 usecs/op
2 * 8kB open_sync writes 3129.062 ops/sec 320 usecs/op
4 * 4kB open_sync writes 1919.259 ops/sec 521 usecs/op
8 * 2kB open_sync writes 978.837 ops/sec 1022 usecs/op
16 * 1kB open_sync writes 518.352 ops/sec 1929 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 493.002 ops/sec 2028 usecs/op
write, close, fsync 522.625 ops/sec 1913 usecs/op
Non-Sync'ed 8kB writes:
write 189268.154 ops/sec 5 usecs/op
进程2
# /data_ssd0/pgsql9.3.4/bin/pg_test_fsync -f /ssd/2
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 3278.009 ops/sec 305 usecs/op
fdatasync 1074.887 ops/sec 930 usecs/op
fsync 496.757 ops/sec 2013 usecs/op
fsync_writethrough n/a
open_sync 5928.269 ops/sec 169 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 3312.785 ops/sec 302 usecs/op
fdatasync 1006.541 ops/sec 994 usecs/op
fsync 524.735 ops/sec 1906 usecs/op
fsync_writethrough n/a
open_sync 3129.400 ops/sec 320 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 4433.001 ops/sec 226 usecs/op
2 * 8kB open_sync writes 3117.236 ops/sec 321 usecs/op
4 * 4kB open_sync writes 1911.074 ops/sec 523 usecs/op
8 * 2kB open_sync writes 978.444 ops/sec 1022 usecs/op
16 * 1kB open_sync writes 551.965 ops/sec 1812 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 553.674 ops/sec 1806 usecs/op
write, close, fsync 601.139 ops/sec 1664 usecs/op
Non-Sync'ed 8kB writes:
write 194588.313 ops/sec 5 usecs/op
两个进程已经到达了OCZ的fsync瓶颈.
宝存测试结果
进程1
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 36722.105 ops/sec 27 usecs/op
fdatasync 38222.994 ops/sec 26 usecs/op
fsync 33121.821 ops/sec 30 usecs/op
fsync_writethrough n/a
open_sync 46673.776 ops/sec 21 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 21869.056 ops/sec 46 usecs/op
fdatasync 21256.613 ops/sec 47 usecs/op
fsync 17299.647 ops/sec 58 usecs/op
fsync_writethrough n/a
open_sync 23230.884 ops/sec 43 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 22936.731 ops/sec 44 usecs/op
2 * 8kB open_sync writes 22550.950 ops/sec 44 usecs/op
4 * 4kB open_sync writes 13588.508 ops/sec 74 usecs/op
8 * 2kB open_sync writes 432.588 ops/sec 2312 usecs/op
16 * 1kB open_sync writes 269.104 ops/sec 3716 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 15686.374 ops/sec 64 usecs/op
write, close, fsync 15723.477 ops/sec 64 usecs/op
Non-Sync'ed 8kB writes:
write 187060.951 ops/sec 5 usecs/op
进程2
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/2
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 37496.895 ops/sec 27 usecs/op
fdatasync 38063.802 ops/sec 26 usecs/op
fsync 33193.801 ops/sec 30 usecs/op
fsync_writethrough n/a
open_sync 46719.307 ops/sec 21 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 21853.038 ops/sec 46 usecs/op
fdatasync 21022.927 ops/sec 48 usecs/op
fsync 17531.537 ops/sec 57 usecs/op
fsync_writethrough n/a
open_sync 23044.216 ops/sec 43 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 22933.531 ops/sec 44 usecs/op
2 * 8kB open_sync writes 22986.931 ops/sec 44 usecs/op
4 * 4kB open_sync writes 13640.880 ops/sec 73 usecs/op
8 * 2kB open_sync writes 462.776 ops/sec 2161 usecs/op
16 * 1kB open_sync writes 260.950 ops/sec 3832 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 15492.679 ops/sec 65 usecs/op
write, close, fsync 15737.468 ops/sec 64 usecs/op
Non-Sync'ed 8kB writes:
write 190884.093 ops/sec 5 usecs/op
看起来2个进程fsync对宝存毫无压力, 所以开了3个进程, 以下是测试结果
进程1
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 29959.548 ops/sec 33 usecs/op
fdatasync 26916.071 ops/sec 37 usecs/op
fsync 19083.832 ops/sec 52 usecs/op
fsync_writethrough n/a
open_sync 27749.411 ops/sec 36 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 14162.292 ops/sec 71 usecs/op
fdatasync 15988.317 ops/sec 63 usecs/op
fsync 12431.108 ops/sec 80 usecs/op
fsync_writethrough n/a
open_sync 13515.127 ops/sec 74 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 18787.955 ops/sec 53 usecs/op
2 * 8kB open_sync writes 13915.061 ops/sec 72 usecs/op
4 * 4kB open_sync writes 8684.830 ops/sec 115 usecs/op
8 * 2kB open_sync writes 347.167 ops/sec 2880 usecs/op
16 * 1kB open_sync writes 269.654 ops/sec 3708 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 13616.426 ops/sec 73 usecs/op
write, close, fsync 13115.266 ops/sec 76 usecs/op
Non-Sync'ed 8kB writes:
write 161001.305 ops/sec 6 usecs/op
进程2
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/2
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 29773.938 ops/sec 34 usecs/op
fdatasync 26694.424 ops/sec 37 usecs/op
fsync 19525.263 ops/sec 51 usecs/op
fsync_writethrough n/a
open_sync 27408.323 ops/sec 36 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 14181.855 ops/sec 71 usecs/op
fdatasync 16052.939 ops/sec 62 usecs/op
fsync 12490.750 ops/sec 80 usecs/op
fsync_writethrough n/a
open_sync 13723.671 ops/sec 73 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 18485.904 ops/sec 54 usecs/op
2 * 8kB open_sync writes 13690.145 ops/sec 73 usecs/op
4 * 4kB open_sync writes 8538.992 ops/sec 117 usecs/op
8 * 2kB open_sync writes 336.883 ops/sec 2968 usecs/op
16 * 1kB open_sync writes 314.388 ops/sec 3181 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 13707.362 ops/sec 73 usecs/op
write, close, fsync 12839.225 ops/sec 78 usecs/op
Non-Sync'ed 8kB writes:
write 153528.755 ops/sec 7 usecs/op
进程3
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/3
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 29921.701 ops/sec 33 usecs/op
fdatasync 26656.149 ops/sec 38 usecs/op
fsync 19701.098 ops/sec 51 usecs/op
fsync_writethrough n/a
open_sync 27229.080 ops/sec 37 usecs/op
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync 14177.152 ops/sec 71 usecs/op
fdatasync 15772.293 ops/sec 63 usecs/op
fsync 12589.742 ops/sec 79 usecs/op
fsync_writethrough n/a
open_sync 13787.782 ops/sec 73 usecs/op
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write 18697.155 ops/sec 53 usecs/op
2 * 8kB open_sync writes 13377.013 ops/sec 75 usecs/op
4 * 4kB open_sync writes 8537.356 ops/sec 117 usecs/op
8 * 2kB open_sync writes 349.729 ops/sec 2859 usecs/op
16 * 1kB open_sync writes 311.943 ops/sec 3206 usecs/op
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 13971.711 ops/sec 72 usecs/op
write, close, fsync 12785.385 ops/sec 78 usecs/op
Non-Sync'ed 8kB writes:
write 147119.435 ops/sec 7 usecs/op
将所有进程的测试结果汇总后的one 8KB write结果对比, 宝存1.2 TB 远超OCZ RevoDrive3 X2 960G
OCZ
open_datasync 13726.425 ops/sec
fdatasync 2077.498 ops/sec
fsync 1029.456 ops/sec
fsync_writethrough n/a
open_sync 12453.317 ops/sec
宝存
open_datasync 89655.187 ops/sec
fdatasync 80266.644 ops/sec
fsync 58310.193 ops/sec
fsync_writethrough n/a
open_sync 82386.814 ops/sec
2. 测试数据库的tps性能(对SSD性能本身没有太大的参考价值, 因为瓶颈可能出现在CPU上)
数据库版本 9.3.4
数据库配置, 主要是wal fsync method, 从测试结果来看, 我这里选择了两款SSD产品都最好的宝存open_datasync, OCZ open_sync(不支持open_datasync).
grep "^[a-z]" postgresql.conf
listen_addresses = '0.0.0.0' # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 100 # (change requires restart)
unix_socket_directories = '.' # comma-separated list of directories
shared_buffers = 2048MB # min 128kB
maintenance_work_mem = 512MB # min 1MB
vacuum_cost_delay = 10 # 0-100 milliseconds
vacuum_cost_limit = 10000 # 1-10000 credits
bgwriter_delay = 10ms # 10-10000ms between rounds
wal_level = hot_standby # minimal, archive, or hot_standby
synchronous_commit = on # synchronization level;
wal_sync_method = open_datasync # the default is the first option
wal_buffers = 16384kB # min 32kB, -1 sets based on shared_buffers
checkpoint_segments = 128 # in logfile segments, min 1, 16MB each
effective_cache_size = 24000MB
log_destination = 'csvlog' # Valid values are combinations of
logging_collector = on # Enable capturing of stderr and csvlog
log_truncate_on_rotation = on # If on, an existing log file with the
log_timezone = 'PRC'
autovacuum = on # Enable autovacuum subprocess? 'on'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C' # locale for system error message
lc_monetary = 'C' # locale for monetary formatting
lc_numeric = 'C' # locale for number formatting
lc_time = 'C' # locale for time formatting
default_text_search_config = 'pg_catalog.english'
测试模型
postgres=# create table test (id int primary key, info text, crt_time timestamp);
CREATE TABLE
postgres=# create or replace function f(v_id int) returns void as
$$
declare
begin
update test set info=md5(now()::text),crt_time=now() where id=v_id;
if not found then
insert into test values (v_id, md5(now()::text), now());
end if;
return;
exception when others then
return;
end;
$$ language plpgsql strict;
CREATE FUNCTION
$ vi test.sql
\setrandom vid 1 10000000
select f(:vid);
测试方法
pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 300
测试结果对比
OCZ
pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 300 -h /data_ssd1/test -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 300 s
number of transactions actually processed: 6215372
tps = 20714.778666 (including connections establishing)
tps = 20715.587367 (excluding connections establishing)
statement latencies in milliseconds:
0.001759 \setrandom vid 1 10000000
0.575683 select f(:vid);
与异步提交效率有差距.
pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 30 -h /data_ssd1/test -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 30 s
number of transactions actually processed: 1284258
tps = 42797.519722 (including connections establishing)
tps = 42813.485019 (excluding connections establishing)
statement latencies in milliseconds:
0.001762 \setrandom vid 1 10000000
0.276727 select f(:vid);
宝存
$ /opt/pgsql9.3.4/bin/pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 300 -h /mnt/pg_root -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 300 s
number of transactions actually processed: 10865222
tps = 36214.699980 (including connections establishing)
tps = 36216.135752 (excluding connections establishing)
statement latencies in milliseconds:
0.002184 \setrandom vid 1 10000000
0.327720 select f(:vid);
与异步提交(synchronous_commit = off)的效率还是有较大差异.
$ /opt/pgsql9.3.4/bin/pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 30 -h /mnt/pg_root -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 30 s
number of transactions actually processed: 1740723
tps = 58010.769125 (including connections establishing)
tps = 58032.766226 (excluding connections establishing)
statement latencies in milliseconds:
0.001867 \setrandom vid 1 10000000
0.203650 select f(:vid);
从异步测试的数据来看, OCZ所在服务器因为CPU的性能较弱, 明显要低于宝存所在服务器的性能.
另外就是, 数据库的测试TPS结果, OCZ和宝存的差异并没有直接测试FSYNC接口差异那么大, 因为瓶颈不是在IO上面了, 已经出现在CPU了.
测试数据汇总 :
(注意测试机的CPU分别为2.0G和2.5G, 可以对OCZ的数据乘以1.25的系数. 有时间我会把OCZ的硬盘插到同一台服务器, 同样速率的PCI-E接口再测一遍. 现在的测试数据仅供参考. )
(同时需要注意数据库这块的测试, 因为不是单纯的IO测试, 还有大多数的开销在CPU上, 所以仅供参考)
(fsync接口测试单8K数据块的fsync)
关于这块宝存SSD的cell寿命, 因为才拿到一天, 还没有擦写够. 以下是当前的一个健康状态.
如果按照现在的强度使用(基本属于虐待), 根据厂家提供的寿命数据(容量*10000), 可以使用200天左右报废.
Direct-IO drive scta at PCI Address:41:00:0:
Model:sh-shannon-pcie-ssd, SN: 006819246149b014
Device state: attached as disk /dev/dfa, Access mode: readwrite
Firmware version: 0c321351, Driver version: 2.6.9
Vendor:1cb0, Device:0275, Sub vendor:1cb0, Sub device:0275
Flash manufacturer: 98, Flash id: 3a
Channels: 7, Lunsets in channel: 8, Luns in lunset: 2, Available luns: 112
Eblocks in lun: 2116, Pages in eblock: 256, Nand page size: 32768
Logical sector: 512, Physical sector: 4096
User capacity: 1200.00 GB/1117.59 GiB
Physical capacity: 1632.37 GB/1520.26 GiB
Overprovision: 27%, warn at 10%
Error correction: 35 bits per 880 bytes codeword
Controller internal temperature: 70 degC, max 77 degC
Controller board temperature: 52 degC, max 59 degC
NAND Flash temperature: 52 degC, max 63 degC
Internal voltage: 999 mV, max 1028 mV
Auxiliary voltage: 1787 mV, max 1807 mV
Media status: 1.0760% bad block
Power on hours: 21 hours 9 minutes, Power cycles: 3
Lifetime data volumes:
Host write data : 41516.25 GB / 38665.02 GiB
Host read data : 5172.11 GB / 4816.90 GiB
Total write data : 48235.25 GB / 44922.58 GiB
Write amplifier : 1.1618
Estimated life left: 99% left
最后简单介绍一下SSD一般的应用场景.
1. 作为文件系统二级缓存, 例如ZFS的 L2ARC(越大越好,但是它只存非脏数据), SLOG(slog一般10GB以内就够了, 建议mirror).
2. 类似文件系统二级缓存, FB的flashcache.
3. 作为数据库统计信息文件夹(stats_temp_directory).
4. 作为数据库活跃数据表空间.
5. 操作系统交换分区.
其他
应用注意事项
1. 分区物理访问单元对齐.
[参考]
2. $SRC/contrib/pg_test_fsync/pg_test_fsync.c
#define XLOG_BLCKSZ_K (XLOG_BLCKSZ / 1024)