宝存 shannon PCI-E SSD VS OCZ RevoDrive3 X2 PCI-E SSD on CentOS 6.5 2.6.32-431.el6.x86_64-阿里云开发者社区

开发者社区> 开发与运维> 正文
登录阅读全文

宝存 shannon PCI-E SSD VS OCZ RevoDrive3 X2 PCI-E SSD on CentOS 6.5 2.6.32-431.el6.x86_64

简介:
今天拿到一块上海宝存 shannon 的1.2TB SSD一块, 以前一直在用的是OCZ RevoDrive3 X2, 所以对比一下这两款的性能如何.
感谢上海宝存提供SSD卡进行测试.

首先看一下宝存官方提供的性能指标.

Shannon Direct-IO SSD G2

IOPS

用户可用容量 800GB 1200GB 1600GB 3200GB
读带宽 1.4GB/s 2.0GB/s 2.4GB/s 2.0GB/s
写带宽 1.2GB/s 1.8GB/s 1.8GB/s 1.9GB/s
随机读取 IOPS  -4KB 300,000 450,000 590,000 500,000
随机写入 IOPS  -4KB 310,000 460,000 480,000 480,000

延迟

用户可用容量 800GB 1200GB 1600GB 3200GB
随机读延迟 -4KB 67us 67us 67us 70us
随机写延迟 -4KB 9us 9us 9us 9us

接口配置

用户可用 容量 800GB/1200GB 1600GB/3200GB
接口配置 PCIe 2.0x8 half-height half-length PCIe 2.0x8 Full-height half-length
功耗 6W~25W 8W~25W

环境参数



最小值 最大值
工作温度 0℃ 50℃
非工作状态温度 -40℃ 70℃
工作湿度 5% 95%
海拔(m)
3000
气流(LFM) 300

介绍一下两款进行对比的SSD的测试环境.
测试环境1
主机  DELL R720xd
内存  32G
CPU  8核 Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz
SSD  shannon 1.2TB
系统  CentOS 6.5 x64
内核  2.6.32-431.el6.x86_64
SSD 驱动信息  
# modinfo shannon
filename:       /lib/modules/2.6.32-431.el6.x86_64/extra/shannon/shannon.ko
license:        GPL
alias:          pci:v00001CB0d00000275sv*sd*bc*sc*i*
alias:          pci:v00001CB0d00000265sv*sd*bc*sc*i*
alias:          pci:v000010EEd00006024sv*sd*bc*sc*i*
depends:        
vermagic:       2.6.32-431.el6.x86_64 SMP mod_unload modversions 
parm:           shannon_sector_size:int
parm:           shannon_debug_level:int
parm:           shannon_force_rw:int
parm:           shannon_major:int
parm:           shannon_auto_attach:int
PCI接口信息
# lspci -vvvv -s 04:00.0
04:00.0 SCSI storage controller: OCZ Technology Group, Inc. RevoDrive 3 X2 PCI-Express SSD 240 GB (Marvell Controller) (rev 02)
        Subsystem: OCZ Technology Group, Inc. RevoDrive 3 X2 PCI-Express SSD 240 GB (Marvell Controller)
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 122
        Region 0: Memory at df1a0000 (64-bit, non-prefetchable) [size=128K]
        Region 2: Memory at df1c0000 (64-bit, non-prefetchable) [size=256K]
        Expansion ROM at df100000 [disabled] [size=64K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [70] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 256 bytes, MaxReadReq 256 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM L0s Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis+
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt+ UnxCmplt+ RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr+ BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr+
                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
        Capabilities: [140 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
                        Status: NegoPending- InProgress-
        Kernel driver in use: ocz10xx
        Kernel modules: ocz10xx

测试环境2
主机 DELL R610
内存  96G
CPU  8核 Intel(R) Xeon(R) CPU           E5504  @ 2.00GHz
SSD  OCZ RevoDrive3 X2 960G
系统  CentOS 5.8 x64
内核  2.6.18-308.el5
SSD 驱动信息  
# modinfo ocz10xx
filename:       /lib/modules/2.6.18-308.el5/extra/ocz10xx.ko
version:        2.3.1.1977
license:        Proprietary
description:    OCZ Linux driver
author:         OCZ Technology Group, Inc.
srcversion:     27F0A3AF2BD189FDFA8ED54
alias:          pci:v00001B85d00001084sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001083sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001044sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001043sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001042sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001041sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001022sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001021sv*sd*bc*sc*i*
alias:          pci:v00001B85d00001080sv*sd*bc*sc*i*
depends:        scsi_mod
vermagic:       2.6.18-308.4.1.el5 SMP mod_unload gcc-4.1
parm:           ocz_msi_enable: Enable MSI Support for OCZ VCA controllers (default=0) (int)
PCI接口信息
# lspci -vvvv -s 41:00.0
41:00.0 Mass storage controller: Device 1cb0:0275
        Subsystem: Device 1cb0:0275
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 150
        Region 0: Memory at d40fc000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee000d8  Data: 0000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range B, TimeoutDis-, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [9c] MSI-X: Enable- Count=1 Masked-
                Vector table: BAR=0 offset=00000180
                PBA: BAR=0 offset=000001e0
        Capabilities: [fc] #00 [0000]
        Capabilities: [100 v1] Device Serial Number 00-00-00-01-01-00-0a-35
        Kernel driver in use: shannon
        Kernel modules: shannon

shannon驱动的安装方法 : 
1. 使用已编译好的包
找到与内核版本一直的驱动rpm
# uname -r
2.6.32-431.el6.x86_64
安装
shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64.rpm

2. 或者使用源码build
如果安装包里没有对应的内核版本包, 那么可以使用源码包直接build
shannon-v2.6-9.src.rpm
# rpm -ivh shannon-v2.6-9.src.rpm
# cd /root/rpmbuild
# ll
total 8
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SOURCES
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SPECS
# cd SPECS/
# ll
total 8
-rw-rw-r--. 1 spike spike 7183 May 21 17:10 shannon-driver.spec
# rpmbuild -bb shannon-driver.spec 
# cd ..
# ll
total 24
drwxr-xr-x 3 root root 4096 Jun 19 21:05 BUILD
drwxr-xr-x 2 root root 4096 Jun 19 21:05 BUILDROOT
drwxr-xr-x 3 root root 4096 Jun 19 21:05 RPMS
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SOURCES
drwxr-xr-x 2 root root 4096 Jun 19 21:00 SPECS
drwxr-xr-x 2 root root 4096 Jun 19 21:05 SRPMS
# cd RPMS
# ll
total 4
drwxr-xr-x 2 root root 4096 Jun 19 21:05 x86_64
# cd x86_64/
# ll
total 392
-rw-r--r-- 1 root root 401268 Jun 19 21:05 shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64.rpm
# rpm -ivh shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64.rpm
安装完就可以把rpmbuild目录删掉了.

shannon linux下的命令行管理工具使用介绍
可以监控SSD的状态, 擦除SSD数据, 修改保留容量等.
(因为shannon的数据未被收录到smartmontools工具中, 所以不能使用smartctl查看该SSD的状态)
我下载了最新的smartctl还是无法查看shannon的信息.
# /opt/smartmontools-6.2/sbin/smartctl -A /dev/dfa
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-2.6.32-431.el6.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/dfa: Unable to detect device type
Please specify device type with the -d option.

Use smartctl -h to get a usage summary

我们看看shannon的驱动包带了哪些工具
# rpm -qa|grep shan
shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64
# rpm -ql shannon-2.6.32-431.el6.x86_64.x86_64-v2.6-9.x86_64
/lib/modules/2.6.32-431.el6.x86_64/extra/shannon/Module.symvers
/lib/modules/2.6.32-431.el6.x86_64/extra/shannon/shannon.ko
/lib/udev/rules.d/60-persistent-storage-shannon.rules
/usr/bin/shannon-attach
/usr/bin/shannon-beacon
/usr/bin/shannon-bugreport
/usr/bin/shannon-detach
/usr/bin/shannon-format
/usr/bin/shannon-status

挨个介绍一下这些管理工具的使用.
1. 绑定SSD卡给操作系统使用. 这个一般不需要操作, 除非你detach了.
# /usr/bin/shannon-attach -h

Usage: shannon-attach [OPTIONS] [device]
Attaches Direct-IO PCIe SSD card and makes it available to the operating system.

OPTIONS:
   -h, --help
        Display this usage message.
   -n, --readonly
        Attach the Direct-IO drive and set access mode to readonly.
   -r, --reduced-write
        Attach the Direct-IO drive and set access mode to reduced-write.
        If no -n or -r option is given, set access mode to normal readwrite.
   [device]
        Device node for the Direct-IO PCIe SSD card (/dev/sctx).

2. LED标示, 一般用于在多个SSD中物理的区分.
# /usr/bin/shannon-beacon -h

Usage: shannon-beacon [OPTIONS] [device]
Lights the Direct-IO PCIe SSD card's yellow LED to locate the device.
This utility always turns the LED on, unless you specifically use the -o option.

OPTIONS:
   -h, --help
        Display this usage message.
   -l, --on
        Light on the yellow LED.
   -o, --off
        Turn off the yellow LED.
   [device]
        Device node for the Direct-IO PCIe SSD card (/dev/sctx).

3. 收集BUG报告. 例如
# /usr/bin/shannon-bugreport -h
usage: shannon-bugreport

# /usr/bin/shannon-bugreport
$hostname$
Linux-2.6.32-431.el6.x86_64
Copying files into system...   done.
Copying files into system...   done.
Copying files into system...   done.
Copying files into proc...   done.
Copying files into proc/self...   done.
Copying files into log...   done.
Copying files into log...   done.
Copying files into log...   done.
Copying files into log...   done.
Dumping dmesg ...   done.
Copying files into sys...   done.
Copying files into sys...   done.
Copying files into sys...   done.
Copying files into sys...   done.
Dumping lspci -vvvvv...   done.
Dumping uname -a...   done.
Dumping hostname ...   done.
Dumping ps aux...   done.
Dumping ps aux --sort start_time...   done.
Dumping pstree ...   done.
Dumping lsof ...   done.
Dumping w ...   done.
Dumping lsmod ...   done.
Dumping dmidecode ...   done.
Dumping sar -A...   done.
Dumping sar -r...   done.
Dumping sar ...   done.
Dumping iostat -dmx 1 5...   done.
Dumping vmstat 1 5...   done.
Dumping top -bc -d1 -n5...   done.
Copying files into disk...   done.
Copying files into disk...   done.
Dumping df -h...   done.
Dumping pvs ...   done.
Dumping vgs ...   done.
Dumping lvs ...   done.
Dumping dmsetup table...   done.
Dumping dmsetup status...   done.
Gathering information using shannon-status...   done.
Dumping numactl --hardware...   done.
Dumping numactl --show...   done.
Dumping numastat ...   done.
Copying files into debug...   done.

Building tarball...
存放到/tmp下的一个TAR包, 便于给厂商提供分析数据.
 Tarball: /tmp/shannon-bugreport-20140619-211326.tar.gz
 Plz send it to our customer service, including steps to reproduce the problem.
 All the information would help us address the issue tremendously.

4. 把一个SSD设备从操作系统中删除, 即解绑. 
# /usr/bin/shannon-detach -h

Usage: shannon-detach [OPTIONS] [device]
Detaches and removes the corresponding /dev/dfx Direct-IO block device.

OPTIONS:
   -h, --help
        Display this usage message.
   [device]
        Device node for the Direct-IO PCIe SSD card (/dev/sctx).

5. SSD设备格式化, 查看当前设备的物理容量, 擦除, 设置最小逻辑访问单元, 设置用户可用空间, 等.
注意SSD一般物理容量都要大于实际的可用容量, 保留的容量用于纠错, 替换坏块. 因为SSD的存储单元有擦写次数限制, 超出限制就会造成物理损伤, 变成坏块, 坏块就需要额外的容量来替换和修复. 
shannon保留的容量一般为27% 这个保留值可以使用状态命令看到.
当坏块比较多之后, SSD还可以继续使用, 但是如果保留容量也用的差不多的话, SSD随着坏块的增多, 最后就会报废.
所以一般坏块多了, 还可以通过缩小用户可用空间, 保持27%甚至更多的可用空间来维持SSD继续使用, 直到所有块都变成坏块. 就不能再写入了.
# /usr/bin/shannon-format -h

Usage: shannon-format [OPTIONS] [device]
Direct-IO PCIe SSD card is pre-formated before shipped to the customer.
This tool can perform a re-format as needed.
WARNING:
   Re-format will erase all your data on the drive.
   Please use shannon-detach to detach block device before using this tool,
OPTIONS:
   -h, --help
        Display this usage message.
   -p, --probe
        Probe current physical capacity and advertised user capacity.
   -e, --erase
        Erase all data and re-format the drive without changing setting.
   -y, --always-yes
        Auto-answer "yes" to all queries from this tool (i.e. bypass prompts).
   -i DELAY, --interrupt=DELAY
        Set interrupt delay (unit: us).
   -b SIZE, --logical-block=SIZE
        Set logical block size, i.e the minimum access unit from host.
   -s CAPACITY, --capacity=CAPACITY
        Set user capacity as a specific size(in TB, GB, or MB)
        or as a percentage(such as 70%) of the advertised capacity.
   -o CAPACITY, --overformat=CAPACITY
        Over-format user capacity (to greater than the advertised capacity)
        as a specific size(in TB, GB, or MB)
        or as a percentage(e.g. 70%) of physical capacity.
   -a, --advertised     Set user capacity to advertised capacity directly.
   Warning: -s, -o , -a options are mutually exclusive!
   [device]
        Device node for the Direct-IO PCIe SSD card (/dev/sctx).

6. 前面我们说shannon SSD没有录入smartmontools字典, 还好shannon厂商提供了这个工具来查看SSD的状态, 包括使用寿命, 坏块占比, 预留空间, 温度传感器值, 等.
# /usr/bin/shannon-status -h

Usage: shannon-status [OPTIONS] [device]
Shows information about Direct-IO PCIe card(s) installed on the system.
OPTIONS:
   -h, --help
        Display this usage message.
   -r SECS, --refresh=SECS
        Set refresh interval of monitoring (unit: second, default: 2 seconds).
   -a, --all
        Find all Direct-IO drive(s), and provide basic information of them.
   -m, --monitor
        If given, this tool will open a monitoring window,
        which dynamically shows detailed information of the specified drive.
   -p, --print
       Generate key=value format output for easier parsing.
   [device]
        Device node for the Direct-IO PCIe SSD card (/dev/sctx).
例如 : 
# /usr/bin/shannon-status -a
Found Shannon PCIE SSD card /dev/scta:

Direct-IO drive scta at PCI Address:41:00:0:  PCI接口信息, 使用lspci可以查看对应的驱动信息等.
Model:sh-shannon-pcie-ssd, SN: 006819246149b014  序列号
Device state: attached as disk /dev/dfa, Access mode: readwrite
Firmware version: 0c321351, Driver version: 2.6.9  固件版本, 驱动版本
Vendor:1cb0, Device:0275, Sub vendor:1cb0, Sub device:0275
Flash manufacturer: 98, Flash id: 3a
Channels: 7, Lunsets in channel: 8, Luns in lunset: 2, Available luns: 112
Eblocks in lun: 2116, Pages in eblock: 256, Nand page size: 32768  闪存信息
Logical sector: 512, Physical sector: 4096  逻辑扇区和SSD的物理扇区(字节), 分区对齐时使用物理扇区大小, 也就是4K对齐
User capacity: 1200.00 GB/1117.59 GiB  用户可用容量
Physical capacity: 1632.37 GB/1520.26 GiB  物理容量, 可用看到预留了大概400GB作为修复坏块用
Overprovision: 27%, warn at 10%  
Error correction: 35 bits per 880 bytes codeword  错误校验信息
Controller internal temperature: 71 degC, max 77 degC  温度传感器信息
Controller board temperature: 53 degC, max 59 degC
NAND Flash temperature: 53 degC, max 63 degC
Internal voltage: 1001 mV, max 1028 mV    电压
Auxiliary voltage: 1795 mV, max 1804 mV
Media status: 1.0760% bad block   存储介质状态, 已经有1%的坏块了.
Power on hours: 9 hours 15 minutes, Power cycles: 3   加电时间和加电次数
Lifetime data volumes:  生命信息
  Host write data    : 13774.36 GB / 12828.37 GiB
  Host read data     : 5172.11 GB / 4816.90 GiB
  Total write data   : 19974.61 GB / 18602.80 GiB
  Write amplifier    : 1.4501
  Estimated life left: 99% left   评估的剩余生命. 这个可以作为是否需要换盘的参考.

Totally found 1 Direct-IO PCIe SSD card on this system.
KEY-VALUE输出格式, 信息和前面的差不多.
# /usr/bin/shannon-status -p /dev/scta 
drive=/dev/scta
pci_address=41:00:0
model=sh-shannon-pcie-ssd
serial_number=006819246149b014
device_state=attached as disk /dev/dfa
access_mode=readwrite
firmware_version=0c321351
driver_version=2.6.9
vendor_id=1cb0
device_id=0275
subsystem_vendor_id=1cb0
subsystem_device_id=0275
flash_manufacturer=98
flash_id=3a
channels=7
lunsets_in_channel=8
luns_in_lunset=2
available_luns=112
eblocks_in_lun=2116
pages_in_eblock=256
nand_page_size=32768
logical_sector=512
physical_sector=4096
user_capacity=1200.00 GB
physical_capacity=1632.37 GB
overprovision=27%
error_correction=35 bits per 880 bytes codeword
controller_temp=71 degC
controller_temp_max=77 degC
board_temp=53 degC
board_temp_max=59 degC
flash_temp=53 degC
flash_temp_max=63 degC
internal_voltage=1004 mV
internal_voltage_max=1028 mV
auxiliary_voltage=1790 mV
auxiliary_voltage_max=1804 mV
bad_block_percentage=1.0760%
power_on_hours=9 hours 15 minutes
power_cycles=3
host_write_data=13801.32 GB
host_read_data=5172.11 GB
total_write_data=20001.57 GB
write_amplifier=1.4493
estimated_life_left=99%
实时监控举例
# /usr/bin/shannon-status -m -r 1 /dev/scta 

                Direct-IO PCIe SSD Card Monitor Program
Commands: q|Q exit; g|G general info; m|M main window

We are now monitoring disk 'scta' at PCI:41:00:0
Capacity: 1200.00 GB/1117.59 GiB, Block size: 4096, Overprovision: 27%
 Power on hours           : 9 hours 20 minutes
 Power cycles             : 3
 Controller internal temp : 72 degC, max 77 degC
 Controller board temp    : 53 degC, max 59 degC
 NAND Flash temperature   : 53 degC, max 63 degC
 Internal voltage         : 990 mV, max 1028 mV 
 Auxiliary voltage        : 1781 mV, max 1804 mV
 Free block count         : 32
 Host write data          : 13987.89 GB / 13027.24 GiB
 Write Bandwith           : 733.258 MB/s / 699.289 MiB/s
 Write IOPS               : 89.509 K
 Avg write latency        : 0.013 ms
 Host read data           : 5172.11 GB / 4816.90 GiB
 Read Bandwith            : 0.000 MB/s / 0.000 MiB/s
 Read IOPS                : 0.000 K
 Avg read latency         : 0.000 ms
 Total write data         : 20188.16 GB / 18801.69 GiB
 Total write Bandwith     : 733.258 MB/s / 699.289 MiB/s
 Write amplifier          : life 1.443, transient 1.000
 Buffer write percentage  : 99%

使用lspci查看对应的pci接口信息.
# lspci -vvvv -s 41:00.0
41:00.0 Mass storage controller: Device 1cb0:0275
        Subsystem: Device 1cb0:0275
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 150
        Region 0: Memory at d40fc000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: [40] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [48] MSI: Enable+ Count=1/1 Maskable- 64bit+
                Address: 00000000fee000d8  Data: 0000
        Capabilities: [60] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 unlimited
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 256 bytes, MaxReadReq 4096 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Latency L0 unlimited, L1 unlimited
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range B, TimeoutDis-, LTR-, OBFF Not Supported
                DevCtl2: Completion Timeout: 65ms to 210ms, TimeoutDis-, LTR-, OBFF Disabled
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [9c] MSI-X: Enable- Count=1 Masked-
                Vector table: BAR=0 offset=00000180
                PBA: BAR=0 offset=000001e0
        Capabilities: [fc] #00 [0000]
        Capabilities: [100 v1] Device Serial Number 00-00-00-01-01-00-0a-35
        Kernel driver in use: shannon
        Kernel modules: shannon

重点来了, 接下来测2块东西, 一块是fsync接口的性能, 这个直接影响到数据库的checkpoint, xlog的flush等; 另一块是直接测试PostgreSQL数据库的读写性能, 测试模型后面给出.

1. 测试fsync接口的性能.
注意分区时物理块的对齐, 本文场景使用4K对齐;  据昨天宝存的工程师阐述宝存SSD盘不存在这个问题.

另外需要写超过50%后进行测试, 因为OCZ有这样的问题, 使用超过容量1半后, 性能会下降.
/dev/dfa1       788G  733G   15G  99% /mnt

最后要注意的是因为CPU的差异, 可能导致IOPS未能达到测试瓶颈, 这样的话需要开多个进程进行测试, 直到iostat 块设备的 %util = 100

本文将开启2个进程同时测试, 测试的单次fsync数据块大小为8KB. (如果要测4K的块, 可以修改一下--with-wal-blocksize=4重新编译PostgreSQL, 目前支持的块大小为1,2,4,8,16,32,64), 或者直接使用dd obs指定数据块大小, oflag=sync,nonblock,noatime指定同步调用.

OCZ测试结果
进程1
# /data_ssd0/pgsql9.3.4/bin/pg_test_fsync -f /ssd/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   10448.416 ops/sec      96 usecs/op
        fdatasync                        1002.611 ops/sec     997 usecs/op
        fsync                             532.699 ops/sec    1877 usecs/op
        fsync_writethrough                            n/a
        open_sync                        6525.048 ops/sec     153 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                    3097.379 ops/sec     323 usecs/op
        fdatasync                         939.144 ops/sec    1065 usecs/op
        fsync                             481.663 ops/sec    2076 usecs/op
        fsync_writethrough                            n/a
        open_sync                        3378.838 ops/sec     296 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write        4442.898 ops/sec     225 usecs/op
         2 *  8kB open_sync writes       3129.062 ops/sec     320 usecs/op
         4 *  4kB open_sync writes       1919.259 ops/sec     521 usecs/op
         8 *  2kB open_sync writes        978.837 ops/sec    1022 usecs/op
        16 *  1kB open_sync writes        518.352 ops/sec    1929 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close               493.002 ops/sec    2028 usecs/op
        write, close, fsync               522.625 ops/sec    1913 usecs/op

Non-Sync'ed 8kB writes:
        write                           189268.154 ops/sec       5 usecs/op
进程2
# /data_ssd0/pgsql9.3.4/bin/pg_test_fsync -f /ssd/2
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                    3278.009 ops/sec     305 usecs/op
        fdatasync                        1074.887 ops/sec     930 usecs/op
        fsync                             496.757 ops/sec    2013 usecs/op
        fsync_writethrough                            n/a
        open_sync                        5928.269 ops/sec     169 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                    3312.785 ops/sec     302 usecs/op
        fdatasync                        1006.541 ops/sec     994 usecs/op
        fsync                             524.735 ops/sec    1906 usecs/op
        fsync_writethrough                            n/a
        open_sync                        3129.400 ops/sec     320 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write        4433.001 ops/sec     226 usecs/op
         2 *  8kB open_sync writes       3117.236 ops/sec     321 usecs/op
         4 *  4kB open_sync writes       1911.074 ops/sec     523 usecs/op
         8 *  2kB open_sync writes        978.444 ops/sec    1022 usecs/op
        16 *  1kB open_sync writes        551.965 ops/sec    1812 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close               553.674 ops/sec    1806 usecs/op
        write, close, fsync               601.139 ops/sec    1664 usecs/op

Non-Sync'ed 8kB writes:
        write                           194588.313 ops/sec       5 usecs/op
两个进程已经到达了OCZ的fsync瓶颈.

宝存测试结果
进程1
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   36722.105 ops/sec      27 usecs/op
        fdatasync                       38222.994 ops/sec      26 usecs/op
        fsync                           33121.821 ops/sec      30 usecs/op
        fsync_writethrough                            n/a
        open_sync                       46673.776 ops/sec      21 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   21869.056 ops/sec      46 usecs/op
        fdatasync                       21256.613 ops/sec      47 usecs/op
        fsync                           17299.647 ops/sec      58 usecs/op
        fsync_writethrough                            n/a
        open_sync                       23230.884 ops/sec      43 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write       22936.731 ops/sec      44 usecs/op
         2 *  8kB open_sync writes      22550.950 ops/sec      44 usecs/op
         4 *  4kB open_sync writes      13588.508 ops/sec      74 usecs/op
         8 *  2kB open_sync writes        432.588 ops/sec    2312 usecs/op
        16 *  1kB open_sync writes        269.104 ops/sec    3716 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close             15686.374 ops/sec      64 usecs/op
        write, close, fsync             15723.477 ops/sec      64 usecs/op

Non-Sync'ed 8kB writes:
        write                           187060.951 ops/sec       5 usecs/op
进程2
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/2
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   37496.895 ops/sec      27 usecs/op
        fdatasync                       38063.802 ops/sec      26 usecs/op
        fsync                           33193.801 ops/sec      30 usecs/op
        fsync_writethrough                            n/a
        open_sync                       46719.307 ops/sec      21 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   21853.038 ops/sec      46 usecs/op
        fdatasync                       21022.927 ops/sec      48 usecs/op
        fsync                           17531.537 ops/sec      57 usecs/op
        fsync_writethrough                            n/a
        open_sync                       23044.216 ops/sec      43 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write       22933.531 ops/sec      44 usecs/op
         2 *  8kB open_sync writes      22986.931 ops/sec      44 usecs/op
         4 *  4kB open_sync writes      13640.880 ops/sec      73 usecs/op
         8 *  2kB open_sync writes        462.776 ops/sec    2161 usecs/op
        16 *  1kB open_sync writes        260.950 ops/sec    3832 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close             15492.679 ops/sec      65 usecs/op
        write, close, fsync             15737.468 ops/sec      64 usecs/op

Non-Sync'ed 8kB writes:
        write                           190884.093 ops/sec       5 usecs/op
看起来2个进程fsync对宝存毫无压力, 所以开了3个进程, 以下是测试结果
进程1
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/1
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   29959.548 ops/sec      33 usecs/op
        fdatasync                       26916.071 ops/sec      37 usecs/op
        fsync                           19083.832 ops/sec      52 usecs/op
        fsync_writethrough                            n/a
        open_sync                       27749.411 ops/sec      36 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   14162.292 ops/sec      71 usecs/op
        fdatasync                       15988.317 ops/sec      63 usecs/op
        fsync                           12431.108 ops/sec      80 usecs/op
        fsync_writethrough                            n/a
        open_sync                       13515.127 ops/sec      74 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write       18787.955 ops/sec      53 usecs/op
         2 *  8kB open_sync writes      13915.061 ops/sec      72 usecs/op
         4 *  4kB open_sync writes       8684.830 ops/sec     115 usecs/op
         8 *  2kB open_sync writes        347.167 ops/sec    2880 usecs/op
        16 *  1kB open_sync writes        269.654 ops/sec    3708 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close             13616.426 ops/sec      73 usecs/op
        write, close, fsync             13115.266 ops/sec      76 usecs/op

Non-Sync'ed 8kB writes:
        write                           161001.305 ops/sec       6 usecs/op
进程2
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/2
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   29773.938 ops/sec      34 usecs/op
        fdatasync                       26694.424 ops/sec      37 usecs/op
        fsync                           19525.263 ops/sec      51 usecs/op
        fsync_writethrough                            n/a
        open_sync                       27408.323 ops/sec      36 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   14181.855 ops/sec      71 usecs/op
        fdatasync                       16052.939 ops/sec      62 usecs/op
        fsync                           12490.750 ops/sec      80 usecs/op
        fsync_writethrough                            n/a
        open_sync                       13723.671 ops/sec      73 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write       18485.904 ops/sec      54 usecs/op
         2 *  8kB open_sync writes      13690.145 ops/sec      73 usecs/op
         4 *  4kB open_sync writes       8538.992 ops/sec     117 usecs/op
         8 *  2kB open_sync writes        336.883 ops/sec    2968 usecs/op
        16 *  1kB open_sync writes        314.388 ops/sec    3181 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close             13707.362 ops/sec      73 usecs/op
        write, close, fsync             12839.225 ops/sec      78 usecs/op

Non-Sync'ed 8kB writes:
        write                           153528.755 ops/sec       7 usecs/op
进程3
# /opt/pgsql9.3.4/bin/pg_test_fsync -f /ssd/3
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   29921.701 ops/sec      33 usecs/op
        fdatasync                       26656.149 ops/sec      38 usecs/op
        fsync                           19701.098 ops/sec      51 usecs/op
        fsync_writethrough                            n/a
        open_sync                       27229.080 ops/sec      37 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                   14177.152 ops/sec      71 usecs/op
        fdatasync                       15772.293 ops/sec      63 usecs/op
        fsync                           12589.742 ops/sec      79 usecs/op
        fsync_writethrough                            n/a
        open_sync                       13787.782 ops/sec      73 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write       18697.155 ops/sec      53 usecs/op
         2 *  8kB open_sync writes      13377.013 ops/sec      75 usecs/op
         4 *  4kB open_sync writes       8537.356 ops/sec     117 usecs/op
         8 *  2kB open_sync writes        349.729 ops/sec    2859 usecs/op
        16 *  1kB open_sync writes        311.943 ops/sec    3206 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
        write, fsync, close             13971.711 ops/sec      72 usecs/op
        write, close, fsync             12785.385 ops/sec      78 usecs/op

Non-Sync'ed 8kB writes:
        write                           147119.435 ops/sec       7 usecs/op

将所有进程的测试结果汇总后的one 8KB write结果对比, 宝存1.2 TB 远超OCZ RevoDrive3 X2 960G
OCZ
        open_datasync                   13726.425 ops/sec
        fdatasync                        2077.498 ops/sec
        fsync                            1029.456 ops/sec
        fsync_writethrough                            n/a
        open_sync                       12453.317 ops/sec
宝存
        open_datasync                   89655.187 ops/sec
        fdatasync                       80266.644 ops/sec
        fsync                           58310.193 ops/sec
        fsync_writethrough                            n/a
        open_sync                       82386.814 ops/sec


2. 测试数据库的tps性能(对SSD性能本身没有太大的参考价值, 因为瓶颈可能出现在CPU上)
数据库版本 9.3.4
数据库配置, 主要是wal fsync method, 从测试结果来看, 我这里选择了两款SSD产品都最好的宝存open_datasync, OCZ open_sync(不支持open_datasync).

grep "^[a-z]" postgresql.conf 
listen_addresses = '0.0.0.0'            # what IP address(es) to listen on;
port = 5432                             # (change requires restart)
max_connections = 100                   # (change requires restart)
unix_socket_directories = '.'   # comma-separated list of directories
shared_buffers = 2048MB                 # min 128kB
maintenance_work_mem = 512MB            # min 1MB
vacuum_cost_delay = 10                  # 0-100 milliseconds
vacuum_cost_limit = 10000               # 1-10000 credits
bgwriter_delay = 10ms                   # 10-10000ms between rounds
wal_level = hot_standby                 # minimal, archive, or hot_standby
synchronous_commit = on         # synchronization level;
wal_sync_method = open_datasync         # the default is the first option
wal_buffers = 16384kB                   # min 32kB, -1 sets based on shared_buffers
checkpoint_segments = 128               # in logfile segments, min 1, 16MB each
effective_cache_size = 24000MB
log_destination = 'csvlog'              # Valid values are combinations of
logging_collector = on          # Enable capturing of stderr and csvlog
log_truncate_on_rotation = on           # If on, an existing log file with the
log_timezone = 'PRC'
autovacuum = on                 # Enable autovacuum subprocess?  'on'
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
datestyle = 'iso, mdy'
timezone = 'PRC'
lc_messages = 'C'                       # locale for system error message
lc_monetary = 'C'                       # locale for monetary formatting
lc_numeric = 'C'                        # locale for number formatting
lc_time = 'C'                           # locale for time formatting
default_text_search_config = 'pg_catalog.english'

测试模型
postgres=# create table test (id int primary key, info text, crt_time timestamp);
CREATE TABLE
postgres=# create or replace function f(v_id int) returns void as 
$$
declare
begin
  update test set info=md5(now()::text),crt_time=now() where id=v_id;
  if not found then
    insert into test values (v_id, md5(now()::text), now());         
  end if;
  return;
  exception when others then
    return;
end;
$$ language plpgsql strict;
CREATE FUNCTION

$ vi test.sql
\setrandom vid 1 10000000
select f(:vid);

测试方法
pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 300

测试结果对比
OCZ
pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 300 -h /data_ssd1/test -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 300 s
number of transactions actually processed: 6215372
tps = 20714.778666 (including connections establishing)
tps = 20715.587367 (excluding connections establishing)
statement latencies in milliseconds:
        0.001759        \setrandom vid 1 10000000
        0.575683        select f(:vid);
与异步提交效率有差距.
pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 30 -h /data_ssd1/test -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 30 s
number of transactions actually processed: 1284258
tps = 42797.519722 (including connections establishing)
tps = 42813.485019 (excluding connections establishing)
statement latencies in milliseconds:
        0.001762        \setrandom vid 1 10000000
        0.276727        select f(:vid);

宝存
$ /opt/pgsql9.3.4/bin/pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 300 -h /mnt/pg_root -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 300 s
number of transactions actually processed: 10865222
tps = 36214.699980 (including connections establishing)
tps = 36216.135752 (excluding connections establishing)
statement latencies in milliseconds:
        0.002184        \setrandom vid 1 10000000
        0.327720        select f(:vid);
与异步提交(synchronous_commit = off)的效率还是有较大差异.
$ /opt/pgsql9.3.4/bin/pgbench -M prepared -n -r -f ./test.sql -c 12 -j 4 -T 30 -h /mnt/pg_root -p 5432 -U postgres postgres
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 4
duration: 30 s
number of transactions actually processed: 1740723
tps = 58010.769125 (including connections establishing)
tps = 58032.766226 (excluding connections establishing)
statement latencies in milliseconds:
        0.001867        \setrandom vid 1 10000000
        0.203650        select f(:vid);
从异步测试的数据来看, OCZ所在服务器因为CPU的性能较弱, 明显要低于宝存所在服务器的性能.
另外就是, 数据库的测试TPS结果, OCZ和宝存的差异并没有直接测试FSYNC接口差异那么大, 因为瓶颈不是在IO上面了, 已经出现在CPU了.

测试数据汇总 : 
(注意测试机的CPU分别为2.0G和2.5G, 可以对OCZ的数据乘以1.25的系数. 有时间我会把OCZ的硬盘插到同一台服务器, 同样速率的PCI-E接口再测一遍. 现在的测试数据仅供参考. )
(同时需要注意数据库这块的测试, 因为不是单纯的IO测试, 还有大多数的开销在CPU上, 所以仅供参考)
(fsync接口测试单8K数据块的fsync)
宝存 shannon PCI-E SSD VS OCZ RevoDrive3 X2 PCI-E SSD on CentOS 6.5 2.6.32-431.el6.x86_64 - 德哥@Digoal - PostgreSQL

关于这块宝存SSD的cell寿命, 因为才拿到一天, 还没有擦写够. 以下是当前的一个健康状态.
如果按照现在的强度使用(基本属于虐待), 根据厂家提供的寿命数据(容量*10000), 可以使用200天左右报废.
Direct-IO drive scta at PCI Address:41:00:0:
Model:sh-shannon-pcie-ssd, SN: 006819246149b014
Device state: attached as disk /dev/dfa, Access mode: readwrite
Firmware version: 0c321351, Driver version: 2.6.9
Vendor:1cb0, Device:0275, Sub vendor:1cb0, Sub device:0275
Flash manufacturer: 98, Flash id: 3a
Channels: 7, Lunsets in channel: 8, Luns in lunset: 2, Available luns: 112
Eblocks in lun: 2116, Pages in eblock: 256, Nand page size: 32768
Logical sector: 512, Physical sector: 4096
User capacity: 1200.00 GB/1117.59 GiB
Physical capacity: 1632.37 GB/1520.26 GiB
Overprovision: 27%, warn at 10%
Error correction: 35 bits per 880 bytes codeword
Controller internal temperature: 70 degC, max 77 degC
Controller board temperature: 52 degC, max 59 degC
NAND Flash temperature: 52 degC, max 63 degC
Internal voltage: 999 mV, max 1028 mV
Auxiliary voltage: 1787 mV, max 1807 mV
Media status: 1.0760% bad block
Power on hours: 21 hours 9 minutes, Power cycles: 3
Lifetime data volumes:
  Host write data    : 41516.25 GB / 38665.02 GiB
  Host read data     : 5172.11 GB / 4816.90 GiB
  Total write data   : 48235.25 GB / 44922.58 GiB
  Write amplifier    : 1.1618
  Estimated life left: 99% left

最后简单介绍一下SSD一般的应用场景.
1. 作为文件系统二级缓存, 例如ZFS的 L2ARC(越大越好,但是它只存非脏数据), SLOG(slog一般10GB以内就够了, 建议mirror).
2. 类似文件系统二级缓存, FB的flashcache.
3. 作为数据库统计信息文件夹(stats_temp_directory).
4. 作为数据库活跃数据表空间.
5. 操作系统交换分区.
其他

应用注意事项
1. 分区物理访问单元对齐.

[参考]
2. $SRC/contrib/pg_test_fsync/pg_test_fsync.c
#define XLOG_BLCKSZ_K   (XLOG_BLCKSZ / 1024)

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

分享:
开发与运维
使用钉钉扫一扫加入圈子
+ 订阅

集结各类场景实战经验,助你开发运维畅行无忧

其他文章
最新文章
相关文章