89.2. mcelog - Decode kernel machine check log on x86 machines

日志服务 SLS,月写入数据量 50GB 1个月
$ sudo apt-get install mcelog
Decode machine check ASCII output from kernel logs
--cpu CPU           Set CPU type CPU to decode (see below for valid types)
--cpumhz MHZ        Set CPU Mhz to decode time (output unreliable, not needed on new kernels)
--raw		     (with --ascii) Dump in raw ASCII format for machine processing
--daemon            Run in background waiting for events (needs newer kernel)
--ignorenodev       Exit silently when the device cannot be opened
--file filename     With --ascii read machine check log from filename instead of stdin
--syslog            Log decoded machine checks in syslog (default stdout or syslog for daemon)
--syslog-error	     Log decoded machine checks in syslog with error level
--no-syslog         Never log anything to syslog
--logfile filename  Append log output to logfile instead of stdout
--dmi               Use SMBIOS information to decode DIMMs (needs root)
--no-dmi            Don't use SMBIOS information
--dmi-verbose       Dump SMBIOS information (for debugging)
--filter            Inhibit known bogus events (default on)
--no-filter         Don't inhibit known broken events
--config-file filename Read config information from config file instead of /etc/mcelog/mcelog.conf
--foreground        Keep in foreground (for debugging)
--num-errors N      Only process N errors (for testing)
--pidfile file	     Write pid of daemon into file
--no-imc-log	     Disable extended iMC logging			

原文出处:Netkiller 系列 手札

成功解决Failed to execute stage ‘Setup validation’: Hardware does not support virtualization.
成功解决Failed to execute stage ‘Setup validation’: Hardware does not support virtualization.
Unix Linux 异构计算
成功解决 ERROR: An error occurred while performing the step: “Building kernel modules“. See /var/log/nv
成功解决 ERROR: An error occurred while performing the step: “Building kernel modules“. See /var/log/nv
成功解决 ERROR: An error occurred while performing the step: “Building kernel modules“. See  /var/log/nv
机器学习/深度学习 PyTorch 算法框架/工具
解决RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cp
1638 0
并行计算 PyTorch 算法框架/工具
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
CUDA unknown error - this may be due to an incorrectly set up environment 问题解决
成功解决An error ocurred while starting the kernel
成功解决An error ocurred while starting the kernel
LD_LIBRARY_PATH shouldn't contain the current directory when building glibc. Please change the envir
执行# ./glibc-2.14/configure 出现以下错误: checking LD_LIBRARY_PATH variable... contains current directory configure: error: *** LD_LIBRARY_PATH ...
1884 0