I have recently purchased three S1200SPOR servers, and they all three are having the same issue: at random intervals, the fans will spin up to max for just a second or two before settling back down. These intervals can be anywhere from a few minutes to an hour, and they do not correlate with server load.
I have upgraded the firmware to S1200SP.86B.03.01.0026.092720170729, check for thermal trips, and examined the SEL, but they didn't help. However, in watching the IPMI sensors, I did notice one correlation: the times when the fans spin up seem to correspond to the sensors being unable to read "P1 Therm Margin". My conclusion was that the system is occasionally unable to read the power supply temperature, so it spins up the fans "just in case", but on the next reading it reads the temperature correctly and so returns the fans to normal.
I checked the sensor wire and it is secure. (Fiddling with it doesn't trigger the fan response either.) I might conclude a hardware fault, but this is happening on all three systems, each of which were purchased at different times over the last few months.
I also found this thread: R1304SPOSHBNR 1U chassis fans randomly spin up to maximum for a few seconds and then go back to normal. but the information isn't any help since the SEL doesn't report any errors, the problem is intermittent, I don't have a RAID controller in them, and the standard checks didn't reveal anything promising.
If anyone has any ideas on where to go next, I would appreciate the input.
Thanks!
System info dump (all three set up the same):
OS: CentOS 7
# dmidecode
Getting SMBIOS data from sysfs.
SMBIOS 2.7 present.
64 structures occupying 3770 bytes.
Table at 0x80630000.
Handle 0x0006, DMI type 0, 24 bytes
BIOS Information
Vendor: Intel Corporation
Version: S1200SP.86B.03.01.0026.092720170729
Release Date: 09/27/2017
Address: 0xF0000
Runtime Size: 64 kB
ROM Size: 16384 kB
Characteristics:
PCI is supported
PNP is supported
BIOS is upgradeable
BIOS shadowing is allowed
Boot from CD is supported
Selectable boot is supported
EDD is supported
5.25"/1.2 MB floppy services are supported (int 13h)
3.5"/720 kB floppy services are supported (int 13h)
3.5"/2.88 MB floppy services are supported (int 13h)
Print screen service is supported (int 5h)
8042 keyboard services are supported (int 9h)
Serial services are supported (int 14h)
Printer services are supported (int 17h)
CGA/mono video services are supported (int 10h)
ACPI is supported
USB legacy is supported
LS-120 boot is supported
ATAPI Zip drive boot is supported
BIOS boot specification is supported
Function key-initiated network boot is supported
Targeted content distribution is supported
UEFI is supported
BIOS Revision: 0.0
Firmware Revision: 0.0
Handle 0x0007, DMI type 1, 27 bytes
System Information
Manufacturer: Intel Corporation
Product Name: S1200SP
Version: R1304SPOSHBNR
Serial Number: QSCD74200212
UUID: 71C4BAF6-8AB0-E711-AB21-A4BF0128AAE2
Wake-up Type: Power Switch
SKU Number: SKU Number
Family: Family
Handle 0x0008, DMI type 2, 17 bytes
Base Board Information
Manufacturer: Intel Corporation
Product Name: S1200SP
Version: H57534-260
Serial Number: QSSA74100091
Asset Tag: Base Board Asset Tag
Features:
Board is a hosting board
Board is replaceable
Location In Chassis: Part Component
Chassis Handle: 0x0000
Type: Motherboard
Contained Object Handles: 0
Handle 0x001D, DMI type 4, 48 bytes
Processor Information
Socket Designation: CPU 1
Type: Central Processor
Family: Xeon
Manufacturer: Intel(R) Corporation
ID: E9 06 09 00 FF FB EB BF
Signature: Type 0, Family 6, Model 158, Stepping 9
Flags:
FPU (Floating-point unit on-chip)
VME (Virtual mode extension)
DE (Debugging extension)
PSE (Page size extension)
TSC (Time stamp counter)
MSR (Model specific registers)
PAE (Physical address extension)
MCE (Machine check exception)
CX8 (CMPXCHG8 instruction supported)
APIC (On-chip APIC hardware supported)
SEP (Fast system call)
MTRR (Memory type range registers)
PGE (Page global enable)
MCA (Machine check architecture)
CMOV (Conditional move instruction supported)
PAT (Page attribute table)
PSE-36 (36-bit page size extension)
CLFSH (CLFLUSH instruction supported)
DS (Debug store)
ACPI (ACPI supported)
MMX (MMX technology supported)
FXSR (FXSAVE and FXSTOR instructions supported)
SSE (Streaming SIMD extensions)
SSE2 (Streaming SIMD extensions 2)
SS (Self-snoop)
HTT (Multi-threading)
TM (Thermal monitor supported)
PBE (Pending break enabled)
Version: Intel(R) Xeon(R) CPU E3-1220 v6 @ 3.00GHz
Voltage: 0.9 V
External Clock: 100 MHz
Max Speed: 4200 MHz
Current Speed: 3000 MHz
Status: Populated, Enabled
Upgrade: Other
L1 Cache Handle: 0x001A
L2 Cache Handle: 0x001B
L3 Cache Handle: 0x001C
Serial Number: To Be Filled By O.E.M.
Asset Tag: To Be Filled By O.E.M.
Part Number: To Be Filled By O.E.M.
Core Count: 4
Core Enabled: 4
Thread Count: 4
Characteristics:
64-bit capable
Multi-Core
Execute Protection
Enhanced Virtualization
Power/Performance Control
Handle 0x000E, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x001E
Error Information Handle: Not Provided
Total Width: Unknown
Data Width: Unknown
Size: No Module Installed
Form Factor: Unknown
Set: None
Locator: DIMM_A1
Bank Locator: NODE 0 CHANNEL 0 DIMM 1
Type: Unknown
Type Detail: None
Speed: Unknown
Manufacturer: Empty/NO DIMM
Serial Number: Not Specified
Asset Tag: Not Specified
Part Number: Not Specified
Rank: Unknown
Configured Clock Speed: Unknown
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x0010, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x001E
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 8192 MB
Form Factor: DIMM
Set: None
Locator: DIMM_A2
Bank Locator: NODE 0 CHANNEL 0 DIMM 2
Type: DDR4
Type Detail: Synchronous
Speed: 2133 MHz
Manufacturer: Kingston
Serial Number: BA3287C7
Asset Tag: DIMM_A2_AssetTag
Part Number: 9965669-023.A00G
Rank: 2
Configured Clock Speed: 2133 MHz
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: 1.2 V
Handle 0x001F, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x001E
Error Information Handle: Not Provided
Total Width: Unknown
Data Width: Unknown
Size: No Module Installed
Form Factor: Unknown
Set: None
Locator: DIMM_B1
Bank Locator: NODE 0 CHANNEL 1 DIMM 1
Type: Unknown
Type Detail: None
Speed: Unknown
Manufacturer: Empty/NO DIMM
Serial Number: Not Specified
Asset Tag: Not Specified
Part Number: Not Specified
Rank: Unknown
Configured Clock Speed: Unknown
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
Handle 0x0020, DMI type 17, 40 bytes
Memory Device
Array Handle: 0x001E
Error Information Handle: Not Provided
Total Width: Unknown
Data Width: Unknown
Size: No Module Installed
Form Factor: Unknown
Set: None
Locator: DIMM_B2
Bank Locator: NODE 0 CHANNEL 1 DIMM 2
Type: Unknown
Type Detail: None
Speed: Unknown
Manufacturer: Empty/NO DIMM
Serial Number: Not Specified
Asset Tag: Not Specified
Part Number: Not Specified
Rank: Unknown
Configured Clock Speed: Unknown
Minimum Voltage: Unknown
Maximum Voltage: Unknown
Configured Voltage: Unknown
# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +119.0°C)
temp2: +29.8°C (crit = +119.0°C)
coretemp-isa-0000
Adapter: ISA adapter
Physical id 0: +44.0°C (high = +80.0°C, crit = +100.0°C)
Core 0: +39.0°C (high = +80.0°C, crit = +100.0°C)
Core 1: +40.0°C (high = +80.0°C, crit = +100.0°C)
Core 2: +42.0°C (high = +80.0°C, crit = +100.0°C)
Core 3: +44.0°C (high = +80.0°C, crit = +100.0°C)
power_meter-acpi-0
Adapter: ACPI interface
power1: 36.00 W (interval = 4294967.29 s)
# ipmitool sdr elist full
System Airflow | 11h | ok | 23.1 | 8 CFM
BB P1 VR Temp | 20h | ok | 7.1 | -74 degrees C
P1 VR Ctrl Temp | BAh | ok | 7.1 | 31 degrees C
Front Panel Temp | 21h | ok | 12.1 | 23 degrees C
SSB Temp | 22h | ok | 7.1 | 32 degrees C
BB Inlet Temp | 23h | ok | 7.1 | 22 degrees C
BB BMC Temp | 24h | ok | 7.1 | 30 degrees C
BB Mem VR temp | 25h | ok | 7.1 | -75 degrees C
HSBP 1 Temp | 29h | ok | 15.1 | 24 degrees C
Exit Air Temp | 2Eh | ok | 7.1 | 28 degrees C
System Fan 1 | 34h | ok | 29.1 | 5978 RPM
System Fan 2 | 31h | ok | 29.2 | 5978 RPM
System Fan 3 | 32h | ok | 29.3 | 5978 RPM
PS1 Input Power | 54h | ok | 10.3 | 35 Watts
PS1 Temperature | 5Ch | ok | 10.1 | 41 degrees C
P1 Therm Margin | 74h | ok | 3.1 | -55 degrees C <----- this will say "No Reading" rather than "-55 degrees C" when the fans spin up, then back to "-55 degrees C" when the fans spin back down
P1 Therm Ctrl % | 78h | ok | 3.1 | 0 percent
P1 DTS Therm Mgn | 83h | ok | 3.1 | -34 degrees C
DIMM Thrm Mrgn 1 | B0h | ok | 8.1 | -60 degrees C
Agg Therm Mgn 1 | C8h | ok | 7.1 | -12 degrees C
BB +12.0V | D0h | ok | 7.1 | 12.10 Volts
BB +3.3V Vbat | DEh | ok | 7.1 | 3.08 Volts
MTT CPU1 | 34h | ok | 3.1 | 0 percent
(I have had fan spin-ups since the last system boot show here)
#ipmitool sel elist
1 | 12/12/2017 | 16:15:21 | Event Logging Disabled System Event Log | Log area reset/cleared | Asserted
2 | 12/12/2017 | 19:49:35 | Power Unit Pwr Unit Status | Power off/down | Asserted
3 | 12/13/2017 | 15:36:02 | Processor P1 Status | Presence detected | Asserted
4 | 12/13/2017 | 15:36:09 | Drive Slot / Bay HDD 0 Status | Drive Present | Asserted
5 | 12/13/2017 | 15:36:09 | Drive Slot / Bay HDD 1 Status | Drive Present | Asserted
6 | 12/13/2017 | 15:36:29 | System Event BIOS Evt Sensor | OEM System boot event | Asserted
7 | 12/13/2017 | 15:54:26 | Power Supply PS1 Status | Presence detected | Deasserted
8 | 12/13/2017 | 15:54:31 | Physical Security #0x04 | General Chassis intrusion | Asserted
9 | 12/13/2017 | 15:54:35 | Fan #0x30 | Lower Non-critical going low | Asserted
a | 12/13/2017 | 15:54:35 | Fan #0x30 | Lower Critical going low | Asserted
b | 12/13/2017 | 15:54:35 | Fan #0x33 | Lower Non-critical going low | Asserted
c | 12/13/2017 | 15:54:35 | Fan #0x33 | Lower Critical going low | Asserted
d | 12/13/2017 | 15:54:36 | Physical Security #0x04 | General Chassis intrusion | Deasserted
e | 12/13/2017 | 15:54:36 | Fan #0x30 | Lower Non-critical going low | Deasserted
f | 12/13/2017 | 15:54:36 | Fan #0x30 | Lower Critical going low | Deasserted
10 | 12/13/2017 | 15:54:36 | Fan #0x33 | Lower Non-critical going low | Deasserted
11 | 12/13/2017 | 15:54:36 | Fan #0x33 | Lower Critical going low | Deasserted
12 | 12/13/2017 | 21:53:14 | Power Unit Pwr Unit Status | Power off/down | Asserted
13 | 12/14/2017 | 15:15:22 | Processor P1 Status | Presence detected | Asserted
14 | 12/14/2017 | 15:15:28 | Drive Slot / Bay HDD 0 Status | Drive Present | Asserted
15 | 12/14/2017 | 15:15:28 | Drive Slot / Bay HDD 1 Status | Drive Present | Asserted
16 | 12/14/2017 | 15:15:47 | System Event BIOS Evt Sensor | OEM System boot event | Asserted
17 | 12/14/2017 | 15:36:49 | Power Unit Pwr Unit Status | Power off/down | Asserted
18 | 12/14/2017 | 15:36:52 | Power Unit Pwr Unit Status | Power off/down | Deasserted
19 | 12/14/2017 | 15:37:17 | System Event BIOS Evt Sensor | OEM System boot event | Asserted