Skip to content

staging - Disk errors on storage1

A new disk error was detected by zfs on storage1

root@storage1:~# zpool status -v
  pool: data
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: resilvered 830G in 11:21:40 with 0 errors on Sun Jun 13 12:33:12 2021
remove: Removal of vdev 1 copied 513G in 3h57m, completed on Mon Apr 12 16:15:40 2021
    108M memory used for removed device mappings
config:

	NAME                                            STATE     READ WRITE CKSUM
	data                                            DEGRADED     0     0     0
	  mirror-0                                      ONLINE       0     0     0
	    wwn-0x5000c500a22eef5e                      ONLINE       0     0     0
	    wwn-0x5000c500a23e85cf                      ONLINE       0     0     0
	  mirror-2                                      ONLINE       0     0     0
	    wwn-0x5000c500a23d19b6                      ONLINE       0     0     0
	    wwn-0x5000c500a22ef2c4                      ONLINE       0     0     0
	  mirror-3                                      ONLINE       0     0     0
	    wwn-0x5000c500a23e7af4                      ONLINE       0     0     0
	    wwn-0x5000c500a23d253b                      ONLINE       0     0     0
	  mirror-4                                      DEGRADED     0     0     0
	    wwn-0x5000c500a23cf9ba                      ONLINE       0     0     0
	    spare-1                                     DEGRADED     6     0     0
	      wwn-0x5000c500a23e4511                    DEGRADED    15     2    49  too many errors
	      wwn-0x5000c500c4be3956                    ONLINE       0     0 3.45K
	  mirror-5                                      ONLINE       0     0     0
	    wwn-0x5000c500d5dda886                      ONLINE       0     0     0
	    wwn-0x5000c500a22eed6f                      ONLINE       0     0     0
	cache
	  nvme-INTEL_SSDPED1K375GAQ_FUKS70860038375AGN  ONLINE       0     0     0
	spares
	  wwn-0x5000c500c4be3956                        INUSE     currently in use
	  wwn-0x5000c500d5de652a                        AVAIL   

errors: No known data errors

The disk has several dead sectors:

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   077   064   044    Pre-fail  Always       -       49244880
  3 Spin_Up_Time            0x0003   092   091   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       26
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       16
  7 Seek_Error_Rate         0x000f   094   060   045    Pre-fail  Always       -       2591822807
  9 Power_On_Hours          0x0032   067   067   000    Old_age   Always       -       29725 (204 244 0)
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       26
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   098   098   000    Old_age   Always       -       2
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   070   063   040    Old_age   Always       -       30 (Min/Max 28/37)
191 G-Sense_Error_Rate      0x0032   097   097   000    Old_age   Always       -       6251
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       282
193 Load_Cycle_Count        0x0032   073   073   000    Old_age   Always       -       55159
194 Temperature_Celsius     0x0022   030   040   000    Old_age   Always       -       30 (0 11 0 0 0)
195 Hardware_ECC_Recovered  0x001a   028   001   000    Old_age   Always       -       49244880
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       21543h+27m+28.037s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       19384770747
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       123220748329

Migrated from T3380 (view on Phabricator)