View Single Post
Old 19th May 2020, 02:25   #1  |  Link
FranceBB
Broadcast Encoder
 
FranceBB's Avatar
 
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 2,902
Is my HDD going to fail?

Hi there,
my computer began to fail to boot randomly when I use Fedora: sometimes it does sometimes it doesn't.
Generally, when it boots, it shows a list with a series of logs of the things that are starting 'till it gets to the login page (it's how Fedora works), however when it doesn't it just shows a blank black screen, no logs, no messages, nothing, just a black empty screen. If I reboot and I try two or three times without changing anything, it finally boots and everything works as expected.
I even tried with 10 different kernels with no success: it still behaves the same (with the exception that on top of this, newer kernels can't detect my fan so it's a "no no" for me).
I have two drives, an old Toshiba HDD from 2016 which has Fedora and a relatively new NVME Intel SSD from 2018 which has Windows 10 and my beloved Windows XP.
This is the output of the S.M.A.R.T Test for both of them:

SATA HDD from 2016 with Fedora:

Code:
=== START OF INFORMATION SECTION ===
Model Family:     Toshiba 2.5" HDD MQ01ABD...
Device Model:     TOSHIBA MQ01ABD100
Serial Number:    76GJT8W0T
LU WWN Device Id: 5 000039 7219070e9
Firmware Version: AX0R2J
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon May 18 23:52:35 2020 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  120) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 238) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   050    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   050    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0027   100   100   001    Pre-fail  Always       -       1326
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       1796
  5 Reallocated_Sector_Ct   0x0033   100   100   050    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   100   100   050    Pre-fail  Always       -       0
  8 Seek_Time_Performance   0x0005   100   100   050    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   044   044   000    Old_age   Always       -       22509
 10 Spin_Retry_Count        0x0033   135   100   030    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       1779
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       46
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       385
193 Load_Cycle_Count        0x0032   048   048   000    Old_age   Always       -       520262
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       36 (Min/Max 12/47)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
220 Disk_Shift              0x0002   100   100   000    Old_age   Always       -       0
222 Loaded_Hours            0x0032   055   055   000    Old_age   Always       -       18192
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
224 Load_Friction           0x0022   100   100   000    Old_age   Always       -       0
226 Load-in_Time            0x0026   100   100   000    Old_age   Always       -       268
240 Head_Flying_Hours       0x0001   100   100   001    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     22506         -
# 2  Short offline       Completed without error       00%      9637         -
# 3  Extended offline    Aborted by host               90%      9637         -
# 4  Extended offline    Completed without error       00%      6676         -
# 5  Extended offline    Aborted by host               90%      6673         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
NVME SSD from 2019 with Windows 10 and WinXP:

Code:
=== START OF INFORMATION SECTION ===
Model Number:                       INTEL SSDPEKNW010T8
Serial Number:                      BTNH90810PRS1P0B
Firmware Version:                   002C
PCI Vendor/Subsystem ID:            0x8086
IEEE OUI Identifier:                0x5cd2e4
Controller ID:                      1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          1,024,209,543,168 [1.02 TB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Mon May 18 23:54:08 2020 BST
Firmware Updates (0x14):            2 Slots, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x005f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     77 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     4.00W       -        -    0  0  0  0        0       0
 1 +     3.00W       -        -    1  1  1  1        0       0
 2 +     2.20W       -        -    2  2  2  2        0       0
 3 -   0.0300W       -        -    3  3  3  3     5000    5000
 4 -   0.0040W       -        -    4  4  4  4     5000    9000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        33 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    7,937,553 [4.06 TB]
Data Units Written:                 9,910,024 [5.07 TB]
Host Read Commands:                 185,445,136
Host Write Commands:                62,433,576
Controller Busy Time:               2,095
Power Cycles:                       351
Power On Hours:                     4,628
Unsafe Shutdowns:                   43
Media and Data Integrity Errors:    0
Error Information Log Entries:      0
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Thermal Temp. 1 Transition Count:   609
Thermal Temp. 1 Total Time:         7024
Is it the Hard Drive? Is it going to die? Should I backup everything immediately or perhaps it could be something else?
FranceBB is offline   Reply With Quote