• Home
  • Work
  • Blog
  • Security
  • Links
  • Samsung SSD 840: Endurance Destruct Test

    2013 - 10.10

    When you operate a Datacenter with many servers you also probably have big number of installed disks. In most cases even you have cluster, especialy in small companies, there is still SPOF (Single point of failure) somewhere.

    In this SPOF it’s critical to know condition of the system. When system using HDD it’s important to know HDD condition to prevent failures, plan maintenance etc.

    We will talk about Samsung SSD 840 PRO series. We realize this SSD has very good performance and lifetime. Before we use it in production we must know how to monitor condition. There is many articles and technical specification but we had a lot of questions without answer.

    ssd-disks

    (Update: 10.10.2013)

    (Samsung performance issues, read new post, 03.12.2014)

    For example:

    1. What happened when smart normalized value “Wearleveling count” (WLC)  drop to 0 ?
    2. Will It effect disk performance ?
    3. How temperature change during utilization of disk ?
    4. How many data can we write to disk without error ?

    Hardware

    • Samsung SSD 840 PRO – 128 GB
    • Dell Server R320
    • 32 GB ECC Ram
    • 1x CPU, Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz (6 Cores + HT)
    • SAS Controller, Perc H710 NV
    • OS Debian Squeeze
    • Linux Kernel 3.2.2

    Tests

    Our test consist of two stages which are in a loop.

    Stage 1 is performance test. This test writes 5 GB of data to disk for seq. read/write, rand. read/write, tests. During this stage we write something about 20 GB of data. We measure speed, latency and iops.

    Stage 2 is fill test. This test measure all values from Stage 1 but fill whole disk.

    All stages writes data with option “sync”.

    Test server is monitored separately for CPU, Memory, System Load Average, Network, Stats of active Sockets, Number of Processes, IO stats, Disks free space, number of connected users. We also monitor all Smart values from SSD disk.

    You may ask, why monitored all these values. Simple answer is, to know what is going on the server. When we will processing the results of these tests, we should see some strange values in some time intervals. We must have additional information what happened to decide this results is false or it’s real behavior of SSD disk.

    I use FIO utility for all tests on SSD.

    Overal I have 58 graphs from the server, most of it’s smart values.

    Wearleveling count

    What happened when it drops to 0 ?

    Simply nothing :). This value drops to 1 when it writes about 465 TB of data. This value I count from number of tests and crosscheck with Smart Value – “Total LBAs Written”. What I realize this is only prefail value and there was no errors or sectors reallocation. To this days disk wrote another 235 TB without any errors or reallocations, and test still continue.

    ssd-test-war

    ssd-test-lbas

    Wearleveling count vs performance

    As you can see on graphs below, there is no performance decrease before or after we reach 1% WLC.

    ssd-test-iostat

    This graphs is only for comparison with indicatively values. You should see no performance decrease during test.

    When we want exact values we must look into result logs from fio utility. Section Graphs from FIO utility of this article.

    Temperature

    In our datacenter temperature of SSD is somewhere around 26 °C. When SSD start writing and reading temperature grow up to 39.9 °C

    ssd-test-temp

    Data written

    As I mentioned above, test isn’t done yet. When I wrote this article, Total LBAs Written was 1503673300498. This value show how many 512 bytes blocks ware written.

    LBA_Value * 512 = Bytes written

    So we written about 700 TB to this 128 GB disk. Our tests have 148 GB (128 GB fill disk and 20 GB test disk). We run about 5000 tests which I can see on counter. When we divide 700 TB with 148 GB we get around 4870 tests. It’s really close, difference is around 3%.

    ssd-test-lbas

    Bandwidth and IOPS

    First look on manufacturer site for some specification.

    Manufacturer specification:

    • Seq. read up to 530 MB/s
    • Seq. write up to 390 MB/s
    • Rand. read speed up to 97 000 IOPS
    • Rand. write speed up to 90 000 IOPS

    Let see my measured values:

    • Seq. read ~ 275 MB/s
    • Seq. write ~ 300MB/s
    • Random read over 250 MB/s
    • Random write ~ 100 MB/s
    • Random read speed ~ 65 000 IOPS
    • Random write speed ~ 28 000 IOPS
    • Seq. read speed ~ 67 000 IOPS
    • Seq. write speed ~ 75 000 IOPS

    Quite interesting results.  Some values are similar to manufacturer specification and some are really different.

    Graphs from FIO test utility

    Samsung-128G-bw

    Samsung-128G-IOPS

     

    Conclusion

    Test still continues, disk have WLC on 1% but no reserved blocks was used, no reallocation, no errors.

    When we reach 1% of WLC, disk write about 465 TB of data.

    It means if your server writes daily 20 GB of data, it will take 65 years. For example if you rewrite whole disk every day you reach 1% of WLC in about 10 years.

    If you plan renewal HW every 5 years, you can be safe if you rewrite whole disk twice a day.

    What I see, it’s good to monitor these values:

    • Normalized value of WLC
    • Reallocated sector count
    • Normalized value of Used Reserved Blocks Count

    Dictionary

    Wearleveling count – The maximum number of erase operations performed on a single flash memory block.

    Reallocated sector count – When encountering a read/write/check error, a device remaps a bad sector to a “healthy” one taken from a special reserve pool. Normalized value of the attribute decreases as the number of available spares decreases.On a regular hard drive, Raw value indicates the number of remapped sectors, which should normally be zero. On an SSD, the Raw value indicates the number of failed flash memory blocks.

    Used Reserved Blocks Count – On an SSD, this attribute describes the state of the reserve block pool. The value of the attribute shows the percentage of the pool remaining. The Raw value sometimes contains the actual number of used reserve blocks.

     

    UPDATE 10.10.2013

    Our SSD is finaly dead after almost 5 months of heavy write test.

    Some numbers

    • more then 3 PB written
    • rewritten more then 24 400 times
    • stable temperature about 37 C

    and some nice graphs

    realloc-sec-count program-fail-count-total wear-leveling-count used-reserved-blocks total-lba runtime-bad-block power-on-hours ecc-recovered temperature data-lba-all iostat

    realloc-sect-normalized used_rsvd_blk_count-normalized wear-normalized

     

    UPDATE: (Samsung performance issues, read new post, 03.12.2014)

    Loading Facebook Comments ...

    24 Responses to “Samsung SSD 840: Endurance Destruct Test”

    1. Andrew says:

      That drive sure took a beating Interesting to watch how it wore.

      Thank you for sharing all that detailed and well put together info.

    2. toffitomek says:

      brilliant and useful article!

      How did you generate SMART graphs..?

    3. […] just have to share this: /samsung-ssd-84…destruct-test/ This person made a write endurance test on a Samsung 840 Pro, 128 GB version in order to check […]

    4. Alex says:

      Thanks so much! My 840 Evo had a wear leveling count of 1 after only 20 days! I was horrified. Your work is the best proof I need that the drive is safe to use.

    5. Andrey says:

      Hello,
      thanks for sharing that information – you did really impressive piece of work. Could you also share fio scripts that you used to perform tests. I want to try doing similar things working with bigger 840 Pro’s.
      Can you also share how did you bypass server raid controller. I am struggling with Dell r620 and its PERC H710 raid controller. It doesn’t allow me to pass trim commands to the drives. Can you advise something about it?

      thank you

      • Robert Vojčík says:

        Hi Andrey,

        many people ask mi about scripts. Originaly scripts not ready for distribution. They was very specific. I plan to similiar scripts on github. Both graphing smart values and example of testing . So stay tuned 😉

    6. Anonymous says:

      […] Belastung z.B. als Storage in einem DB Server oder als Plattencache. Von der 840 Pro 128GB gibt es hier einen Endurance Test, da sieht man auch gut die Entwicklung der Reallocated Sectoren und welche Dimensionen die annehmen […]

    7. Chris Wetemans says:

      Was it a pro or non-pro 840 you used?

    8. Vitaliy says:

      Thank you for great article! I don`t have any doubts about buying this ssd. You`ve made a good job.

    9. Michał says:

      I was wondering how do you normalize Wear Leveling Count? For example I get:

      177 Wear_Leveling_Count 0x0013 098 098 000 Pre-fail Always – 61

      No information about normalized value.

      • Robert Vojčík says:

        Hi Michal,

        this line consist of “ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

        Column VALUE is our normalized value of last column which is raw actual value.. Best norm. value of the new disk is 100 and how data is written it slowly go down to 0. Your value is now 98, which indicates disk is in good health.

    10. David Henderson says:

      Wow! So I have the 250gb version, so I should expect it to get about double the data written since you used a 128GB, correct?

      • Paul M says:

        as a rough estimation, yes; however, there may be a *proportionally* smaller area of flash for over-provisioning, but if you use TRIM then that shouldn’t make much difference depending on how often you push it into write amplification.

    11. supertramp says:

      Hello everyone,

      I have Samsung 840 EVO 120GB SSD. Today I realised I have a “used reserve block count error” on HDTune pro.

      Here is the screenshot:

      http://imgur.com/Bms5qnR

      Samsung Magician and SSDLife reporting the same smart values but no warning.

      Any ideas about the meaning of this?

      Waiting to hear from you guys.

      Best

    12. […] year ago I wrote blogpost about endurance and performance test of Samsung SSD 840 PRO. Some things has changed, especially firmware of […]

    13. Poma says:

      Thank you for that useful article

    14. […] Ich meine die Basis der Berechnung ist 5000 oder 6000 spezifizierte P/E Zyklen, aber wie dieser Endurance Test einer 840 Pro 128GB zeigt, sind die bei Samsung meist sehr konservativ die Garantierten Zyklen angegeben: Weiter sieht […]

    Your Reply