Welcome to Myce’s review of the Smart Storage Systems Cloudspeed
1000E SATA Enterprise SSD.
Sandisk recently completed the acquisition of Smart Storage
Systems.
The last Smart Storage Systems drive that we reviewed was
the truly outstanding Optimus,
our current Editor’s choice from amongst SAS based Enterprise SSDs. However, the
Cloudspeed is a relatively inexpensive solution that targets mixed workloads
and Smart tells us that the Cloudspeed will be sold at a sub 1.30 USD per
gigabyte price point – this will put it head to head with Intel’s DC
S3500, which we reviewed recently. With the DC S3500 we saw how Intel had
managed with the packaging of its new smaller size 20nm MLC NAND (as compared
to its sister drive the Intel DCS3700,
which uses the same set of components but with 25nm eMLC NAND) – it is fair to
say that write performance and endurance suffered. So it will be interesting
to see how Smart Storage Systems has managed with the packaging of its new 19nm
MLC NAND solution and how it compares with its closest competitor.
Market Positioning and Specification
Market Positioning
This is how Smart Storage Systems positions the Cloudspeed
product series -

Specification
Here is Smart Storage Systems’ specification for the Cloudspeed
1000 and 1000E -

You can see that the sole difference between the 1000 and
the 1000E is the amount of NAND set aside for use by the controller. The
greater amount set aside in the 1000E results in improved endurance and a
slight increase in random Read/Write IOPS.
Product Image
Here is a picture of the Cloudspeed 1000E that I tested –

I understand the Cloudspeed uses 19nm Toshiba MLC NAND and a
Marvell 9187 Controller.
Now let's head to the next page, to look at Myce’s
Enterprise Testing Methodology.....
Please click
here
to view or download a detailed introduction to Myce’s Enterprise Class Solid
State Storage (‘SSS’) Testing Methodology as a PDF.
Put briefly:
All testing is performed on an OakGate Technology test unit
We perform two sets of Performance Tests:
- A full set of the mandatory Storage Network Industry
Association’s (‘SNIA’) tests as specified in their Solid State Storage
Performance Test Specification Enterprise V1.0 – SNIA
SSS PTS Version 1.0. - A set of tests, known as the ‘Myce/OakGate Full
Characterisation Test Set’, that provides readers with a fuller
characterisation of the solution.
We also review other important factors such as Power
Consumption, Data Reliability and Failover features.
A word about SNIA testing – before striking a partnership
with OakGate Technology I spent some time researching how I may implement SNIA
testing using freely available tools such as IOMeter and FIO. I arrived at the
conclusion that whilst it was theoretically possible it was impractical. The
reason for this is as without the automation offered by a test bench, such as
the OakGate Unit, the only way to meet the SSS PTS requirements is to run the
maximum number of test cycles and then to manually look back at the results to
determine when/if steady state has been achieved in the workload specific test
cycle, and then harvest the data from the qualifying Measurement Window. this
means that the test runs would always take a maximum elapsed time, and there
would be a great deal of human effort required to review, gather, and report
upon the data. I empathise with, acknowledge, and respect the efforts of other
reviewers who endeavour to meet the SNIA’s principles in their testing - I am
privileged and thankful to be able to use a superb test bench which automates
the whole process and allows me to meet the SNIA’s specification in full.
Before we move on, let’s remind ourselves of some basics –
When reviewing the performance of an SSS solution there are
three basic metrics that we look at:
1. IOPS – the number of
Input/Output Operations per Second
2. Bandwidth – the number of
bytes transferred per second (usually measured in Megabytes per second, ‘MB/s’)
3. Latency – the amount of time
each IO request will take to complete (usually, in the context of SSS
solutions, measured in Microseconds, which are millionths of a second).
It is true to say that IOPS and Bandwidth had all been
growing rapidly before the advent of SSS solutions, but Latency can only be significantly
decreased by eliminating mechanical devices, and thus Latency is the single
most important aspect that SSS solutions deliver to enhance performance.
Latency in a technical environment is synonymous with delay.
In the context of an SSS solution it is the amount of time between an IO
request being made, and when the request is serviced.
Bandwidth, also commonly referred to as ‘Throughput’, is the
amount of data that can be transferred from a storage device to a host, in a
given amount of time. In the context of SSS solutions it is typically measured
in Megabytes per second (MB/s).
A great enterprise SSS solution
offers an effective balance of all three metrics. High IOPS and Bandwidth is
simply not enough if Latency (the delay in an IO operation) is too high. As we
will see in the test results presented below, as Latency increases IOPS will
inevitably decrease.
Queue Depth is the average amount
of IO requests outstanding. If you are running an application and the Average
Queue Depth is one or higher and CPU utilisation is low, then the application’s
performance is most probably suffering from a ‘Storage Bottleneck’.
Another important factor to
remember is that SSS performance is influenced by previous workloads, not just
the current workload, and especially by what has previously been written to the
drive. As specified in the SNIA SSS PTS the goal of all good Enterprise level
testing is to provide consistent circumstances, so that results can be compared
fairly across different SSS solutions – it is for this reason that all of our
tests start with a purge of the drive, so that it starts in a ‘Fresh Out of the
Box’ (FOB) state. Most tests then have a pre-conditioning phase where the
drive is put into a ‘Steady State’ before the test phase begins. Put briefly, a
‘Steady State’ is achieved when the performance of the drive no longer varies
over time and settles into a consistent level of performance for the workload
in hand. You can find a detailed explanation of ‘Steady State’ and how it is
determined in the SNIA tests in our Enterprise Testing Methodology paper, which
can be viewed or downloaded as a PDF by clicking here.
For interest, here are some
generally accepted assumptions that differentiate the use and therefore the
approach to testing Enterprise/Server and Consumer/Client SSS solutions:
Enterprise/Server SSS
assumptions:
- The drive is always full
- The drive is being accessed 100% of the time (i.e. the
drive gets no idle time) - Failure is catastrophic for many users
- The Enterprise market chooses SSS solutions based on their
performance in steady state, and that steady state, full, and worst case
are not the same thing
Consumer/Client SSS
assumptions:
- The drive typically has less than 50% of its user space
occupied - The drive is accessed around 8 hours per day, 5 days per
week, and typically data is written far less frequently - Failure is catastrophic for a single user
- The consumer/client market generally chooses SSS solutions
based on their performance in the FOB state
Esther
Spanjer, Director, SSD Technical Marketing at Smart Storage Systems, said, 'I
am happy to commend Myce for their high level of professionalism and
cooperation during the review process', Ms. Spanjer added, 'I wish them every
success in their partnership with OakGate Technology and their initiative to
provide authoritative performance reviews for the Enterprise Solid State
Storage market'
Now let's head to the next page, to look at the results
of our SNIA IOPS (Input/Output Operations per Second) Test.....
Here is the specification for this test -

IOPS performance will typically
vary greatly depending on the nature of the IO traffic, including the mixture
of Read and Write operations, and the mixture of Block Sizes (the size of the
IO operation’s data packet, also referred to as IO Size). This test is designed
to benchmark the IOPS performance profile for random IO operations for 56
different combinations of Read/Write mix % and Block Sizes when in a Steady
State, which are of interest to most users.
All of the SNIA’s test
specifications define a ‘required’ set of parameters that must be run for the
test and then allow the operator to elect to run additional tests with
different parameters of their choice. It is the mandatory test with the
required parameters that we run. Note that all of the mandatory tests must be
conducted with fully random data
As previously mentioned, a key
principle of SNIA testing is to provide a consistent basis for comparing
different solutions from different manufacturers - myce.com/blog will be in a strong
position to publish meaningful comparisons as we gain more experience in the
review of Enterprise level SSS solutions.
Here is the report of the results -

The second table confirms the Range in the Measurement
Window (the maximum variation of a 4K Round value from the Average of the 4K Round
values) and the slope of the best linear fit through the 4K values (please see
Testing Methodology paper for a detailed specification of the criteria for
determining the achievement of Steady State, click here)

You can see here that Steady State Convergence was
determined at the end of Round 5. The Steady State Convergence Plot provides a
visual confirmation of Steady State Convergence.

This graph shows the average results gathered in the
Measurement Window. You can see an expected drop in IOPS performance as IO size
increases and/or the percentage of Writes increases.


This is an alternative method for presenting the results
from the Measurement Window; one which personally I prefer. Users can simply
refer to the grid to obtain the R/W mix and Block Size value of interest.
For example, Online Transaction Processing applications typically run at a Block
Size of 8K and a Read/Write Mix of 65/35, and users can quickly understand how
the device might perform under Steady State for these access characteristics.
You can see that the 4K 100% Read IOPS result is 82,439,
which exceeds Smart’s specification of 80,000, and that the 4K 100% Write IOPS
result is 29,154, which in the context of this test falls slightly short of
Smart’s specification of 30,000.
Now let's head to the next page, where to look at the
results of the SNIA Write Saturation Test.....
Here is the specification for this test -

The objective of this test is
to observe the time evolution of the drive’s performance, as a function of
time, from a ‘factory fresh’, ‘fresh out of the box’ (‘FOB’) state. When a
drive is in a FOB state (e.g. after it has been purged by, for example by a
SATA Secure Erase or SCSI Format), we can expect an initial period of time when
writes can easily be accommodated by clean/empty blocks, but once all of the clean
blocks have been written to once and the drive’s controller must first clean
blocks (with erase write operations) before it can write new data, then we can
expect a slow down. The slow-down is usually quite dramatic and is commonly
referred to as the ‘write cliff’.
The Write Saturation Test is
easy to run as it requires no steady state determination – it can be easily run
in freely available software, such as IOMeter.
Here is the report of the
results -


You can see here a significant drop in Write IOPS
performance as the Cloudspeed 1000E reaches a Steady State. The marked fall, at
around Round 39 occurs when all of the available NAND has been written to once
and the drive must clean blocks on the fly, in preparation for accommodating
further writes – this is commonly referred to as the ‘Write Cliff’.
This is a picture of typical behaviour.
Note that the test was halted, as specified in the SNIA SSS
PTS, when 4 x the User Capacity had been written to the drive. You can see that
the Cloudspeed is settling into a steady state at around the 30,000 IOPS level.

You can also see that the latency graph line is a mirror
image of the IOPS graph line.

This is a graph showing the Maximum Write Latency values
that occurred in each Round.
Now let's head to the next page, to look at the SNIA
Throughput Test.....
Please note that we have moved up to Version 1.1 of the SNIA
Throughput Test specification. The v1.0 Throughput Test is missing a pre-fill
stage between the purge and the tests loop, that sit within the overall loop on
Block Size.
Here is the specification for the Version 1.1 test -

The test is designed to measure the sequential Read and
Write IO performance for two Block Sizes, when under Steady State conditions.
One can easily compare the results produced by this test with box-top numbers,
which are usually stated as “Up to xxx MB/S”.
Here is the report of the results -



You can see here that Steady State was achieved for both Write
IO sizes by the end of Round 5.

You can see here that Steady State for both Read IO sizes
was achieved by the end of Round 6.

You can see here the average of the values recorded in the
Measurement Window. These results exceed Smart’s specification of up to 350 MB/s
Writes and 450 MB/s Reads.
Now let's head to the next page, to look at the results
of the SNIA Latency Test.....
Here is the specification for this test -

The Latency Test measures average and maximum response times
using random IOs at specified Block Sizes and Read/Write mixes, taken under
steady state conditions. The test runs at a Queue Depth of 1 (1 outstanding
IO), thus the results give the baseline response time for a single IO request.
The test also reports maximum latency values, which can be
helpful to see if there might be processes within the drive that may cause max
Latency values to become larger.
Here is the report of the results -


These are the Average and Maximum Latency Values observed in
the Measurement Window (measured in Milliseconds).

You can see here that Steady State Convergence was achieved
at the end of Round 13.

Here is a graph of the Maximum Latency results.

Here you can see a graph of the Average Latency results.

Here is a 3D graph showing, at a glance, the Maximum Latency
values for each combination of Read/Write Mix and IO Size.

Here is a 3D graph showing, at a glance, the Average Latency
values for each combination of Read/Write Mix and IO Size. These are very good
Latency results, particularly for the 4K and 8K 100% Writes.
Now let's head to the next page, to look at the results
for the Myce/OakGate Read and Write Latency Tests......
Here are the specifications for the tests -


These tests steadily increase the random 4K IO demand in terms
of IOPS, and report the drives response in terms of Average IOPS, Average
Latency and Maximum Latency. It is designed to show a drive’s maximum IOPS
capability and report the all important Latency numbers for each level of IOPS
demanded. The Maximum latency numbers give us an insight into the occurrence
of Latency peaks that could cause an unexpected response from time to time.
Here are the results –
Firstly, here is a graph showing the result for the
Pre-Conditioning in Step 2 -

You may notice that I have increased the duration of the
preconditioning step to 3 hours – this was necessary as the drive had not
fallen into a steady state after the normal 2 hours.
4K Latency Read Test

You can see that the drive can no longer meet the increase
in IOPS demand at around 77,000 IOPS, which in the context of this test is
falling slightly short of Smart’s specification of 80,000.

You can see a gradual increase in read latency up to the
maximum IOPS mark. The average Read Latency is excellent.

Here we can see that the Max Read Latencies are remarkably
consistent.
Let’s have a look at the distribution of the Latency results
at the 75,000 IOPS mark –

As this is the first time in this review, that we are
looking at a High Resolution Latency Histogram, here’s an explanation – The X
axis to the left is the count of the IOs in the observation period (in a Round)
that had a Latency of the value along the Y axis (please note that the X axis
is logarithmic to allow the low order counts of the huge number of IOs that
have been measured to be visible); the Y axis is the Latency value measured in
Microseconds; The X axis to the right is the % of the Total IOs observed that
have a Latency <= to a given Latency value; the rate of getting to 100% is
highlighted by the red graph line.
You can see that 99.9% of the Latency values are <= 550
Microseconds.
4K Latency Write Test

You can see here that the Cloudspeed fails to meet the
increase in IOPS demand beyond 31,000. This however exceeds Smart’s specification
of 30,000 for maximum write IOPS.

Here we can see that Average Write Latency stays below 100
Microseconds until 25,000 IOPS.

Here are the Maximum Write Latency plots.
Now let’s have a look at the distribution of the Latency
Values at the 30,000 IOPS Mark –

You can see that 99.9% of the Latency Values are <= 7.82
Milliseconds (ms).
Now let's have a look at the distribution of the latency
values in a test designed to show the Cloudspeed 1000E’s Quality of Service
(QoS) for 4K Writes and Reads when in a Steady State. The specification for the
test is 1) Purge the Drive 2) Precondition the drive by performing 4K random
writes for 3 hours (100% random data) 3) Perform 60 rounds of 4K Random Writes,
with each round consisting of 9 seconds warm up and 51 seconds of performance
measurement 4) Perform 60 rounds of 4K Random Reads, with each round consisting
of 9 seconds warm up and 51 seconds of performance measurement. The test was
performed at a Queue Depth of 1.
Here are the results -

Here are the Average Write Latency plots per Round.

And here are Maximum Write Latency plots per Round

And here is the High Resolution Latency Histogram for Round
30.
You can see that 99.9% of the Latency Values were <= 2,470
Microseconds (2.47ms) and that 99% of the values were <= 30 Microseconds
(which is excellent). If you look carefully you will also see that there are
relatively few outliers.

Here are the average Read Latency plots per round.

And here are the Maximum Read Latency plots per Round.

And here is the High Resolution latency Histogram for Round 30.
You can see that 99.9% of Latency values were <= 120 Microseconds. You can
also see that there are remarkably few outliers. This is an exceptionally good
result.
Now let's head to the next page, to look at the results
for the Myce/Oakgate Reads and Writes Tests.....
Here is the specification for the tests -

The tests are designed to show the Random and Sequential,
Read and Write, performance metrics for different combinations of Queue Depth
and IO size.
Here are the results -
Random Reads

Here you can see IOPS drop as IO size increases. You can
also see that there is good scalability up the maximum SATA Queue Depth of 32.

Here you can see a gradual increase in Bandwidth as IO Size
increases, plus again there is good scalability up to the maximum Queue Depth
of 32.

You can see here that Read Latency increases as IO Size and
Queue Depth increase.
Random Writes
When we tested the Intel DCS3500 we found that Random Writes
was its Achilles' heel. For interest we include the Random Writes results for
the DCS3500 here, to be able to see how the Cloudspeed compares.

Here are the results for the Cloudspeed 1000E, you can see a
very distinctive and healthy IOPS peak for the 4K IO Size.

And here are the results for DC S3500 in comparison and the
obvious lack of a 4K peak.

Here are the Bandwidth results for the Cloudspeed and you can
see that for the 4K IO Size and beyond they are nearly twice as much as the
following results for the DCS3500 -


Not surprisingly we see the same the same results echoed for
the latency results.

We are pleased to see that the Cloudspeed does not have the
same Achilles' heel as the Intel DC S3500.
Sequential Reads

You can see here that Sequential Read IOPS decreases as IO
Size and Queue Depth increases.

You can see here that
Bandwidth scales all the way up to the maximum Queue Depth of 32.

You can see here that Read Latency increases as IO Size and
Queue Depth increase.
Sequential Writes


It is interesting to note that the Sequential Write
Bandwidth results for the Cloudspeed are significantly better than those for
the Intel DCS3500, particularly for the larger IO sizes.

Now let's head to the next page, to look at the results
for the Myce/Oakgate 4K Mixed Reads/Writes Tests.....
NEW PAGE NEW PAGE NEW PAGE NEW PAGE
Myce/OakGate 4K Mixed Reads/Writes Tests

This test is designed to show the performance metrics for
different combinations of Queue Depth and Read/Write mix (the % of Reads and
the % of Writes making up the IO traffic)
4K Mixed R/W Test

You can see that there is no dramatic decrease in Read IOPS
as a small % of writes enters the mix.








Now let's head to the next page, to look at the results
of the Myce/OakGate Entropy Tests.....

These tests are designed to show performance metrics for
different combinations of Queue Depth and Entropy % (Entropy % is the degree to
which the data that is random and therefore incompressible). Testing with
different Entropy % levels has become important with the advent of controllers,
such as those from LSI Sandforce, that compress data before writing it to NAND.
Controllers that compress data can be expected to perform better with highly
compressible data (i.e. data with low Entropy).
The first test performs 5 minutes of Random 4K writes for
each combination of Queue Depth and Entropy %.
The second test does the same thing for a mixture of Read
and Write traffic (70% Reads, 30% Writes).
4K Entropy Write Test

You can see there is little or no variance in performance to
be found in any of the Entropy tests, as the degree of random data increases
(and this comment applies to all of the test results for the Myce/OakGate
Entropy Tests). We can conclude that the Cloudspeed’s controller does not
compress data.


4K Entropy 70%_Reads_30%_Writes Test
As we saw no evidence of compression in the 4K Entropy Write
Test we skip the presentation of the 70/30 entropy results.
Now let's head to the next page, to look at Power
Consumption and Data Reliability.....
Power Consumption
I believe most people know that data centres are already one
of the major consumers of electricity in the industrialised world; indeed it is
estimated that currently 2% of all electricity consumption goes into IT
applications. According to the European Union the energy consumption of data
centres was 46 Terawatt hours in 2006 and is set to rise to 93 TW hrs by 2020. This
is equivalent to one hundred million 100W light bulbs burning 24 hours a day,
365 days a year.
Typically 40% of the power consumed by data centres is for
the IT load and 35% is for cooling the system. Generally speaking, if a drive
consumes more power it will produce more heat – so power consumption is indeed
a double edged sword. It is no surprise then that a significant proportion of
a data centre’s power consumption goes on servers. I understand cloud based
applications, such as Facebook, are the primary cause of the growth in servers
and the demand for storage space.
I recently listened to a BBC Radio 4 Programme that quoted
IBM as saying that 90% of the world’s data has been created in the last 2 years
– staggering!
If you are a Facebook user, like me and the Reynolds sibs, and
you reside in Europe – this is most probably where your data is click here. Some
interesting Facebook statistics – Facebook has more than 1 Billion monthly
active users, it generates 1 Trillion page views per month and more than 219
Billion photos have been uploaded since launch – amazing!
I’ve heard that Google has more than 1 million servers and
that Microsoft has more than 300,000 in its Chicago based data centre alone –
fortunately for humanity the very large players are also amongst the most
efficient (understandably, as the economics associated with power consumption
are huge for them). So suffice to say, the power consumption of SSS Enterprise
solutions is a very important global consideration.
My thanks to Anna of Intel for pointing me to the following
Info-graphs -


The following graph uses the typical Power Consumption, when
active, as published in the respective manufacturer’s specification (please
note that the value for the Samsung 843 is the average of the typical read
active and write active values, as specified by Samsung). The value for the
Kingston E100 is calculated as the average of 1.2W (TYP) Read and 2.7W (TYP)
Write.

The Cloudspeed has low Power Consumption requirements.
Data Reliability
The 'Unrecoverable Bit Error
Rate' (UBER),as defined by JEDEC, the global leader in developing open
standards for the microelectronic industry, is a metric for data corruption
rate equal to the number of data errors per bit read after applying any
specified error correction method. UBER = number of data errors / number of
bits read. specifies that the maximum error rate allowable for an
Enterprise level SSS solution is one error in every 10^16 bits read.

The Cloudspeed exceeds the JEDEC requirement and has an
excellent UBER of 1 in 10^18.
The Cloudspeed has a 5 year warranty and Smart Storage
Systems specifies endurance as follows.
Cloudspeed 1000 - 1-6 Drive Writes per day
Cloudspeed 1000E – 3-7 Drive Writes per day
The low order of the range is for a random workload and the
high order for a sequential workload as endurance of NAND does depend on the
type of workload. Endurance is significantly greater than that achieved by
Intel in the DC S3500 – impressive!
So, the Guardian Technology has allowed Smart Storage
Systems to maintain an excellent endurance level in spite of the move to 19nm
NAND. Sandisk must be delighted to have acquired the Guardian Technology IP in
its acquisition of Smart Storage Systems.
Needless to say, the Cloudspeed includes full end-to-end
data protection and power failure support.
Now let's head to the next page, to look at the
Conclusions of this review.....
Now that we have reviewed both the Intel DCS3500 and the
Smart Storage Systems Cloudspeed 1000E I feel a new category for the Myce
‘Editor’s choice’ for Enterprise solutions is called for and this is for the
best ‘Mixed Workload, Value, Enterprise SSD’. So, we now have three categories
– the Best Premium SAS Enterprise SSD (held by the Smart Optimus); the Best Premium
SATA Enterprise SSD (held by the Intel DC S3700) and the Best Mixed Workload, Value
Enterprise SSD.
The Cloudspeed 1000E has proved to be another very
impressive solution from Smart Storage Systems, especially in the way that
Smart’s Guardian Technology has maintained an excellent level of endurance
despite the move to 19nm MLC NAND and in the way that excellent random write
speeds have been maintained. Smart Storage Systems is to be congratulated and
I am delighted to award the Cloudspeed 1000E our Excellent rating and to name
it as our Editor’s Choice from amongst ‘Mixed workload, Value Enterprise SSDs’.

















